Debugging Lighting Validation
?
?

Keyboard Navigation

Global Keys

[, < / ], > Jump to previous / next episode
W, K, P / S, J, N Jump to previous / next timestamp
t / T Toggle theatre / SUPERtheatre mode
V Revert filter to original state Y Select link (requires manual Ctrl-c)

Menu toggling

q Quotes r References f Filter y Link c Credits

In-Menu and Index Controls

a
w
s
d
h j k l


Esc Close menu / unfocus timestamp

Quotes and References Menus and Index

Enter Jump to timestamp

Quotes, References and Credits Menus

o Open URL (in new tab)

Filter Menu

x, Space Toggle category and focus next
X, ShiftSpace Toggle category and focus previous
v Invert topics / media as per focus

Filter and Link Menus

z Toggle filter / linking mode

Credits Menu

Enter Open URL (in new tab)
0:03Plug the Meow the Infinite printed comic Kickstarter1 and tease the related fun stuff in celebration of it
📖
0:03Plug the Meow the Infinite printed comic Kickstarter1 and tease the related fun stuff in celebration of it
📖
0:03Plug the Meow the Infinite printed comic Kickstarter1 and tease the related fun stuff in celebration of it
📖
0:52Recap our lighting discrepancy between the game and hhlightprof
🗩
0:52Recap our lighting discrepancy between the game and hhlightprof
🗩
0:52Recap our lighting discrepancy between the game and hhlightprof
🗩
2:00Change the dump file paths in hhlightprof, with the determination to dump new lighting data in a single-threaded run of the game
2:00Change the dump file paths in hhlightprof, with the determination to dump new lighting data in a single-threaded run of the game
2:00Change the dump file paths in hhlightprof, with the determination to dump new lighting data in a single-threaded run of the game
2:58Make InternalLightingCore() disable the LightBoxDumpTrigger() after dumping one set of data
2:58Make InternalLightingCore() disable the LightBoxDumpTrigger() after dumping one set of data
2:58Make InternalLightingCore() disable the LightBoxDumpTrigger() after dumping one set of data
4:05Hit our Work alignment assertion in InternalLightingCore()
🏃
4:05Hit our Work alignment assertion in InternalLightingCore()
🏃
4:05Hit our Work alignment assertion in InternalLightingCore()
🏃
4:42Build in -Od
4:42Build in -Od
4:42Build in -Od
4:55Hit our Work alignment assertion in InternalLightingCore()
🏃
4:55Hit our Work alignment assertion in InternalLightingCore()
🏃
4:55Hit our Work alignment assertion in InternalLightingCore()
🏃
5:06RemedyBG bug report: AND'ing a location with an integer
🏃
5:06RemedyBG bug report: AND'ing a location with an integer
🏃
5:06RemedyBG bug report: AND'ing a location with an integer
🏃
6:13x13pixels Sheeet. I'll get that fixed
🗪
6:13x13pixels Sheeet. I'll get that fixed
🗪
6:13x13pixels Sheeet. I'll get that fixed
🗪
6:27Fix the BigPad in lighting_work
6:27Fix the BigPad in lighting_work
6:27Fix the BigPad in lighting_work
7:52Run without hitting that alignment assertion in InternalLightingCore
🏃
7:52Run without hitting that alignment assertion in InternalLightingCore
🏃
7:52Run without hitting that alignment assertion in InternalLightingCore
🏃
8:13Run the game with the determination to capture lighting dumps
🏃
8:13Run the game with the determination to capture lighting dumps
🏃
8:13Run the game with the determination to capture lighting dumps
🏃
9:13Add the LightBoxDumpTrigger to the debug UI in EndLightingComputation()
9:13Add the LightBoxDumpTrigger to the debug UI in EndLightingComputation()
9:13Add the LightBoxDumpTrigger to the debug UI in EndLightingComputation()
10:06Dump our multithreaded lighting
🏃
10:06Dump our multithreaded lighting
🏃
10:06Dump our multithreaded lighting
🏃
10:25Disable multi-threading of the lighting
10:25Disable multi-threading of the lighting
10:25Disable multi-threading of the lighting
10:43Dump our single-threaded lighting
🏃
10:43Dump our single-threaded lighting
🏃
10:43Dump our single-threaded lighting
🏃
11:33Re-enable multi-threading of the lighting
11:33Re-enable multi-threading of the lighting
11:33Re-enable multi-threading of the lighting
11:50Run hhlightprof on the single-threaded lighting data, with errors
🏃
11:50Run hhlightprof on the single-threaded lighting data, with errors
🏃
11:50Run hhlightprof on the single-threaded lighting data, with errors
🏃
12:23Run hhlightprof on the multi-threaded lighting data, also with errors
🏃
12:23Run hhlightprof on the multi-threaded lighting data, also with errors
🏃
12:23Run hhlightprof on the multi-threaded lighting data, also with errors
🏃
12:43Scour hhlightprof for bugs
📖
12:43Scour hhlightprof for bugs
📖
12:43Scour hhlightprof for bugs
📖
16:37Make InternalLightingCore() dump the light Boxes and BoxTable after the BuildSpatialPartitionForLighting() call
16:37Make InternalLightingCore() dump the light Boxes and BoxTable after the BuildSpatialPartitionForLighting() call
16:37Make InternalLightingCore() dump the light Boxes and BoxTable after the BuildSpatialPartitionForLighting() call
20:09Dump our lighting
🏃
20:09Dump our lighting
🏃
20:09Dump our lighting
🏃
20:37Make hhlightprof load in and validate the Boxes and BoxTable
20:37Make hhlightprof load in and validate the Boxes and BoxTable
20:37Make hhlightprof load in and validate the Boxes and BoxTable
24:52Run hhlightprof to find that the light boxes don't match
🏃
24:52Run hhlightprof to find that the light boxes don't match
🏃
24:52Run hhlightprof to find that the light boxes don't match
🏃
25:10Make hhlightprof validate the BoxTable
25:10Make hhlightprof validate the BoxTable
25:10Make hhlightprof validate the BoxTable
26:16Run hhlightprof to find that the boxrefs don't match
🏃
26:16Run hhlightprof to find that the boxrefs don't match
🏃
26:16Run hhlightprof to find that the boxrefs don't match
🏃
26:41Note the simplicity of BuildSpatialPartitionForLighting()
📖
26:41Note the simplicity of BuildSpatialPartitionForLighting()
📖
26:41Note the simplicity of BuildSpatialPartitionForLighting()
📖
27:33Enable the LightBoxDumpTrigger, to dump the first frame of lighting
27:33Enable the LightBoxDumpTrigger, to dump the first frame of lighting
27:33Enable the LightBoxDumpTrigger, to dump the first frame of lighting
27:59Break in to InternalLightingCore()
🏃
27:59Break in to InternalLightingCore()
🏃
27:59Break in to InternalLightingCore()
🏃
28:43Fix InternalLightingCore() to dump the correct amount of light boxes at the head
28:43Fix InternalLightingCore() to dump the correct amount of light boxes at the head
28:43Fix InternalLightingCore() to dump the correct amount of light boxes at the head
29:11Dump our lighting
🏃
29:11Dump our lighting
🏃
29:11Dump our lighting
🏃
29:34Run hhlightprof to find that the light boxes still don't match, but the error / texel is much lower
🏃
29:34Run hhlightprof to find that the light boxes still don't match, but the error / texel is much lower
🏃
29:34Run hhlightprof to find that the light boxes still don't match, but the error / texel is much lower
🏃
32:36Fix InternalLightingCore() to dump the correct amount of light boxes after BuildSpatialPartitionForLighting()
32:36Fix InternalLightingCore() to dump the correct amount of light boxes after BuildSpatialPartitionForLighting()
32:36Fix InternalLightingCore() to dump the correct amount of light boxes after BuildSpatialPartitionForLighting()
33:07Dump our lighting
🏃
33:07Dump our lighting
🏃
33:07Dump our lighting
🏃
33:22Run hhlightprof to find that the light boxes now match
🏃
33:22Run hhlightprof to find that the light boxes now match
🏃
33:22Run hhlightprof to find that the light boxes now match
🏃
33:40Check InternalLightingCore() for BoxTable dumping errors
📖
33:40Check InternalLightingCore() for BoxTable dumping errors
📖
33:40Check InternalLightingCore() for BoxTable dumping errors
📖
34:18Step in to InternalLightingCore() and inspect the BoxTable values
🏃
34:18Step in to InternalLightingCore() and inspect the BoxTable values
🏃
34:18Step in to InternalLightingCore() and inspect the BoxTable values
🏃
35:31Step through hhlightprof and find the BoxTable file size to be wrong
🏃
35:31Step through hhlightprof and find the BoxTable file size to be wrong
🏃
35:31Step through hhlightprof and find the BoxTable file size to be wrong
🏃
38:38Step through the DEBUGDumpData() of the BoxTable
🏃
38:38Step through the DEBUGDumpData() of the BoxTable
🏃
38:38Step through the DEBUGDumpData() of the BoxTable
🏃
39:24Add a SetFileSize() function pointer to the platform2,3
39:24Add a SetFileSize() function pointer to the platform2,3
39:24Add a SetFileSize() function pointer to the platform2,3
49:35Make DEBUGDumpData() call SetFileSize()
49:35Make DEBUGDumpData() call SetFileSize()
49:35Make DEBUGDumpData() call SetFileSize()
50:03Dump our lighting
🏃
50:03Dump our lighting
🏃
50:03Dump our lighting
🏃
50:31Run hhlightprof to find that the light refs now match
🏃
50:31Run hhlightprof to find that the light refs now match
🏃
50:31Run hhlightprof to find that the light refs now match
🏃
50:49Make hhlightprof record the max error / texel;
50:49Make hhlightprof record the max error / texel;
50:49Make hhlightprof record the max error / texel;
51:27Max error / texel: 0.001786
🏃
51:27Max error / texel: 0.001786
🏃
51:27Max error / texel: 0.001786
🏃
52:56Dump our lighting in -O2, with a max error / texel of 0.001814
🏃
52:56Dump our lighting in -O2, with a max error / texel of 0.001814
🏃
52:56Dump our lighting in -O2, with a max error / texel of 0.001814
🏃
53:09Disable the LightBoxDumpTrigger
53:09Disable the LightBoxDumpTrigger
53:09Disable the LightBoxDumpTrigger
53:34Walk through the orphanage and dump our lighting
🏃
53:34Walk through the orphanage and dump our lighting
🏃
53:34Walk through the orphanage and dump our lighting
🏃
53:54Max error / texel: 0.001195
🏃
53:54Max error / texel: 0.001195
🏃
53:54Max error / texel: 0.001195
🏃
54:00Walk outside and dump our lighting
🏃
54:00Walk outside and dump our lighting
🏃
54:00Walk outside and dump our lighting
🏃
54:12Max error / texel: 0.002587
🏃
54:12Max error / texel: 0.002587
🏃
54:12Max error / texel: 0.002587
🏃
54:21Walk down to the dungeon and dump our lighting
🏃
54:21Walk down to the dungeon and dump our lighting
🏃
54:21Walk down to the dungeon and dump our lighting
🏃
54:31Max error / texel: 0.009179, and the light boxes don't match
🏃
54:31Max error / texel: 0.009179, and the light boxes don't match
🏃
54:31Max error / texel: 0.009179, and the light boxes don't match
🏃
55:18Consider how to proceed
🏃
55:18Consider how to proceed
🏃
55:18Consider how to proceed
🏃
56:25Run our "Instructions Per Clock" analysis of hhlightprof
🏃
56:25Run our "Instructions Per Clock" analysis of hhlightprof
🏃
56:25Run our "Instructions Per Clock" analysis of hhlightprof
🏃
58:33Make InternalLightingCore() compute 5 seconds of lighting
58:33Make InternalLightingCore() compute 5 seconds of lighting
58:33Make InternalLightingCore() compute 5 seconds of lighting
1:00:44Introduce ProfileRun() in hhlightprof, to run it multiple times
1:00:44Introduce ProfileRun() in hhlightprof, to run it multiple times
1:00:44Introduce ProfileRun() in hhlightprof, to run it multiple times
1:02:05Run hhlightprof for 9 seconds, to completion
🏃
1:02:05Run hhlightprof for 9 seconds, to completion
🏃
1:02:05Run hhlightprof for 9 seconds, to completion
🏃
1:02:23Decrease the iterations of ProfileRun() from 60*5 to 60
1:02:23Decrease the iterations of ProfileRun() from 60*5 to 60
1:02:23Decrease the iterations of ProfileRun() from 60*5 to 60
1:02:42Run hhlightprof for 3 seconds, to completion
🏃
1:02:42Run hhlightprof for 3 seconds, to completion
🏃
1:02:42Run hhlightprof for 3 seconds, to completion
🏃
1:02:48Increase the iterations of ProfileRun() from 60 to 60*2
1:02:48Increase the iterations of ProfileRun() from 60 to 60*2
1:02:48Increase the iterations of ProfileRun() from 60 to 60*2
1:03:02Run our "Instructions Per Clock" analysis of hhlightprof
🏃
1:03:02Run our "Instructions Per Clock" analysis of hhlightprof
🏃
1:03:02Run our "Instructions Per Clock" analysis of hhlightprof
🏃
1:03:23Consult our "Instructions Per Clock" VTune analysis
🏃
1:03:23Consult our "Instructions Per Clock" VTune analysis
🏃
1:03:23Consult our "Instructions Per Clock" VTune analysis
🏃
1:04:02Optimisation Opportunities: 1) Post-processing textures
📖
1:04:02Optimisation Opportunities: 1) Post-processing textures
📖
1:04:02Optimisation Opportunities: 1) Post-processing textures
📖
1:04:43Optimisation Opportunities: 2) Accessing the lighting_box in the spatial partition
📖
1:04:43Optimisation Opportunities: 2) Accessing the lighting_box in the spatial partition
📖
1:04:43Optimisation Opportunities: 2) Accessing the lighting_box in the spatial partition
📖
1:05:43Optimisation Opportunities: 3) Efficient loading of data
📖
1:05:43Optimisation Opportunities: 3) Efficient loading of data
📖
1:05:43Optimisation Opportunities: 3) Efficient loading of data
📖
1:06:07Prepare to pack our ray casting data more concisely
📖
1:06:07Prepare to pack our ray casting data more concisely
📖
1:06:07Prepare to pack our ray casting data more concisely
📖
1:09:05Run hhlightprof
🏃
1:09:05Run hhlightprof
🏃
1:09:05Run hhlightprof
🏃
1:09:19Make hhlightprof record its execution time4,5
1:09:19Make hhlightprof record its execution time4,5
1:09:19Make hhlightprof record its execution time4,5
1:12:55hhlightprof total seconds elapsed: 7.173237
🏃
1:12:55hhlightprof total seconds elapsed: 7.173237
🏃
1:12:55hhlightprof total seconds elapsed: 7.173237
🏃
1:13:58Introduce ray_cast_stack_entry to more concisely store the data needed by RayCast()
1:13:58Introduce ray_cast_stack_entry to more concisely store the data needed by RayCast()
1:13:58Introduce ray_cast_stack_entry to more concisely store the data needed by RayCast()
1:19:43hhlightprof total seconds elapsed: 8.112103
🏃
1:19:43hhlightprof total seconds elapsed: 8.112103
🏃
1:19:43hhlightprof total seconds elapsed: 8.112103
🏃
1:20:32Inspect the assembly of RayCast()
🏃
1:20:32Inspect the assembly of RayCast()
🏃
1:20:32Inspect the assembly of RayCast()
🏃
1:22:08Replace ray_cast_stack_entry with a PACK_CAST_ENTRY() for RayCast() to use
1:22:08Replace ray_cast_stack_entry with a PACK_CAST_ENTRY() for RayCast() to use
1:22:08Replace ray_cast_stack_entry with a PACK_CAST_ENTRY() for RayCast() to use
1:26:48hhlightprof total seconds elapsed: 7.259090
🏃
1:26:48hhlightprof total seconds elapsed: 7.259090
🏃
1:26:48hhlightprof total seconds elapsed: 7.259090
🏃
1:27:44Inspect the assembly of RayCast() to see many jmp instructions
🏃
1:27:44Inspect the assembly of RayCast() to see many jmp instructions
🏃
1:27:44Inspect the assembly of RayCast() to see many jmp instructions
🏃
1:29:03Introduce lighting_box_pack for lighting_box to contain, and RayCast() to use
1:29:03Introduce lighting_box_pack for lighting_box to contain, and RayCast() to use
1:29:03Introduce lighting_box_pack for lighting_box to contain, and RayCast() to use
1:33:45hhlightprof total seconds elapsed: 7.063448
🏃
1:33:45hhlightprof total seconds elapsed: 7.063448
🏃
1:33:45hhlightprof total seconds elapsed: 7.063448
🏃
1:34:16Inspect the assembly of RayCast() to still see jmp instructions
🏃
1:34:16Inspect the assembly of RayCast() to still see jmp instructions
🏃
1:34:16Inspect the assembly of RayCast() to still see jmp instructions
🏃
1:36:55Consult AnyTrue()
📖
1:36:55Consult AnyTrue()
📖
1:36:55Consult AnyTrue()
📖
1:38:41Let RayCast() push on a box regardless of its proximity
1:38:41Let RayCast() push on a box regardless of its proximity
1:38:41Let RayCast() push on a box regardless of its proximity
1:39:06hhlightprof total seconds elapsed: 8.120685
🏃
1:39:06hhlightprof total seconds elapsed: 8.120685
🏃
1:39:06hhlightprof total seconds elapsed: 8.120685
🏃
1:39:36Revert RayCast() to only push on boxes within a certain distance of the ray's origin
1:39:36Revert RayCast() to only push on boxes within a certain distance of the ray's origin
1:39:36Revert RayCast() to only push on boxes within a certain distance of the ray's origin
1:39:46Consider determining more efficiently if RayTest() should push a box
📖
1:39:46Consider determining more efficiently if RayTest() should push a box
📖
1:39:46Consider determining more efficiently if RayTest() should push a box
📖
1:40:50Make RayTest() determine more efficiently if it should push a box
1:40:50Make RayTest() determine more efficiently if it should push a box
1:40:50Make RayTest() determine more efficiently if it should push a box
1:42:20hhlightprof total seconds elapsed: 6.932234
🏃
1:42:20hhlightprof total seconds elapsed: 6.932234
🏃
1:42:20hhlightprof total seconds elapsed: 6.932234
🏃
1:42:48Inspect the assembly of RayCast() to still see jmp instructions
🏃
1:42:48Inspect the assembly of RayCast() to still see jmp instructions
🏃
1:42:48Inspect the assembly of RayCast() to still see jmp instructions
🏃
1:44:13Try to make RayTest() determine even more efficiently if it should push a box
1:44:13Try to make RayTest() determine even more efficiently if it should push a box
1:44:13Try to make RayTest() determine even more efficiently if it should push a box
1:45:20Inspect the assembly of RayCast() to still see jmp instructions
🏃
1:45:20Inspect the assembly of RayCast() to still see jmp instructions
🏃
1:45:20Inspect the assembly of RayCast() to still see jmp instructions
🏃
1:45:47Research cmov intrinsic generation
📖
1:45:47Research cmov intrinsic generation
📖
1:45:47Research cmov intrinsic generation
📖
1:47:03Read the gamedev.net forum post "Dependable cmov in Visual C++"6
📖
1:47:03Read the gamedev.net forum post "Dependable cmov in Visual C++"6
📖
1:47:03Read the gamedev.net forum post "Dependable cmov in Visual C++"6
📖
1:47:37Try to make the compiler generate a cmov for the conditional box pushing code in RayTest()
1:47:37Try to make the compiler generate a cmov for the conditional box pushing code in RayTest()
1:47:37Try to make the compiler generate a cmov for the conditional box pushing code in RayTest()
1:49:16Inspect the assembly of RayCast() to still see no cmov instructions
🏃
1:49:16Inspect the assembly of RayCast() to still see no cmov instructions
🏃
1:49:16Inspect the assembly of RayCast() to still see no cmov instructions
🏃
1:49:27Try again to make the compiler generate a cmov for the conditional box pushing code in RayTest()
1:49:27Try again to make the compiler generate a cmov for the conditional box pushing code in RayTest()
1:49:27Try again to make the compiler generate a cmov for the conditional box pushing code in RayTest()
1:50:15Inspect the assembly of RayCast() to see one cmov instruction
🏃
1:50:15Inspect the assembly of RayCast() to see one cmov instruction
🏃
1:50:15Inspect the assembly of RayCast() to see one cmov instruction
🏃
1:50:41Try again to make the compiler generate a cmov for the conditional box pushing code in RayTest()
1:50:41Try again to make the compiler generate a cmov for the conditional box pushing code in RayTest()
1:50:41Try again to make the compiler generate a cmov for the conditional box pushing code in RayTest()
1:50:52Inspect the assembly of RayCast() to see no further cmov instructions
🏃
1:50:52Inspect the assembly of RayCast() to see no further cmov instructions
🏃
1:50:52Inspect the assembly of RayCast() to see no further cmov instructions
🏃
1:51:55Consult the Intel Intrinsics Guide7 for mask instructions
📖
1:51:55Consult the Intel Intrinsics Guide7 for mask instructions
📖
1:51:55Consult the Intel Intrinsics Guide7 for mask instructions
📖
1:53:53Try making RayCast() determine in SIMD if it should push a box
1:53:53Try making RayCast() determine in SIMD if it should push a box
1:53:53Try making RayCast() determine in SIMD if it should push a box
1:58:19Inspect the assembly of RayCast() to still see no further cmov instructions
🏃
1:58:19Inspect the assembly of RayCast() to still see no further cmov instructions
🏃
1:58:19Inspect the assembly of RayCast() to still see no further cmov instructions
🏃
1:59:44Try changing RayCast() to store off the StackAt to write back to the BoxStack
1:59:44Try changing RayCast() to store off the StackAt to write back to the BoxStack
1:59:44Try changing RayCast() to store off the StackAt to write back to the BoxStack
2:00:30Inspect the assembly of RayCast() to see cmov instructions
🏃
2:00:30Inspect the assembly of RayCast() to see cmov instructions
🏃
2:00:30Inspect the assembly of RayCast() to see cmov instructions
🏃
2:01:02hhlightprof total seconds elapsed: 7.496200
🏃
2:01:02hhlightprof total seconds elapsed: 7.496200
🏃
2:01:02hhlightprof total seconds elapsed: 7.496200
🏃
2:01:33Toggle RayCast() back to determine in scalar if it should push a box
2:01:33Toggle RayCast() back to determine in scalar if it should push a box
2:01:33Toggle RayCast() back to determine in scalar if it should push a box
2:01:41Inspect the assembly of RayCast() to see our dreaded jmp instructions
🏃
2:01:41Inspect the assembly of RayCast() to see our dreaded jmp instructions
🏃
2:01:41Inspect the assembly of RayCast() to see our dreaded jmp instructions
🏃
2:01:57Try making RayCast() set ShouldPush using a bitwise, rather than a conditional, OR
2:01:57Try making RayCast() set ShouldPush using a bitwise, rather than a conditional, OR
2:01:57Try making RayCast() set ShouldPush using a bitwise, rather than a conditional, OR
2:02:22Inspect the assembly of RayCast() to see cmov instructions
🏃
2:02:22Inspect the assembly of RayCast() to see cmov instructions
🏃
2:02:22Inspect the assembly of RayCast() to see cmov instructions
🏃
2:02:31Consider our ShouldPush setting, in terms of OR'ing
📖
2:02:31Consider our ShouldPush setting, in terms of OR'ing
📖
2:02:31Consider our ShouldPush setting, in terms of OR'ing
📖
2:04:12hhlightprof total seconds elapsed: 7.360275
🏃
2:04:12hhlightprof total seconds elapsed: 7.360275
🏃
2:04:12hhlightprof total seconds elapsed: 7.360275
🏃
2:04:35Inspect the assembly of RayCast()
🏃
2:04:35Inspect the assembly of RayCast()
🏃
2:04:35Inspect the assembly of RayCast()
🏃
2:05:20Try making RayCast() compute ShouldPush bitwise OR'ing and AND'ing only tInside, Mask and CloseEnough
2:05:20Try making RayCast() compute ShouldPush bitwise OR'ing and AND'ing only tInside, Mask and CloseEnough
2:05:20Try making RayCast() compute ShouldPush bitwise OR'ing and AND'ing only tInside, Mask and CloseEnough
2:07:01Inspect the assembly of RayCast()
🏃
2:07:01Inspect the assembly of RayCast()
🏃
2:07:01Inspect the assembly of RayCast()
🏃
2:07:23hhlightprof total seconds elapsed: 7.307276
🏃
2:07:23hhlightprof total seconds elapsed: 7.307276
🏃
2:07:23hhlightprof total seconds elapsed: 7.307276
🏃
2:07:40Note that it seems cheaper to jmp than cmov
📖
2:07:40Note that it seems cheaper to jmp than cmov
📖
2:07:40Note that it seems cheaper to jmp than cmov
📖
2:08:27Try making RayCast() compute ShouldPush using conditional tests
2:08:27Try making RayCast() compute ShouldPush using conditional tests
2:08:27Try making RayCast() compute ShouldPush using conditional tests
2:08:47hhlightprof total seconds elapsed: 7.253258
🏃
2:08:47hhlightprof total seconds elapsed: 7.253258
🏃
2:08:47hhlightprof total seconds elapsed: 7.253258
🏃
2:08:56Revert RayCast() to the original box pushing code
2:08:56Revert RayCast() to the original box pushing code
2:08:56Revert RayCast() to the original box pushing code
2:09:29hhlightprof total seconds elapsed: 6.878540
🏃
2:09:29hhlightprof total seconds elapsed: 6.878540
🏃
2:09:29hhlightprof total seconds elapsed: 6.878540
🏃
2:09:45Save off our jmp and cmov versions of the box pushing code in RayCast()
2:09:45Save off our jmp and cmov versions of the box pushing code in RayCast()
2:09:45Save off our jmp and cmov versions of the box pushing code in RayCast()
2:12:16hhlightprof total seconds elapsed: 6.888590
🏃
2:12:16hhlightprof total seconds elapsed: 6.888590
🏃
2:12:16hhlightprof total seconds elapsed: 6.888590
🏃
2:12:31Toggle RayCast() to the slower cmov box pushing code
2:12:31Toggle RayCast() to the slower cmov box pushing code
2:12:31Toggle RayCast() to the slower cmov box pushing code
2:12:41hhlightprof total seconds elapsed: 7.079671
🏃
2:12:41hhlightprof total seconds elapsed: 7.079671
🏃
2:12:41hhlightprof total seconds elapsed: 7.079671
🏃
2:13:11Q&A
🗩
2:13:11Q&A
🗩
2:13:11Q&A
🗩
2:13:24Realise why the cmov version isn't faster
📖
2:13:24Realise why the cmov version isn't faster
📖
2:13:24Realise why the cmov version isn't faster
📖
2:14:56Q&A
🗩
2:14:56Q&A
🗩
2:14:56Q&A
🗩
2:15:09vaualbus Q: I think the MSDN you were looking at is SetFileInformationByHandle, maybe?
🗪
2:15:09vaualbus Q: I think the MSDN you were looking at is SetFileInformationByHandle, maybe?
🗪
2:15:09vaualbus Q: I think the MSDN you were looking at is SetFileInformationByHandle, maybe?
🗪
2:16:00yurasniper Q: Logical || and && are short-circuited, so they will always have a jump, unless compiler can figure out some property that will allow it to collapse the expression. So to avoid jumps one should use bitwise | and & if possible. But also there were some people saying that cmov is worse than a jump over a few instructions. I think LLVM people, but I may be wrong
🗪
2:16:00yurasniper Q: Logical || and && are short-circuited, so they will always have a jump, unless compiler can figure out some property that will allow it to collapse the expression. So to avoid jumps one should use bitwise | and & if possible. But also there were some people saying that cmov is worse than a jump over a few instructions. I think LLVM people, but I may be wrong
🗪
2:16:00yurasniper Q: Logical || and && are short-circuited, so they will always have a jump, unless compiler can figure out some property that will allow it to collapse the expression. So to avoid jumps one should use bitwise | and & if possible. But also there were some people saying that cmov is worse than a jump over a few instructions. I think LLVM people, but I may be wrong
🗪
2:16:35somebody_took_my_name Q: Can't you just use CloseEnough instead of CloserCloseEnough in the ShouldPush assignment? CloseEnough is already AND'd with Mask. Oh, and the assignment of StackY is busted. There is a 0, 1 instead of 2, 3 at the end
🗪
2:16:35somebody_took_my_name Q: Can't you just use CloseEnough instead of CloserCloseEnough in the ShouldPush assignment? CloseEnough is already AND'd with Mask. Oh, and the assignment of StackY is busted. There is a 0, 1 instead of 2, 3 at the end
🗪
2:16:35somebody_took_my_name Q: Can't you just use CloseEnough instead of CloserCloseEnough in the ShouldPush assignment? CloseEnough is already AND'd with Mask. Oh, and the assignment of StackY is busted. There is a 0, 1 instead of 2, 3 at the end
🗪
2:16:59Fix the StackY setting in RayCast() and toggle to the faster jmp box pushing code
2:16:59Fix the StackY setting in RayCast() and toggle to the faster jmp box pushing code
2:16:59Fix the StackY setting in RayCast() and toggle to the faster jmp box pushing code
2:17:37hhlightprof total seconds elapsed: 6.963080
🏃
2:17:37hhlightprof total seconds elapsed: 6.963080
🏃
2:17:37hhlightprof total seconds elapsed: 6.963080
🏃
2:18:20Toggle RayCast() to the slower, SIMD cmov box pushing code
2:18:20Toggle RayCast() to the slower, SIMD cmov box pushing code
2:18:20Toggle RayCast() to the slower, SIMD cmov box pushing code
2:18:27hhlightprof total seconds elapsed: 7.533641
🏃
2:18:27hhlightprof total seconds elapsed: 7.533641
🏃
2:18:27hhlightprof total seconds elapsed: 7.533641
🏃
2:18:39Toggle RayCast() to the faster, scalar cmov box pushing code
2:18:39Toggle RayCast() to the faster, scalar cmov box pushing code
2:18:39Toggle RayCast() to the faster, scalar cmov box pushing code
2:18:44hhlightprof total seconds elapsed: 7.265596
🏃
2:18:44hhlightprof total seconds elapsed: 7.265596
🏃
2:18:44hhlightprof total seconds elapsed: 7.265596
🏃
2:19:00somebody_took_my_name Q: Can't you just use CloseEnough instead of CloserCloseEnough in the ShouldPush assignment? CloseEnough is already AND'd with Mask
🗪
2:19:00somebody_took_my_name Q: Can't you just use CloseEnough instead of CloserCloseEnough in the ShouldPush assignment? CloseEnough is already AND'd with Mask
🗪
2:19:00somebody_took_my_name Q: Can't you just use CloseEnough instead of CloserCloseEnough in the ShouldPush assignment? CloseEnough is already AND'd with Mask
🗪
2:19:31Remove the superfluous CloserCloseEnough from RayCast()
2:19:31Remove the superfluous CloserCloseEnough from RayCast()
2:19:31Remove the superfluous CloserCloseEnough from RayCast()
2:20:17hhlightprof total seconds elapsed: 7.195799
🏃
2:20:17hhlightprof total seconds elapsed: 7.195799
🏃
2:20:17hhlightprof total seconds elapsed: 7.195799
🏃
2:20:29Toggle RayCast() to the faster jmp box pushing code
2:20:29Toggle RayCast() to the faster jmp box pushing code
2:20:29Toggle RayCast() to the faster jmp box pushing code
2:20:45hhlightprof total seconds elapsed: 6.874347
🏃
2:20:45hhlightprof total seconds elapsed: 6.874347
🏃
2:20:45hhlightprof total seconds elapsed: 6.874347
🏃
2:21:00Make a note to try pushing boxes using a circular buffer
2:21:00Make a note to try pushing boxes using a circular buffer
2:21:00Make a note to try pushing boxes using a circular buffer
2:22:17somebody_took_my_name Q: Somebody in chat had the idea of loading the ray caster through the dll for testing. Could this remove the floating point errors in the test code?
🗪
2:22:17somebody_took_my_name Q: Somebody in chat had the idea of loading the ray caster through the dll for testing. Could this remove the floating point errors in the test code?
🗪
2:22:17somebody_took_my_name Q: Somebody in chat had the idea of loading the ray caster through the dll for testing. Could this remove the floating point errors in the test code?
🗪
2:23:25tjom2000 Q: Is it possible to structure the game code so it would run reasonably fast in debug mode? Would that be worth the hassle?
🗪
2:23:25tjom2000 Q: Is it possible to structure the game code so it would run reasonably fast in debug mode? Would that be worth the hassle?
🗪
2:23:25tjom2000 Q: Is it possible to structure the game code so it would run reasonably fast in debug mode? Would that be worth the hassle?
🗪
2:23:53emperormetallix Q: Will you come to the dark side and try const?
🗪
2:23:53emperormetallix Q: Will you come to the dark side and try const?
🗪
2:23:53emperormetallix Q: Will you come to the dark side and try const?
🗪
2:24:40martinsmemory Q: When you dropping that low level course?
🗪
2:24:40martinsmemory Q: When you dropping that low level course?
🗪
2:24:40martinsmemory Q: When you dropping that low level course?
🗪
2:24:51vaualbus Q: Why have you uploaded the last Handmade Hero episode on Molly Rocket's YouTube account?
🗪
2:24:51vaualbus Q: Why have you uploaded the last Handmade Hero episode on Molly Rocket's YouTube account?
🗪
2:24:51vaualbus Q: Why have you uploaded the last Handmade Hero episode on Molly Rocket's YouTube account?
🗪
2:25:47isfoo Q: MSVC is very often not optimizing away obvious things (basically you cannot do so-called zero cost abstractions with it). For example it always does short circuiting.8 Or from experience, I also remember it sometimes calling empty constructors. Why not use some reasonable compiler like clang / gcc / icc?
🗪
2:25:47isfoo Q: MSVC is very often not optimizing away obvious things (basically you cannot do so-called zero cost abstractions with it). For example it always does short circuiting.8 Or from experience, I also remember it sometimes calling empty constructors. Why not use some reasonable compiler like clang / gcc / icc?
🗪
2:25:47isfoo Q: MSVC is very often not optimizing away obvious things (basically you cannot do so-called zero cost abstractions with it). For example it always does short circuiting.8 Or from experience, I also remember it sometimes calling empty constructors. Why not use some reasonable compiler like clang / gcc / icc?
🗪
2:26:58x1bzzr Q: In one of the first episodes of Handmade Hero you mentioned maybe it would be a good idea to recreate the window if WM_DESTROY was caught in the window procedure. In what sort of scenario does that happen?
🗪
2:26:58x1bzzr Q: In one of the first episodes of Handmade Hero you mentioned maybe it would be a good idea to recreate the window if WM_DESTROY was caught in the window procedure. In what sort of scenario does that happen?
🗪
2:26:58x1bzzr Q: In one of the first episodes of Handmade Hero you mentioned maybe it would be a good idea to recreate the window if WM_DESTROY was caught in the window procedure. In what sort of scenario does that happen?
🗪
2:27:48A few words on the C++ spec effectively preventing the optimising compiler from using const
🗩
2:27:48A few words on the C++ spec effectively preventing the optimising compiler from using const
🗩
2:27:48A few words on the C++ spec effectively preventing the optimising compiler from using const
🗩
2:28:08emperormetallix Q: These optimisations seem to be very low level. How do you know when it is worth going to this level vs zooming out to examine the overall algorithm, or memory layout, data volume, etc?
🗪
2:28:08emperormetallix Q: These optimisations seem to be very low level. How do you know when it is worth going to this level vs zooming out to examine the overall algorithm, or memory layout, data volume, etc?
🗪
2:28:08emperormetallix Q: These optimisations seem to be very low level. How do you know when it is worth going to this level vs zooming out to examine the overall algorithm, or memory layout, data volume, etc?
🗪
2:30:20vaualbus Q: Will you be in Jon's talk today? It's already started! Let's go there after
🗪
2:30:20vaualbus Q: Will you be in Jon's talk today? It's already started! Let's go there after
🗪
2:30:20vaualbus Q: Will you be in Jon's talk today? It's already started! Let's go there after
🗪
2:30:30x1bzzr Q: I guess what I'm curious about is why you shouldn't just terminate the application if you get WM_DESTROY in the window procedure
🗪
2:30:30x1bzzr Q: I guess what I'm curious about is why you shouldn't just terminate the application if you get WM_DESTROY in the window procedure
🗪
2:30:30x1bzzr Q: I guess what I'm curious about is why you shouldn't just terminate the application if you get WM_DESTROY in the window procedure
🗪
2:30:42martinsmemory Q: How many hours do you work a week normally?
🗪
2:30:42martinsmemory Q: How many hours do you work a week normally?
🗪
2:30:42martinsmemory Q: How many hours do you work a week normally?
🗪
2:31:38maavelar Q: Has your RSI problem improved from a couple of years back? If so, what helped you?
🗪
2:31:38maavelar Q: Has your RSI problem improved from a couple of years back? If so, what helped you?
🗪
2:31:38maavelar Q: Has your RSI problem improved from a couple of years back? If so, what helped you?
🗪
2:32:00internationalizationist Q: Is it possible to not call VirtualAlloc at all? You can create global array of bytes of whatever size you want and point Persistent / Transient storages to those arrays. Global storage should go to BSS (is it?), and the operating system must allocate enough memory at startup time (and you already allocated a determined amount of space (so you know how much it is at compile time))
🗪
2:32:00internationalizationist Q: Is it possible to not call VirtualAlloc at all? You can create global array of bytes of whatever size you want and point Persistent / Transient storages to those arrays. Global storage should go to BSS (is it?), and the operating system must allocate enough memory at startup time (and you already allocated a determined amount of space (so you know how much it is at compile time))
🗪
2:32:00internationalizationist Q: Is it possible to not call VirtualAlloc at all? You can create global array of bytes of whatever size you want and point Persistent / Transient storages to those arrays. Global storage should go to BSS (is it?), and the operating system must allocate enough memory at startup time (and you already allocated a determined amount of space (so you know how much it is at compile time))
🗪
2:32:32yakvie handmade_hero Hey Casey, I was wondering what's the current plan for Handmade Hero? Will you be releasing pieces of this code to the public domain?
🗪
2:32:32yakvie handmade_hero Hey Casey, I was wondering what's the current plan for Handmade Hero? Will you be releasing pieces of this code to the public domain?
🗪
2:32:32yakvie handmade_hero Hey Casey, I was wondering what's the current plan for Handmade Hero? Will you be releasing pieces of this code to the public domain?
🗪
2:32:47emperormetallix Q: What do you use instead of Ctrl / Alt keys? Vim controls?
🗪
2:32:47emperormetallix Q: What do you use instead of Ctrl / Alt keys? Vim controls?
🗪
2:32:47emperormetallix Q: What do you use instead of Ctrl / Alt keys? Vim controls?
🗪
2:33:38internationalizationist Q: Could you stream your daily working process( of 1935, for example) sometime?
🗪
2:33:38internationalizationist Q: Could you stream your daily working process( of 1935, for example) sometime?
🗪
2:33:38internationalizationist Q: Could you stream your daily working process( of 1935, for example) sometime?
🗪
2:33:52zzyzzyxx Here's an example where adding const with clang induced some optimization (minutes 27–28)9
🗪
2:33:52zzyzzyxx Here's an example where adding const with clang induced some optimization (minutes 27–28)9
🗪
2:33:52zzyzzyxx Here's an example where adding const with clang induced some optimization (minutes 27–28)9
🗪
2:34:37cencetv Is it important nowadays to still support x86?
🗪
2:34:37cencetv Is it important nowadays to still support x86?
🗪
2:34:37cencetv Is it important nowadays to still support x86?
🗪
2:35:06oisincar Apologies if you've answered this before but would there be any chance we'd see Vulkan on Handmade Hero?
🗪
2:35:06oisincar Apologies if you've answered this before but would there be any chance we'd see Vulkan on Handmade Hero?
🗪
2:35:06oisincar Apologies if you've answered this before but would there be any chance we'd see Vulkan on Handmade Hero?
🗪
2:35:28Wrap it up with a plug of the Meow the Infinite printed comic Kickstarter10 and related fun videos at Molly Rocket's YouTube channel,11 and Jon's stream12
🗩
2:35:28Wrap it up with a plug of the Meow the Infinite printed comic Kickstarter10 and related fun videos at Molly Rocket's YouTube channel,11 and Jon's stream12
🗩
2:35:28Wrap it up with a plug of the Meow the Infinite printed comic Kickstarter10 and related fun videos at Molly Rocket's YouTube channel,11 and Jon's stream12
🗩