Examining the CPU Voxel Sampling
?
?

Keyboard Navigation

Global Keys

[, < / ], > Jump to previous / next episode
W, K, P / S, J, N Jump to previous / next marker
t / T Toggle theatre / SUPERtheatre mode
V Revert filter to original state Y Select link (requires manual Ctrl-c)

Menu toggling

q Quotes r References f Filter y Link c Credits

In-Menu Movement

a
w
s
d
h j k l


Quotes and References Menus

Enter Jump to timecode

Quotes, References and Credits Menus

o Open URL (in new tab)

Filter Menu

x, Space Toggle category and focus next
X, ShiftSpace Toggle category and focus previous
v Invert topics / media as per focus

Filter and Link Menus

z Toggle filter / linking mode

Credits Menu

Enter Open URL (in new tab)
0:00Recap and set the stage for the day
🗩
0:00Recap and set the stage for the day
🗩
0:00Recap and set the stage for the day
🗩
0:31Determine to blur the lighting samples across voxels and reducing the flicker, after speeding up the performance of the grid ray tracer
🏃
0:31Determine to blur the lighting samples across voxels and reducing the flicker, after speeding up the performance of the grid ray tracer
🏃
0:31Determine to blur the lighting samples across voxels and reducing the flicker, after speeding up the performance of the grid ray tracer
🏃
3:02Prepare to enable hhlightprof to capture a grid-based ray cast run
3:02Prepare to enable hhlightprof to capture a grid-based ray cast run
3:02Prepare to enable hhlightprof to capture a grid-based ray cast run
6:56Update InternalLightingCore() to dump out the new source_lightboxes
6:56Update InternalLightingCore() to dump out the new source_lightboxes
6:56Update InternalLightingCore() to dump out the new source_lightboxes
9:47Compile in hhlightprof and update it to work with our grid ray caster
9:47Compile in hhlightprof and update it to work with our grid ray caster
9:47Compile in hhlightprof and update it to work with our grid ray caster
16:27Break in to InternalLightingCore()
🏃
16:27Break in to InternalLightingCore()
🏃
16:27Break in to InternalLightingCore()
🏃
18:24Reload, to see the walk table break
🏃
18:24Reload, to see the walk table break
🏃
18:24Reload, to see the walk table break
🏃
18:48Investigate the walk table breakage on hot reloading
📖
18:48Investigate the walk table breakage on hot reloading
📖
18:48Investigate the walk table breakage on hot reloading
📖
20:52Run in -Od, hot-reload and see the walk table break
🏃
20:52Run in -Od, hot-reload and see the walk table break
🏃
20:52Run in -Od, hot-reload and see the walk table break
🏃
21:32Refamiliarise ourselves with the walk table structure and code
📖
21:32Refamiliarise ourselves with the walk table structure and code
📖
21:32Refamiliarise ourselves with the walk table structure and code
📖
22:42Break in to InternalLightingCore() and inspect the LightSamplingWalkTable
🏃
22:42Break in to InternalLightingCore() and inspect the LightSamplingWalkTable
🏃
22:42Break in to InternalLightingCore() and inspect the LightSamplingWalkTable
🏃
25:28Break in to GridRayCast() and inspect the WalkTable usage
🏃
25:28Break in to GridRayCast() and inspect the WalkTable usage
🏃
25:28Break in to GridRayCast() and inspect the WalkTable usage
🏃
27:02Fix our walk table breakage by making ComputeWalkTable() block copy the SampleDirections, and set a fresh RayD and WalkTableOffset
27:02Fix our walk table breakage by making ComputeWalkTable() block copy the SampleDirections, and set a fresh RayD and WalkTableOffset
27:02Fix our walk table breakage by making ComputeWalkTable() block copy the SampleDirections, and set a fresh RayD and WalkTableOffset
33:32Run in -Od, hot-reload and see the walk table remain intact
🏃
33:32Run in -Od, hot-reload and see the walk table remain intact
🏃
33:32Run in -Od, hot-reload and see the walk table remain intact
🏃
34:08Run in -O2, hot-reload and see the walk table remain intact
🏃
34:08Run in -O2, hot-reload and see the walk table remain intact
🏃
34:08Run in -O2, hot-reload and see the walk table remain intact
🏃
34:42Enable then disable the LightBoxDumpTrigger, to dump the lighting
🏃
34:42Enable then disable the LightBoxDumpTrigger, to dump the lighting
🏃
34:42Enable then disable the LightBoxDumpTrigger, to dump the lighting
🏃
35:17Check out our lighting dump files, noting the large size of the source_lighting.dump
🗹
35:17Check out our lighting dump files, noting the large size of the source_lighting.dump
🗹
35:17Check out our lighting dump files, noting the large size of the source_lighting.dump
🗹
36:44Hit a read access violation in GetAlignmentOffset() from hhlightprof
🏃
36:44Hit a read access violation in GetAlignmentOffset() from hhlightprof
🏃
36:44Hit a read access violation in GetAlignmentOffset() from hhlightprof
🏃
37:14Make ProfileRun() push the SampleDirectionTable onto the TempArena
37:14Make ProfileRun() push the SampleDirectionTable onto the TempArena
37:14Make ProfileRun() push the SampleDirectionTable onto the TempArena
37:34Hit a write access violation in PushDebugLine()
🏃
37:34Hit a write access violation in PushDebugLine()
🏃
37:34Hit a write access violation in PushDebugLine()
🏃
37:44Make ProfileRun() disable UpdateDebugLines
37:44Make ProfileRun() disable UpdateDebugLines
37:44Make ProfileRun() disable UpdateDebugLines
38:03Run hhlightprof successfully
🏃
38:03Run hhlightprof successfully
🏃
38:03Run hhlightprof successfully
🏃
38:35Run an -O2 build of hhlightprof
🏃
38:35Run an -O2 build of hhlightprof
🏃
38:35Run an -O2 build of hhlightprof
🏃
38:59hhlightprof total seconds elapsed: 7.646482
🏃
38:59hhlightprof total seconds elapsed: 7.646482
🏃
38:59hhlightprof total seconds elapsed: 7.646482
🏃
40:03Disable LIGHTING_USE_GRID
40:03Disable LIGHTING_USE_GRID
40:03Disable LIGHTING_USE_GRID
40:20hhlightprof total seconds elapsed: 7.287836
🏃
40:20hhlightprof total seconds elapsed: 7.287836
🏃
40:20hhlightprof total seconds elapsed: 7.287836
🏃
40:42Save off our timings, enable LIGHTING_USE_GRID and make GridRayCast() return early if not Debugging
40:42Save off our timings, enable LIGHTING_USE_GRID and make GridRayCast() return early if not Debugging
40:42Save off our timings, enable LIGHTING_USE_GRID and make GridRayCast() return early if not Debugging
42:34A few words on replacing language / compiler "Errors" and "Warnings" with "I could not compile this" and "Things I noticed about the code"
💢
🗩
42:34A few words on replacing language / compiler "Errors" and "Warnings" with "I could not compile this" and "Things I noticed about the code"
💢
🗩
42:34A few words on replacing language / compiler "Errors" and "Warnings" with "I could not compile this" and "Things I noticed about the code"
💢
🗩
45:29hhlightprof total seconds elapsed (without ray casting): 1.246065
🏃
45:29hhlightprof total seconds elapsed (without ray casting): 1.246065
🏃
45:29hhlightprof total seconds elapsed (without ray casting): 1.246065
🏃
46:44Let GridRayCast() do its work
46:44Let GridRayCast() do its work
46:44Let GridRayCast() do its work
46:54hhlightprof total seconds elapsed: 7.706672
🏃
46:54hhlightprof total seconds elapsed: 7.706672
🏃
46:54hhlightprof total seconds elapsed: 7.706672
🏃
47:10Reorganise GridRayCast() to decrement the CostMetric after the loops, and comment out debugging code
47:10Reorganise GridRayCast() to decrement the CostMetric after the loops, and comment out debugging code
47:10Reorganise GridRayCast() to decrement the CostMetric after the loops, and comment out debugging code
50:18hhlightprof total seconds elapsed: 7.479611
🏃
50:18hhlightprof total seconds elapsed: 7.479611
🏃
50:18hhlightprof total seconds elapsed: 7.479611
🏃
50:46Note why the AABB testing loop in GridRayCast() does not tend to use all four SIMD lanes, and the simplicity of ComputeWalkTable()
📖
50:46Note why the AABB testing loop in GridRayCast() does not tend to use all four SIMD lanes, and the simplicity of ComputeWalkTable()
📖
50:46Note why the AABB testing loop in GridRayCast() does not tend to use all four SIMD lanes, and the simplicity of ComputeWalkTable()
📖
52:53Determine to decouple the spatial and lighting voxel grids
📖
52:53Determine to decouple the spatial and lighting voxel grids
📖
52:53Determine to decouple the spatial and lighting voxel grids
📖
53:55Try decreasing the CostMetric from 16 to 4 in GridRayCast()
53:55Try decreasing the CostMetric from 16 to 4 in GridRayCast()
53:55Try decreasing the CostMetric from 16 to 4 in GridRayCast()
54:28hhlightprof total seconds elapsed: 6.321697
🏃
54:28hhlightprof total seconds elapsed: 6.321697
🏃
54:28hhlightprof total seconds elapsed: 6.321697
🏃
55:00Try decreasing the CostMetric from 4 to 0 in GridRayCast()
55:00Try decreasing the CostMetric from 4 to 0 in GridRayCast()
55:00Try decreasing the CostMetric from 4 to 0 in GridRayCast()
55:13hhlightprof total seconds elapsed: 4.679288
🏃
55:13hhlightprof total seconds elapsed: 4.679288
🏃
55:13hhlightprof total seconds elapsed: 4.679288
🏃
55:34Interpret our 4.679288 seconds performance when casting no rays
📖
55:34Interpret our 4.679288 seconds performance when casting no rays
📖
55:34Interpret our 4.679288 seconds performance when casting no rays
📖
58:16Alignment of Atlas Cells
🖌
58:16Alignment of Atlas Cells
🖌
58:16Alignment of Atlas Cells
🖌
1:06:48Determine to remove the mutex from our atlas traversal code
📖
1:06:48Determine to remove the mutex from our atlas traversal code
📖
1:06:48Determine to remove the mutex from our atlas traversal code
📖
1:10:18Change ComputeLightPropagationWork() to distribute the lighting computation along the Y axis
1:10:18Change ComputeLightPropagationWork() to distribute the lighting computation along the Y axis
1:10:18Change ComputeLightPropagationWork() to distribute the lighting computation along the Y axis
1:12:00hhlightprof total seconds elapsed: 4.422329
🏃
1:12:00hhlightprof total seconds elapsed: 4.422329
🏃
1:12:00hhlightprof total seconds elapsed: 4.422329
🏃
1:12:41Consider compacting the lighting atlases
📖
1:12:41Consider compacting the lighting atlases
📖
1:12:41Consider compacting the lighting atlases
📖
1:14:44Consider the performance of ComputeVoxelIrradianceAt()
📖
1:14:44Consider the performance of ComputeVoxelIrradianceAt()
📖
1:14:44Consider the performance of ComputeVoxelIrradianceAt()
📖
1:15:41Try greatly simplifying ComputeVoxelIrradianceAt()
1:15:41Try greatly simplifying ComputeVoxelIrradianceAt()
1:15:41Try greatly simplifying ComputeVoxelIrradianceAt()
1:16:46hhlightprof total seconds elapsed: 3.160531
🏃
1:16:46hhlightprof total seconds elapsed: 3.160531
🏃
1:16:46hhlightprof total seconds elapsed: 3.160531
🏃
1:16:59Try further simplifying ComputeVoxelIrradianceAt()
1:16:59Try further simplifying ComputeVoxelIrradianceAt()
1:16:59Try further simplifying ComputeVoxelIrradianceAt()
1:17:33hhlightprof total seconds elapsed: 1.815950
🏃
1:17:33hhlightprof total seconds elapsed: 1.815950
🏃
1:17:33hhlightprof total seconds elapsed: 1.815950
🏃
1:17:50Determine to speed up ComputeVoxelIrradianceAt()
📖
1:17:50Determine to speed up ComputeVoxelIrradianceAt()
📖
1:17:50Determine to speed up ComputeVoxelIrradianceAt()
📖
1:18:22Instrument ComputeVoxelIrradianceAt() to more specifically gauge its performance
1:18:22Instrument ComputeVoxelIrradianceAt() to more specifically gauge its performance
1:18:22Instrument ComputeVoxelIrradianceAt() to more specifically gauge its performance
1:21:48hhlightprof total seconds elapsed: 2.806293
🏃
1:21:48hhlightprof total seconds elapsed: 2.806293
🏃
1:21:48hhlightprof total seconds elapsed: 2.806293
🏃
1:22:00Break in to ComputeVoxelIrradianceAt() and inspect the assembly
🏃
1:22:00Break in to ComputeVoxelIrradianceAt() and inspect the assembly
🏃
1:22:00Break in to ComputeVoxelIrradianceAt() and inspect the assembly
🏃
1:23:31Determine to optimise out some of the math in ComputeVoxelIrradianceAt()
📖
1:23:31Determine to optimise out some of the math in ComputeVoxelIrradianceAt()
📖
1:23:31Determine to optimise out some of the math in ComputeVoxelIrradianceAt()
📖
1:26:46Make ComputeVoxelIrradianceAt() operate wide1,2
1:26:46Make ComputeVoxelIrradianceAt() operate wide1,2
1:26:46Make ComputeVoxelIrradianceAt() operate wide1,2
1:49:07Dependents, and Cycle Ordering
1:49:07Dependents, and Cycle Ordering
1:49:07Dependents, and Cycle Ordering
1:52:32Continue to make ComputeVoxelIrradianceAt() operate wide3,4
1:52:32Continue to make ComputeVoxelIrradianceAt() operate wide3,4
1:52:32Continue to make ComputeVoxelIrradianceAt() operate wide3,4
2:07:23Let GridRayCast() use the original CostMetric
2:07:23Let GridRayCast() use the original CostMetric
2:07:23Let GridRayCast() use the original CostMetric
2:07:48We are back to normal
🏃
2:07:48We are back to normal
🏃
2:07:48We are back to normal
🏃
2:07:53Q&A
🗩
2:07:53Q&A
🗩
2:07:53Q&A
🗩
2:08:15sagian2005 Q: Would things go better if you started with U, V and W each 4 wide?
🗪
2:08:15sagian2005 Q: Would things go better if you started with U, V and W each 4 wide?
🗪
2:08:15sagian2005 Q: Would things go better if you started with U, V and W each 4 wide?
🗪
2:09:36cirdanvalen Q: Could you not pad by 8 bytes to fix the overlap?
🗪
2:09:36cirdanvalen Q: Could you not pad by 8 bytes to fix the overlap?
🗪
2:09:36cirdanvalen Q: Could you not pad by 8 bytes to fix the overlap?
🗪
2:10:03mindmark42 Q: The L0 determines how many cache lines the CPU can hold, right?
🗪
2:10:03mindmark42 Q: The L0 determines how many cache lines the CPU can hold, right?
🗪
2:10:03mindmark42 Q: The L0 determines how many cache lines the CPU can hold, right?
🗪
2:10:43rooctag Q: Would you ever consider just going over explaining how an operations takes x amount of CPU ops? Or will that be in the Intro to C?
🗪
2:10:43rooctag Q: Would you ever consider just going over explaining how an operations takes x amount of CPU ops? Or will that be in the Intro to C?
🗪
2:10:43rooctag Q: Would you ever consider just going over explaining how an operations takes x amount of CPU ops? Or will that be in the Intro to C?
🗪
2:10:59ali4410 Q: Hi, do you recommend learning vi keybindings, emacs keybindings, or neither?
🗪
2:10:59ali4410 Q: Hi, do you recommend learning vi keybindings, emacs keybindings, or neither?
🗪
2:10:59ali4410 Q: Hi, do you recommend learning vi keybindings, emacs keybindings, or neither?
🗪
2:12:00mindmark42 Q: The memory caches on the CPU
🗪
2:12:00mindmark42 Q: The memory caches on the CPU
🗪
2:12:00mindmark42 Q: The memory caches on the CPU
🗪
2:13:17Hardware Caches5,6
📖
2:13:17Hardware Caches5,6
📖
2:13:17Hardware Caches5,6
📖
2:22:50mindmark42 Q: Yes, that answers my question. I just was just off by one index
🗪
2:22:50mindmark42 Q: Yes, that answers my question. I just was just off by one index
🗪
2:22:50mindmark42 Q: Yes, that answers my question. I just was just off by one index
🗪
2:23:15billdstrong Q: How much time are you expecting to shave off from making this routine wide? About 2 / 3 or so? What are your expectations for your grid walk optimization? Are you trying to get under 4 seconds or lower?
🗪
2:23:15billdstrong Q: How much time are you expecting to shave off from making this routine wide? About 2 / 3 or so? What are your expectations for your grid walk optimization? Are you trying to get under 4 seconds or lower?
🗪
2:23:15billdstrong Q: How much time are you expecting to shave off from making this routine wide? About 2 / 3 or so? What are your expectations for your grid walk optimization? Are you trying to get under 4 seconds or lower?
🗪
2:23:51mindmark42 Q: Yeah, I mentioned those L caches. I was wondering how to determine how many cache lines a core holds
🗪
2:23:51mindmark42 Q: Yeah, I mentioned those L caches. I was wondering how to determine how many cache lines a core holds
🗪
2:23:51mindmark42 Q: Yeah, I mentioned those L caches. I was wondering how to determine how many cache lines a core holds
🗪
2:24:238-Way Caches7,8
📖
2:24:238-Way Caches7,8
📖
2:24:238-Way Caches7,8
📖
2:29:01Thanks, everyone
🗩
2:29:01Thanks, everyone
🗩
2:29:01Thanks, everyone
🗩