Reducing GPU Memory Footprint
?
?

Keyboard Navigation

Global Keys

[, < / ], > Jump to previous / next episode
W, K, P / S, J, N Jump to previous / next marker
t / T Toggle theatre / SUPERtheatre mode
V Revert filter to original state Y Select link (requires manual Ctrl-c)

Menu toggling

q Quotes r References f Filter y Link c Credits

In-Menu Movement

a
w
s
d
h j k l


Quotes and References Menus

Enter Jump to timecode

Quotes, References and Credits Menus

o Open URL (in new tab)

Filter Menu

x, Space Toggle category and focus next
X, ShiftSpace Toggle category and focus previous
v Invert topics / media as per focus

Filter and Link Menus

z Toggle filter / linking mode

Credits Menu

Enter Open URL (in new tab)
0:02Update Milton and RemedyBG, with thanks to Ameen Sayegh1 for the Milton grid patch
🗹
0:02Update Milton and RemedyBG, with thanks to Ameen Sayegh1 for the Milton grid patch
🗹
0:02Update Milton and RemedyBG, with thanks to Ameen Sayegh1 for the Milton grid patch
🗹
2:46Try out Milton's grids
🖌
2:46Try out Milton's grids
🖌
2:46Try out Milton's grids
🖌
6:18Milton or tablet driver issue: Smoothing
🖌
6:18Milton or tablet driver issue: Smoothing
🖌
6:18Milton or tablet driver issue: Smoothing
🖌
8:18Demo our lighting sampling sphere
🏃
8:18Demo our lighting sampling sphere
🏃
8:18Demo our lighting sampling sphere
🏃
9:29Note the fast lighting computation
📖
9:29Note the fast lighting computation
📖
9:29Note the fast lighting computation
📖
12:358-Wide Light Probes2
🖌
12:358-Wide Light Probes2
🖌
12:358-Wide Light Probes2
🖌
22:15Plan to work on both the pixel shader and light propagation
🗩
22:15Plan to work on both the pixel shader and light propagation
🗩
22:15Plan to work on both the pixel shader and light propagation
🗩
24:06See stutter in our frame rate
🏃
24:06See stutter in our frame rate
🏃
24:06See stutter in our frame rate
🏃
25:55We've got an interloping puss
🗩
25:55We've got an interloping puss
🗩
25:55We've got an interloping puss
🗩
27:18Consider that we're using too much GPU memory
🏃
27:18Consider that we're using too much GPU memory
🏃
27:18Consider that we're using too much GPU memory
🏃
28:42Calculate our GPU memory requirements
🗹
28:42Calculate our GPU memory requirements
🗹
28:42Calculate our GPU memory requirements
🗹
31:51Crash Nsight upon launching Handmade Hero
🏃
31:51Crash Nsight upon launching Handmade Hero
🏃
31:51Crash Nsight upon launching Handmade Hero
🏃
33:35Add a new Nsight project for Handmade Hero
🗹
33:35Add a new Nsight project for Handmade Hero
🗹
33:35Add a new Nsight project for Handmade Hero
🗹
35:13Crash Nsight upon launching Handmade Hero
🏃
35:13Crash Nsight upon launching Handmade Hero
🏃
35:13Crash Nsight upon launching Handmade Hero
🏃
36:36Install RenderDoc3 and configure it for Handmade Hero
🗹
36:36Install RenderDoc3 and configure it for Handmade Hero
🗹
36:36Install RenderDoc3 and configure it for Handmade Hero
🗹
40:26Crash RenderDoc upon launching Handmade Hero
🏃
40:26Crash RenderDoc upon launching Handmade Hero
🏃
40:26Crash RenderDoc upon launching Handmade Hero
🏃
41:36cloc Handmade Hero: 32,936 lines
🗹
41:36cloc Handmade Hero: 32,936 lines
🗹
41:36cloc Handmade Hero: 32,936 lines
🗹
43:24Remove the light_buffer array from open_gl
43:24Remove the light_buffer array from open_gl
43:24Remove the light_buffer array from open_gl
47:41Find that there is no change
🏃
47:41Find that there is no change
🏃
47:41Find that there is no change
🏃
47:51Track the frame buffer memory usage in FreeFramebuffer() and CreateFramebuffer()
47:51Track the frame buffer memory usage in FreeFramebuffer() and CreateFramebuffer()
47:51Track the frame buffer memory usage in FreeFramebuffer() and CreateFramebuffer()
57:41Find that we may get a faster frame rate with V-Sync disabled
🏃
57:41Find that we may get a faster frame rate with V-Sync disabled
🏃
57:41Find that we may get a faster frame rate with V-Sync disabled
🏃
58:13Add a "Renderer" DEBUG_DATA_BLOCK in WinMainCRTStartup() for the framebuffer and texture memory
58:13Add a "Renderer" DEBUG_DATA_BLOCK in WinMainCRTStartup() for the framebuffer and texture memory
58:13Add a "Renderer" DEBUG_DATA_BLOCK in WinMainCRTStartup() for the framebuffer and texture memory
1:04:58Enable the debug system to handle umm type
1:04:58Enable the debug system to handle umm type
1:04:58Enable the debug system to handle umm type
1:07:57Find that we have a TotalFramebufferMemory of 3GB
🏃
1:07:57Find that we have a TotalFramebufferMemory of 3GB
🏃
1:07:57Find that we have a TotalFramebufferMemory of 3GB
🏃
1:08:22Add UsedMultisampleCount to the "Renderer" DEBUG_DATA_BLOCK in WinMainCRTStartup()
1:08:22Add UsedMultisampleCount to the "Renderer" DEBUG_DATA_BLOCK in WinMainCRTStartup()
1:08:22Add UsedMultisampleCount to the "Renderer" DEBUG_DATA_BLOCK in WinMainCRTStartup()
1:10:48Find that our UsedMultisampleCount is 16, and calculate the memory requirements as 2.47GB
🏃
1:10:48Find that our UsedMultisampleCount is 16, and calculate the memory requirements as 2.47GB
🏃
1:10:48Find that our UsedMultisampleCount is 16, and calculate the memory requirements as 2.47GB
🏃
1:13:14Consider memory usage improvements: Only store one depth buffer
🗩
1:13:14Consider memory usage improvements: Only store one depth buffer
🗩
1:13:14Consider memory usage improvements: Only store one depth buffer
🗩
1:18:59Consider memory usage improvements: Streamline the colour buffer
🗩
1:18:59Consider memory usage improvements: Streamline the colour buffer
🗩
1:18:59Consider memory usage improvements: Streamline the colour buffer
🗩
1:23:02Consider rendering solid cubes into a single multisampling buffer with depth peeling disabled, resolve and composite this with our alpha items
🗩
1:23:02Consider rendering solid cubes into a single multisampling buffer with depth peeling disabled, resolve and composite this with our alpha items
🗩
1:23:02Consider rendering solid cubes into a single multisampling buffer with depth peeling disabled, resolve and composite this with our alpha items
🗩
1:30:15Consider depth peeling in only two buffers
🗩
1:30:15Consider depth peeling in only two buffers
🗩
1:30:15Consider depth peeling in only two buffers
🗩
1:36:26Spot our DepthPeelResolveBuffer, and make OpenGLEndFrame() use a single depth peel buffer
1:36:26Spot our DepthPeelResolveBuffer, and make OpenGLEndFrame() use a single depth peel buffer
1:36:26Spot our DepthPeelResolveBuffer, and make OpenGLEndFrame() use a single depth peel buffer
1:39:21Find that we render just fine in one DepthPeelBuffer
🏃
1:39:21Find that we render just fine in one DepthPeelBuffer
🏃
1:39:21Find that we render just fine in one DepthPeelBuffer
🏃
1:40:11Shrink open_gl down to only contain one DepthPeelBuffer
1:40:11Shrink open_gl down to only contain one DepthPeelBuffer
1:40:11Shrink open_gl down to only contain one DepthPeelBuffer
1:42:15Find that we render correctly and have a TotalFramebufferMemory of 2GB
🏃
1:42:15Find that we render correctly and have a TotalFramebufferMemory of 2GB
🏃
1:42:15Find that we render correctly and have a TotalFramebufferMemory of 2GB
🏃
1:43:16Step in to CreateFramebuffer() and watch the memory usage
🏃
1:43:16Step in to CreateFramebuffer() and watch the memory usage
🏃
1:43:16Step in to CreateFramebuffer() and watch the memory usage
🏃
1:46:00Add a new Nsight project for Handmade Hero using the correct Working Directory
🗹
1:46:00Add a new Nsight project for Handmade Hero using the correct Working Directory
🗹
1:46:00Add a new Nsight project for Handmade Hero using the correct Working Directory
🗹
1:46:51Run successfully in Nsight
🏃
🦉
1:46:51Run successfully in Nsight
🏃
🦉
1:46:51Run successfully in Nsight
🏃
🦉
1:48:07Continue to step through CreateFramebuffer() and spot that we're multiplying the MaxMultiSampleCount into the GPUMemoryUsed even when multisampling is disabled
🏃
🦉
1:48:07Continue to step through CreateFramebuffer() and spot that we're multiplying the MaxMultiSampleCount into the GPUMemoryUsed even when multisampling is disabled
🏃
🦉
1:48:07Continue to step through CreateFramebuffer() and spot that we're multiplying the MaxMultiSampleCount into the GPUMemoryUsed even when multisampling is disabled
🏃
🦉
1:48:53Fix CreateFramebuffer() to multiply the correct SampleCount into the GPUMemoryUsed
🦉
1:48:53Fix CreateFramebuffer() to multiply the correct SampleCount into the GPUMemoryUsed
🦉
1:48:53Fix CreateFramebuffer() to multiply the correct SampleCount into the GPUMemoryUsed
🦉
1:49:44Step through CreateFramebuffer() to see that our GPUMemoryUsed is actually okay
🏃
🦉
1:49:44Step through CreateFramebuffer() to see that our GPUMemoryUsed is actually okay
🏃
🦉
1:49:44Step through CreateFramebuffer() to see that our GPUMemoryUsed is actually okay
🏃
🦉
1:55:55Find that we have a TotalFramebufferMemory of 325MB
🏃
1:55:55Find that we have a TotalFramebufferMemory of 325MB
🏃
1:55:55Find that we have a TotalFramebufferMemory of 325MB
🏃
1:57:46Nsight rendering time: 14ms / frame
🏃
1:57:46Nsight rendering time: 14ms / frame
🏃
1:57:46Nsight rendering time: 14ms / frame
🏃
1:58:38Permit NVIDIA GPU performance counters4
🗹
1:58:38Permit NVIDIA GPU performance counters4
🗹
1:58:38Permit NVIDIA GPU performance counters4
🗹
2:01:09Capture for Live Analysis in Nsight, and look for a combined memory count
🏃
2:01:09Capture for Live Analysis in Nsight, and look for a combined memory count
🏃
2:01:09Capture for Live Analysis in Nsight, and look for a combined memory count
🏃
2:07:39Nsight: Range Profiler View
🏃
2:07:39Nsight: Range Profiler View
🏃
2:07:39Nsight: Range Profiler View
🏃
2:14:57Reflect on our peel buffer memory usage reduction, and plan to pseudo-simulate the light probes
🗩
2:14:57Reflect on our peel buffer memory usage reduction, and plan to pseudo-simulate the light probes
🗩
2:14:57Reflect on our peel buffer memory usage reduction, and plan to pseudo-simulate the light probes
🗩
2:16:29Try disabling V-Sync and multisampling, to find that the latter fails
🏃
2:16:29Try disabling V-Sync and multisampling, to find that the latter fails
🏃
2:16:29Try disabling V-Sync and multisampling, to find that the latter fails
🏃
2:17:17Plan to fix the multisampling read
🗩
2:17:17Plan to fix the multisampling read
🗩
2:17:17Plan to fix the multisampling read
🗩