Debug Overlay Cleanup and Render Group Performance Investigation
?
?

Keyboard Navigation

Global Keys

[, < / ], > Jump to previous / next episode
W, K, P / S, J, N Jump to previous / next timestamp
t / T Toggle theatre / SUPERtheatre mode
V Revert filter to original state Y Select link (requires manual Ctrl-c)

Menu toggling

q Quotes r References f Filter y Link c Credits

In-Menu and Index Controls

a
w
s
d
h j k l


Esc Close menu / unfocus timestamp

Quotes and References Menus and Index

Enter Jump to timestamp

Quotes, References and Credits Menus

o Open URL (in new tab)

Filter Menu

x, Space Toggle category and focus next
X, ShiftSpace Toggle category and focus previous
v Invert topics / media as per focus

Filter and Link Menus

z Toggle filter / linking mode

Credits Menu

Enter Open URL (in new tab)
0:30"Today's episode might be a little bit of a potpourri"α
0:30"Today's episode might be a little bit of a potpourri"α
0:30"Today's episode might be a little bit of a potpourri"α
1:06Determine to address the "Fix for profiler clipping issue on threads/frames/clocks view" issue1
1:06Determine to address the "Fix for profiler clipping issue on threads/frames/clocks view" issue1
1:06Determine to address the "Fix for profiler clipping issue on threads/frames/clocks view" issue1
2:07Run the game and demo the incorrect clipping of the profiler
2:07Run the game and demo the incorrect clipping of the profiler
2:07Run the game and demo the incorrect clipping of the profiler
3:54handmade_render_group.cpp: Fix GetClipRect() to use Rectangle.Min
3:54handmade_render_group.cpp: Fix GetClipRect() to use Rectangle.Min
3:54handmade_render_group.cpp: Fix GetClipRect() to use Rectangle.Min
5:00Run the game to see that the profiler clipping is fixed
5:00Run the game to see that the profiler clipping is fixed
5:00Run the game to see that the profiler clipping is fixed
6:00handmade_debug.cpp: Make DEBUGStart() offset the text Z
6:00handmade_debug.cpp: Make DEBUGStart() offset the text Z
6:00handmade_debug.cpp: Make DEBUGStart() offset the text Z
8:22Run the game to see that the text drop-shadow looks correct
8:22Run the game to see that the text drop-shadow looks correct
8:22Run the game to see that the text drop-shadow looks correct
9:08handmade_debug.cpp: Greatly increase the Z offset for the tooltip, so it draws above the profile bars
9:08handmade_debug.cpp: Greatly increase the Z offset for the tooltip, so it draws above the profile bars
9:08handmade_debug.cpp: Greatly increase the Z offset for the tooltip, so it draws above the profile bars
10:20Run the game to see the tooltip box above the profile bars, but not its text contents, and discuss possible reasons why
10:20Run the game to see the tooltip box above the profile bars, but not its text contents, and discuss possible reasons why
10:20Run the game to see the tooltip box above the profile bars, but not its text contents, and discuss possible reasons why
17:53handmade_debug_ui.cpp: Consider simplifying the text system
17:53handmade_debug_ui.cpp: Consider simplifying the text system
17:53handmade_debug_ui.cpp: Consider simplifying the text system
24:08handmade_debug_ui.cpp: Enable DrawTooltips() to draw its text above everything
24:08handmade_debug_ui.cpp: Enable DrawTooltips() to draw its text above everything
24:08handmade_debug_ui.cpp: Enable DrawTooltips() to draw its text above everything
25:02Run the game to see our tooltip text back
25:02Run the game to see our tooltip text back
25:02Run the game to see our tooltip text back
28:16handmade_debug_interface.h: Temporarily conditionally append the function name to the UniqueFileCounterString__
28:16handmade_debug_interface.h: Temporarily conditionally append the function name to the UniqueFileCounterString__
28:16handmade_debug_interface.h: Temporarily conditionally append the function name to the UniqueFileCounterString__
29:19Run the game to see those function names
29:19Run the game to see those function names
29:19Run the game to see those function names
30:42handmade_debug.cpp: Enable DrawProfileBars() and DrawFrameBars() to put the Element->Name in their tooltip
30:42handmade_debug.cpp: Enable DrawProfileBars() and DrawFrameBars() to put the Element->Name in their tooltip
30:42handmade_debug.cpp: Enable DrawProfileBars() and DrawFrameBars() to put the Element->Name in their tooltip
32:59Run the game to see our complete tooltips
32:59Run the game to see our complete tooltips
32:59Run the game to see our complete tooltips
34:16Investigate why UpdateAndRenderEntities() takes so much time
34:16Investigate why UpdateAndRenderEntities() takes so much time
34:16Investigate why UpdateAndRenderEntities() takes so much time
37:10Run the game to see that the EntityRender routine is the main culprit
37:10Run the game to see that the EntityRender routine is the main culprit
37:10Run the game to see that the EntityRender routine is the main culprit
39:45handmade_entity.cpp: Further Isolate EntityRender
39:45handmade_entity.cpp: Further Isolate EntityRender
39:45handmade_entity.cpp: Further Isolate EntityRender
41:01Run the game to see that RenderVolume and RenderPieces are both slow
41:01Run the game to see that RenderVolume and RenderPieces are both slow
41:01Run the game to see that RenderVolume and RenderPieces are both slow
41:24Investigate this slowness
41:24Investigate this slowness
41:24Investigate this slowness
43:43Step through the assembly for the RenderPieces block
43:43Step through the assembly for the RenderPieces block
43:43Step through the assembly for the RenderPieces block
47:16handmade_render_group.cpp: Rewrite GetBitmapDim() without function overloading
47:16handmade_render_group.cpp: Rewrite GetBitmapDim() without function overloading
47:16handmade_render_group.cpp: Rewrite GetBitmapDim() without function overloading
50:40Inspect the assembly for the RenderPieces block and note that GetBitmapDim() has now been inlined, as has PushQuad()
50:40Inspect the assembly for the RenderPieces block and note that GetBitmapDim() has now been inlined, as has PushQuad()
50:40Inspect the assembly for the RenderPieces block and note that GetBitmapDim() has now been inlined, as has PushQuad()
57:03Investigate why RoundReal32ToInt32() isn't being inlinedβ
57:03Investigate why RoundReal32ToInt32() isn't being inlinedβ
57:03Investigate why RoundReal32ToInt32() isn't being inlinedβ
59:29Determine to excise roundf()
59:29Determine to excise roundf()
59:29Determine to excise roundf()
1:00:55Consult the Intel Intrinsics Guide for rounding instructions2
1:00:55Consult the Intel Intrinsics Guide for rounding instructions2
1:00:55Consult the Intel Intrinsics Guide for rounding instructions2
1:03:53handmade_intrinsics.h: Make RoundReal32ToInt32() and RoundReal32ToUInt32() use _mm_cvtss_si32()
1:03:53handmade_intrinsics.h: Make RoundReal32ToInt32() and RoundReal32ToUInt32() use _mm_cvtss_si32()
1:03:53handmade_intrinsics.h: Make RoundReal32ToInt32() and RoundReal32ToUInt32() use _mm_cvtss_si32()
1:05:25Run the game to see that RenderPieces now takes 0.87% of the time, and consider the importance of profiling and looking at the assembly
1:05:25Run the game to see that RenderPieces now takes 0.87% of the time, and consider the importance of profiling and looking at the assembly
1:05:25Run the game to see that RenderPieces now takes 0.87% of the time, and consider the importance of profiling and looking at the assembly
1:09:27"Death by a thousand function calls"γ
1:09:27"Death by a thousand function calls"γ
1:09:27"Death by a thousand function calls"γ
1:10:05Investigate why the RenderVolume block takes a lot of time
1:10:05Investigate why the RenderVolume block takes a lot of time
1:10:05Investigate why the RenderVolume block takes a lot of time
1:10:55Inspect the assembly for PushLineSegment()δ
1:10:55Inspect the assembly for PushLineSegment()δ
1:10:55Inspect the assembly for PushLineSegment()δ
1:13:47handmade_intrinsics.h: Make SquareRoot() use _mm_sqrt_ss()3
1:13:47handmade_intrinsics.h: Make SquareRoot() use _mm_sqrt_ss()3
1:13:47handmade_intrinsics.h: Make SquareRoot() use _mm_sqrt_ss()3
1:17:49Run the game to see that that just works, but that RenderVolume is not faster
1:17:49Run the game to see that that just works, but that RenderVolume is not faster
1:17:49Run the game to see that that just works, but that RenderVolume is not faster
1:20:05handmade_opengl.cpp: Add some debug timing blocks in OpenGLRenderCommands()
1:20:05handmade_opengl.cpp: Add some debug timing blocks in OpenGLRenderCommands()
1:20:05handmade_opengl.cpp: Add some debug timing blocks in OpenGLRenderCommands()
1:22:30Run the game to see that OpenGL::TexturedQuads takes the most time, and investigate why
1:22:30Run the game to see that OpenGL::TexturedQuads takes the most time, and investigate why
1:22:30Run the game to see that OpenGL::TexturedQuads takes the most time, and investigate why
1:25:51Consider how to speed up passing the textures to the GPU
1:25:51Consider how to speed up passing the textures to the GPU
1:25:51Consider how to speed up passing the textures to the GPU
1:28:08handmade_opengl.cpp: Experiment with ways to speed up the OpenGL::QuadLoop block in OpenGLRenderCommands()
1:28:08handmade_opengl.cpp: Experiment with ways to speed up the OpenGL::QuadLoop block in OpenGLRenderCommands()
1:28:08handmade_opengl.cpp: Experiment with ways to speed up the OpenGL::QuadLoop block in OpenGLRenderCommands()
1:35:41Run the game to see garbage, note the speed up of the block, but back out those changes in favour of potentially using texture arrays
1:35:41Run the game to see garbage, note the speed up of the block, but back out those changes in favour of potentially using texture arrays
1:35:41Run the game to see garbage, note the speed up of the block, but back out those changes in favour of potentially using texture arrays
1:38:20Run the game to see that the cutscene does not work with the new 3D system
1:38:20Run the game to see that the cutscene does not work with the new 3D system
1:38:20Run the game to see that the cutscene does not work with the new 3D system
1:39:33Calculate how large of a texture array we would need for the cutscene
1:39:33Calculate how large of a texture array we would need for the cutscene
1:39:33Calculate how large of a texture array we would need for the cutscene
1:43:06handmade_cutscene.cpp: Reacquaint ourselves with the cutscene code
1:43:06handmade_cutscene.cpp: Reacquaint ourselves with the cutscene code
1:43:06handmade_cutscene.cpp: Reacquaint ourselves with the cutscene code
1:49:29Run the game and try to interpret how the cutscene is being drawn
1:49:29Run the game and try to interpret how the cutscene is being drawn
1:49:29Run the game and try to interpret how the cutscene is being drawn
1:50:14handmade_render_group.cpp: Make SetCameraTransform() take the NearClipPlane and FarClipPlane
1:50:14handmade_render_group.cpp: Make SetCameraTransform() take the NearClipPlane and FarClipPlane
1:50:14handmade_render_group.cpp: Make SetCameraTransform() take the NearClipPlane and FarClipPlane
1:53:19handmade_cutscene.cpp: Make RenderLayeredScene() specify the clip planes
1:53:19handmade_cutscene.cpp: Make RenderLayeredScene() specify the clip planes
1:53:19handmade_cutscene.cpp: Make RenderLayeredScene() specify the clip planes
1:53:47Run the game and note that our distance based fog is knocking out some of the cutscene
1:53:47Run the game and note that our distance based fog is knocking out some of the cutscene
1:53:47Run the game and note that our distance based fog is knocking out some of the cutscene
1:54:28handmade_render_group.cpp: Make SetCameraTransform() take Fog
1:54:28handmade_render_group.cpp: Make SetCameraTransform() take Fog
1:54:28handmade_render_group.cpp: Make SetCameraTransform() take Fog
1:56:34Run the game to see all the pieces of the cutscene
1:56:34Run the game to see all the pieces of the cutscene
1:56:34Run the game to see all the pieces of the cutscene
1:58:03handmade_cutscene.cpp: Make RenderLayeredScene() set the FocalLength and run the game to see it
1:58:03handmade_cutscene.cpp: Make RenderLayeredScene() set the FocalLength and run the game to see it
1:58:03handmade_cutscene.cpp: Make RenderLayeredScene() set the FocalLength and run the game to see it
2:01:45"Where Santa Claus went, I don't know"ε
2:01:45"Where Santa Claus went, I don't know"ε
2:01:45"Where Santa Claus went, I don't know"ε
2:01:57Run the game and investigate where Santa Claus went
2:01:57Run the game and investigate where Santa Claus went
2:01:57Run the game and investigate where Santa Claus went
2:08:45Discover that the w values passed to PushQuad() are not all 0
2:08:45Discover that the w values passed to PushQuad() are not all 0
2:08:45Discover that the w values passed to PushQuad() are not all 0
2:09:57handmade_render_group.cpp: Make PushBitmap() only set the ZBias for upright sprites
2:09:57handmade_render_group.cpp: Make PushBitmap() only set the ZBias for upright sprites
2:09:57handmade_render_group.cpp: Make PushBitmap() only set the ZBias for upright sprites
2:10:24Run the game to see that Santa Claus is back
2:10:24Run the game to see that Santa Claus is back
2:10:24Run the game to see that Santa Claus is back
2:10:30Q&A
🗩
2:10:30Q&A
🗩
2:10:30Q&A
🗩
2:11:05desuused What will be our way forward to reduce glDrawArray calls? Will we look into texture atlases, megatextures, or invent our own way? Is it better to separate cutscene rendering because it's so different?
🗪
2:11:05desuused What will be our way forward to reduce glDrawArray calls? Will we look into texture atlases, megatextures, or invent our own way? Is it better to separate cutscene rendering because it's so different?
🗪
2:11:05desuused What will be our way forward to reduce glDrawArray calls? Will we look into texture atlases, megatextures, or invent our own way? Is it better to separate cutscene rendering because it's so different?
🗪
2:12:03kilo_pasztetowej Are we forced to use SIMD registers to use those intrinsics or are there some ways to do that without those, like inline assembly or something else?
🗪
2:12:03kilo_pasztetowej Are we forced to use SIMD registers to use those intrinsics or are there some ways to do that without those, like inline assembly or something else?
🗪
2:12:03kilo_pasztetowej Are we forced to use SIMD registers to use those intrinsics or are there some ways to do that without those, like inline assembly or something else?
🗪
2:16:48desuused Have you seen musl standard library? It has very good code quality. sqrtf on x86_64 is one instruction on it. It's for Linux, but parts of it are very portable4
🗪
2:16:48desuused Have you seen musl standard library? It has very good code quality. sqrtf on x86_64 is one instruction on it. It's for Linux, but parts of it are very portable4
🗪
2:16:48desuused Have you seen musl standard library? It has very good code quality. sqrtf on x86_64 is one instruction on it. It's for Linux, but parts of it are very portable4
🗪
2:21:17botondar Do you know whether the sqrt function in math.h uses processor instructions or calculates the results via some software algorithm?5
🗪
2:21:17botondar Do you know whether the sqrt function in math.h uses processor instructions or calculates the results via some software algorithm?5
🗪
2:21:17botondar Do you know whether the sqrt function in math.h uses processor instructions or calculates the results via some software algorithm?5
🗪
2:23:42Close it down
🗩
2:23:42Close it down
🗩
2:23:42Close it down
🗩