Making a Stand-alone Lighting Performance Test
?
?

Keyboard Navigation

Global Keys

[, < / ], > Jump to previous / next episode
W, K, P / S, J, N Jump to previous / next marker
t / T Toggle theatre / SUPERtheatre mode
V Revert filter to original state Y Select link (requires manual Ctrl-c)

Menu toggling

q Quotes r References f Filter y Link c Credits

In-Menu Movement

a
w
s
d
h j k l


Quotes and References Menus

Enter Jump to timecode

Quotes, References and Credits Menus

o Open URL (in new tab)

Filter Menu

x, Space Toggle category and focus next
X, ShiftSpace Toggle category and focus previous
v Invert topics / media as per focus

Filter and Link Menus

z Toggle filter / linking mode

Credits Menu

Enter Open URL (in new tab)
0:01Recap and set the stage for the day
🗩
0:01Recap and set the stage for the day
🗩
0:01Recap and set the stage for the day
🗩
1:00Plug x13pixels' RemedyBG version 0.3.0.0, with a brief history of Microsoft Visual Studio
🗩
1:00Plug x13pixels' RemedyBG version 0.3.0.0, with a brief history of Microsoft Visual Studio
🗩
1:00Plug x13pixels' RemedyBG version 0.3.0.0, with a brief history of Microsoft Visual Studio
🗩
5:09Conditional Breakpoints in RemedyBG
🏃
5:09Conditional Breakpoints in RemedyBG
🏃
5:09Conditional Breakpoints in RemedyBG
🏃
6:58x13pixels I didn't end up doing that, no
🗪
6:58x13pixels I didn't end up doing that, no
🗪
6:58x13pixels I didn't end up doing that, no
🗪
7:17x13pixels It was already 15x faster just doing it the "normal" way. Well, okay, there are some tricks under the covers
🗪
7:17x13pixels It was already 15x faster just doing it the "normal" way. Well, okay, there are some tricks under the covers
🗪
7:17x13pixels It was already 15x faster just doing it the "normal" way. Well, okay, there are some tricks under the covers
🗪
7:44x13pixels Yup!
🗪
7:44x13pixels Yup!
🗪
7:44x13pixels Yup!
🗪
7:48Further love for RemedyBG1
🏃
7:48Further love for RemedyBG1
🏃
7:48Further love for RemedyBG1
🏃
12:37Demo the current state of the lighting
🏃
12:37Demo the current state of the lighting
🏃
12:37Demo the current state of the lighting
🏃
13:5130ms per frame
🏃
13:5130ms per frame
🏃
13:5130ms per frame
🏃
15:00Make ComputeLightPropagationWork() a TIMED_FUNCTION
15:00Make ComputeLightPropagationWork() a TIMED_FUNCTION
15:00Make ComputeLightPropagationWork() a TIMED_FUNCTION
16:00Check the Threads performance
🏃
16:00Check the Threads performance
🏃
16:00Check the Threads performance
🏃
18:58Determine to reduce our time spent ray tracing
🏃
18:58Determine to reduce our time spent ray tracing
🏃
18:58Determine to reduce our time spent ray tracing
🏃
19:50Describe our two-branch RayCast()
📖
19:50Describe our two-branch RayCast()
📖
19:50Describe our two-branch RayCast()
📖
22:25Why we separated the collision detection and hierarchy traversal code in RayCast()
📖
22:25Why we separated the collision detection and hierarchy traversal code in RayCast()
📖
22:25Why we separated the collision detection and hierarchy traversal code in RayCast()
📖
24:22Inspect the assembly of RayCast()
🏃
24:22Inspect the assembly of RayCast()
🏃
24:22Inspect the assembly of RayCast()
🏃
26:21Describe our k-d-tree-like SplitBox()
📖
26:21Describe our k-d-tree-like SplitBox()
📖
26:21Describe our k-d-tree-like SplitBox()
📖
27:05Consider speeding up the hierarchy traversal code in RayCast()
📖
27:05Consider speeding up the hierarchy traversal code in RayCast()
📖
27:05Consider speeding up the hierarchy traversal code in RayCast()
📖
29:54Launch VTune
🏃
29:54Launch VTune
🏃
29:54Launch VTune
🏃
30:59Set up to write our lighting data out to file
📖
30:59Set up to write our lighting data out to file
📖
30:59Set up to write our lighting data out to file
📖
33:50Make BuildSpatialPartitionForLighting() write out the lighting boxes to file, introducing DEBUGDumpData() and a Dump platform_file_type
33:50Make BuildSpatialPartitionForLighting() write out the lighting boxes to file, introducing DEBUGDumpData() and a Dump platform_file_type
33:50Make BuildSpatialPartitionForLighting() write out the lighting boxes to file, introducing DEBUGDumpData() and a Dump platform_file_type
46:57Traverse the world out to the dungeon with a view to triggering a debug dump of the lighting boxes
🏃
46:57Traverse the world out to the dungeon with a view to triggering a debug dump of the lighting boxes
🏃
46:57Traverse the world out to the dungeon with a view to triggering a debug dump of the lighting boxes
🏃
48:50RemedyBG feature request: Editable values
🏃
48:50RemedyBG feature request: Editable values
🏃
48:50RemedyBG feature request: Editable values
🏃
49:06Enable the LightBoxDumpTrigger
49:06Enable the LightBoxDumpTrigger
49:06Enable the LightBoxDumpTrigger
49:22Dump the lighting boxes to file
🏃
49:22Dump the lighting boxes to file
🏃
49:22Dump the lighting boxes to file
🏃
49:50Create hhlightprof.cpp, adding it to build.bat
49:50Create hhlightprof.cpp, adding it to build.bat
49:50Create hhlightprof.cpp, adding it to build.bat
56:28Invoke hhlightprof
🗹
56:28Invoke hhlightprof
🗹
56:28Invoke hhlightprof
🗹
57:12Fix hhlightprof to correctly get the DumpName
57:12Fix hhlightprof to correctly get the DumpName
57:12Fix hhlightprof to correctly get the DumpName
57:21Add hhlightprof to RemedyBG
🗹
57:21Add hhlightprof to RemedyBG
🗹
57:21Add hhlightprof to RemedyBG
🗹
58:52x13pixels Might have to append EXE? Thought that worked, though
🗪
58:52x13pixels Might have to append EXE? Thought that worked, though
🗪
58:52x13pixels Might have to append EXE? Thought that worked, though
🗪
58:57Launch hhlightprof in RemedyBG
🏃
58:57Launch hhlightprof in RemedyBG
🏃
58:57Launch hhlightprof in RemedyBG
🏃
59:34Introduce TestRayCast() in hhlightprof
59:34Introduce TestRayCast() in hhlightprof
59:34Introduce TestRayCast() in hhlightprof
1:02:36Make hhlightprof set up the lighting Solution from our dump
1:02:36Make hhlightprof set up the lighting Solution from our dump
1:02:36Make hhlightprof set up the lighting Solution from our dump
1:06:44Make hhlightprof initialise the SpecAtlas and DiffuseAtlas, and derive the BoxCount from the dump
1:06:44Make hhlightprof initialise the SpecAtlas and DiffuseAtlas, and derive the BoxCount from the dump
1:06:44Make hhlightprof initialise the SpecAtlas and DiffuseAtlas, and derive the BoxCount from the dump
1:13:41Hit a write access violation on the Solution
🏃
1:13:41Hit a write access violation on the Solution
🏃
1:13:41Hit a write access violation on the Solution
🏃
1:14:06Initialise the Solution globally
1:14:06Initialise the Solution globally
1:14:06Initialise the Solution globally
1:14:29Successfully run hhlightprof
🏃
1:14:29Successfully run hhlightprof
🏃
1:14:29Successfully run hhlightprof
🏃
1:14:56Step through hhlightprof
🏃
1:14:56Step through hhlightprof
🏃
1:14:56Step through hhlightprof
🏃
1:15:45Fix the order of the arguments to fseek()
1:15:45Fix the order of the arguments to fseek()
1:15:45Fix the order of the arguments to fseek()
1:16:03Step through hhlightprof and inspect the Solution
🏃
1:16:03Step through hhlightprof and inspect the Solution
🏃
1:16:03Step through hhlightprof and inspect the Solution
🏃
1:17:26Increase the BoxCount in an effort to allow room for all our child boxes
1:17:26Increase the BoxCount in an effort to allow room for all our child boxes
1:17:26Increase the BoxCount in an effort to allow room for all our child boxes
1:18:06Hit a read access violation on the Box->Radius in BuildSpatialPartitionForLighting()
🏃
1:18:06Hit a read access violation on the Box->Radius in BuildSpatialPartitionForLighting()
🏃
1:18:06Hit a read access violation on the Box->Radius in BuildSpatialPartitionForLighting()
🏃
1:19:00Revert the BoxCount and instead allocate memory for four times that number to allow room for child boxes
1:19:00Revert the BoxCount and instead allocate memory for four times that number to allow room for child boxes
1:19:00Revert the BoxCount and instead allocate memory for four times that number to allow room for child boxes
1:19:27Hit a read access violation on the Solution->tUpdateBlend in RayCast()
🏃
1:19:27Hit a read access violation on the Solution->tUpdateBlend in RayCast()
🏃
1:19:27Hit a read access violation on the Solution->tUpdateBlend in RayCast()
🏃
1:19:34Make TestRayCast() initialise the Work
1:19:34Make TestRayCast() initialise the Work
1:19:34Make TestRayCast() initialise the Work
1:20:06Run hhlightprof successfully
🏃
1:20:06Run hhlightprof successfully
🏃
1:20:06Run hhlightprof successfully
🏃
1:20:22Make TestRayCast() set up the sampling sphere and cast many rays
1:20:22Make TestRayCast() set up the sampling sphere and cast many rays
1:20:22Make TestRayCast() set up the sampling sphere and cast many rays
1:23:19Run hhlightprof, casting all its rays
🏃
1:23:19Run hhlightprof, casting all its rays
🏃
1:23:19Run hhlightprof, casting all its rays
🏃
1:23:26Prepare to cast enough rays to last a minimum of 10 seconds
📖
1:23:26Prepare to cast enough rays to last a minimum of 10 seconds
📖
1:23:26Prepare to cast enough rays to last a minimum of 10 seconds
📖
1:25:15Make TestRayCast() multiply the rays cast by 256
1:25:15Make TestRayCast() multiply the rays cast by 256
1:25:15Make TestRayCast() multiply the rays cast by 256
1:25:25Run hhlightprof for just over 10 seconds, without completing
🏃
1:25:25Run hhlightprof for just over 10 seconds, without completing
🏃
1:25:25Run hhlightprof for just over 10 seconds, without completing
🏃
1:25:36Decrease the ray multiplier from 256 to 32 in TestRayCast()
1:25:36Decrease the ray multiplier from 256 to 32 in TestRayCast()
1:25:36Decrease the ray multiplier from 256 to 32 in TestRayCast()
1:25:45Run hhlightprof for just almost 10 seconds, without completing
🏃
1:25:45Run hhlightprof for just almost 10 seconds, without completing
🏃
1:25:45Run hhlightprof for just almost 10 seconds, without completing
🏃
1:25:54Decrease the ray multiplier from 32 to 8 in TestRayCast()
1:25:54Decrease the ray multiplier from 32 to 8 in TestRayCast()
1:25:54Decrease the ray multiplier from 32 to 8 in TestRayCast()
1:26:02Run hhlightprof for 9 seconds, to completion
🏃
1:26:02Run hhlightprof for 9 seconds, to completion
🏃
1:26:02Run hhlightprof for 9 seconds, to completion
🏃
1:26:12Decrease the ray multiplier from 8 to 4 in TestRayCast()
1:26:12Decrease the ray multiplier from 8 to 4 in TestRayCast()
1:26:12Decrease the ray multiplier from 8 to 4 in TestRayCast()
1:26:20Run hhlightprof for 5 seconds, to completion
🏃
1:26:20Run hhlightprof for 5 seconds, to completion
🏃
1:26:20Run hhlightprof for 5 seconds, to completion
🏃
1:26:25Prepare to time our ray caster in VTune
🗩
1:26:25Prepare to time our ray caster in VTune
🗩
1:26:25Prepare to time our ray caster in VTune
🗩
1:28:12Create a project in VTune for hhlightprof
🗹
1:28:12Create a project in VTune for hhlightprof
🗹
1:28:12Create a project in VTune for hhlightprof
🗹
1:28:49A few words on the sheer plethora of performance counters
🗩
1:28:49A few words on the sheer plethora of performance counters
🗩
1:28:49A few words on the sheer plethora of performance counters
🗩
1:30:44Set up our project for hhlightprof
🗹
1:30:44Set up our project for hhlightprof
🗹
1:30:44Set up our project for hhlightprof
🗹
1:34:03Run hhlightprof in VTune
🏃
1:34:03Run hhlightprof in VTune
🏃
1:34:03Run hhlightprof in VTune
🏃
1:36:59Run a -O2 build of hhlightprof in VTune
🏃
1:36:59Run a -O2 build of hhlightprof in VTune
🏃
1:36:59Run a -O2 build of hhlightprof in VTune
🏃
1:37:11Increase the ray multiplier from 8 to 32 in TestRayCast()
1:37:11Increase the ray multiplier from 8 to 32 in TestRayCast()
1:37:11Increase the ray multiplier from 8 to 32 in TestRayCast()
1:37:28Run hhlightprof for under 1 second, to completion
🏃
1:37:28Run hhlightprof for under 1 second, to completion
🏃
1:37:28Run hhlightprof for under 1 second, to completion
🏃
1:37:30Increase the ray multiplier from 32 to 256 in TestRayCast()
1:37:30Increase the ray multiplier from 32 to 256 in TestRayCast()
1:37:30Increase the ray multiplier from 32 to 256 in TestRayCast()
1:37:39Run hhlightprof for 5 seconds, to completion
🏃
1:37:39Run hhlightprof for 5 seconds, to completion
🏃
1:37:39Run hhlightprof for 5 seconds, to completion
🏃
1:37:46Run hhlightprof in VTune
🏃
1:37:46Run hhlightprof in VTune
🏃
1:37:46Run hhlightprof in VTune
🏃
1:39:41Check the Hotspots of hhlightprof in VTune
🏃
1:39:41Check the Hotspots of hhlightprof in VTune
🏃
1:39:41Check the Hotspots of hhlightprof in VTune
🏃
1:41:45Microarchitecture Exploration in VTune
🏃
1:41:45Microarchitecture Exploration in VTune
🏃
1:41:45Microarchitecture Exploration in VTune
🏃
1:43:15Run a Microarchitecture Exploration of hhlightprof in VTune
🏃
1:43:15Run a Microarchitecture Exploration of hhlightprof in VTune
🏃
1:43:15Run a Microarchitecture Exploration of hhlightprof in VTune
🏃
1:54:00Run a Memory Access analysis of hhlightprof in VTune
🏃
1:54:00Run a Memory Access analysis of hhlightprof in VTune
🏃
1:54:00Run a Memory Access analysis of hhlightprof in VTune
🏃
1:55:09Reflect on our isolated ray caster
🗩
1:55:09Reflect on our isolated ray caster
🗩
1:55:09Reflect on our isolated ray caster
🗩
1:55:41Q&A
🗩
1:55:41Q&A
🗩
1:55:41Q&A
🗩
1:56:23yurasniper Q: Might be a good idea to explain the difference between sampling and instrumentation profilers and how they work on some basic level, and why sampling profiling is not great idea, despite most people believing and saying that it is very good
🗪
1:56:23yurasniper Q: Might be a good idea to explain the difference between sampling and instrumentation profilers and how they work on some basic level, and why sampling profiling is not great idea, despite most people believing and saying that it is very good
🗪
1:56:23yurasniper Q: Might be a good idea to explain the difference between sampling and instrumentation profilers and how they work on some basic level, and why sampling profiling is not great idea, despite most people believing and saying that it is very good
🗪
1:58:39lucid_frost Q: VTune organizes those metrics by something called the "top-down performance analysis methodology". There is a pretty detailed paper that introduced this that would likely help
🗪
1:58:39lucid_frost Q: VTune organizes those metrics by something called the "top-down performance analysis methodology". There is a pretty detailed paper that introduced this that would likely help
🗪
1:58:39lucid_frost Q: VTune organizes those metrics by something called the "top-down performance analysis methodology". There is a pretty detailed paper that introduced this that would likely help
🗪
1:58:49dragoonx6 Q: Have you ever tried using clang-cl? It's a drop-in MSVC compatible compiler that has much better codegen than MSVC CL. It's compatible with link.exe, but even lld-link will give you usable PDBs. When I used it in my ray tracer, it ended up being 15 times faster than with regular MSVC CL
🗪
1:58:49dragoonx6 Q: Have you ever tried using clang-cl? It's a drop-in MSVC compatible compiler that has much better codegen than MSVC CL. It's compatible with link.exe, but even lld-link will give you usable PDBs. When I used it in my ray tracer, it ended up being 15 times faster than with regular MSVC CL
🗪
1:58:49dragoonx6 Q: Have you ever tried using clang-cl? It's a drop-in MSVC compatible compiler that has much better codegen than MSVC CL. It's compatible with link.exe, but even lld-link will give you usable PDBs. When I used it in my ray tracer, it ended up being 15 times faster than with regular MSVC CL
🗪
2:01:00euphius Q: Games like CS:GO go up to 300fps. Are they pretty good optimized? Seems like getting Handmade Hero to that FPS would be hard?
🗪
2:01:00euphius Q: Games like CS:GO go up to 300fps. Are they pretty good optimized? Seems like getting Handmade Hero to that FPS would be hard?
🗪
2:01:00euphius Q: Games like CS:GO go up to 300fps. Are they pretty good optimized? Seems like getting Handmade Hero to that FPS would be hard?
🗪
2:01:43robgeel Q: I think you never use the sphere sampling direction in hhlightprof, also when dumping boxes, you write out Solution->BoxCount * sizeof(Solution->Boxes), the sizeof takes the size of a pointer instead of a lighting_box
🗪
2:01:43robgeel Q: I think you never use the sphere sampling direction in hhlightprof, also when dumping boxes, you write out Solution->BoxCount * sizeof(Solution->Boxes), the sizeof takes the size of a pointer instead of a lighting_box
🗪
2:01:43robgeel Q: I think you never use the sphere sampling direction in hhlightprof, also when dumping boxes, you write out Solution->BoxCount * sizeof(Solution->Boxes), the sizeof takes the size of a pointer instead of a lighting_box
🗪
2:01:47Fix the SampleDirB setting in TestRayCast()
2:01:47Fix the SampleDirB setting in TestRayCast()
2:01:47Fix the SampleDirB setting in TestRayCast()
2:02:04lobsang2 Q: What's the status of meowhash? Will it be reaching a new version soon?
🗪
2:02:04lobsang2 Q: What's the status of meowhash? Will it be reaching a new version soon?
🗪
2:02:04lobsang2 Q: What's the status of meowhash? Will it be reaching a new version soon?
🗪
2:02:51jim0_o Q: Have you tried debugging why the stream loses so many frames when you move the character around?
🗪
2:02:51jim0_o Q: Have you tried debugging why the stream loses so many frames when you move the character around?
🗪
2:02:51jim0_o Q: Have you tried debugging why the stream loses so many frames when you move the character around?
🗪
2:03:44kniffel5 Q: What should meowhash (not) be used for?2
🗪
2:03:44kniffel5 Q: What should meowhash (not) be used for?2
🗪
2:03:44kniffel5 Q: What should meowhash (not) be used for?2
🗪
2:07:27desu_used Q: Are you sure the hash is "secure"? People have previously pointed out some issues with meowhash, if I recall correctly, generating collisions3
🗪
2:07:27desu_used Q: Are you sure the hash is "secure"? People have previously pointed out some issues with meowhash, if I recall correctly, generating collisions3
🗪
2:07:27desu_used Q: Are you sure the hash is "secure"? People have previously pointed out some issues with meowhash, if I recall correctly, generating collisions3
🗪
2:10:03brian_nevec Q: What do you use meowhash for?
🗪
2:10:03brian_nevec Q: What do you use meowhash for?
🗪
2:10:03brian_nevec Q: What do you use meowhash for?
🗪
2:10:14temdisponivel Q: Would Handmade Hero run on a 32-bit system as it is now, or would it need porting?
🗪
2:10:14temdisponivel Q: Would Handmade Hero run on a 32-bit system as it is now, or would it need porting?
🗪
2:10:14temdisponivel Q: Would Handmade Hero run on a 32-bit system as it is now, or would it need porting?
🗪
2:10:22dragoonx6 Even for security?
🗪
2:10:22dragoonx6 Even for security?
🗪
2:10:22dragoonx6 Even for security?
🗪
2:10:58vtlmks Q: Not using the SampleDir?
🗪
2:10:58vtlmks Q: Not using the SampleDir?
🗪
2:10:58vtlmks Q: Not using the SampleDir?
🗪
2:11:01Fix TestRayCast() to set (and use) RayD
2:11:01Fix TestRayCast() to set (and use) RayD
2:11:01Fix TestRayCast() to set (and use) RayD
2:11:22kniffel5 Q: Is meowhash cross platform? For ARM, PowerPC, etc?
🗪
2:11:22kniffel5 Q: Is meowhash cross platform? For ARM, PowerPC, etc?
🗪
2:11:22kniffel5 Q: Is meowhash cross platform? For ARM, PowerPC, etc?
🗪
2:12:29mindmark42 Q: Any reason we're not using the checkerboard rendering?
🗪
2:12:29mindmark42 Q: Any reason we're not using the checkerboard rendering?
🗪
2:12:29mindmark42 Q: Any reason we're not using the checkerboard rendering?
🗪
2:12:54Begin to wind down the stream, with a plug of the upcoming Jon and Sean talk4
🗩
2:12:54Begin to wind down the stream, with a plug of the upcoming Jon and Sean talk4
🗩
2:12:54Begin to wind down the stream, with a plug of the upcoming Jon and Sean talk4
🗩
2:14:09rationalcoder Q: You use meowhash for normal hash tables in your everyday code, strings, vectors, etc?
🗪
2:14:09rationalcoder Q: You use meowhash for normal hash tables in your everyday code, strings, vectors, etc?
🗪
2:14:09rationalcoder Q: You use meowhash for normal hash tables in your everyday code, strings, vectors, etc?
🗪
2:14:39Anticipate the Jon and Sean talk5
🗩
2:14:39Anticipate the Jon and Sean talk5
🗩
2:14:39Anticipate the Jon and Sean talk5
🗩
2:15:32Wind down the stream
🗩
2:15:32Wind down the stream
🗩
2:15:32Wind down the stream
🗩