Introduction to Function Approximation with Andrew Bromage

?

?# Keyboard Navigation

## Global Keys

[, < / ], > Jump to previous / next episode

W, K, P / S, J, N Jump to previous / next marker

t / T Toggle theatre / SUPERtheatre mode

V Revert filter to original state Y Select link (requires manual Ctrl-c)## Menu toggling

q Quotes
r References
f Filter
y Link
c Credits
## In-Menu Movement

## Quotes and References Menus

Enter Jump to timecode

## Quotes, References and Credits Menus

o Open URL (in new tab)
## Filter Menu

x, Space Toggle category and focus next

X, ShiftSpace Toggle category and focus previous

v Invert topics / media as per focus## Filter and Link Menus

z Toggle filter / linking mode
## Credits Menu

Enter Open URL (in new tab)

W, K, P / S, J, N Jump to previous / next marker

t / T Toggle theatre / SUPERtheatre mode

V Revert filter to original state Y Select link (requires manual Ctrl-c)

a

w

s

s

d

h
j
k
l

←

↑

↓

↓

→

X, ShiftSpace Toggle category and focus previous

v Invert topics / media as per focus

⏫

Previous: 'Testing Better Entropy'

⏫

0:00Welcome to a special episode with Andrew Bromage

0:00Welcome to a special episode with Andrew Bromage

0:00Welcome to a special episode with Andrew Bromage

2:42 : How floating point numbers are represented by a computer^{1}

2:42 : How floating point numbers are represented by a computer^{1}

2:42 : How floating point numbers are represented by a computer^{1}

4:36 : Scientific notation, and how IEEE 754 represents numbers in binary

4:36 : Scientific notation, and how IEEE 754 represents numbers in binary

4:36 : Scientific notation, and how IEEE 754 represents numbers in binary

7:14Get Andrew back

🗹

7:14Get Andrew back

🗹

7:14Get Andrew back

🗹

9:27 : Return

9:27 : Return

9:27 : Return

10:36 : Continuing IEEE 754's^{2} representation of floating point numbers^{3}

10:36 : Continuing IEEE 754's^{2} representation of floating point numbers^{3}

10:36 : Continuing IEEE 754's^{2} representation of floating point numbers^{3}

16:20 : Subnormal numbers, the special-case numbers infinity, quiet NaN and signaling NaN, and the quality of being "algebraically closed"

16:20 : Subnormal numbers, the special-case numbers infinity, quiet NaN and signaling NaN, and the quality of being "algebraically closed"

24:10 : Any questions?

24:10 : Any questions?

24:10 : Any questions?

24:35Is it just a peculiarity of binary as a number system, that you can skip encoding the leading digit?

24:35Is it just a peculiarity of binary as a number system, that you can skip encoding the leading digit?

27:28 : Note the binary and decimal representations of floating point numbers in the IEEE 754 standard^{4}

27:28 : Note the binary and decimal representations of floating point numbers in the IEEE 754 standard^{4}

27:51 : Constants definitions in handmade_numerics.h

📖

27:51 : Constants definitions in handmade_numerics.h

📖

27:51 : Constants definitions in handmade_numerics.h

📖

30:22 : Constants definitions in C's float.h^{5} as compared with those in handmade_numerics.h, with a special mention of machine epsilon

📖

30:22 : Constants definitions in C's float.h^{5} as compared with those in handmade_numerics.h, with a special mention of machine epsilon

📖

📖

33:32 : Describe the IEEEBinary32 union, ieee754_number_category enum and the special-case number functions

📖

33:32 : Describe the IEEEBinary32 union, ieee754_number_category enum and the special-case number functions

📖

📖

36:48 : Describe Real32_Abs(), Real32_SetSign() and CategoryOfReal32()

📖

36:48 : Describe Real32_Abs(), Real32_SetSign() and CategoryOfReal32()

📖

36:48 : Describe Real32_Abs(), Real32_SetSign() and CategoryOfReal32()

📖

39:07 : Describe ExtractExponent() as similar to the CRT's frexp()^{6} with an example of its use in a sqrt() function

📖

39:07 : Describe ExtractExponent() as similar to the CRT's frexp()^{6} with an example of its use in a sqrt() function

📖

📖

47:36When multiplying a subnormal number by a power of two, does the floating point unit first shift the numbers into the normal range before incrementing the exponent?

47:36When multiplying a subnormal number by a power of two, does the floating point unit first shift the numbers into the normal range before incrementing the exponent?

50:38 : Describe ScaleByExponent()

📖

50:38 : Describe ScaleByExponent()

📖

50:38 : Describe ScaleByExponent()

📖

53:58 : Note the differing range of absolute values of the mantissa in text books (as used in handmade_numerics.h) and the CRT's frexp()^{7}

📖

53:58 : Note the differing range of absolute values of the mantissa in text books (as used in handmade_numerics.h) and the CRT's frexp()^{7}

📖

📖

57:01 : Quote James H. Wilkinson on the state of computer arithmetic in 1971

57:01 : Quote James H. Wilkinson on the state of computer arithmetic in 1971

57:01 : Quote James H. Wilkinson on the state of computer arithmetic in 1971

58:30 : Describe SlowDivision(), with emphasis on the sheer amount of specification compliance it contains^{8}

📖

58:30 : Describe SlowDivision(), with emphasis on the sheer amount of specification compliance it contains^{8}

📖

📖

1:06:29 : Walk through an example of SlowDivision(), noting why it uses 11 bits of precision in its application of Horner's rule^{9} (IA-32's RCPSS instruction^{10})

📖

1:06:29 : Walk through an example of SlowDivision(), noting why it uses 11 bits of precision in its application of Horner's rule^{9} (IA-32's RCPSS instruction^{10})

📖

📖

1:14:13 : How SlowDivision() finishes up its computation to the highest precision it can

📖

1:14:13 : How SlowDivision() finishes up its computation to the highest precision it can

📖

1:14:13 : How SlowDivision() finishes up its computation to the highest precision it can

📖

1:16:54 : Note the difference between SlowDivision() and how our FPU performs division

📖

1:16:54 : Note the difference between SlowDivision() and how our FPU performs division

📖

1:16:54 : Note the difference between SlowDivision() and how our FPU performs division

📖

1:19:13 : Calculating those polynomial approximations from SlowDivision(), with range illustrations in Mathematica

📖

1:19:13 : Calculating those polynomial approximations from SlowDivision(), with range illustrations in Mathematica

📖

📖

1:25:12 : Relative vs absolute error

📖

1:25:12 : Relative vs absolute error

📖

1:25:12 : Relative vs absolute error

📖

1:26:43 : Plot the error function in Mathematica, with a mention of Chebyshev's Equioscillation theorem^{11} and Chebyshev nodes^{12}

📖

1:26:43 : Plot the error function in Mathematica, with a mention of Chebyshev's Equioscillation theorem^{11} and Chebyshev nodes^{12}

📖

📖

1:31:35 : Plot the eighth order Chebyshev polynomial in the range -1 to 1 in Mathematica

📖

1:31:35 : Plot the eighth order Chebyshev polynomial in the range -1 to 1 in Mathematica

📖

1:31:35 : Plot the eighth order Chebyshev polynomial in the range -1 to 1 in Mathematica

📖

1:33:47 : Using the Remez exchange algorithm^{13} to find approximations to Chebyshev's set of polynomials

📖

1:33:47 : Using the Remez exchange algorithm^{13} to find approximations to Chebyshev's set of polynomials

📖

📖

1:38:21On the initial guesses in the Remez exchange algorithm

1:38:21On the initial guesses in the Remez exchange algorithm

1:38:21On the initial guesses in the Remez exchange algorithm

1:39:15 : Plot the ninth order Chebyshev polynomial, for comparison with the eighth order, to explain extrema

📖

1:39:15 : Plot the ninth order Chebyshev polynomial, for comparison with the eighth order, to explain extrema

📖

📖

1:40:20Struggle with the communication link

💢

🗩

1:40:20Struggle with the communication link

💢

🗩

1:40:20Struggle with the communication link

💢

🗩

1:41:49As the Remez exchange algorithm proceeds, what values does it use as its new guesses?

1:41:49As the Remez exchange algorithm proceeds, what values does it use as its new guesses?

1:41:49As the Remez exchange algorithm proceeds, what values does it use as its new guesses?

1:42:24 : Searching for the extremum, perhaps using golden-section search,^{14} as the Remez exchange algorithm proceeds

📖

1:42:24 : Searching for the extremum, perhaps using golden-section search,^{14} as the Remez exchange algorithm proceeds

📖

📖

1:46:50 : Plot sine in Mathematica, and introduce the computation of sine in the range 0 to 2π

📖

1:46:50 : Plot sine in Mathematica, and introduce the computation of sine in the range 0 to 2π

📖

1:46:50 : Plot sine in Mathematica, and introduce the computation of sine in the range 0 to 2π

📖

1:56:57 : Describe SinCos_TableVersion()

📖

1:56:57 : Describe SinCos_TableVersion()

📖

1:56:57 : Describe SinCos_TableVersion()

📖

1:58:31 : Deriving trigonometric identities

1:58:31 : Deriving trigonometric identities

1:58:31 : Deriving trigonometric identities

2:02:28 : Calculating cosine around "a", branch-free

2:02:28 : Calculating cosine around "a", branch-free

2:02:28 : Calculating cosine around "a", branch-free

2:04:00 : Point out the experimental SinCos_QuadrantVersion() for SIMD sines and cosines

📖

2:04:00 : Point out the experimental SinCos_QuadrantVersion() for SIMD sines and cosines

📖

2:04:00 : Point out the experimental SinCos_QuadrantVersion() for SIMD sines and cosines

📖

2:05:54 : Describe the XSinCosX table look up

📖

2:05:54 : Describe the XSinCosX table look up

📖

2:05:54 : Describe the XSinCosX table look up

📖

2:07:56 : Counting has always started at zero^{15}

📖

2:07:56 : Counting has always started at zero^{15}

📖

2:07:56 : Counting has always started at zero^{15}

📖

2:08:47 : Continued description of the XSinCosX table look up

📖

2:08:47 : Continued description of the XSinCosX table look up

📖

2:08:47 : Continued description of the XSinCosX table look up

📖

2:10:38 : Run calculate_sincos_tables and explain the result for .5

🏃

2:10:38 : Run calculate_sincos_tables and explain the result for .5

🏃

2:10:38 : Run calculate_sincos_tables and explain the result for .5

🏃

2:12:10 : Describe how FindSinCosAround() searches adjacent floating point numbers

📖

2:12:10 : Describe how FindSinCosAround() searches adjacent floating point numbers

📖

2:12:10 : Describe how FindSinCosAround() searches adjacent floating point numbers

📖

2:15:55Could you explain why part of the table version looks up into XSinCosX while the polynomial part is the same no matter where you are in the table?

2:15:55Could you explain why part of the table version looks up into XSinCosX while the polynomial part is the same no matter where you are in the table?

2:17:10 : Further explain the table lookups of cos(a) and sin(a) and the sin(e) and cos(e) polynomial approximations

📖

2:17:10 : Further explain the table lookups of cos(a) and sin(a) and the sin(e) and cos(e) polynomial approximations

📖

📖

2:19:46Since table lookups are hard in SIMD, what sort of stuff would you end up doing if you couldn't use a table?

2:19:46Since table lookups are hard in SIMD, what sort of stuff would you end up doing if you couldn't use a table?

2:20:09 : Describe the experimental SinCos_QuadrantVersion()

📖

2:20:09 : Describe the experimental SinCos_QuadrantVersion()

📖

2:20:09 : Describe the experimental SinCos_QuadrantVersion()

📖

2:29:32 : Describe ATan() and ATan2(), noting the current use of atan2 in Handmade Hero

📖

2:29:32 : Describe ATan() and ATan2(), noting the current use of atan2 in Handmade Hero

📖

2:29:32 : Describe ATan() and ATan2(), noting the current use of atan2 in Handmade Hero

📖

2:35:01Determine to remove atan2 from Handmade Hero with thanks to Andrew

2:35:01Determine to remove atan2 from Handmade Hero with thanks to Andrew

2:35:01Determine to remove atan2 from Handmade Hero with thanks to Andrew

2:35:31Q&A

2:35:31Q&A

2:35:31Q&A

2:43:17 : Note the educational nature of this sine and cosine implementation, with a mention of Cody-Waite reduction

2:43:17 : Note the educational nature of this sine and cosine implementation, with a mention of Cody-Waite reduction

2:46:51SSE denormal flushing

2:46:51SSE denormal flushing

2:46:51SSE denormal flushing

2:50:34 : Re-emphasise the slowness of SlowDivision()

📖

2:50:34 : Re-emphasise the slowness of SlowDivision()

📖

2:50:34 : Re-emphasise the slowness of SlowDivision()

📖

2:56:22 : Quote the Intel 64 and IA-32 Architectures Optimization Reference Manual:^{17} "Although x87 supports transcendental instructions, software library implementation of transcendental function can be faster in many cases"

📖

2:56:22 : Quote the Intel 64 and IA-32 Architectures Optimization Reference Manual:^{17} "Although x87 supports transcendental instructions, software library implementation of transcendental function can be faster in many cases"

📖

📖

2:57:04Thank you to Andrew for walking us through sine and cosine, with closing thoughts on numerical approximations

2:57:04Thank you to Andrew for walking us through sine and cosine, with closing thoughts on numerical approximations

⏬

Next: 'Never, Ever Update Your Development Tools. Ever.'

⏬