Memory Barriers and Semaphores
Semaphores:
A semaphore is essentially a number that the operating system keeps track of, that can be incremented and decremented. When you wait for a semaphore, you're essentially telling the OS to let you know when the semaphore number becomes greater than zero. Once it does, then the Wait() call will return and the thread can do something. Calling ReleaseSemaphore(), maybe a little counterintuitively, increments the semaphore, allowing any threads waiting on it to continue working. (Thus, it releases those threads to do work). It doesn't actually change the state of the semaphore other than making the number go up. The semaphore number goes down when a thread has successfully Wait()ed for the semaphore. In cases like the one demonstrated on stream, this usually means the semaphore number will go up/down really fast and stick close to 0 as most of the time the threads are waiting for the semaphore to increment.
What this allows you to do is tell several threads at once that some work is ready without having to signal to each one individually. As long as each one is waiting on the same semaphore object, they'll all know when there's more work to be done.