Fundamentals of Audio and Video Programming for Games (Pro-Developer)
After running Circular Streams.exe, you ll notice that there are slots for three sounds, each with its own Play and Stop buttons (see Figure 6.1). To make the first test easy, we have added the Load Presets button, which will select three appropriate files (musical tracks in this case) and load them in with a single click. Click Play for one of the sounds, and then for the other two. As you randomly click Play and Stop, you should hear the music tracks smoothly start and stop without interfering with each other and without any sound glitches.
Use the Sound File button to load in alternative sounds, and test the sample again. Again, note the smooth starting, stopping, and looping of tracks.
Behind the scenes, what is happening is that each of the three slots in the UI matches a one-second buffer. Each time that you press Play, the first second of a track is loaded and begins playing. This buffer is topped up from the file on the hard disk around every tenth of a second. If all three streams are playing, then this topping up is going on for all three of them. This was the main design goal of this sample, to demonstrate multiple streams working together, and not tripping each other up.
If you have a small number of sounds in your project, you will want to have them in static (non-streaming) buffers, to maximize performance and simplify the coding. However, there are several situations where you might want to consider using streaming buffers. These include music and speech, because they are probably long files and perhaps are sounds where you do not want to apply special effects or 3-D processing to the audio data. However, you may also want to use streaming when you have a very large number of sound effects, which is typical of many games , and want to maximize the performance (that is, minimize latency and glitching ) of all of them.
To summarize, we suggest the use of streaming buffers in the following specific cases:
-
If you have music tracks, then set up a single streaming buffer for each to play the music smoothly regardless of the length of the files.
-
If you want to be more adventurous, try the idea of separating your music tracks into bass, percussion, sax, drums and so on, and then play one or more of these tracks together to create a certain atmosphere. At certain times during your game, you can turn on or off one or more of the tracks that make up the theme music of your game to add a different feel to the music. This requires multiple software buffers similar to the Circular Streams sample. However, note that it is not possible to synchronize the playing of any buffers exactly using DirectSound, so compose your music accordingly .
-
For human (or even non-human) dialog that is lengthy (more than a few seconds long), consider setting up one stream per speaker.
-
Use streaming when your application requires a very large number of sounds, and you want most or all of them to be created using hardware buffers. You should be careful not to create very small streaming buffers, because there are a number of impenetrable issues that lead to glitching if the buffers are too small (the recommended minimum is 100ms for a streaming buffer).
So with these specific applications in mind, we ll crack open the code.
The Circular Streams Project
Before we can make any progress in writing a polling-based streaming sample, we need to define a new class supporting this technique, similar to CStreamingSound , which we have called CCircularSound . If you open the
class CCircularSound : public CSound { protected: long m_lWritePosition; long circularDistance(long lValue); public: CCircularSound( LPDIRECTSOUNDBUFFER pDSBuffer, DWORD dwDSBufferSize, CWaveFile* pWaveFile ); ~CCircularSound(); HRESULT TopUpBuffer(DWORD index); HRESULT Reset(); };
The declaration is noticeably simpler than the CStreamingSound class, with just a single protected variable, lWritePosition , which acts as a cursor pointing to the byte at which the next write should start. The creation function does nothing more than set lWritePosition to 0, and call the creation function for the CSound class.
CCircularSound::CCircularSound( LPDIRECTSOUNDBUFFER pDSBuffer, DWORD dwDSBufferSize, CWaveFile* pWaveFile ) : CSound( &pDSBuffer, dwDSBufferSize, 1, pWaveFile, 0 ) { m_lWritePosition = 0; }
The short protected method, circularDistance, simply ensures that the given value lies within the boundaries of the buffer. For example, the call circularDistance( dwBufferSize ) will return the value 0, as the boundaries of the buffer are from 0 to dwBufferSize “ 1.
long CCircularSound::circularDistance(long lValue) { if (lValue < 0) return lValue + m_dwDSBufferSize; if (lValue >= (long) m_dwDSBufferSize) return lValue - m_dwDSBufferSize; return lValue; }
All the serious processing in the class is encapsulated in one method, TopUpBuffer , which does all the polling and topping up of a sound buffer.
HRESULT CCircularSound::TopUpBuffer(DWORD index) { DWORD dwCurrentPlay; // Current play cursor. DWORD dwCurrentWrite; // Current safe write cursor // (not used in this sample). long lSafeWriteTo; // Upper safe-writing limit. long lCurrentPlay; // Signed copy of play cursor. long actualRead; // Actual number of bytes written. BOOL fEOF; // End of File flag. if( m_apDSBuffer[0] == NULL m_pWaveFile == NULL ) return CO_E_NOTINITIALIZED; // Retrieve the current play position. m_apDSBuffer[0] ->GetCurrentPosition(&dwCurrentPlay,&dwCurrentWrite); // Make sure calculations work with negative numbers. lCurrentPlay = (long) dwCurrentPlay; // Ensure a 32-byte safety margin (some cards // mis-report their play position). if (circularDistance(lCurrentPlay - m_lWritePosition) > 32) { lSafeWriteTo = circularDistance(lCurrentPlay - 32); // Do not top up the buffer if less than 10% of the buffer // size would be written. if (circularDistance(lSafeWriteTo - m_lWritePosition) > (long) m_dwDSBufferSize/ 10) { // Test to see if the block to write is contiguous. if (lSafeWriteTo > m_lWritePosition) { // Top up the buffer. actualRead = PartFillBufferWithSound(0, m_lWritePosition, lSafeWriteTo, &fEOF); // Update the write position depending on the number // of bytes actually written. m_lWritePosition = circularDistance(m_lWritePosition + actualRead); } else { // Wrap around case. // First top up the end of the buffer. actualRead = PartFillBufferWithSound(0, m_lWritePosition, (long)m_dwDSBufferSize - 1, &fEOF); if (fEOF == false) { // Second top up the beginning of the buffer. actualRead = PartFillBufferWithSound(0, 0L, lSafeWriteTo, &fEOF); m_lWritePosition = actualRead; } else m_lWritePosition = circularDistance(m_lWritePosition + actualRead); } // End of file reached so reset the read pointer into the wave file // back to the start. if (fEOF) m_pWaveFile ->ResetFile(); } } return S_OK; }
After the usual checks to see that we have valid parameters, the TopUpBuffer method calls the DirectSound SDK method, GetCurrentPosition . The GetCurrentPosition method takes two DWORDs as parameters, and fills in the current play position and the safe-write position for the buffer. The current play position is what we want, which is the cursor position pointing to where the sound buffer is currently being played from. The next statement in our method simply changes this to a long value, so it is signed as we need to do some calculation that may involve negative numbers. Unfortunately, not all sound cards report the play position very accurately, and some are known to misreport it by up to 32 bytes. Because of this, we subtract 32 bytes from the current play position to get a safe limit for writing. Notice how we use the circularDistance method to ensure that the calculation works when the pointers are wrapping around the buffer.
|
As a matter of interest, we do not use the dwCurrentWrite value returned by the GetCurrentPosition call. This is a pointer to where it is safe to write to the buffer ahead of the play cursor. This can be used in a different style of polling model called After-Write-Cursor, where the buffer is topped up in advance of the play cursor to a certain number of milliseconds of playing time.
In practice, though, this model tends to have very low latency (which is good) but at the expense of having more glitches. The technique that we are using to top up the buffer is called Before-Play-Cursor, which tends to result in slightly higher latency than After-Write-Cursor, but with fewer glitches. However, because the After-Write-Cursor model is appropriate to use where audio data is being generated on the fly, which is not the case for almost all applications, the Before-Play-Cursor model is usually the appropriate polling model.
|
It doesn t make sense to top up the buffer when only a trivial amount of data would be written. For this sample, we have simply defined a trivial amount to be less than ten percent of the length of the buffer. The recommended minimum (sometimes referred to as the wake-up time) is 10ms.
If our calculations tell us that it is time to top up the buffer, the next thing we do is check whether we have the simple case of filling a contiguous block of memory in the buffer. The more complex case would involve having to fill some data at the top of the buffer, and then wrap around and start filling from the beginning of the buffer.
In the former case, we make one call to the PartFillBufferWithSound method, which we have added to the CSound class. In the latter case, where a wraparound occurs, we make two calls to the PartFillBufferWithSound method, first filling to the top of the buffer, and then filling from the beginning. However, in both of these cases there is one important check to make: we need to know if the end of the wave file has been reached. If the end-of-file condition is true, we have to call ResetFile for the wave file, obviously to reset its read pointer back to the beginning of the file. Note that when we increment the m_lWritePosition cursor, it is always by the amount of bytes actually written, to take into account the end-of-file situation.
The TopUpBuffer method is nice and simple. However, before we can use it, we should examine the PartFillBufferWithSound method of the CSound class, and also the CreateCircular method of the CSoundManager class.
DWORD CSound::PartFillBufferWithSound(int index, long lFrom, long lTo, BOOL *fEOF ) { HRESULT hr; VOID* pDSLockedBuffer = NULL; // Pointer to locked buffer memory. DWORD dwDSLockedBufferSize = 0; // Size of the locked DirectSound buffer. DWORD dwWavDataRead = 0; // Amount of data read from // the wave file. if( m_apDSBuffer[index] == NULL ) return CO_E_NOTINITIALIZED; // Make sure we have focus, and didn't just switch in from // an application which had a DirectSound device. if( FAILED( hr = RestoreBuffer( m_apDSBuffer[index], NULL ) ) ) return DXTRACE_ERR( TEXT("RestoreBuffer"), hr ); // Lock the buffer down. if( FAILED( hr = m_apDSBuffer[index]->Lock( 0, m_dwDSBufferSize, &pDSLockedBuffer, &dwDSLockedBufferSize, NULL, NULL, 0L ) ) ) return DXTRACE_ERR( TEXT("Lock"), hr ); // Read data into the section of the buffer identified by lFrom and lTo. if( FAILED( hr = m_pWaveFile->Read( (BYTE*) pDSLockedBuffer + lFrom, lTo - lFrom + 1, &dwWavDataRead ) ) ) return DXTRACE_ERR( TEXT("Read"), hr ); // Unlock the buffer because we don't need it anymore. m_apDSBuffer[index]-> Unlock( pDSLockedBuffer, dwDSLockedBufferSize, NULL, 0 ); // Return true if the end of the file has been reached. if (dwWavDataRead == lTo - lFrom + 1) *fEOF = false; else *fEOF = true; return dwWavDataRead; }
The PartFillBufferWithSound method takes as parameters the index of the buffer to be written to (always 0 in this sample), two pointers identifying the range of bytes to write to, and a flag to be set if the end of the file has been reached. The process of this method is to lock the buffer, read in the appropriate amount of bytes, and then unlock the buffer. Note that we do not have to alter the Read method of the CWaveFile class; it already does what we want (reads in a section of the wave file) without modification.
In order to create an object of the CCircularSound class, the following method is added to the CSoundManager class.
HRESULT CSoundManager::CreateCircular( CCircularSound** ppCircularSound, LPTSTR strWaveFileName, DWORD dwCreationFlags, GUID guid3DAlgorithm, DWORD dwRequestedSize ) { HRESULT hr; if( m_pDS == NULL ) return CO_E_NOTINITIALIZED; if( strWaveFileName == NULL ppCircularSound == NULL ) return E_INVALIDARG; LPDIRECTSOUNDBUFFER pDSBuffer = NULL; CWaveFile* pWaveFile = NULL; pWaveFile = new CWaveFile(); if( pWaveFile == NULL ) return E_OUTOFMEMORY; pWaveFile->Open( strWaveFileName, NULL, WAVEFILE_READ ); // Set up the DirectSound buffer, and note that the flag // DSBCAPS_GETCURRENTPOSITION2 is required. DSBUFFERDESC dsbd; ZeroMemory( &dsbd, sizeof(DSBUFFERDESC) ); dsbd.dwSize = sizeof(DSBUFFERDESC); dsbd.dwFlags = dwCreationFlags DSBCAPS_GETCURRENTPOSITION2; dsbd.dwBufferBytes = dwRequestedSize; dsbd.guid3DAlgorithm = guid3DAlgorithm; dsbd.lpwfxFormat = pWaveFile->m_pwfx; if( FAILED( hr = m_pDS->CreateSoundBuffer( &dsbd, &pDSBuffer, NULL ) ) ) { return DXTRACE_ERR( TEXT("CreateSoundBuffer"), hr ); } // Create the sound. *ppCircularSound = new CCircularSound( pDSBuffer, dwRequestedSize, pWaveFile ); return S_OK; }
The process here is to set up the DirectSound buffer structure, call CreateSoundBuffer with the address of the structure as input and a DirectSound buffer pointer as output, and finally call the creation function for the CCircularSound class. All nice and straightforward. Notice in particular that we must set the DSBCAPS_GETCURRENTPOSITION2 creation flag to enable the DirectSound SDK GetCurrentPosition method to work. Notice also that this is the only flag we have to set; although volume, frequency, panning, 3-D, and special effect flags could also be set for the buffer, none are required for this sample.
The Circular Streams.cpp File
Having added all the low-level functionality to implement a polling-based streaming class, we can examine the high-level code that takes advantage of it.
The only global variables we need are in the following code.
#define nStreams 3 #define AllStreams (streamNumber=0;streamNumber<nStreams;streamNumber++) #define NUM_SECONDS 1 // Size of circular buffers. HINSTANCE g_hInst = NULL; CSoundManager* g_pSoundManager = NULL; CCircularSound* g_pCircularSound[nStreams] = { NULL, NULL, NULL }; TCHAR g_soundDir[MAX_PATH]; // AVBook directory path.
Notice how we are using the AllStreams macro as a lazy way of stepping through the streams. The NUM_SECONDS define provides the size of the circular buffers in seconds (it is only used once, in the LoadWaveAndCreateBuffer function).
The WinMain , MainDlgProc , OnOpenSoundFile and OnInitDialog functions should by now be very familiar to you, so we will not list them here. The only point of interest that is particular to polling is that we set a Windows timer going, ticking every 50ms, in the OnInitDialog function. If you are using buffers much smaller than 1 second, say the recommended minimum of 100ms, the timer should be set to tick every 10ms. Each time the timer ticks , the OnTimer function is called.
VOID OnTimer( HWND hwndDlg ) { int streamNumber; for AllStreams { if (g_pCircularSound[streamNumber] != NULL) { if (g_pCircularSound[streamNumber] ->IsSoundBufferPlaying(0)) { g_pCircularSound[streamNumber] -> TopUpBuffer(0); } } } }
The OnTimer function is certainly simple; for each stream that is actually playing, a call is made to the TopUpBuffer method. This is all that s needed. The OnOpenSoundFile function calls the LoadWaveAndCreateBuffer function, which is the function that creates a CCircularSound object.
VOID LoadWaveAndCreateBuffer( HWND hDlg, TCHAR* strFileName, int streamNumber ) { HRESULT hr; CWaveFile waveFile; DWORD dwCircularSize; // Load the wave file. if( FAILED( hr = waveFile.Open( strFileName, NULL, WAVEFILE_READ ) ) ) { waveFile.Close(); SetDlgItemText( hDlg, IDC_FILENAME1 + streamNumber, TEXT("Bad wave file.") ); return; } if( waveFile.GetSize() == 0 ) { waveFile.Close(); SetDlgItemText( hDlg, IDC_FILENAME1 + streamNumber, TEXT("Wave file blank.") ); return; } // The wave file is valid, and waveFile.m_pwfx is the wave's format // so we are done with the reader. waveFile.Close(); // Determine the dwCircularSize. // It should be an integer multiple of nBlockAlign. DWORD nBlockAlign = (DWORD)waveFile.m_pwfx->nBlockAlign; INT nSamplesPerSec = waveFile.m_pwfx->nSamplesPerSec; dwCircularSize = nSamplesPerSec * NUM_SECONDS * nBlockAlign; dwCircularSize -= dwCircularSize % nBlockAlign; // Create a new sound. SAFE_DELETE( g_pCircularSound[streamNumber] ); // Set up the DirectSound buffer. if( FAILED( hr = g_pSoundManager-> CreateCircular( &g_pCircularSound[streamNumber], strFileName, 0, GUID_NULL, dwCircularSize ))) { SetDlgItemText( hDlg, IDC_FILENAME1 + streamNumber, TEXT("Could not support the file's audio format.") ); return; } // Update the UI controls to show the sound as the file is loaded. EnablePlayUI( hDlg, TRUE, streamNumber ); SetDlgItemText( hDlg, IDC_FILENAME1 + streamNumber, strFileName ); }
After performing a few checks on the validity of the wave file, notice the calculation of the buffer size using the sampling rate, block alignment (1 or 2 bytes), and the NUM_SECONDS define.
Following this, any existing sound is deleted, and then a call made to CreateCircular to create the new circular sound buffer. Finally, the circular sound can be played with a call to the PlayCircularBuffer function.
HRESULT PlayCircularBuffer( BOOL bLooped, int streamNumber ) { HRESULT hr; if( NULL == g_pCircularSound[streamNumber] ) return E_FAIL; // Sanity check if( FAILED( hr = g_pCircularSound[streamNumber]->Reset() ) ) return E_FAIL; // Fill the entire buffer with wave data. If the wave file is small, // repeat the wave file if the user wants to loop the file, // otherwise fill in silence. LPDIRECTSOUNDBUFFER pDSB = g_pCircularSound[streamNumber]->GetBuffer( 0 ); if( FAILED( hr = g_pCircularSound[streamNumber]-> FillBufferWithSound( pDSB, bLooped ) ) ) return E_FAIL; // Always play with the LOOPING flag since the streaming buffer // wraps around before the entire wave file is played. if( FAILED( hr = g_pCircularSound[streamNumber] -> PlayBuffer(0, 0, DSBPLAY_LOOPING, DSBVOLUME_MAX, DSBPAN_CENTER, NO_FREQUENCY_CHANGE ) ) ) return E_FAIL; return S_OK; }
The PlayCircularBuffer function is notable in that it calls FillBufferWithSound, a CSound method that it was not necessary to modify for the polling sample, which fills the one-second buffer up completely to start with. The PlayBuffer call sets the sound going.
That completes our examination of the polling model for streaming sounds. This is the recommended method, but for comparison purposes, the next section discusses the Three Streams sample.