OpenGL Distilled

5.4. Performance Issues

Copying pixel rectangles between client (host) memory and server (graphics hardware) memory is often the cause of performance bottlenecks. If your application needs to use glDrawPixels() or glReadPixels() in performance-critical code, you should read this section and make yourself aware of the performance issues.

5.4.1. Using Alternatives to glDrawPixels()

Applications rarely use glDrawPixels() to render pixel rectangles due to poor performance in several implementations. Many implementations fall back to software processing when applications specify uncommon type and format parameters. For best performance, specify a format of GL_RGBA and a type of GL_UNSIGNED_BYTE.

Regardless, glDrawPixels() still requires OpenGL to copy the entire pixel rectangle to the graphics hardware every time the application issues the glDrawPixels() commandtypically, every frame. Even if the OpenGL implementation processes glDrawPixels() with full hardware acceleration, the performance impact of the data copy could be unacceptable.

The texture mapping feature, described in Chapter 6, "Texture Mapping," provides a more efficient solution. Applications store the pixel rectangle as a texture object, requiring only one data copy over the system bustypically, at initialization time. To display the pixel rectangle, applications render a texture mapped quadrilateral. Accessing the pixel data from texture memory is extremely efficient on nearly all modern OpenGL implementations.

The GL_ARB_pixel_buffer_object[2] extension allows glDrawPixels() commands to source data from buffer objects stored in high-performance server memory. When correctly implemented, this extension allows glDrawPixels() to operate with performance comparable to texture mapping. Chapter 7, "Extensions and Versions," discusses the GL_ARB_pixel_buffer_object extension.

[2] When this book went to press, GL_ARB_pixel_buffer_object was a candidate for promotion to the OpenGL version 2.1 specification.

5.4.2. Flushing the OpenGL Pipeline

glReadPixels() suffers from the same performance issues as glDrawPixels() inherent performance issues related to copying pixel rectangles over the system bus and unoptimized code paths for uncommon format and type parameters. These issues are minor, however, compared with the fact that glReadPixels() completely flushes the OpenGL rendering pipeline.

The glReadPixels() command doesn't return to the application until OpenGL has completed the read operation. This completely drains the OpenGL pipeline; the application isn't sending any new commands because it's stalled waiting for glReadPixels() to return.

As with glDrawPixels(), the GL_ARB_pixel_buffer_object extensions can help boost read performance. Applications read to a buffer object rather than completely back to host memory; then they copy buffer object contents back to the host when a performance delay is acceptable. Again, see Chapter 7, "Extensions and Versions," for a discussion of the GL_ARB_pixel_buffer_object extension.

Категории