Wes Bethel
R3vis Corporation
P.O. Box 979
Novato, CA, 94948
ewbethel03 at r3vis dot com
December, 2000
R3vis Corporation was awarded a Phase I Small Business Innovation Research (SBIR) grant from the U.S. Department of Energy, Office of Science, effective 1 September 2000. The scope of the grant is to design and to implement technical enhancements in four area in OpenRM Scene Graph. OpenRM is an Open Source scene graph library distributed under the terms of the GNU Lesser General Public License (LGPL), hosted at SourceForge.net (http://openrm.sourceforge.net). The four technical areas are 1) implement a "thread-safe" scene graph API, 2) implement a pipelined-parallel rendering engine in the scene graph, 3) provide support for tiled displays, and 4) provide support for off-screen rendering. All of the funded work is intended to be deployed in OpenRM, and is available as Open Source software via the OpenRM Scene Graph project. The first public release of OpenRM containing thread-safe enhancements is version 1.4.0-alpha-1, released on December 7, 2000. This document describes the design and specification of thread-safe enhancements to OpenRM. The 1.4.0-alpha-1 version of OpenRM is thread-safe in both Unix and Windows operating environments.
In this document, we assume that you have a good understanding of graphics concepts, including OpenGL, and a good understanding of issues related to multithreaded, or parallel programming, including concepts such as synchronization mechanisms like mutexes and sempahores.
1.0 Definition of Thread Safety 2.0 OpenRM Scene Graph Thread-Safety Requirements 3.0 Multiple Readers, the Context Cache and the Component Manager 4.0 Multiple Writers: Policy-Based and Fine Grained Access Control 5.0 Typical Application Architecture 6.0 Implementation Details 6.1 Context Cache 6.2 Component Manager 7.0 Deployment Considerations 8.0 Future Work 9.0 Bibliography
The term thread safety is often used to describe specific measures of correctness for software components or code blocks that are executed by multiple, concurrent execution threads. A software component that is thread safe will produce "correct results" free from unexpected side-effects when executed by multiple, concurrent threads. In contrast, software components that are not thread safe are not guaranteed to produce correct results when executed by multiple, concurrent threads. As stated in [1], "thread safety" means that code can be run from multiple threads without "destructive results."
Most graphics and visualization applications generally require two types of thread safe behavior of graphics infrastructure software: "multiple readers" and "multiple writers." We have focused our work on these two primary areas of emphasis.
Multiple readers describes an architecture that uses multiple rendering threads that simultaneously read from a single scene graph to create multiple images. Such an architecture is commonly used to create rendering applications for tiled display surfaces. Tiled display surfaces, or systems, consist of several display devices that are can be used to present multiple views of a single scene. The multiple readers architecture is an effective way to achieve parallelism in graphics and visualization applications that are targeted for use in tiled display environments. One of the biggest challenges in providing for thread safety in a scene graph infrastructure that supports multiple readers is the absence of state variables in the scene graph itself. Our approach for a thread safe scene graph in a multiple reader environment uses two related mechanisms, called the "context cache" and "component manager", which are described later. Both of these control structures are embedded into the RMpipe object, which is controlled by the application. In essence, the RMpipe object is the control structure that allows multiple readers to be "reentrant." Reentrant is often thought of as code that is efficiently thread safe.
Multiple writers describes an architecture in which multiple application threads create or update a single scene graph. Multiple writers is an effective means to increase application performance by parallelizing data input operations. One of the challenges in supporting thread safety for multiple writers lies in providing orderly access to the scene graph. Our approach for providing thread safe support to multiple writers makes use of implicit and explicit synchronization operations, also described later.
To be succinct, our requirements for a thread safe scene graph system is that it must support both multiple readers and multiple writers without destructive side-effects.
The fundamental problem addressed by the context cache is thread safe storage of values that are both rendering context specific, but associated with single scene graph objects. One common pitfall encountered in thread unsafe systems is that context-specific data is stored as part of the scene graph itself. Context-specific data includes things like OpenGL display list indices, OpenGL texture object identifiers, and so forth. The fundamental issue is that there exists data that is dependent upon both the scene graph primitives (the things being drawn), and the rendering context (the environment in which they're drawn).
We define a context cache as a data structure that holds data dependent upon both the scene graph and the rendering context. Thus, the OpenRM context cache contains display list indices and texture object identifiers. The context cache is a part of the RMpipe object, which is the rendering context specific object in an application that uses the scene graph system, but is not a part of the scene graph itself. The context cache is created when the RMpipe object is "bound" to an OpenGL rendering context. The size of the context cache is a function of the size of the scene graph itself: as the number of objects in the scene graph grows, the number of potential cachable objects similarly grows.
The component manager is a "central clearing house" for all scene graph objects that may require storage and use of data that is rendering context specific. Creation and destruction requests for such objects are processed by the component manager. At the time of scene graph initialization, the component manager preallocates a large block of all such objects, and then manages it's own internal free and alloc object lists. This approach allows for synchronization of the size of the component manager object pools, and the size of the rendering context cache. In addition, the component manager acts as a control nexus that provides orderly access to the scene graph by multiple writer threads through the use of synchronization mechanisms (mutexes).
Two classes of problems must be addressed in a thread-safe system that supports multiple writer threads. One of these is a occurs when two writer threads attempt to read/write/modify a single object. Another occurs when two threads simultaneously invoke a constructor or destructor function for an object that is under control of the component manager.
In order to avoid having a mutex lock on every possible object and attribute, we strike a balance between exhaustive protection and no projection using a combination of OpenRM-supplied mutex locks, application-dictated mutex locks and an advertised policy concerning scene graph updates. OpenRM supplies mutex locks for all constructor and destructor functions for objects under control of the component manager. These locks will provide for orderly access to constructor/destructor functions for all object types under control of the component manager.
Applications may request that a mutex lock is created or destroyed for any scene graph node. Applications may test, lock or unlock mutexes at scene graph nodes.
Scene graph access policy is an important part of applications that use multiple writer threads, because node-level mutexes do not completely solve all problems associated with concurrent writes in a scene graph. Nodes contain primitives, which hold the actual vertex and color data used for drawing. Applications update primitives, not nodes, with vertex and color data. Therefore, the node-level mutexes do not explicitly coordinate access to a single primitive by multiple writers. The application itself must use a node-level mutex to control access to resources at a node, or within a subtree rooted at the locked node.
The policy that must be used by applications is to first obtain a node lock prior to updating any objects or attributes of the node. This problem crops up in a slightly different context - modification of a descendant of a node that is locked. The descendent node does not explicitly know that its ancestor is locked. This type of problem is not easily solved with direct access control, hence it is up to the application to make use of node-level access control to coordinate its access to node-level attributes or objects, as well as to coordinate access to portions of the scene graph that are rooted at a locked node.
To summarize the policy, multiple-writer applications must make extensive use of node-level locks to coordinate orderly access to scene graph nodes. The methods that modify the scene graph nodes do not know about the mutex locks. Access to scene graph constructor/destructor functions is protected by internal mutexes, and requires no special action on the part of the application: multiple application threads may ask for a new scene graph node (or other protected object - see the table below) without destructive side effects.
At the time of this writing (December 2000), the OpenRM renderer ignores any mutex locks that might be set by the application. In the future, the renderer may be modified to respond to the presence of a locked node by either pruning traversal of the scene graph at the locked node, or to block and wait for the node to become unlocked.
It is assumed that the application that uses the thread safe features of OpenRM (multiple readers or writers) will itself be a parallel application composed of multiple execution threads. A multiple renderer (reader) application will create and manage multiple threads, each of which will create an RMpipe object, and invoke the OpenRM frame rendering method to produce an image. Most applications will want some form of per-thread view computation, such as a sheared-frustum for tiled surface displays (each tile uses a unique view specification for it's view of the scene). The application must compute and assign such a view specification - the scene graph system is not a framework for setting up tiled displays - it is a framework for running multiple, simultanous rendering threads that draw to each display from one or more scene graph object databases. Similarly, multiple writer threads must be created and managed by the application.
Applications developers need not be concerned with the details of context cache and component manager implementation in OpenRM scene graph - the benefits provided by these services are realized at the application level without any explicit code. The purpose of documenting implementation details is to define the scope of the current system, and to provide technical details for the curious.
The context cache exists on a per-RMpipe basis. The RMpipe contains all the information and structures needed to perform rendering: an OpenGL context, and open drawable, etc. The context cache is a mechanism used to hold data that is specific to an OpenGL rendering context. At the present time, this data consists of display list indices and texture object identifiers.
Associated with these data values are what we call "context cache keys." The context cache key is an identifier, or fingerprint, that is associated with each of the display list indices or texture object identifiers. The cache key is merely an identity that is assigned to each display list index or texture object identifier. When an application writer thread assigns data to a "cacheable object," e.g., an RMprimitive, a cache key is automatically generated and stored inside the RMprimitive. Later, at render time, the value of the cache key in the RMprimitive is compared with the cache key stored in the context cache. If they differ, the OpenGL display list is regenerated for the RMprimitive. The purpose of the cache key stored in the scene graph object is to act as a unique identifier associated with the contents of the scene graph object. We don't need to know what the contents are, precisely, but we do need to be able to detect when they have changed, in order to take appropriate action (download a texture to OpenGL or regenerate a display list).
The cache key is a 64 (or 32) bit value, and is derived from the system clock, hence is a function of time. The high order bits of the key represent seconds, and low order bits represent microseconds. In the absence of protected (coordinated, or serialized) access to the key generator, there is virtually zero liklihood of duplicate context cache keys ever being generated during the course of an application run. This is due to the use of the high resolution system clock (microseconds). Unfortunately, a clock of sufficiently high resolution is not available on Win32 platforms, so an incremental counter supplants the high resolution system clock used in Unix. (Note that Win32 implementations will require a mutex lock on the key generator, and such a lock has not yet been implemented.)
The following table shows the OpenRM primitives that require context-specific data to be stored in the context cache. Of these objects, text and quadrics do not require a cache key because the contents of the OpenGL display lists are invariant - these are in effect pre-tesselated or pre-rendered representations of models (or character glyphs) that are simply instantiated as needed by the renderer.
Object or Process | Role in Rendering Context |
---|---|
RMprimitive | All draw code inside RMprimitive reduced to a single OpenGL display list index. |
Text and Fonts | A single display list identifier is used to render a bitmap glyph for each character. An internal font manager keeps track of display list indices for bitmap glyphs, and causes new lists to be built at render time when the application draws text. The font manager uses an internal cache to hold bitmaps for all characters of all font, style and size combinations. Display lists are built only when the application requests text rendering. The font manager will build display lists only when it encounters a new font, size or family combination. Applications do not have direct access to the font manager. |
Quadrics | When OpenRM is initialized, internal display lists are built to represent spheres, cones and cylinders. A number of lists are built for each quadrics object, representing varying degrees of tesselation. Applications do not have direct access to these display lists, but can control which of the tesselation models is rendered through a flag set at the RMprimitive level. |
RMtexture | Texture object identifiers are stored inside the RMtexture object. The texture object identifiers are obtained from OpenGL at the time when the RMtexture object is assigned to an RMnode as a scene parameter. Additionally, textures are downloaded when the RMtexture is assigned to the node as a scene parameter. |
RMimage | The glDrawPixels() used to draw the image is stored as an OpenGL display list index inside the RMimage object. The RMimage display list index is used by draw code for RM_SPRITES primitives and when drawing background image tile scene parameters. Even though RMimages specify pixel data for RMtexture objects, the display list identifier inside the RMimage object is not used within the context of downloading pixel data for textures. |
The context cache is created by OpenRM at the time that an RMpipe is "bound" to an OpenGL rendering context using the call "rmPipeMakeCurrent." The current design requires each RMpipe to use a unique OpenGL rendering context, and that the contexts used by multiple RMpipes do not share display lists or texture object identifiers.
When created, the context cache contains enough space to hold display lists for all possible permutations of text, as well as all possible combinations of pre-tesselated quadrics models. These tables are fixed in size, and can never grow or shrink. The initial cache also contains entries for display lists to be associated with RMprimitive objects, RMimage objects and texture object identifiers. To address the problems inherent with a growing and shrinking population of objects of these types, we preallocate a large amount of space to hold display list identifiers (and cache keys) for many objects. The amount of space allocated is a function of the number of objects present in the Component Manager object pool.
The primary function of the component manager is to serialize access to constructor and destructor functions for a "critical set" of scene graph objects. The critical set is defined as those objects whose population size may have an effect upon the size required by the context cache. When initialized, the component manager will preallocate a large number of objects (RMnodes, RMprimitives, RMimages and RMtextures). Later, when a context cache is created for an RMpipe, the size of the component manager object pool dictates the size of the context cache for certain classes of objects. The component manager is initialized when the scene graph system is initialized.
The component manager maintains its own internal lists of free and allocated objects. When the application invokes a constructor function (e.g., rmNodeNew()), a handle to the first item on the free list is returned to the caller. Internal to the component manager, the object is removed from the free list and added to the alloc list. Conversely, when an object is destroyed (e.g., rmNodeDelete()), the object is moved from the alloc list back to the free list. This approach replaces numerous mallocs of small objects with a single malloc for a large pool of objects.
In the current implementation, the initial size of the object pool is controlled by a compile-time constant. When all available objects have been exhausted, an error message is issued. In the future, we hope to implement a reallocation strategy that will grow the size of the object pool, and also produce growth in the context cache of the corresponding object type. At this point in time, if the size of the component manager's object pool is too small for a given application, the only remediation is to increase the size of the pool that is allocated when OpenRM is first initialized.
As of the time of this writing (December 5, 2000), most of the coding is complete and has been tested within a limited scope. The OpenRM demonstration distribution contains two new programs: a multiple reader and a multiple writer. The multiple reader creates several rendering windows, all of which simultaneously read from a single scene graph and draw to their respective windows. The multiple writer application is an implementation of a parallel isosurface visualization tool. Each of the multiple threads in the isosurfacer generates triangles for its subset of data, and stores the triangles into a single scene graph.
We have experienced significant difficulties with thread-unsafe behavior of OpenGL drivers under Linux, our primary development platform. In fact, all combinations of OpenGL drivers we have tested, including software-only implementations, have exhibited destructive results when multiple readers execute in a truly concurrent fashion. We have tested our multiple reader on Solaris, and that OpenGL implementation appears to be thread safe.
LLNL has provided us with a copy of VDL, a framework for creating multi-threaded apps for use on tiled displays. At this time, VDL appears to be completely serial, even though it is multi-threaded. As part of the SBIR work, we will port two or three of the OpenRM demonstration programs to VDL to show how OpenRM and VDL can be used together.
We are also awaiting a VRCO release of a threads-based CAVELib, and will resume demo application development work when that becomes available.
In the weeks ahead, we will implement a mutex lock on the cache key generator for Windows to ensure key uniqueness.
In the Phase I work, the size of the component manager object pool is fixed. When space is exhausted, an error message is reported, but there is no remedial action. In future work (Phase II SBIR), we will implement remedial action when the object pool has been exhausted, and propogate that change to the context cache of each active RMpipe.
Each RMpipe uses a unique OpenGL rendering context, and there is no sharing of display lists or texture objects between display lists. Future work will include sharing of display lists and textures between OpenGL contexts. However, due to the fundamental nature of multiple-readers, it will always be the case that a unique OpenGL context is required on each RMpipe.
[1] Programming with POSIX Threads, D. Butenhof, Addison-Wesley, 1997.
|
This page last modified -- Monday, 19-Jan-2004 13:28:14 PST |