Thursday, June 7, 2007

Quadtree Geometry Images

A good chunk of my work in graphics has been spent working on mega-LOD terrain systems. This started with a research project in my college years where I created a real-time fractal planetary terrain demo. It wasn't the first such system (as after all, elite did this after all so many years ago!), but it was one of the first that I know of that achieved both high actual polygon throughput, and very high apparent polygon detail through aggressive use of a higher resolution unique normal map correlated perfectly with the geometry (ie, the normals are true geometric normals sampled from a more detailed surface LOD). So it looked pretty good, compared to the handful of other planetary terrain systems at the time.

Bizzarely, the system that I developed years later for pandemic is still based around all the same core ideas. The basic structure is that of a geometry image quadtree (side note: I wrote the original demo a few years before hughe hoppe's geometry images paper - I called my idea 'point textures' back then. His term is better.) - so you can do non-height field surfaces. In fact, if you go through the hassle of supporting multi-chart geometry images, you can use the system to amplify general meshes. My current implementation doesn't have any nice ways of handling the resulting cuts in the geometry image, and I just assume a single chart parameterization. More recently, hughe hoppe and all have worked out a semi-complicated method of 'zippering' up a multi-chart geometry image to guarantee a fully regular, valence 4 quadrilateral mesh. One sub-problem, which they do not discuss if i remember correctly, is that the multi-chart geometry image still has invalid samples. You can use a push-pull filter to fill these in up until you hit borders between patches in UV space which can not be connected, as they don't correspond to correct edges in 3D space. So you have a problem where you need to cut and remove triangles and or edges in your tessellation. This is a big pain in the but, because it requires you to store additional information and or derive it with a dynamic tessellation approach. My current system just renders a straight full grid for each quadtree patch. However, this is something I'm going to improve.

I have been thinking about a vastly simpler approach to multi-chart geometry images that takes advantage of the fact that I'm using 3D projective texturing, so the local 2D texture connectivity is unused. I can probably get away with a simple push-pull filter to fill in a few samples worth of edge border, creating an overlap region where adacent UV-patches will intersect and slightly z-fight, rendering on top of each other. However, this shouldn't be a problem, because as they are textured based on their 3D position derived from the z-value, they will fight for the same color. It would be really easy to try, but it still requires a good way of handling invalid triangles and breaking edges as described above. I have a very simple path now where I use an alpha channel to kill pixels for invalid regions, which would almost work, but it would still result in huge invisible triangles connecting distant UV edges which eat up fillrate and cause strange problems.

My coworker pointed out this paper, which is similar in basic structure to my system. Some of the non-trivial details are described well, such as the border issue. I use the same border duplication, but I don't assemble a temporary by copying in the 4 adjacent rectangles to a patch (although past iterations used that technique). Instead I just make the surface generation deterministic and attempt to guarantee the adjacent sample points line up perfectly. In practice, this has some accuracy issues for planetary scale (one of many floating point woes when working at that scale), but is faster and works fine for more reasonable size terrains. Their implementation is pretty slow for real-time, but it was on a 6800GT, whereas mine is on 360 & 7800 - class GPU's.

I've gone farther in caching, high res normals, and surface generation for terrains. Caching is a critical part of such a system from a performance standpoint. I'm surprised at how widespread LRU caching is, as they used, when really this is a very poor strategy for a real-time system. If we want a stable frame-rate, we need to be able to sacrifice some quality and cap expensive GPU time eaters such as bicubic geometry image surface generation. A much better caching policy is to use a priority driven cache, where you view your patch memory utilization and the current quadtree slice as an optimization problem, tackled with a greedy method. At each frame you do at most a constant number of expensive operations, and you spend some CPU time to carefully select 1. whether any are worth doing, and 2. the best choice. I use prioritization functions that include screen pixel/texel ratio, visibility (through z-queries) and some temporal feedback/prediction to attempt to weight nodes proportional to how visually important they are projected some time into the near future. The exact time weighting falloff should depend on the ratio of max generation steps per frame vs the size of the full cache, but I just heuristically tune with this idea in mind. This is essentially the same as the streaming problem, and quadtrees for general textures combined with a priority driven cache makes for a great virtual texture system, ala the id tech 5 stuff. My terrain system employs something similar, but just for the geometry image surface. I also store a compressed normal map at 4x geometry resolution, which more effeciently codes the fine detail. For this I use real-time dxt5 compression, based on the id paper. More recently I've been thinking of a more effecient approach to store the fine detail as a local displacement along a lower res normal derived from the points, ie 4 points and 5x5 height offsets connecting a virtual quad. The normals could then be derived in the shader, the height offsets can be stored 4bits per texel in the alpha of DXT5, and the quality should be better than using 8bits per texel, without any runtime compression cost. Haven't tried that yet. And at some point, that type of representation could be used to directly point splat from, so it could represent very dense surfaces cheaply.

Another area where I've experiemented and finally come up with something I like is in the surface generation itself. After trying various variations of pure factal approaches, I finally settled on a 'stamp' based approach that works quite well. In essence, my compressed representation of the surface is an explicit representation of all of the edits used to create the surface. The most basic and most useful being a directional displacement map, which can add or blend in to the existing surface. By using a small library of such displacement maps at different scales, offsets, and rotations, you can create just about anything, and creating natural terrains this way modelled from real world DEMs is very natural and quick. Allowing the directional offset can cause problems however, as there is nothing preventing the surface from generating ugly self-intersections. I've more recently thought of an improvement, where I project the displacement stamp down in 3D to derive the UV coordinate, instead of directly mapping UV's in the 2D geometry image space, which would avoid any local folds. You could still get folds from successive stamps at offset angles, but it would be more difficult. Even without this, self-intersection is a circumventable problem. (ie, rapid undo helps)

Sampling all these displacement maps could quickly get out of hand in terms of performance, especially if we need expensive sampling, such as bicubic. Unfortunately, geometry images pretty much require bicubic b-spline upsampling. So to get around this, I cache lower frequency evaluations of the surface, and remove any undersampled stamps from the evaluation stack. Instead, all the undersampled stamps are bicubic upsampled at once from the single lower frequency cache. Since these intermediate caches need to be full resolution and bit-depth, they can be extremely memory intensive. I've come up with some good tricks to get around this, namely caching reduced size intermediates that are quickly upsampled. A potential issue is that this stack collapse trick only works for fully commutative operations, such as additive displacement. A blend operator, or more complex operators, such as smooth or flatten ops, aren't commutative and thus the order is important. Removing a few ops out of an ordered stack wouldn't generate the same results. This was a big headache until I decied to stop treating it as a purely technical problem, and treat the stack order as a real concept. Now I calculate (and recalculate) the stack ordering based on the resolution of the ops, and preserve a general coarse-to-fine ordering rule. This means that sometimes you will apply a non-commutative stamp that actually happens 'before' or 'under' other stamps, but this isn't as bad as it sounds, and combined with the ability to manually override the stack order, is actually useful.

In the more distant future, geometry images may give way to more full 3D techniques, such as voxels/distance fields, but they will maintain effeciency advantages for large organic surfaces, possibly indefinetely as authoring tool and intermediate format, if not in actual run-time. However, adaptive distance fields have many of the same advantages and perhaps full volumetric modelling will at some point replace geometry images completely. I'd guess that z-brush and their ilk are based on geometry images, but perhaps I'm behind the times and they are already using something like adaptive distance fields. Some simple modelling tests could reveal the differences: ADF's are true volumes and so don't have self-intersection issues and naturally support CSG type cut/fracture operations.