Details
That covers the essentials of
the algorithm. What's left is
a mass of
details, some of them crucial. First of all, where
is the height data actually stored?
In all of the previously-published algorithms, there is
a regular grid of height values (and other bookkeeping data),
on top of which the mesh is implicitlyor
explicitlydefined. The
key innovation of my algorithm is that the data is actually
stored in an adaptive quadtree. This results in two major
benefits. First,
storage can be allocated adaptively according to the actual
dataset or the needs of the application; e.g. less storage can
be used in smoother areas or areas where the viewpoint is not
expected to travel. Second, the tree can grow or shrink
dynamically according to where the viewpoint is; procedural
detail can be added to the region near the viewpoint
on-the-fly, and deleted when the viewpoint moves
on.
In order to
store heightfield information in a quadtree, each quadtree
square must contain height values for at least its center
vertex and two of its edge vertices. All of the other vertex
heights are contained in other nearby nodes in the tree. The heights of the
corner vertices, for instance, come from the parent quadtree
square. The remaining edge vertex heights are stored in
neighboring squares. In my current implementation, I actually
store the center height and all four edge heights in the
quadtree square structure.
This simplifies things because all the necessary data to
process a square is readily available within the square or as
function parameters. The upshot is that the height of each edge
vertex is actually stored twice in the
quadtree.
Also, in my
current implementation, the same quadtree used for heightfield
storage is also used for meshing.
It should be possible to use two separate heightfields,
one for heightfield storage and one for meshing. The potential benefits of such an
approach are discussed later.
A
lot of the tricky implementation details center around the
shared edge vertices between two adjacent squares. For instance, which
square is responsible for doing the vertex-enabled test on a
given edge vertex?
My answer is to arbitrarily say that a square only tests
its east and south edge vertices.
A square relies on its neighbors to the north and to the
west to test the corresponding edge
vertices.
Another
interesting question is, do we need to clear all enabled flags
in the tree at the beginning of Update(), or can we proceed
directly from the state left over from the previous frame? My answer is, work from
the previous state (like).
Which leads to more details: we've already covered the
conditions that allow us to enable a vertex or a square, but
how do we know when we can disable a vertex or a square? Remember from the original
Update() explanation, the enabling of a vertex can cause
dependent vertices to also be enabled, rippling changes through
the tree. We can't
just disable a vertex in the middle of one of these dependency
chains, if the vertex depends on enabled vertices. Otherwise we'd either
get cracks in the mesh, or important enabled vertices would not
get rendered.
If you take a
look at Figure 8, you'll notice that any given edge vertex has
four adjacent sub-squares that use the vertex as a corner. If
any vertex in any of those sub-squares is enabled, then the edge
vertex must be enabled.
Because the square itself will be enabled whenever a
vertex within it is enabled, one approach would be to just
check all the adjacent sub-squares of an edge vertex before
disabling it.
However, in my implementation, that would be costly, since
finding those adjacent sub-squares involves traversing around
the tree. Instead,
I maintain a reference count for each edge vertex. The
reference count records the number of adjacent sub-squares, from
0 to 4, which are enabled.
That means that every time a square is enabled or
disabled, the reference counts of its two adjacent edge
vertices must be updated.
Fortunately, the value is always in the range, so
we can easily squeeze a reference count into three
bits.
 |
Figure 8. Each edge
vertex has four adjacent sub-squares which use it as a corner. If
any of those squares are enabled, then the edge vertex must be
enabled. For example, the black vertex must be enabled if any of the
four gray squares are enabled.
|
Thus
the disable test for an edge vertex becomes straightforward: if
the vertex is currently enabled, and the associated reference
count is zero, and the vertex test with the current viewpoint
returns false, then disable the edge vertex. Otherwise leave it
alone. The conditions for disabling a square are
fairly straightforward: if the square is currently enabled, and
it's not the root of the tree, and none of its edge vertices
are enabled, and none of its sub-squares are enabled, and the
square fails the box test for the current viewpoint, then
disable it.
Memory
A very important issue with
this (or any) LOD method is memory consumption. In a fully populated quadtree, a
single quadtree square is equivalent to about three vertices of
an ordinary heightfield, so it is imperative to keep the square
data-structure as compact as possible. Fortunately, the needs of the
Update() and Render() algorithms do not require each square to
contain all the information about 9 vertices. Instead, this is the laundry list
of required data:
- 5 vertex
heights (center, and edges verts east, north, west,
south)
- 6 error values (edge verts
east and south, and the 4 child squares)
- 2 sub-square-enabled
reference counts (for east and south verts)
- 8 1-bit enabled flags (for
each edge vertex and each child square)
- 4 child-square
pointers
- 2 height values for min/max
vertical extent
- 1 1-bit 'static' flag, to
mark nodes that can't be deleted
Depending on
the needs of the application, the height values can usually be
squeezed comfortably into 8 or 16 bits. The error values can
use the same precision, or you can also do some non-linear
mapping voodoo to squeeze them into smaller data sizes. The reference counts
can fit into one byte along with the static flag. The enabled
flags fit in one byte. The
size of the child-square pointers depends on the maximum number
of nodes you anticipate.
I typically see node counts in the hundreds of
thousands, so I would say 20 bits each as a minimum. The min/max vertical values can
be squeezed in various ways if desired, but 8 bits each seems
like a reasonable minimum. All told, this amounts to at least
191 bits (24 bytes) per square assuming 8-bit height
values. 16-bit height
values bring the total to at least 29 bytes. A 32-byte sizeof(square)
seems like a good target for a thrifty implementation. 36 bytes is what I
currently live with in Soul Ride, because I haven't gotten
around to trying to bit-pack the child pointers. Another byte-saving trick I
use in Soul Ride is to use a fixed-pool allocator replacement
for quadquare::new() and delete(). You can eliminate whatever
overhead the C++ library imposes (at least 4 bytes I would
expect) in favor of a single allocated bit per
square.
There are
various compression schemes and tricks that could be used to
squeeze the data even smaller, at the expense of complexity and
performance degradation.
In any case, 36 bytes per 3 vertices is not entirely
unrespectable. That's 12
bytes/vertex.
reports implementations as small as 6 bytes per vertex. only requires
storage of vertex heights and "wedgie thicknesses", so the base
data could be quite tiny by comparison. , reports
the storage of wedgie thicknesses at a fraction of the
resolution of the height mesh, giving further
savings.
However, such
comparisons are put in a different light when you consider that
the quadtree data structure is completely adaptive: in very
smooth areas or areas where the viewer won't ever go near, you
need only store sparse data.
At the same time, in areas of high importance to the
game, you can include very detailed features; for example the
roadway in a driving game can have shapely speed bumps and
potholes.
Geomorphs
go into some
detail on "vertex morphing", or "geomorphs". Basically, geomorphing is a
technique whereby when vertices are enabled, they smoothly
animate from their interpolated position to their correct
position. It looks great and
eliminates unsightly popping; see McNally's TreadMarks for a
nice example.
Unfortunately,
doing geomorphs requires storing yet another height value for
the vertices that must morph, which would present a real
data-size problem for the adaptive quadtree algorithm as I've
implemented it. It
could result in adding several bytes per square to the storage
requirements, which should not be done lightly. incurs the
same per-vertex storage penalty, but avoids it because it
only has to store the extra height values for vertices that are
actually in the current mesh, not for every vertex in the
dataset.
I have three
suggestions for how to address the geomorph issue. The first alternative
is to spend the extra memory.
The second alternative is to optimize the
implementation, so that really small error tolerances would be
practical and geomorphs unnecessary. Moore's Law may take care
of this fairly soon without any additional software work. The third alternative is to split
the quadtree into two trees, a "storage tree" and a "mesh
tree". The storage tree
would hold all the heightfield information and precomputed
errors, but none of the transitory rendering data like enabled
flags, reference counts, geomorph heights, etc. The mesh tree would hold all that
stuff, along with links into the storage tree to facilitate
expanding the mesh tree and accessing the height data. The mesh tree could be
relatively laissez-faire about memory consumption, because its
size would only be proportional to the amount of
currently-rendered detail.
Whereas the storage tree, because it would be static,
could trim some fat by eliminating most of the child
links.
The
storage-tree/mesh-tree split could also, in addition to reducing
total storage, increase data locality and improve the
algorithm's cache usage.
_____________________________________________________________
Working
Code