Contents

Introduction to Bezier Curves and Surfaces

Data Structures and Classes

Surface Normal Calculation

Data Structures and Classes

The CBezierTessellation Class:

The CBezierTessellation class serves two purposes. The first is to hold the triangle indices data for the current tessellation level. This data is sent to the rasterization HW and indicates how to connect the vertices generated by the tessellation process into triangles. The second purpose is to hold precalculated values for the basis functions and their derivatives for the chosen (s, t) parameter pair at each sample point. Since our tessellation is not dependent on the position of the control points of the Bezier patch, we can use one CBezierTessellation object for all the Bezier patches in our application.

The CBezierSurface and CBezierPatch Classes:

A single, 4x4 patch can generate only a limited set of objects. This is why we often see sets of 4x4 patches connected to each other to produce the object, where each patch defines only a small portion of the object’s surface. Neighboring patches are connected by sharing the control points along the common edge of the connected surface. To use the data efficiently, we define two classes: CBezierSurface, which holds all the control points used by the object, and CBezierPatch, which holds a 4x4 matrix of indices to control points stored in the patch parent CBezierSurface object.

Streaming SIMD Extensions Implementation

Setup

We use Streaming SIMD Extensions to evaluate the surface position and surface normal for four sample points in parallel. Within the data structure, the basis function values for the sample points (s, t) are organized into groups of four values each. To improve cache locality, we store these values in continuos memory blocks. The memory footprint of the basis function values looks like:

B0,3(s0),B0,3(s1),B0,3(s2),B0,3(s3),B1,3(s0),B1,3(s1),,,,B3,3(t0), B3,3(t1), B3,3(t2), B3,3(t3)

For every iteration of four sample points, we need 32 floating-point values (4 s values x 4 basis functions + 4 t values x 4 basis functions), or 4 cache lines. We use the prefetch instruction to tell the processor that the next 4 cache lines will be used in the next iteration. This way, no cache misses occur during the execution of the tessellation algorithm.

We also setup the tessellation process by expanding the control points four times. We need to expand  the current set of control points because the algorithm uses the control points values in parallel to generate four vertices. (When tessellating directly to screen space,  we need to expand the control points after the transformation to screen). 

Position calculation

There is a big difference between the tessellation of screen space surfaces and the tessellation of object space surfaces. For object space surfaces, we only need to calculate the surface position “by the book”. For screen space surfaces, we usually need to transform using a perspective projection. The perspective projection changes the actual position of the x, y coordinates based on their z-value. To make the surface persistent in projective transformations, we must convert our patch to a “Rational Bezier surface”. A Rational Bezier surface is based on a set of homogenous control points coming from the control-point transformation algorithm. The fourth coordinate (W) is used as a weight that divides all the other control points coordinates:

The last stage of the surface position generation is the reformatting of the output data to the rasterizer. The data comes out as groups of 4 X’s, 4 Y’s, 4 Z’s and 4 W’s, while the rasterizer expects x, y, z, w vectors. We use the _MM_TRANSPOSE4_PS macro to transpose the values to the correct format.

Finally, the code for the tessellation of surface position is:

// Zero the vertex coordinates
vertex.x = vertex.y = vertex.z = vertex.w = _ZERO_;
for (k=0;k<4;k++) {
    for (t=0;t<4;t++) {                        
    // pre-multiply basis function values for s & t
         F32vec4 coeff = coeffs;
    // get the control point based on its index
         DWORD idx = _indices;                                         
        PIIIVector4 *pt = &projBase;
    // sum the vertex components
         vertex.x += coeff*pt->x;
         vertex.y += coeff*pt->y;
         vertex.z += coeff*pt->z;
    // Weight calculation- required only when tessellating directly     
    // to screen space
        
vertex.w += coeff*pt->w;
    }
}

// Perspective division - required only when tessellating
//directly to screen space
F32vec4 rhw = rcp(vertex.w);         

// 1 over w using rcp
vertex.x *= rhw;
vertex.y *= rhw;
vertex.z *= rhw;

// Transpose position values from
// to X4 format.
_MM_TRANSPOSE4_PS(vertex.x, vertex.y, vertex.z,rhw);

// Store the four vertices position, using unaligned move
storeu(&vtx.sx, vertex.x);
storeu(&vtx.sx, vertex.y);
storeu(&vtx.sx, vertex.z);
storeu(&vtx.sx,rhw);

___________________________________________________________________

Surface Normal Calculation