ALGORITHMS FOR STEREOSCOPIC IMAGING

True 3-D is easier than it looks

This article contains the following executables: STEREO.ARC

Victor J. Duvanenko and W.E. Robbins

Victor is a member of the technical staff at Truevision in Indianapolis, Indiana. He can be contacted at victor@truevision.com.

Humans, as well as many other animals, have two eyes that enable us to detect a certain range of frequencies reflected off or transmitted from surrounding objects. In other words, we have two sensors, offset horizontally by about 2.5 inches, that allow us to see and judge distances (depth) fairly precisely.

The ability to precisely judge depth comes in handy when manipulating objects with our hands. It's also beneficial to have a bit of redundancy in the visual system. Thus, our brain sees the world through our two eyes from two slightly offset perspectives. We learn to merge the two disparate views into a single image with added depth detail via triangulation.

The mechanism for extraction of depth information from two offset inputs is illustrated in Figure 1. It's easy to see that the further away an object is, the less disparate the two views (one from each eye) become. This is easy to verify by placing a finger close to your eyes, and then quickly opening and closing left and right eyes in alternating fashion. The finger appears to move a lot. However, when you fully extend your arm, moving the finger and repeat the procedure, your finger doesn't seem to move as much. Our brain learns to interpret the amount of disparity between the two views as distance (or depth). Thus, only objects that are very far away will be perceived to have little or no disparity. This ability is commonly referred to as "stereoscopic vision" (or, more formally, "binocular disparity").

The ability to present distance or depth information in a natural fashion is absent from most current-generation computer-graphics environments, which usually present only a single view of the world, as if the viewer had only one eye. Thus, the visual sense is not fully exploited, and stereoscopic skills are not utilized. The amount of information that can be effectively communicated is reduced, although many applications would greatly benefit from an increased bandwidth.

Depth information is especially useful when dealing with three-dimensional objects. By presenting depth information, a more natural human-computer interaction occurs, with higher communication bandwidth and improved understanding. In fact, the need for object rotation and movement becomes less critical, since it's no longer necessary for understanding of depth relationships, but is only needed to reveal obscured sections. Depth is an integral part of the three-dimensional space we live in. Only when depth is presented to the viewer will the visual experience offered by computers be indistinguishable from that of the real world.

Homogeneous matrix form (4x4) was used for simplicity and efficiency. This method is beautiful, as it allows for accumulation of a series of transformations into a single matrix. For example, it's possible to construct a single matrix that not only performs the viewing transformation, but also the scaling to the device/display coordinates. Thus, each sample point in the original image needs to be multiplied by a 4x4 matrix only once to transform it all the way to the device/display coordinates.

Some computational optimizations were performed here. For instance, instead of handling each triangle separately, the entire image was transformed. This reduced the amount of computations by six, since every sample point generates two triangles, and all of the vertices are shared between neighboring triangles. The multiplication of a vector (the sample point, 1x4) by a matrix (4x4) is computationally intensive: 16 multiplications and 12 additions per sample point (a total of 640,000 multiplications and 480,000 additions). Also, three divisions, necessary to bring a point back to 3-D from the homogeneous space, have been replaced by one reciprocal and three multiplications. This should execute faster on most floating-point platforms.

More Details.

Shading

Every triangle in the uniform STM grid was flat shaded (but other shading algorithms can easily be used). For every sample/grid point (the grid is 200X200), two triangles are created. The intensity/color of each triangle is based on the angle between its normalized normal and the normalized negative light-source direction vector (a normalized vector from the origin to the light-source position). Since both vectors have been normalized (and thus have a magnitude of unity), the angle between them is obtained by computing the dot product. The result will be between -1 and 1. For simplicity, all negative values were set to 0, implying that all triangles facing away from the light source were rendered as dark as possible. Thus, the resulting triangle, shade values were between 0 and 1. The range of possible display intensities was known to be between 0 and 255. Therefore, the final triangle-shading value was obtained by multiplying the shade value with 255.

For a 200x200 grid of sample points (or a total of 40,000 samples ), 80,000 triangle normals must be computed. Each triangle normal is obtained by computing the cross product of two adjacent edge vectors. (This is how a normal to a plane is found, and triangles are guaranteed to be planar, since it takes a minimum of three points to define a plane.) For example, for a grid point P[i,i] the normal to the first triangle is obtained by computing (P[i,i] P[i+1,i])X(P[i,i]P[i+1,i+1]); and for the second triangle by computing (P[i,i]P[i+1,i+1])X(P[i,i]P[i,i+1]). A general cross-product formula would require performing three subtractions per edge to compute the edge vectors and then six multiplications and three subtractions, for a total of nine subtractions and six multiplications. Thus, for 80,000 triangles 720,000 subtractions and 480,000 multiplications are required, which would take a noticeable amount of time even on the best of current workstations.

Several computational optimizations are possible due to the restrictive structure of the STM database. The grid of samples is uniform in both X and Y, and is X- and Y-axis aligned. Thus, the X and Y components of the vectors are known in advance, and need not be computed for each triangle. Only the Z component has to be computed. Also, for vectors between successive-row grid points (for instance, P[i,i]P[i+1,i]), the X component is 0; for vectors between successive-column grid points (for example, P[i,i]P[i,i+1]), the Y component is 0. This leads to several simplifications in the cross-product computations, as the 0 components do not contribute to the result and can be eliminated. The result is that only five subtractions and four multiplications are performed for every pair of triangles (or a total of 200,000 subtractions and 160,000 multiplications--a 3.6-times reduction in subtractions and 3.0-times reduction in multiplications). See procedure compute_normals() in Listing Three, page 76. However, these optimizations produce results that are not general (that is, specific for the STM database structure).

Several floating-point operations per triangle are also necessary to normalize the normal vectors: three multiplications, two additions, three divisions, and one square root. To compute the dot product between the normalized light-source direction vector and the normalized triangle, normal requires three multiplications and two additions. It would be an interesting exercise in vector mathematics to try and reduce the number of operations required.

Transforming Interocular Distance

Once a scene can be generated from a single viewpoint, the move to stereoscopic scene generation is not difficult--two scenes must be generated from two slightly different perspectives. The trick is to decide how far apart to place the two viewpoints--the two eyes.

On average, the distance between human eyes (interocular distance) is 2.5 inches. This implies that when you look at any object, the views that each eye sees will differ by no more than 2.5 inches. This interocular distance stays nearly constant for each of us. We get used to it and derive depth information based on it. The further away an object is, the more similar the two views are; the closer an object is, the more disparate the two views.

Most computer-graphics books describe the rendering pipeline as a sequence of steps that ends with drawing a pixel on the display device. Pixel aspect ratio, gamma, and the color/shade capabilities of the display device may be the only other parameters considered. Several additional parameters must be considered when generating stereoscopic images. These parameters move beyond the pixels of the display and into your physical world.

Two images (a "stereopair") must be generated, one for each eye. However, your interocular distance (eye-to-eye spacing) is known only in the physical-coordinate system (about 2.5 inches). This distance needs to be transformed into (brought into) the viewing-coordinate system where the object being viewed lies. The result will be the interocular distance in the viewing-coordinate system, which will correspond to 2.5 inches in the physical-coordinate system.

Foley and van Dam refer to the transformation from the viewing-coordinate system to the device/display-coordinate system as "window-to-viewport scaling." The transformation from the device-coordinate system to the physical-coordinate system depends on the pixel size of the physical display. Since the interocular distance is measured along the X-axis, only the pixel width is significant (and this is obviously monitor dependent).

To map the interocular distance from the physical-coordinate system to the viewing-coordinate system, it must first be mapped to the device-coordinate system (pixels or dots). In other words, inches are mapped to pixels (and dots per inch, or dot pitch, varies between monitors). For instance, the dot pitch of the LEX/90 monitor is 0.021 inches/dot. Thus, 2.5 inches are equivalent to 119 dots (pixels). The 119 pixels is the interocular distance in the device-coordinate system. The 119 pixels must now be mapped to the viewing-coordinate system. The horizontal scale factor is the window width divided by the viewport width. Multiplying the 119 pixels by this scale factor will transform the 2.5 inches in the physical-coordinate system to the viewing-coordinate system. This computation is shown in set_per_projection_plane() in Listing Six, page 78.

Of course, your physical distance from the display needs to be transformed into the viewing-coordinate system as well. This is done in set_per_projection_plane(). These two transformations bring you from the physical world into the viewing-coordinate system (once again, where the object is). The set_per_projection_plane() procedure suggests the distance of the projection plane in the viewing-coordinate system (which corresponds to 18 inches in the physical world).

Now that we have the interocular distance and your distance from the display transformed into the viewing-coordinate system, the left- and right-eye view transformation matrices can be formed. Larry Hodges presents a simple technique whereby you first look down the Z-axis, and your eyes are centered around the origin (along the X-axis). Thus, the left eye is at coordinate (e/2, 0, 0) and the right eye is at (-e/2, 0, 0), where e is the interocular distance in the viewing-coordinate system. Secondly, the two views are generated by using the TPT{-1} model (translate, project, translate back). By using homogeneous (4x4) matrices, a single (4x4) transformation matrix results. The left-eye view transformation consists of a concatenation of three operations: translation to the right by e/2 distance, projection, and translation to the left by the same amount (T[R]PT[L]); see Figure 2. The display_3D_data() procedure (see Listing One, page 76) follows this model by creating first a left-eye (4x4) transformation matrix, then a right-eye matrix.

Concatenating All Matrices into One

The display_3D_data() procedure then combines the left-eye matrix with the viewing-transformation (4x4) matrix to produce a single (4x4) matrix that contains the complete transformation that each vertex of the 2OOx2OO grid must undergo. Once a single (4x4) transformation matrix has been obtained, transform_and_homogenize_image() (Listing Four, page 78) is called, transforming the 2OOx2OO grid (or 40,000 vertices). The result is homogenized (X, Y, and Z are divided by W) to bring the points back into 3-D from the homogenous space.

At this point, render_image() (Listing Five, page 78) is called, which takes transformed grid with shading information for each of the 80,000 triangles, clips the triangles to the viewport, and renders it (utilizing the LEX/90 polygon command). Clipping is done very simplistically--the triangle is rendered only if it is entirely inside the viewport.

Displaying Alternating Views

The LEX/90 has two display buffers (double-buffered). It is capable of displaying either of the two buffers and of alternating between the buffers on each frame. The left-eye image was rendered into display-buffer one, and the right-eye image was rendered into display-buffer two. Then LEX/90 is set into the alternating buffer mode, and--stereo! The LEX/90 displays the left-eye image in one frame, then displays the right-eye image, then the left-eye image, and so on, 60 times per second.

To view this image, we used a crusty pair of piezoelectric shutter glasses synchronized with the vertical sync pulse of the LEX/90 display. Thus, the left eye saw only the left-eye image and the right eye saw only the right-eye image. The display flickered a little, since each eye saw a picture at 30 Hz (frames/second), but the result was sufficient to get quite a few "Wow!"s.

Rolling Your Own Stereopair Images

Barr E. Bauer

Barr is a research scientist at Arris Pharmaceutical and can be contacted at 385 Oyster Point Blvd., South San Francisco, CA 94080.

In my work as a pharmaceutical researcher, I find stereo imaging to be not only useful, but essential because true 3-D presentation is often the only way to clearly present results. In other words, when viewed in conventional 3-D or 2-D, DNA structures (such as that on the cover of this magazine) can lose much of their structural content.

Workstations like Silicon Graphics' (Mountain View, California) Indigo provide real-time 3-D effects directly on the monitor. Vector objects can be displayed depth-cued, whereby uniform fading along the screen Z-axis gives a good sense of front to back. The pseudo-3-D effect is magnified when the object is rotated or when the fading is dramatic. Interior objects can be highlighted by making them less bright than objects in front, maintaining a clear what's-in-front distinction. The depth-cued display sometimes inverts in the mind's eye, making the bright part of the display appear to be in the rear. This curious side effect occurs at random and can complicate presentations. The Indigo can display solid-shaded objects, too, which also gives a good pseudo-3-D effect.

CrystalEyes from StereoGraphics (San Rafael, California) uses a special monitor that alternates the display of left- and right-eye views to achieve real-time stereo. Liquid-crystal goggles synched to the monitor through an infrared emitter separate the displays so that each eye sees only the image appropriate for it. The stereo effect is startling: Objects appear to float in front of the screen and can be rotated in real time. On the downside, the hardware is costly, significant programming must be done to support the graphics, the picture quality is degraded due to fewer display lines, and the manipulation of 2-D menus presents a programming challenge. Despite all this, CrystalEyes can show stereo to up to four people in front of a monitor and up to 20 people with projection.

Stereo slides enable stereo presentation to larger crowds or when maximum picture quality is desired. Separate left- and right-eye views are photographed, then projected through polarized filters and viewed with polarized glasses, one eye at 90 degrees with respect to the other. The images are rotated six degrees with respect to each other along the screen's Y-axis. Internal coordinate frames must be normalized to the screen frame before the two images are rotated and photographed, otherwise the resulting projected stereo images can be very distressing to look at. The result is a static image that looks good but requires careful image alignment, and it often takes a moment for the audience to "see" the stereo effect. The polarizing filters steal light intensity the farther you sit from the central axis of the screen, making the image appear dim to those sitting in the front corners of the room. The screen can fool camera exposure control often enough that multiple exposures and careful intensity matching give the best stereo image. If one of the pair is noticeably different in exposure, the stereo effect becomes more difficult to see. Regardless, this method is inexpensive and portable. I combine depth-cuing with stereo slides for a double effect that accentuates the stereo.

Finally, stereo images can be put in print as stereo pairs, even with line art. The simplest picture that minimizes unnecessary overlap provides the clearest 3-D image. The pairs can be viewed with glasses, or by defocusing your eyes and merging the two images into one. Figure 3, for instance, is a stereo triplet from AI software used to recognize 3-D molecular shape developed at Arris Pharmaceutical Corp. To see in stereo, follow these steps:

Hold the page flat under uniform light at arm's length.
Look at the center and right images, cross your eyes, and attempt to merge the new images formed into a single 3-D image. When the structures merge, the image will snap into 3-D.
If you can't see the stereo, relax, have a glass of wine, and try again.

Most people cross their eyes; if you are "wall-eyed" (eyes turned outward), look instead at the left and center images. Beware, however, that this can give you a severe headache if you look long enough.



_ALGORITHMS FOR STEREOSCOPIC IMAGES_
by Victor Duvanenko and W.E. Robbins

[LISTING ONE]



display_3D_data( proj )
int proj;
{
   double  Tw[4][4], S[4][4], Td[4][4], Ry[4][4], Per[4][4], tmp[4][4];
   double  Left[4][4], Right[4][4];
   double  e_v,      /* interocular distance mapped back to the view  coord. */
           e_w;      /* interocular distance mapped back to the world coord. */
   printf( "Computing normals for shading.  " );
   compute_normals();      /* and the dot products for each triangle */
   printf( "Done.\n" );
   /* Perspective projection must use three steps:
      1)  Compute normals and project,
      2)  Divide by W (homogenize)
      3)  Transform into the device coordinates. */
   if ( proj == PERSPECTIVE )
   {
      /* map physical interocular distance into the view port coordinates. */
      e_v = INTEROCULAR_DISTANCE / PIXEL_WIDTH;      /* e_v == pixels */
      /* map from the viewport coordinate system to the world. */
      e_w = e_v / (( x_right_d - x_left_d ) / ( x_right - x_left ));
      set_to_identity( Left,  4 );
      set_to_identity( Right, 4 );

      /* Use the Translate, Project, Translate back model. */

      /* Create the Left eye transformation matrix. */
      set_to_identity( Tw, 4 );    /* translate the left eye to the origin */
      Tw[3][0] = -e_w / 2.0;
      matrix_mult( Left, 4, 4, Tw, 4, 4, tmp );
      /* Create the perspective projection matrix. */
      set_to_identity( Per, 4 );
      Per[2][3] = 1.0 / proj_plane;          /* 1/d */
      Per[3][3] = 0.0;
      matrix_mult( tmp, 4, 4, Per, 4, 4, Left );

      Tw[3][0] = e_w / 2.0;             /* translate back */
      matrix_mult( Left, 4, 4, Tw, 4, 4, tmp );
      copy_matrix( tmp, Left, 4, 4 );

      /* Create the Right eye transformation matrix. */
      set_to_identity( Tw, 4 );    /* translate the right eye to the origin */
      Tw[3][0] = e_w / 2.0;
      matrix_mult( Right, 4, 4, Tw, 4, 4, tmp );
      /* Create the perspective projection matrix. */
      set_to_identity( Per, 4 );
      Per[2][3] = 1.0 / proj_plane;          /* 1/d */
      Per[3][3] = 0.0;
      matrix_mult( tmp, 4, 4, Per, 4, 4, Right );

      Tw[3][0] = -e_w / 2.0;             /* translate back */
      matrix_mult( Right, 4, 4, Tw, 4, 4, tmp );
      copy_matrix( tmp, Right, 4, 4 );
#if 0
      printf( "Transforming, projecting and homogenizing the image.  " );
      transform_and_homogenize_image( image, tr_image, Tm );
      printf( "Done.\n" );
#endif
   }
   /* Create the world to device transformation matrix. */
   /* Create the translation matrix. Translate only in X and Y. */
   set_to_identity( Tw, 4 );
   Tw[3][0] = -x_left;   Tw[3][1] = -y_bottom;   Tw[3][2] = 0.0;
   /* Create a uniform scale matrix. */
   set_to_identity( S, 4 );
   S[0][0] = ( x_right_d - x_left_d   ) / ( x_right - x_left   );
   S[1][1] = ( y_top_d   - y_bottom_d ) / ( y_top   - y_bottom );
   S[2][2] = ( z_back_d  - z_front_d  ) / ( z_back  - z_front  );

   matrix_mult( Tw, 4, 4, S, 4, 4, tmp );
   copy_matrix( tmp, Tw, 4, 4 );

   /* Create the translation matrix. Translate only in X and Y. */
   set_to_identity( Td, 4 );
   Td[3][0] = x_left_d;   Td[3][1] = y_bottom_d;   Td[3][2] = 0.0;
   matrix_mult( Tw, 4, 4, Td, 4, 4, tmp );
   copy_matrix( tmp, Tw, 4, 4 );

   /* Since device/screen origin on LEX/90 is in upper left, we need to reflect
      Y and translate by screen height to place device origin in bottom left.*/
   set_to_identity( Ry, 4 );
   Ry[1][1] = -1.0;
   matrix_mult( Tw, 4, 4, Ry, 4, 4, tmp );
   copy_matrix( tmp, Tw, 4, 4 );

   set_to_identity( Td, 4 );
   Td[3][1] = SCREEN_HEIGHT;
   matrix_mult( Tw, 4, 4, Td, 4, 4, tmp );
   copy_matrix( tmp, Tw, 4, 4 );
   /* Now, Tw has the world to device/screen transformation matrix. */
   if ( proj == PARALLEL )
   {
      /* Beautiful!!!  Perform a single transformation of the image (for
         parallel projection). */
      printf( "Transforming, projecting and mapping the image onto screen." );
      matrix_mult( Tm, 4, 4, Tw, 4, 4, tmp );
      copy_matrix( tmp, Tm, 4, 4 );
      show_matrix( Tm, 4, 4 );
      transform_and_homogenize_image( image, tr_image, Tm );
      printf( "Done.\n" );
      printf( "Rendering the image.  " );
      render_image( LEFT_EYE );        printf( "Done.\n" );
   }
   if ( proj == PERSPECTIVE )
   {
      printf( "Transforming, projecting, homogenizing and mapping the " );
      printf( "Left image onto screen.  " );
      matrix_mult( Tm, 4, 4, Left, 4, 4, tmp );
      matrix_mult( tmp, 4, 4, Tw, 4, 4, Left );
      printf( "Left eye transformation matrix.\n" );
      show_matrix( Left, 4, 4 );
      transform_and_homogenize_image( image, tr_image, Left );
      printf( "Done.\n" );
      printf( "Rendering the Left eye image.  " );
      render_image( LEFT_EYE );        printf( "Done.\n" );

      printf( "Transforming, projecting, homogenizing and mapping the " );
      printf( "Right image onto screen.  " );
      matrix_mult( Tm, 4, 4, Right, 4, 4, tmp );
      matrix_mult( tmp, 4, 4, Tw, 4, 4, Right );
      /* Move the right eye view into the lower half of the buffer. */
      set_to_identity( Tw, 4 );
      Tw[3][1] = SCREEN_HEIGHT;                 /* move in device coord */
      matrix_mult( Right, 4, 4, Tw, 4, 4, tmp );
      copy_matrix( tmp, Right, 4, 4 );
      printf( "Right eye transformation matrix.\n" );
      show_matrix( Right, 4, 4 );
      transform_and_homogenize_image( image, tr_image, Right );
      printf( "Done.\n" );
      dszom( &0, &512, &1 );                   /* look at the lower half */
      printf( "Rendering the Right eye image.  " );
      render_image( RIGHT_EYE );        printf( "Done.\n" );
   }
}

[LISTING TWO]



make_composite_viewing_matrix( proj )
int proj;      /* PARALLEL or PERSPECTIVE */
{
   double p1[4], p2[4], p3[4], p_tmp[4];
   double T[4][4], Rx[4][4], Ry[4][4], Rz[4][4], C_tmp1[4][4], C_tmp2[4][4];
   double Per[4][4], d1, d2, d12, cos_ang, sin_ang;
   /* Initialize the three points */
   p1[0] = eye_pt.x;  p1[1] = eye_pt.y;  p1[2] = eye_pt.z;  p1[3] = 1.0;
   p2[0] = p2[1] = p2[2] = 0.0;  p2[3] = 1.0;
   p3[0] = p1[0] + vup.x;  p3[1] = p1[1] + vup.y;  p3[2] = p1[2] + vup.z;
                           p3[3] = 1.0;
   /* Magnitude of vector p1->p2 */
   d12 = sqrt( p1[0] * p1[0] + p1[1] * p1[1] + p1[2] * p1[2] );
   /* Create the translation matrix. */
   set_to_identity( T, 4 );
   T[3][0] = -p1[0];   T[3][1] = -p1[1];   T[3][2] = -p1[2];
   /* Translate the three points p1, p2, and p3 to the origin. */
   matrix_mult( p1, 1, 4, T, 4, 4, p_tmp );
   copy_matrix( p_tmp, p1, 1, 4 );
   matrix_mult( p2, 1, 4, T, 4, 4, p_tmp );
   copy_matrix( p_tmp, p2, 1, 4 );
   matrix_mult( p3, 1, 4, T, 4, 4, p_tmp );
   copy_matrix( p_tmp, p3, 1, 4 );

   d1 = sqrt( p2[0] * p2[0] + p2[2] * p2[2] );      /* length of projection */
   cos_ang = -p2[2] / d1;
   sin_ang =  p2[0] / d1;
   /* Create the rotation about Y-axis matrix. */
   set_to_identity( Ry, 4 );
   Ry[0][0] = cos_ang;  Ry[0][2] = -sin_ang;
   Ry[2][0] = sin_ang;  Ry[2][2] =  cos_ang;
   /* Rotate the three points p2, and p3 about the Y-axis. */
   /* p1 is at the origin after translation => no need to rotate. */
   matrix_mult( p2, 1, 4, Ry, 4, 4, p_tmp );
   copy_matrix( p_tmp, p2, 1, 4 );
   matrix_mult( p3, 1, 4, Ry, 4, 4, p_tmp );
   copy_matrix( p_tmp, p3, 1, 4 );

   cos_ang = -p2[2] / d12;
   sin_ang = -p2[1] / d12;

   /* Create the rotation about X-axis matrix. */
   set_to_identity( Rx, 4 );
   Rx[1][1] = cos_ang;   Rx[1][2] = sin_ang;
   Rx[2][1] = -sin_ang;  Rx[2][2] = cos_ang;
   /* Rotate the three points p2, and p3 about the X-axis. */
   matrix_mult( p2, 1, 4, Rx, 4, 4, p_tmp );
   copy_matrix( p_tmp, p2, 1, 4 );
   matrix_mult( p3, 1, 4, Rx, 4, 4, p_tmp );
   copy_matrix( p_tmp, p3, 1, 4 );
   /* Sanity check. */
   printf( "The view vector should be [ %lf %lf %lf %lf ]\n", 0.0, 0.0, -d12,
                                                              1.0 );
   printf( "The view vector is [ %lf %lf %lf %lf ]\n", p2[0], p2[1],
                                                       p2[2], p2[3] );
   d2 = sqrt( p3[0] * p3[0] + p3[1] * p3[1] );
   cos_ang = p3[1] / d2;
   sin_ang = p3[0] / d2;
   /* Create the rotation about Z-axis matrix. */
   set_to_identity( Rz, 4 );
   Rz[0][0] = cos_ang;   Rz[0][1] = sin_ang;
   Rz[1][0] = -sin_ang;  Rz[1][1] = cos_ang;
   /* At this point the translation, and all rotation matrices are known
      and need to be combined into a single transformaation matrix. */
   matrix_mult( T, 4, 4, Ry, 4, 4, C_tmp1 );
   matrix_mult( C_tmp1, 4, 4, Rx, 4, 4, C_tmp2 );
   matrix_mult( C_tmp2, 4, 4, Rz, 4, 4, C_tmp1 );
   copy_matrix( C_tmp1, Tm, 4, 4 );
}

[LISTING THREE]



void compute_normals()
{
   register i, j;
   point_3D_t p11_p21, p11_p22, p11_p12, cross_a, cross_b;
   double d, dx;              /* stepping distance (dx = dy) */
   double dx_sqrd;            /* dx^2 */
   int num_lines;
   point_3D_t na, nb;         /* normals to triangle A and B */
   dx = scan_sz / num_samples;

   num_lines = MIN( num_samples, MAX_IMAGE_SIZE );
   num_lines--;
   dx_sqrd = dx * dx;
   for( i = 0; i < num_lines; i++ )
      for( j = 0; j < num_lines; j++ )
      {
#if DEBUG
         printf( "P11 %f %f %f\n", image[i][j].x, image[i][j].y,
                                   image[i][j].z );
         printf( "P21 %f %f %f\n", image[i+1][j].x, image[i+1][j].y,
                                   image[i+1][j].z );
         printf( "P12 %f %f %f\n", image[i][j+1].x, image[i][j+1].y,
                                   image[i][j+1].z );
         printf( "P22 %f %f %f\n", image[i+1][j+1].x, image[i+1][j+1].y,
                                   image[i+1][j+1].z );
#endif
         p11_p21.z = image[ i + 1 ][ j ].z - image[ i ][ j ].z;
         p11_p22.z = image[ i + 1 ][ j + 1 ].z - image[ i ][ j ].z;
         p11_p12.z = image[ i ][ j + 1 ].z - image[ i ][ j ].z;
#if DEBUG
         printf( "dz11_21 = %f,  dz11_22 = %f,  dz11_12 = %f\n",
                 p11_p21.z, p11_p22.z, p11_p12.z );
#endif
         /* It's possible to eliminate one more multiplication in the
            computations below. */
         na.x = dx * ( p11_p21.z - p11_p22.z );
         na.y = dx * p11_p21.z;
         na.z = dx_sqrd;
#if DEBUG
         printf( "Na %f %f %f\n", na.x, na.y, na.z );
#endif
         nb.x = (-dx) * p11_p12.z;
         nb.y = dx * ( p11_p22.z - p11_p12.z );
         nb.z = dx_sqrd;
#if DEBUG
         printf( "Nb %f %f %f\n", nb.x, nb.y, nb.z );
#endif
         /* Normalize the normal vectors, since the intensity will be
            proportional to the angle between light source and the normal. */
         d = sqrt((double)( na.x * na.x + na.y * na.y + na.z * na.z ));
         na.x = na.x / d;
         na.y = na.y / d;
         na.z = na.z / d;
#if DEBUG
         printf( "Na %f %f %f\n", na.x, na.y, na.z );
#endif
         d = sqrt((double)( nb.x * nb.x + nb.y * nb.y + nb.z * nb.z ));
         nb.x = nb.x / d;
         nb.y = nb.y / d;
         nb.z = nb.z / d;
#if DEBUG
         printf( "Nb %f %f %f\n", nb.x, nb.y, nb.z );
#endif
         /* Compute the dot product between the light source vector and
            the normals (== to the angle between two unit vectors ).
            -1 <= cos( theta ) <= 1, which will be very useful. */
         image[ i ][ j ].sha = light_source.x * na.x + light_source.y * na.y +
                               light_source.z * na.z;
         image[ i ][ j ].shb = light_source.x * nb.x + light_source.y * nb.y +
                               light_source.z * nb.z;
      }
}

[LISTING FOUR]



transform_and_homogenize_image( s, d, tm )
point_3D_ex_t  s[][ MAX_IMAGE_SIZE ], d[][ MAX_IMAGE_SIZE ];
double *tm;         /* transformation matrix */
{
   register i, j;
   int  num_lines;
   double  p[4];      /* the point to be transformed */
   double  t[4], inv_W;

   num_lines = MIN( num_samples, MAX_IMAGE_SIZE );
   for( i = 0; i < num_lines; i++ )
      for( j = 0; j < num_lines; j++ )
      {
         p[0] = s[i][j].x;   p[1] = s[i][j].y;
         p[2] = s[i][j].z;   p[3] = 1.0;
         matrix_mult( p, 1, 4, tm, 4, 4, t );
         if ( t[3] != 1.0 )        /* divide by W (homogenize) */
         {
            inv_W = 1.0 / t[3];
            t[0] *= inv_W;   t[1] *= inv_W;  t[2] *= inv_W;
         }
         d[i][j].x = t[0];   d[i][j].y = t[1];   d[i][j].z = t[2];
         d[i][j].sha = s[i][j].sha;    d[i][j].shb = s[i][j].shb;
      }
}

[LISTING FIVE]



render_image( y )
int  y;
{
   register i, j;
   int    num_lines;
   short  v[6], intensity;

   num_lines = MIN( num_samples, MAX_IMAGE_SIZE );
   num_lines--;
   for( i = 0; i < num_lines; i++ )
      for( j = 0; j < num_lines; j++ )
      {
         v[0] = ROUND( tr_image[ i     ][ j     ].x );
         v[1] = ROUND( tr_image[ i     ][ j     ].y );
         v[2] = ROUND( tr_image[ i + 1 ][ j     ].x );
         v[3] = ROUND( tr_image[ i + 1 ][ j     ].y );
         v[4] = ROUND( tr_image[ i + 1 ][ j + 1 ].x );
         v[5] = ROUND( tr_image[ i + 1 ][ j + 1 ].y );
         /* Render triangle A */
         intensity = ROUND( tr_image[ i ][ j ].sha * (double)(NUM_SHADES - 1));
         if ( intensity > ( NUM_SHADES - 1 ))
            intensity = NUM_SHADES - 1;      /* saturate */
         if ( intensity < 0 )
         {
#if 0
            printf( "Triangle A,  intensity = %d\n", intensity );
            printf( "v11.x = %f, v11.y = %f, v11.z = %f\n",
                    image[i][j].x, image[i][j].y, image[i][j].z );
            printf( "v21.x = %f, v21.y = %f, v21.z = %f\n",
                    image[i+1][j].x, image[i+1][j].y, image[i+1][j].z );
            printf( "v22.x = %f, v22.y = %f, v22.z = %f\n",
                    image[i+1][j+1].x, image[i+1][j+1].y, image[i+1][j+1].z );
#endif
            intensity = 0;
         }
         if ( clip_to_viewport( v, 6, y ) == ACCEPT )
            dspoly( &intensity, v, &6 );

         v[2] = ROUND( tr_image[ i     ][ j + 1 ].x );
         v[3] = ROUND( tr_image[ i     ][ j + 1 ].y );
         /* Render triangle B */
         intensity = ROUND( tr_image[ i ][ j ].shb * (double)( NUM_SHADES-1));
         if ( intensity > ( NUM_SHADES - 1 ))
            intensity = NUM_SHADES - 1;      /* saturate */
         if ( intensity < 0 )   intensity = 0;
         if ( clip_to_viewport( v, 6, y ) == ACCEPT )
            dspoly( &intensity, v, &6 );
      }
}

[LISTING SIX]



display_3D_data( proj )
int proj;
{
   double  Tw[4][4], S[4][4], Td[4][4], Ry[4][4], Per[4][4], tmp[4][4];
   double  Left[4][4], Right[4][4];
   double  e_v,      /* interocular distance mapped back to the view  coord. */
           e_w;      /* interocular distance mapped back to the world coord. */

   printf( "Computing normals for shading.  " );
   compute_normals();      /* and the dot products for each triangle */
   printf( "Done.\n" );

Back to Table of Contents