# CHAPTER 7: VECTORS AND MATRICES

Vectors and matrices are important components in mathematical computations. These are fixed-length 1-d and 2-d arrays, respectively, usually composed of numbers, with defined operations such as matrix multiplication, transposition, determinants, and so on. Vectors and matrices figure prominently in applications like coordinate transformations and in solving simultaneous equations. In this chapter, we'll look at how to create user-defined versions of these useful data structures.

#### Figure 7.1 Vector types.

Since matrices are arrays, you might be tempted to implement them using the array-of-arrays technique described in Chapter 4. If you do, however, the result will be less than satisfactory--for several reasons. First, it's not possible to allow both dimensions to be determined at runtime. Second, even if you use the most compact statically-sized form, you will still have some unnecessary overhead. For example, each row of a 2-d array implemented with Sarray objects has its own length variable, virtual function table pointer, and pointer to its array elements, as illustrated in Figure 7.2. Most of these variables are redundant. You could keep just one of each for the entire matrix.

A worse problem lies in figuring out how to represent vectors associated with a matrix. Row vectors are easy, since each row is already an array. But what about column vectors, or diagonal vectors? There is no easy way to treat the elements of a column or diagonal as an array in its own right, since each element is stored in a separate row array. To further explore this fundam ental problem, we'll step back and take a look at how two-dimensional arrays are organized using the built-in C++ array types.

## ROW-MAJOR AND COLUMN-MAJOR MATRICES

`float m[3][5];`

#### Figure 7.2 A matrix implemented as an array of arrays.

The way a matrix is stored affects the way subscripting works. To find the ith row and jth column of a row-major matrix, the following subscript calculation is used:

`m[i][j] <==> *(m + i*ncols + j); // For row-major matrices`

For a column-major matrix, the calculation is:

`m[i][j] <==> *(m + i + j*nrows); // For column-major matrices`

In both cases, we need the size of only one dimension to compute the subscript. For row-major matrices, we need the number of columns, and for column-major matrices, we must know the number of rows.

The memory layout of a matrix directly affects what type of vectors can be easily obtained from the matrix. For a row-major matrix, it's easy to define row vectors, since the elements of each row are contiguous. Column vectors are more difficult to define, since the elements are non-contiguous. For column-major matrices, the opposite is true, with row-vectors being difficult to define and column vectors easy to define.

Actually, it's not all that difficult to define non-contiguous vectors. The secret lies in usng vector strides, our next topic.

## ROW-MAJOR AND COLUMN-MAJOR MATRICES

`float m[3][5];`

#### Figure 7.2 A matrix implemented as an array of arrays.

The way a matrix is stored affects the way subscripting works. To find the ith row and jth column of a row-major matrix, the following subscript calculation is used:

`m[i][j] <==> *(m + i*ncols + j); // For row-major matrices`

For a column-major matrix, the calculation is:

`m[i][j] <==> *(m + i + j*nrows); // For column-major matrices`

In both cases, we need the size of only one dimension to compute the subscript. For row-major matrices, we need the number of columns, and for column-major matrices, we must know the number of rows.

The memory layout of a matrix directly affects what type of vectors can be easily obtained from the matrix. For a row-major matrix, it's easy to define row vectors, since the elements of each row are contiguous. Column vectors are more difficult to define, since the elements are non-contiguous. For column-major matrices, the opposite is true, with row-vectors being difficult to define and column vectors easy to define.

Actually, it's not all that difficult to define non-contiguous vectors. The secret lies in usng vector strides, our next topic.

## VECTOR POINTERS

`template<class TYPE>`

`class VecPtr {`

`private:`

`TYPE *p;               // Ordinary pointer`

`const unsigned stride; // Pointer stride`

`public:`

`VecPtr(TYPE *t, unsigned s=1)       : p(t), stride(s) { }`

`VecPtr(const VecPtr<TYPE> &s)       : p(s.p), stride(s.stride) { }`

`void operator=(const VecPtr<TYPE> &s) { p = s.p; }`

`void operator=(TYPE *t)               { p = t; }`

`void operator++()                     { p += stride; }`

`void operator++(int)                  { p += stride; }`

`void operator+=(unsigned n)           { p += stride * n; }`

`TYPE &operator[](unsigned i) const    { return p[stride*i]; }`

`operator TYPE *() const               { return p; }`

`TYPE &operator *() const              { return *p; }`

`#ifdef TYPE_IS_STRUCTURE`

`TYPE *operator->() const              { return p; }`

`#endif`

`};`

This class exemplifies everything that's right about C++. It turns vector pointers into concrete data types that look virtually like ordinary pointers. The class is flexible in that, being a template, it can be used for any vector element type. Yet because the class is completely inlined, there is very little overhead (if any). This class can be as efficient as any hand-coded, low-level pointer implementation.

The basic purpose of the VecPtr class is to hide the stride calculations. For example, the overloaded '++' operator turns vector pointer incrementing into stride additions. This allows us to write code like:

`for(int i = 0; i<n; i++) { *v = 0; v++; } `

rather than

`for(int i = 0; i<n; i++) { *v = 0; v += stride; }`

The subscript operator is overloaded to allow vector pointers to be used like the name of an array. For example:

`for(int i = 0; i<n; i++) v[i] = 0;`

Of course, a loop like this won't be as efficient as one that uses pointer incrementing, since it takes an extra multiplication for the subscript operation.

## Optimizing Vector Pointers

`p + i <==> ((char *)p) + i * sizeof(T)`

`template<class TYPE>`

`class VecPtr {`

`...`

`VecPtr (TYPE *t, unsigned s=1) : p(t), stride(s*sizeof(TYPE)) { }`

`void operator++()                   { (char *)p += stride; }`

`void operator++(int)                { (char *)p += stride; }`

`void operator+=(unsigned n)         { (char *)p += stride * n; }`

`TYPE &operator[](unsigned i) const  {`

`return *((TYPE *)((char *)p + stride*i));`

`}`

`...`

`};`

This change may speed up vector pointers by as much as 20 percent. Note that the optimization is completely encapsulated in the VecPtr class. Outside the class, you continue to work with pointer strides in their normal vector element units.

`float m[3][3]; // Declare a 3x3 row-major matrix`

`// Set all values to zero by pointing to matrix with`

`// a vector that can access all elements. A stride of`

`// 1 will do the trick.`

`VecPtr<float> va(m, 1);`

`for (int i = 0; i<9; i++) va[i] = 0;`

`// Set diagonal elements to 1 with another vector pointer`

`VecPtr<float> vd(m, 4); // Stride is ncols + 1`

`for (int j = 0; j<3; j++) vd[j] = 1;`

The resulting matrix would be:

`100`

`010`

`001`

Pointer Increment Problems

`*p++ <==> (v.operator++()).operator*()`

`VecPtr<TYPE> operator++(int)`

`{`

`VecPtr before (*this);  // Make copy of object`

`p += stride;            // Do increment`

`return before;          // Return copy`

`}`

However, what happen in the following (admittedly concocted) expression?

`(v++)++; // Undefined result`

Such constructs are actually undefined. That's because the result of a post-increment operation is not an 1-Value (it can be assigned to). So it's really illegal to try and increment the result again. Note that pre-increment operation are 1-Values and don't have this problem.

We've forced both forms of operator++( ) to return void. That way, expressions like (v++)++ and *p++ won't compile, so there's no danger of receiving erroneous results. Due to the void returns, both the prefix and suffix forms of operator++( ) can be implemented using the same prefix semantics. There's no way to use the operators in illegal ways.

## DYNAMIC MATRICES

When you use the built-in array types, it's not possible to directly define matrices that have both their dimensions determined at runtime. Now that you know about pointers and strides, we can show you why this is true. In Chapter 4, we mentioned that, although the number of columns must be fixed, there is a way to make the number of rows variable. Here is one way to do it:

`typedef float ArrNx5[5]; // Declare array-of-5-columns type`

`void f(unsigned n)`

`{`

`// Dynamically allocate nx5 matrix, referenced by p`

`ArrNx5 *p = new ArrNx5[n];`

`// Set lower-right element to 99`

`p[n - 1][4] = 99;`

`}`

The variable p represents a pointer to the first element in the first row of the respective matrix. Even though this variable points to a floating point element, its type isn't float * or even float ** as you first might think. Instead, p is a pointer to floating point array of five columns. This can easily be seen by p's declaration.

A stride is implicit with a pointer to an array. In this case, p has the constant 5 as its stride. With this in mind, we have the following equivalence:

`P[i] <==> *(p + i*5)`

An ordinary pointer gets returned by subscript of a pointer to array. This pointer points to the first element of the ith row. The pointer can then be used in a subscript calculation to find the jth column of the row. Hence, double subscripts such as p[i][j] can be used.

Interestingly enough, the subscripting equation just shown is the same one used for vector pointer strides! The crucial difference is that, here, the stride is constant. Because of this, the compiler is able to easily plug in the constant stride in subscript and pointer increment calculations.

Contrast this with variable strides (such as in a dynamic matrix with the number of columns determined at runtime). Here, the compiler could not simply plug in values for the stride, because it doesn't know what the stride should be! The compiler would be forced to create a special data structure that carries the stride around with the pointer. This approach ought to sound familiar to you, since that's exactly what vector pointers do!

The conclusion is this: if you want to have dynamic matrices with both dimensions determined at runtime, you need to have vector pointers. Since C++ doesn't have vector pointers built in, it isn't able to handle dynamic matrices directly. However, nothing prevents you from creating your own dynamic matrices, as you'll now see.

## A Simple Dynamic Matrix Class

From the preceding discussions, two basic techniques emerge for defining dynamically sized matrices:

1. Represent the matrix as a one-dimensional array, using row-major or column-major order (row-major ordering is easiest). Allocate this array dynamically.

2. Use vector pointers to conveniently access rows, columns, and diagonals.

`template<class TYPE>`

`class SimpleMatrix {`

`private:`

`TYPE *data;`

`unsigned nrows, ncols;`

`public:`

`SimpleMatrix(unsigned nr, unsigned nc);`

`SimpleMatrix(const SimpleMatrix<TYPE> &m);`

`~SimpleMatrix();`

`void Copy(const SimpleMatrix<TYPE> &m);`

`SimpleMatrix<TYPE> &operator=(const SimpleMatrix<TYPE> &m);`

`TYPE *operator[](unsigned r);`

`VecPtr<TYPE> RowPtr(unsigned r=0);`

`VecPtr<TYPE> ColPtr(unsigned c=0);`

`VecPtr<TYPE> DiagPtr();`

`VecPtr<TYPE> PtrToAll();`

`const TYPE *operator[](unsigned r) const;`

`VecPtr<const TYPE> RowPtr(unsigned r=0) const;`

`VecPtr<const TYPE> ColPtr(unsigned c=0) const;`

`VecPtr<const TYPE> DiagPtr() const;`

`VecPtr<const TYPE> PtrToAll() const;`

`unsigned NRows() const;`

`unsigned NCols() const;`

`unsigned NElems() const;`

`};`

Note The complete code for the SimpleMatrix class is given on disk in the files simpmat.h and simpmat.mth. Test programs for this class are given in the files tstsmat.cpp and tstsmat2.cpp.

This class isn't a complete implementation of matrices. Its purpose is to show you the basic design and to point out some performance and safety problems. Later in this chapter, we'll create complete Vector and Matrix classes that are more robust.

The SimpleMatrix class stores the matrix data in a row-major 1-d array, pointed to by the member data, as shown in the following constructor:

`template<class TYPE>`

`SimpleMatrix<TYPE>::SimpleMatrix(unsigned nr, unsigned nc)`

`// General constructor`

`{`

`nrows = nr; ncols = nc;`

`data = new TYPE[nrows*ncols];`

`}`

`template<class TYPE>`

`VecPtr<TYPE> SimpleMatrix<TYPE>::RowPtr(unsigned r)`

`// Return a row vector pointer to row r`

`{`

`return VecPtr<TYPE>(data + r*ncols, 1);`

`}`

`template<class TYPE>`

`VecPtr<TYPE> SimpleMatrix<TYPE>::ColPtr(unsigned c)`

`// Return a column vector pointer to column c`

`{`

`return VecPtr<TYPE>(data+c, nrows);`

`}`

`template<class TYPE>`

`VecPtr<TYPE> SimpleMatrix<TYPE>::DiagPtr()`

`// Return a diagonal vector pointer. In case matrix`

`// isn't square, the smallest dimension is used to`

`// determine the diagonal stride.`

`{`

`unsigned dstride = ((nrows > ncols) ? ncols : nrows) + 1;`

`return VecPtr<TYPE>(data, dstride);`

`}`

`template<class TYPE>`

`VecPtr<TYPE> SimpleMatrix<TYPE>::PtrToAll()`

`// Return pointer to all elements of the matrix`

`{`

`return VecPtr<TYPE>(data, 1);`

`}`

Here is an example that uses these functions:

`SimpleMatrix<int> m(3, 3);`

`int i;`

`// Set all elements to '4'`

`VecPtr<int> va = m.PtrToAll();`

`for(i = 0; i<m.NElems(); i++) va[i] = 4;`

`// Set row 1 elements to '2'`

`VecPtr<int> vr = m.RowPtr(1);`

`for(i = 0; i<m.NCols(); i++) vr[i] = 2;`

`// Set col 1 elements to '3'`

`VecPtr<int> vc = m.ColPtr(1);`

`for(i = 0; i<m.NRows(); i++) vc[i] = 3;`

`// Set diagonal elements to '1'`

`VecPtr<int> vd = m.DiagPtr();`

`for(i = 0; i<m.NRows(); i++) vd[i] = 1;`

The resulting matrix would be:

`134`

`212`

`431`

## Two-Dimensional Subscripting

`template<class TYPE>`

`TYPE *SimpleMatrix<TYPE>::operator[](unsigned r)`

`{`

`return data + r*ncols;`

`}`

`SimpleMatrix<float> m(3, 4); // Declare 3x4 matrix`

`m[1][2] = 99;`

#### Figure 7.5 Two-dimensional subscripting.

`// Alternative definition for []`

`template<class TYPE>`

`VecPtr<TYPE> SimpleMatrix<TYPE>::operator[](unsigned r)`

`{`

`return VecPtr(data + r*ncols, 1);`

`}`

`m[2][4] = 99; // Set lower-right element to 99`

However, the result won't be as efficient as using the original form of the subscript function. The statement translates into:

`// First call SimpleMatrix::operator[], then VecPtr::operator[]`

`(m.operator[](2)).operator[](4) = 99;`

By substituting the bodies of the two different operator functions, we can transform the statement into the following equivalent pointer arithmetic:

`*((data + 2*ncols) + 1*4) = 99;`

Two multiplications take place: one to compute the start of row 2 and one to find column 4 of that row. Note that the multiplication 1*4 has to take place, even though just using the number 4 would suffice. That's because the number 1 represents the value stored in the stride variable for the row vector. Because a general vector pointer type is being used, we can't optimize for some vectors having a stride of 1. In the case where an ordinary pointer is returned, this optimization automatically takes place. A more efficient calculation results:

`*((data + 2*ncols) + 4) = 99; // Optimized version`

Tests show that returning an ordinary pointer from the matrix subscript function, rather than returning a vector pointer, yields speeds almost twice as fast for operations like matrix multiplication. Next, we'll show you how to define matrix multiplication.

## Matrix Multiplication

The following function implements matrix multiplication:

`void MatMult(SimpleMatrix<Number> &c,`

`const SimpleMatrix<Number> &a, const SimpleMatrix<Number> &b)`

`{`

`unsigned i, j, k;`

`for (i = 0; i < a.NRows(); i++) {`

`for (j = 0; j < b.NCols(); j++) {`

`Number sum = 0;`

`for (k = 0; k < b.NRows(); k++) sum += a[i][k] * b[k][j];`

`c[i][j] = sum;`

`}`

`}`

`}`

We can also overload the '*' operator for matrix multiplication, as follows:

`template<class TYPE>`

`SimpleMatrix<TYPE>`

`operator*(const SimpleMatrix<TYPE> &a, const SimpleMatrix<TYPE> &b)`

`{`

`SimpleMatrix<TYPE> r(a.NRows(), b.NCols());`

`MatMult(r, a, b);`

`return r; // Copy constructor called`

`}`

#### Figure 7.6 Matrix multiplication.

This example multiplies two matrices:

`SimpleMatrix<float> a(3, 2), b(2, 4), c(3, 4);`

`...`

`c = a * b;`

As shown in MatMult( ), double subscripting is the obvious way to implement matrix multiplication. However, this implies a multiplication for each subscripting operation. Since we are accessing array elements in sequence, it might be faster to use pointer incrementing. Here's an alternative matrix multiplication function that uses vector pointers--and is roughly 25 percent faster than the double-subscripting method--for double-precision floating-point matrices:

`template<class TYPE>`

`void MatMult1(SimpleMatrix<TYPE> &c,`

`const SimpleMatrix<TYPE> &a, const SimpleMatrix<TYPE> &b)`

`{`

`unsigned i, j, k;`

`VecPtr<const TYPE> ar = a.RowPtr();`

`VecPtr<const TYPE> ac = a.ColPtr();`

`VecPtr<const TYPE>br = b.RowPtr();`

`VecPtr<const TYPE>bc = b.ColPtr();`

`VecPtr<TYPE> cr = c.RowPtr();`

`VecPtr<TYPE> cc = c.ColPtr();`

`TYPE const *bstart = br;`

`for (i = 0; i < a.NRows(); i++) {`

`cr = cc; // Point to row i. Note: stride not copied`

`br = bstart;`

`for (j = 0; j < b.NCols(); j++) {`

`TYPE sum = 0;`

`ar = ac; // Point row i. Note: stride not copied.`

`bc = br; // Point column j. Note: stride not copied.`

`for (k = 0; k < b.NRows(); k++) {`

`sum += *ar * *bc;`

`ar++; // Next column`

`bc++; // Next row`

`}`

`br++; // Next column`

`*cr = sum;`

`cr++; // Next column`

`}`

`ac++; // Next row`

`cc++; // Next row`

`}`

`}`

There's a way to be even more efficient. Note that the row vectors always have a stride of one, so we can replace them by ordinary pointers. You may see speed improvements of 5 to 10 percent by doing this. The changes involved are:

`Original Statement                   New Statement`

`VecPtr<const TYPE> ar = a.RowPtr();  const TYPE *ar = a.RowPtr();`

`VecPtr<const TYPE> br = b.RowPtr();  const TYPE *br = b.RowPtr();`

`VecPtr<TYPE> cr = c.RowPtr();        TYPE *cr = c.RowPtr();`

You may wonder about the use of the const keywords in these statements. We'll look at these keywords next.

## Constant Vectors and Matrices

True to the design principles explained in Chapters 2 and 3, we've made the VecPtr and SimpleMatrix classes work for constant objects. You'll notice that we supply two versions of the[ ]operator for SimpleMatrix:

`template<class TYPE>`

`TYPE *SimpleMatrix<TYPE>::operator [](unsigned r)`

`// Called for non-const matrices`

`{`

`return data + r*ncols;`

`template<class TYPE>`

`const TYPE *SimpleMatrix<TYPE>::operator[](unsigned i) const`

`// Called for const matrices`

`{`

`return data + r*ncols;`

`}`

`template<class TYPE>`

`VecPtr<TYPE> SimpleMatrix<TYPE>::ColPtr(unsigned c)`

`// Called for non-const matrices`

`{`

`return VecPtr<TYPE>(data+c, nrows);`

`}`

`template<class TYPE>`

`VecPtr<const TYPE> SimpleMatrix<TYPE>::ColPtr(unsigned c) const`

`// Called for const matrices`

`{`

`return VecPtr<const TYPE>(data+c, nrows);`

`}`

When returning a column vector from a constant matrix, we must somehow prevent the elements of the vector from being modified. To do this, we type the vector pointer to reference const TYPE elements, rather than TYPE elements. Note this is not the same as making the vector pointer itself constant. For instance, had we written const VecPtr<TYPE> instead of VecPtr<const TYPE>, we would have created a pointer that couldn't be modified. Thus, we couldn't do things like pointer incrementing, so the pointer would be somewhat useless.

## Matrix Transposition

The following Transpose( ) function shows the obvious implementation of a matrix transposition.

`template<class TYPE>`

`SimpleMatrix<TYPE> Transpose(const SimpleMatrix<TYPE> &m)`

`{`

`SimpleMatrix<TYPE> t(m.NCols(), m.NRows());`

`for (unsigned i = 0; i<m.NRows(); i++)`

`for (unsigned j = 0; j<m.NCols(); j++) t[j][i] = m[i][j];`

`return t; // Copy constructor called.`

`}`

#### Figure 7.7 Transpose of a matrix.

With this implementation, though, an extra copy takes place when the transposed version is returned. We could circumvent this problem by modifying Transpose( ) to assume that the result matrix has already been built, and to pass it as a parameter:

`template<class TYPE>`

`void SimpleMatrix<TYPE>`

`Transpose(SimpleMatrix<TYPE> &t, const SimpleMatrix<TYPE> &m)`

`{`

`for (unsigned i = 0; i<m.NRows(); i++)`

`for (unsigned j = 0; j<m.NCols(); j++) t[j][i] = m[i][j];`

`}`

Of course, this design assumes that you have created the result matrix with the proper size, an error-prone activity.

When it's acceptable to modify the matrix in place, there is another way to accomplish matrix transposition. Transposing a matrix is like changing it from row-major to column-major ordering. Instead of actually moving elements around, you can simply view the matrix as though it were column major. Here are the steps needed to make this work:

2. Interchange the number of rows and columns, and interchange the row and column strides.

`template<class TYPE>`

`void SimpleMatrix<TYPE>::Transpose()`

`// Transpose a matrix in place`

`{`

`// Interchange nrows and ncols`

`unsigned temp = nrows;`

`nrows = ncols;`

`ncols = temp;`

`// Interchange row stride and column stride`

`temp = rowstride;`

`rowstride = colstride;`

`colstride = temp;`

`}`

`template<class TYPE>`

`VecPtr<TYPE> SimpleMatrix<TYPE>::RowPtr(unsigned r)`

`{`

`return VecPtr<TYPE>(data + r*colstride, rowstride);`

`}`

`template<class TYPE>`

`VecPtr<TYPE> SimpleMatrix<TYPE>::ColPtr(unsigned r)`

`{`

`return VecPtr<TYPE>(data + c*rowstride, colstride);`

`}`

`template<class TYPE>`

`VecPtr<TYPE> SimpleMatrix<TYPE>::operator[](unsigned r)`

`{`

`return VecPtr<TYPE>(data + r*colstride, rowstride);`

`}`

Notice the generalized calculations used to determine the row and column vectors. Remember that, in these functions, either rowstride or colstride is equal to one. The other variable is equal to the number of columns in the original, non-transposed matrix. Another important point is that the operator[ ]( ) function can no longer return an ordinary pointer because there's no guarantee that a row vector will have a stride of one. The matrix might be transposed.

This technique eliminates the need to actually move the matrix elements around. Of course, this only works if you want to transpose a matrix in place. If you need to leave the original matrix alone, you must make a copy. In this case, doing the actual interchange between rows and columns is just as efficient. However, as we'll discuss next, a more sophisticated matrix class can be defined that eliminates the data movement, even when the original matrix is to be left untouched.

## MORE ROBUST VECTORS AND MATRICES

The SimpleMatrix class uses vector pointers as a fundamental part of its design. While vector pointers are fast and efficient, there is no way to do range checking on subscripts, since vector pointers do not keep track of the bounds of the vectors. This isn't necessarily a problem. Since range checking can easily reduce the performance of your code by 50 percent, you don't always want to use it. However, the capability should still be provided.

Along these lines, we'll next present two classes, Vector and Matrix, that are extensions of the VecPtr and SimpleMatrix classes and that incorporate the techniques of range checking and reference counting. Fundamental to the design of these two classes is that, strangely enough, each matrix object is actually composed of a 1-d vector object! The purpose of the Matrix class is to interpret the vector data as though it were two-dimensional.

Because vector objects will use a shared representation, it follows that matrices will, as well. This shared representation allows us to easily support efficient submatrix and matrix transpose operations. As you study the following design, note how much overhead is caused by the shared representation, and how much that overhead is compensated for by more efficient matrix operations.

## THE VECTOR CLASS

The Vector class is similar in many ways to the String class described in Chapter 6. (You should review that chapter now if you haven't already read it.) The Vector class uses the same letter-envelope reference counting scheme, having an underlying vector representation class VecRep:

`template<class TYPE> class Vector; // Forward declaration`

`template<class TYPE>`

`class VecRep {`

`private:`

`friend class Vector<TYPE>;`

`unsigned alloclen;`

`unsigned refcnt;`

`TYPE data[1];`

`VecRep(unsigned d);`

`void *operator new(size_t n, unsigned d);`

`void operator delete(void *p);`

`public: // So we can set up null_rep`

`static VecRep<TYPE> null_rep;`

`VecRep();`

`~VecRep();`

`};`

The complete code for the VecRep class is given on disk in the files vector.h and vector.mth.

`template<class TYPE>`

`VecRep<TYPE>::VecRep(unsigned d)`

`// Constructor to initialize a shared vector rep`

`{`

`if (this != &null_rep) {`

`// We only want to do the following if we're`

`// not referencing null_rep`

`alloclen = d;`

`refcnt = 1;`

`// We must call the default constructor for`

`// all but the first element`

`TYPE *q = data + 1;`

`for(unsigned i = 1; i<d; i++, q++) new(q) TYPE;`

`}`

`}`

`template<class TYPE>`

`void VecRep<TYPE>::~VecRep()`

`// Destructor to explicitly destroy all elements`

`// of the vector, except the first`

`{`

`TYPE *q = data + 1;`

`for(unsigned i = 1; i<alloclen; i++, q++) q->TYPE::~TYPE();`

`}`

With VecRep objects, we must track the number of elements in the vector so that we know how many times to call the constructor and destructor. To do this, the alloclen member is stored with each VecRep object.

As is true with strings, we use a null-vector representation called null_rep, which allows us to elegantly handle empty vectors--particularly those caused by memory allocation problems. Unlike all other VecRep objects that must be allocated dynamically, null_rep must be allocated statically. You must allocate one null_rep object per vector element type somewhere in your program. A macro is supplied to make this easier:

`#define INITNULLVEC(TYPE) VecRep<TYPE> VecRep<TYPE>::null_rep;`

With the VecRep class in hand, the Vector class can be defined:

`template<class TYPE> class Matrix; // Forward declaration`

`template<class TYPE>`

`class Vector {`

`protected:`

`friend class Matrix<TYPE>;`

`VecRep<TYPE> *rep; // Pointer to shared vector data`

`TYPE *start;       // Pointer to logical start of data`

`unsigned len;      // Number of logical elements`

`unsigned stride;   // Stride (offset to next logical element)`

`int Alloc(unsigned n);`

`void Bind(const Vector<TYPE> &v);`

`void Unbind();`

`void NewBinding(const Vector<TYPE> &v);`

`public:`

`Vector(unsigned n = 0, const TYPE *s = 0);`

`Vector(const Vector<TYPE> &v);`

`Vector(const Vector<TYPE> &v, SliceType styp,`

`unsigned n=0, unsigned str=1, unsigned ofs=0);`

`~Vector();`

`#ifndef NO_RANGE_CHECK`

`unsigned CheckIndx(unsigned i) const;`

`#endif`

`TYPE &operator[](unsigned i);`

`const TYPE &operator[](unsigned i) const;`

`void CopyN(const TYPE *src, unsigned n);`

`void Copy(const Vector<TYPE> &v);`

`void Share(const Vector<TYPE> &v);`

`void SetElements(const TYPE &x);`

`Vector<TYPE> &operator=(const Vector<TYPE> &v); `

`Vector<TYPE> &operator=(const TYPE &x);`

`int IsNull() const;`

`int IsUnique() const;`

`Vector<TYPE> Clone() const;`

`int EnsureUnique();`

`unsigned Length() const;`

`unsigned Stride() const;`

`// Low-level hooks`

`VecPtr<TYPE> All();`

`VecPtr<const TYPE> All() const;`

`};`

The complete code for the Vector class is given on disk in the files vector.h and vector.mth. Test programs for vectors are supplied in tstvec.cpp, tstvec2.cpp, and tstvec3.cpp.

`template<class TYPE>`

`Vector<TYPE>::Vector(unsigned n, const TYPE *s)`

`// Constructor to allocate space for an n element vector,`

`// and, if s != 0, to copy low-level array into it.`

`// If n == 0 or allocation fails, we bind to null rep,`

`// and do no copying.`

`{`

`if (Alloc(n) && s) CopyN(s, n);`

`}`

`template<class TYPE>`

`int Vector<TYPE>::Alloc(unsigned n)`

`// Allocates a block of n elements for this vector.`

`// If n == 0 or allocation fails, we bind to the null rep.`

`// ASSUMES TYPE has a default constructor.`

`// Returns 0 if we've bound to null_rep otherwise, returns 1.`

`{`

`if (n) {`

`rep = new(n) VecRep<TYPE>(n);`

`}`

`else {`

`rep = &VecRep<TYPE>::null_rep;`

`rep->refcnt++;`

`}`

`start = rep->data;`

`if (rep != &VecRep<TYPE>::null_rep) {`

`len = n;`

`stride = 1;`

`return 1;`

`}`

`else {`

`len = 0;`

`stride = 1;`

`return 0;`

`}`

`}`

The Alloc( ) function is similar to the one used in the String class, with the overloaded new operator for VecRep handling the dynamic allocation chores. Also, to handle any allocation problems, we use a zero-length vector that references null_rep. Note that new vectors initially have a stride of one, and that the vectors are fixed length (with the logical length set to the allocated dimensioned length). Here are some ways to create vectors:

`Vector<float> v(10); // Create a new vector of 10 elements`

`Vector<int> w;       // Create a null vector`

Like the String copy constructor, the Vector copy constructor actually uses share semantics:

`template<class TYPE>`

`Vector<TYPE>::Vector(const Vector<TYPE> &v)`

`// Copy constructor that shares all of the vector v`

`{`

`Bind(v);`

`len = v.len;`

`stride = v.stride;`

`start = v.start;`

`}`

The copy constructor's use of share semantics is all-important, since this allows us to return vectors from functions by value, without having to copy all of the vector elements.

The assignment operator also uses share semantics:

`template<class TYPE>`

`Vector<TYPE> &Vector<TYPE>::operator=(const Vector<TYPE> &v)`

`// Share semantics used for assignment`

`{`

`if (this != &v) Share(v); // Note trap for assignment to self`

`return *this;`

`}`

`template<class TYPE>`

`void Vector<TYPE>::Share(const Vector<TYPE> &v)`

`// Used to shares data with vector v`

`{`

`NewBinding(v);`

`len = v.len;`

`stride = v.stride;`

`start = v.start;`

`}`

## THE MATRIX CLASS

`template<class TYPE>`

`class Matrix {`

`protected:`

`Vector<TYPE> data;`

`unsigned nrows, ncols, rowstride, colstride;`

`public:`

`Matrix(unsigned nr=0, unsigned nc=0, const TYPE *s = 0);`

`Matrix(const Matrix<TYPE> &m);`

`Matrix(const Matrix<TYPE> &m, SliceType styp,`

`unsigned sr=0, unsigned sc=0,`

`unsigned nr=0, unsigned nc=0);`

`void Copy(const Matrix<TYPE> &m);`

`void Share(const Matrix<TYPE> &m);`

`Matrix<TYPE> Transpose();`

`Matrix<TYPE> &operator=(const Matrix<TYPE> &m);`

`Matrix<TYPE> &operator=(const TYPE &x);`

`#ifndef NO_RANGE_CHECK`

`unsigned CheckRow(unsigned i) const;`

`unsigned CheckCol(unsigned i) const;`

`#endif`

`Vector<TYPE> operator[](unsigned r);`

`const Vector<TYPE> operator[](unsigned r) const;`

`TYPE &operator()(unsigned r, unsigned c);`

`const TYPE &operator()(unsigned r, unsigned c) const;`

`Vector<TYPE> Row(unsigned r, SliceType styp=SHARED);`

`Vector<TYPE> Col(unsigned c, SliceType styp=SHARED);`

`Vector<TYPE> Diag(SliceType styp=SHARED);`

`Vector<TYPE> All(SliceType styp=SHARED);`

`const Vector<TYPE> Row(unsigned r, SliceType styp=SHARED) const;`

`const Vector<TYPE> Col(unsigned c, SliceType styp=SHARED) const;`

`const Vector<TYPE> Diag(SliceType styp=SHARED) const;`

`const Vector<TYPE> All(SliceType styp=SHARED) const;`

`// Mid-level hooks. Use these at your own risk:`

`VecPtr<TYPE> RowPtr(unsigned r=0);`

`VecPtr<TYPE> ColPtr(unsigned c=0);`

`VecPtr<TYPE> DiagPtr();`

`VecPtr<TYPE> PtrToAll();`

`VecPtr<const TYPE> RowPtr(unsigned r=0) const;`

`VecPtr<const TYPE> ColPtr(unsigned c=0) const;`

`VecPtr<const TYPE> DiagPtr() const;`

`VecPtr<const TYPE> PtrToAll() const;`

`int IsNull() const;`

`int IsUnique() const;`

`Matrix<TYPE> Clone() const;`

`int EnsureUnique();`

`unsigned NCols() const;`

`unsigned NRows() const;`

`unsigned RowStride() const;`

`unsigned ColStride() const;`

`int IsRowMajor() const;`

`int IsColMajor() const;`

`int IsSquare() const;`

`};`

The complete code for the Matrix class is given on disk in the files matrix.h and matrix.mth. Test programs are provided in the files tstmat.cpp, tstmat2.cpp, tstmat3.cpp, and tstmat4.cpp.

## Constructing and Assigning Matrices

`template<class TYPE>`

`Matrix<TYPE>::Matrix(unsigned nr, unsigned nc, const TYPE *s)`

`: data(nr * nc, s)`

`{`

`if (data.IsNull()) {`

`nrows = 0; ncols = 0; colstride = 1; rowstride = 1;`

`}`

`else {`

`nrows = nr; ncols = nc; colstride = nc; rowstride = 1;`

`}`

`}`

Note how the possibility of null matrices (due either to passing zero sizes or memory allocation problems) is handled properly.

The copy constructor and overloaded assignment operators use share semantics:

`template<class TYPE>`

`Matrix<TYPE>::Matrix(const Matrix<TYPE> &m)`

`: data(m.data)`

`{`

`if (data.IsNull()) {`

`nrows = 0; ncols = 0; colstride = 1; rowstride = 1;`

`}`

`else {`

`nrows = m.nrows; ncols = m.ncols;`

`colstride = m.colstride; rowstride = m.rowstride;`

`}`

`}`

`template<class TYPE>`

`Matrix<TYPE> &Matrix<TYPE>::operator=(const Matrix<TYPE> &m)`

`// Share semantics used for assignment.`

`{`

`if (this != &m) Share(m); // Note trap for assignment to self`

`return *this;`

`}`

`template<class TYPE>`

`void Matrix<TYPE>::Share(const Matrix<TYPE> &m)`

`{`

`data.Share(m.data);`

`nrows = m.nrows; ncols = m.ncols;`

`colstride = m.colstride; rowstride = m.rowstride;`

`}`

Here's an example of these functions:

`float one_d_arr[4] = {0, 1, 2, 3};`

`Matrix<float> m(3, 5);            // Elements left uninitialized`

`Matrix<float> w(2, 2, one_d_arr); // Elements set to {{0, 1},{2, 3}}`

`Matrix<float> z(w);               // Copy constructor called`

`m = z;                            // m now reassigned to z's data`

## Creating Submatrices

The Matrix class can support submatrices, which SimpleMatrix couldn't do. These submatrices can either overlay an existing matrix (with the data being shared) or be separate copies. Figure 7.10 shows an example of overlaying a 3x5 submatrix onto a 6x7 matrix.

`template<class TYPE>`

`Matrix<TYPE>::Matrix(const Matrix<TYPE> &m, SliceType styp,`

`unsigned sr, unsigned sc,`

`unsigned nr, unsigned nc)`

`// Constructor that constructs a submatrix of matrix m`

`// If styp==SHARED, it means to share the data with m`

`// NOTE: If sharing, we initially create a null vector, and`

`// then we immediately rebind to shared vector. If copying,`

`// a row-major submatrix is created, and if nr == 0,`

`// then m.nrows is used. If nc == 0, then m.ncols is used.`

`: data((styp == SHARED) ?`

`0 : ((nr ? nr : m.nrows) * (nc ? nc : m.ncols)))`

`{`

`unsigned start_ofs, n;`

`if (nr) nrows = nr; else { nrows = m.nrows; sr = 0; }`

`if (nc) ncols = nc; else { ncols = m.ncols; sc = 0; }`

`if (styp == SHARED) { // Sharing`

`// When sharing, strides are always the same as parent strides`

`colstride = m.colstride;`

`rowstride = m.rowstride;`

`// Calculate length of underlying vector and offset.`

`// Remember: either colstride or rowstride == 1.`

`if (rowstride == 1) {`

`n = colstride * (nrows-1) + ncols;`

`}`

`else {`

`n = rowstride * (ncols-1) + nrows;`

`}`

`start_ofs = sr * colstride + rowstride * sc;`

`data.Share(Vector<TYPE>(m.data, SHARED, n, 1, start_ofs));`

`}`

`else {`

`if (data.IsNull()) {`

`nrows = 0; ncols = 0; colstride = 1; rowstride = 1;`

`}`

`else {`

`// Set up as a row-major matrix`

`colstride = ncols; rowstride = 1;`

`// Copy data from shared submatrix into allocated space.`

`// Note that the constructor is called recursively here`

`// to create a shared submatrix used for the copy.`

`Copy(Matrix<TYPE>(m, SHARED, sr, sc, nrows, ncols));`

`}`

`}`

`}`

#### Figure 7.10 A submatrix of a matrix.

When making a submatrix copy, we use the same type of recursive call to the constructor that we used in the constructor for subvectors in the Vector class.

`Matrix<float> m(6, 7);`

`...`

`Matrix<float> subm(m, SHARED, 2, 2, 3, 5);`

The last four parameters of the submatrix constructor, which define the origin and size of the submatrix, default to zero. In this case, the submatrix encompasses all of the original matrix. This approach can be used to create an actual copy of the matrix--which the copy constructor can't do--since it uses share semantics exclusively. For example:

`Matrix<float> copy_of_m(m, COPIED);`

The two variables rowstride and colstride provide the key to allowing a submatrix to share data with its parent. For normal row-major matrices, the column stride is always equal to the number of columns in the matrix. However, that isn't necessarily true for submatrices. Consider, for instance, the 3x5 submatrix shown earlier in Figure 7.10. You can see that its column stride should be the number of columns of the parent 6x7 matrix, and thus should be 7 instead of 5.

If one of the matrices is transposed or yet another submatrix is taken of a submatrix, the matter is further complicated. However, by judicious use of rowstride and colstride, the strides are never confused. Here's the basic rule: the row and column strides of a submatrix are always the same as their parents' strides and must always reflect the physical layout of the data. The real data lies somewhere up the chain of parents and is organized in row-major order. Of course, if one parent is transposed, the row and column strides will be interchanged.

## Transposing Matrix Object

`template<class TYPE>`

`Matrix<TYPE> Matrix<TYPE>::Transpose()`

`// Returns a transposition of this matrix. Does not have to move`

`// the elements of the matrix around, just merely changes`

`// interpretation between row-major and column-major ordering.`

`{`

`Matrix<TYPE> t(*this); // Remember, copy constructor shares`

`// Interchange number of rows and cols`

`unsigned temp = t.nrows;`

`t.nrows = t.ncols;`

`t.ncols = temp;`

`// Interchange row stride and column stride`

`temp = t.colstride;`

`t.colstride = t.rowstride;`

`t.rowstride = temp;`

`return t;`

`}`

`Matrix<float> m(3, 5); // Create a 3x5 matrix`

`...`

`Matrix<float> t;       // Create a null matrix`

`t = m.Transpose();`

Because both assignment and copy construction use share semantics, we can write statements like the last one without worrying about undue copying taking place. If the matrix being transposed is large, we can indeed avoid a lot of data movement.

Of course, you don't get something for nothing. Clever use of rowstride and colstride for matrix transposition does have its price. The vector extraction functions, to be given next, aren't as efficient as they could be; they can't assume anything about the vector strides. For instance, a row vector may or may not have a stride of one, so we can't use an ordinary pointer to it.

## Extracting Vectors from a Matrix

Here are the RowPtr( ) and ColPtr( ) functions for the Matrix class. The other extraction functions are defined similarly. Also note that const versions can be defined for constant matrices (not shown):

`template<class TYPE>`

`VecPtr<TYPE> Matrix<TYPE>::RowPtr(unsigned r)`

`// Returns vector pointer to row r`

`{`

`return VecPtr<TYPE>(data.start+CHECKROW(r)*colstride, rowstride);`

`}`

`template<class TYPE>`

`VecPtr<TYPE> Matrix<TYPE>::ColPtr(unsigned c)`

`// Returns vector pointer to column c`

`{`

`return VecPtr<TYPE>(data.start+CHECKCOL(c)*rowstride, colstride);`

`}`

In these functions, range checking is handled by calls to the macros CHECKROW( ) and CHECKCOL( ). These macros are the two-dimensional counterparts to the CHECK( ) macro used in the Array classes (described in Chapter 4), and are defined in the range.h header file. The macros do nothing if NO_RANGE_CHECK is undefined.

`template<class TYPE>`

`Vector<TYPE> Matrix<TYPE>::Row(unsigned r, SliceType styp)`

`// Return a row slice of a matrix`

`{`

`return Vector<TYPE>(data, styp,`

`ncols, rowstride, CHECKROW(r)*colstride);`

`}`

`template<class TYPE>`

`Vector<TYPE> Matrix<TYPE>::Col(unsigned c, SliceType styp)`

`// Return a column slice of a matrix`

`{`

`return Vector<TYPE>(data, styp,`

`nrows, colstride, CHECKCOL(c)*rowstride);`

`}`

Notice that, by using the styp parameter, the vectors returned can either share their data with the original matrix or can be separate copies. The default action is to share the data.

## Subscripting with the Matrix Class

`template<class TYPE>`

`Vector<TYPE> Matrix<TYPE>::operator[](unsigned r)`

`// Row subscripting operator for non-const matrices.`

`// This routine does the same thing as Row(r, SHARED).`

`{`

`return Vector<TYPE>(data, SHARED,`

`ncols, rowstride, CHECKROW(r)*colstride);`

`}`

Since vectors have their own subscript operator function defined, you can write code like the following:

`Matrix<float> m(3, 5);`

`m[2][4] = 99; // Set lower-right element to 99`

You can circumvent this problem by defining operator[ ]( ) to return a vector pointer rather than a full vector. This will yield a more efficient subscript operation (perhaps only twice as slow as the fastest version), but you lose the ability to do range checking on the row vector. Of course, you can eliminate the subscripting altogether and instead use vector pointer incrementing. Doing so will yield speeds as fast as those for the SimpleMatrix class. This is an important benefit because you can obtain improved the performance even though you now have generalized matrices (such as transposed submatrices).

Another alternative is to define a different style of subscript operator. For example, you can overload the ( ) operator (normally used for function call syntax) to do two-dimensional subscripting:

`template<class TYPE>`

`TYPE &Matrix<TYPE>::operator()(unsigned r, unsigned c)`

`{`

`return data.start[CHECKROW(r)*colstride + CHECKCOL(c)*rowstride];`

`}`

`...`

`Matrix<float> m(3, 5); // Construct matrix`

`...`

`m(2, 4) = 99;          // Set lower-right element to 99`

The equation used in the operator( )( ) function is a generalization of the two-dimensional subscript equation used for row-major matrices. Normally, you multiply the row index by the number of columns and then add the column index, as in:

`m(r, c) = *(m + r*ncols + c)`

However, in the general case, the matrix might be transposed and might be a submatrix. Thus, we have to multiply both the row index and the column index by the appropriate row and column strides. Note that one of the strides will be one, so in effect we'll have either

`m(r, c) = *(m + r*colstride + c) // If row major`

or

`m(r, c) = *(m + r + c*rowstride) // If column major`

Thus, one of the multiplications in operator( )( ) is redundant. That's the price we pay for generality. Even so, expressions like m(i,j) have been timed to be two to three times faster than their m[i][j] counterparts.

The expression m(i,j) is actually the conventional notation used in mathematics for subscripting. Because of this, some people prefer to use the two-dimensional operator over its double-subscripting m[i][j] counterpart. Others prefer the more conventional C++ syntax for subscripting. Regardless of your personal preference, the mathematical notation will be significantly faster unless you define the [ ] operator for matrices to return a vector pointer rather than a full vector. In that case, the results are fairly even.

## Non-Contiguous Submatrices

A matrix is said to be contiguous if you can access all of the elements using a single 1-d vector. Normal matrices created by the Matrix class will always be contiguous, regardless of whether they are transposed. However, its possible for submatrices that are shared with a parent to be non-contiguous. Figure 7.11 shows some examples of contiguous and non-contiguous submatrices, assuming a row-major organization.

From Figure 7.11, the following rule emerges for determining whether a submatrix is contiguous: either the number of columns must equal the column stride or the number of rows must be one. For column-major matrices (not shown in Figure 7.11), the number of rows must equal the row stride or the number of columns must be one. The function All( ), which attempts to return a 1-d vector to all the elements in a matrix, makes these tests for contiguousness. If the matrix isn't contiguous, All( ) returns a null vector because it isn't possible to access all the elements with a single vector:

#### Figure 7.11 Some non-contigous submatrices.

`template<class TYPE>`

`Vector<TYPE> Matrix<TYPE>::All(SliceType styp)`

`// Returns a 1-d vector with all of the data for the matrix.`

`// If it isn't possible for the 1-d data to be contiguous`

`// (as will be the case for some submatrices), then a`

`// null vector is returned.`

`// ASSUMES either colstride or rowstride == 1.`

`{`

`if ((rowstride == 1 && nrows > 1 && ncols != colstride) ||`

`(colstride == 1 && ncols > 1 && nrows != rowstride))`

`return Vector<TYPE>(); // Return a null vector`

`return Vector<TYPE>(data, styp, nrows * ncols, 1, 0);`

`}`

As an example of handling non-contiguous submatrics, here is an overloaded assignment operator that sets all the elements of the matrix to a specified value. This is done by doing the assignments row by row, using vector pointers:

`template<class TYPE>`

`Matrix<TYPE> &Matrix<TYPE>::operator=(const TYPE &x)`

`// Set all elements in the matrix to the value x.`

`// Assumes a worst case non-contiguous submatrix.`

`{`

`// tc = column vector pointer to rows of t`

`// tr = row vector pointer to columns of t`

`VecPtr<TYPE> tc(data.start, colstride);`

`VecPtr<TYPE> tr(data.start, rowstride);`

`for (unsigned i = 0; i<nrows; i++) {`

`tr = tc; // Remember, strides aren't assigned!`

`for (unsigned j = 0; j<ncols; j++) {`

`*tr = x;`

`tr++;  // Next column`

`}`

`tc++;       // Next row`

`}`

`return *this;`

`}`

## DESIGN CRITIQUE

However, by using such tricks as vector pointers and reference counting, we can define certain operations to be much more efficient than the obvious ways would suggest. Consider, for instance, matrix transposition, which can be done by merely swapping some variables. Returning matrices from functions also becomes very efficient, making the code less clumsy. Never does an unnecessary copy of matrix elements take place.

There are some overriding questions: What is the cost of generality? Do the potential speed improvements offset the potential slowdowns? The answers, of course, depend on the particular application (and your compiler). However, in most non-trivial applications, the Vector and Matrix classes will probably be just as as efficient as simple, static built-in matrices.

An interesting test is to compare matrix multiply times for built-in statically sized matrices, SimpleMatrix objects, and Matrix objects. Table 7.1 shows the fastest times possible for all three types (using all the tricks described in this chapter). The results were obtained by multiplying two 30x30 double precision floating-point matrices together, 100 times, and taking the average. The tests were performed using 32-bit pointers on a 486 33-MHz machine.

As you might expect, statically sized matrices yielded the fastest results. Using a generalized matrix resulted in a 32 percent drop in speed, which isn't bad considering the flexibility that we gain. Interestingly enough, the general Matrix class performed slightly better than the SimpleMatrix class (probably due to some compiler optimization quirk). Their performance is close enough to suggest that the main performance penalty is due mostly to the use of dynamically sized matrices and not the fact that, in the case of the Matrix class, submatrices and efficient transposition are supported.

What is the space cost of generalized matrices? Figure 7.9 given earlier shows the memory layout of a Matrix object. Note that, in addition to the matrix elements themselves, a Matrix object has four data members plus a Vector object with four data members--one of which points to a VecRep object using two data members. Thus, an independent, unique matrix has an overhead of 10 data members (probably in the range of 20 to 40 bytes, depending on the implementation). This may be significant if you have a lot of small 3x3 or 4x4 matrices. However, for larger matrices, the overhead becomes less important, especially in light of the flexibility gained. And, of course, if the matrix data is shared, the overhead is reduced by two members plus the elements shared.

#### Table 7.1 Relative performance of different matrix types.

`Type                      Time     Performance Relative to Fastest`

`of Matrix              in Seconds       (Smaller Is Better)`

`Statically sized`

` built-in 2-d arrays      0.0470               1.0`

`SimpleMatrix objects      0.0660               1.40`

`Matrix objects            0.0621               1.32`

Go to Chapter 8 Return to Table of Contents