Vectors and matrices have a special relationship. It's common to take *1-d *slices of a matrix and call them vectors. For example, a row of a matrix is called a *row vector*. A column is known as a *column vector*. If the matrix is square (with the number of rows equal to the number of columns), you can think in terms of *diagonal vectors*, which span the diagonal of a matrix. Figure 7.1 shows a matrix with an example of each type of vector.

Since matrices are arrays, you might be tempted to implement them using the array-of-arrays technique described in Chapter 4. If you do, however, the result will be less than satisfactory--for several reasons. First, it's not possible to allow both dimensions to be determined at runtime. Second, even if you use the most compact statically-sized form, you will still have some unnecessary overhead. For example, each row of a *2-d* array implemented with *Sarray *objects has its own length variable, virtual function table pointer, and pointer to its array elements, as illustrated in Figure 7.2. Most of these variables are redundant. You could keep just one of each for the entire matrix.

A worse problem lies in figuring out how to represent vectors associated with a matrix. Row vectors are easy, since each row is already an array. But what about column vectors, or diagonal vectors? There is no easy way to treat the elements of a column or diagonal as an array in its own right, since each element is stored in a separate row array. To further explore this fundam ental problem, we'll step back and take a look at how two-dimensional arrays are organized using the built-in C++ array types.

Consider the following two-dimensional 3x5 matrix:

float m[3][5];

In C++, this matrix is stored, row by row, as a *1-d* array. Matrices stored in this way are called *row-major* matrices. Scanning down the *1-d* representation, the column index varies the fastest, while the row index varies the slowest. Some languages, such as Fortran, use *column-major* matrices, where the matrices are laid out column by column. Here, the row index varies fastest, while the column index varies the slowest. Figure 7.3 shows the difference between these two types of organization.

m[i][j] <==> *(m + i*ncols + j); // For row-major matrices

For a column-major matrix, the calculation is:

m[i][j] <==> *(m + i + j*nrows); // For column-major matrices

Consider the following two-dimensional 3x5 matrix:

float m[3][5];

In C++, this matrix is stored, row by row, as a *1-d* array. Matrices stored in this way are called *row-major* matrices. Scanning down the *1-d* representation, the column index varies the fastest, while the row index varies the slowest. Some languages, such as Fortran, use *column-major* matrices, where the matrices are laid out column by column. Here, the row index varies fastest, while the column index varies the slowest. Figure 7.3 shows the difference between these two types of organization.

m[i][j] <==> *(m + i*ncols + j); // For row-major matrices

For a column-major matrix, the calculation is:

m[i][j] <==> *(m + i + j*nrows); // For column-major matrices

It's possible to incorporate the ideas in the previous section into a new type of pointer, which we'll call *vector pointers*. These pointers are generalizations of ordinary pointers. An ordinary pointer always has a stride of one. That is, it's assumed the array elements are contiguous. Vectors may be non-contiguous (with evenly spaced elements, though), so a vector pointer can have a stride different than one. The following *VecPtr* class encapsulates these notions:

template<class TYPE>

class VecPtr {

private:

TYPE *p; // Ordinary pointer

const unsigned stride; // Pointer stride

public:

VecPtr(TYPE *t, unsigned s=1) : p(t), stride(s) { }

VecPtr(const VecPtr<TYPE> &s) : p(s.p), stride(s.stride) { }

void operator=(const VecPtr<TYPE> &s) { p = s.p; }

void operator=(TYPE *t) { p = t; }

void operator++() { p += stride; }

void operator++(int) { p += stride; }

void operator+=(unsigned n) { p += stride * n; }

TYPE &operator[](unsigned i) const { return p[stride*i]; }

operator TYPE *() const { return p; }

TYPE &operator *() const { return *p; }

#ifdef TYPE_IS_STRUCTURE

TYPE *operator->() const { return p; }

#endif

};

for(int i = 0; i<n; i++) { *v = 0; v++; }

for(int i = 0; i<n; i++) { *v = 0; v += stride; }

The pointer de-referencing operators *`**'* *and *'->' *are overloaded to make vector pointers look even more like ordinary pointers. For instance, the expression ^{*}*v* returns the vector element that is currently being referenced. Likewise, *v*->*m* returns the member *m* of the element being referenced. For this to work, *TYPE* must be a structure; otherwise, the class won't compile. We've commented out the operator->( ) function using ifdef statements in the class declaration.

for(int i = 0; i<n; i++) v[i] = 0;

p + i <==> ((char *)p) + i * sizeof(T)

Vector pointers can be optimized by pre-multiplying the vector stride by *sizeof(T)*. The stride then becomes a *byte stride*. In pointer arithmetic, the pointer can be cast as a *char* *, which saves the extra multiplication. For example, the following functions of the *VecPtr* class can be optimized this way:

template<class TYPE>

class VecPtr {

...

VecPtr (TYPE *t, unsigned s=1) : p(t), stride(s*sizeof(TYPE)) { }

void operator++() { (char *)p += stride; }

void operator++(int) { (char *)p += stride; }

void operator+=(unsigned n) { (char *)p += stride * n; }

TYPE &operator[](unsigned i) const {

return *((TYPE *)((char *)p + stride*i));

}

...

};

The following example shows how to use vector pointers with matrices. A matrix is created using built-in arrays and is then initialized to be a unit matrix (with all elements zero except on the diagonal, where ones are stored.)

float m[3][3]; // Declare a 3x3 row-major matrix

// Set all values to zero by pointing to matrix with

// a vector that can access all elements. A stride of

// 1 will do the trick.

VecPtr<float> va(m, 1);

for (int i = 0; i<9; i++) va[i] = 0;

// Set diagonal elements to 1 with another vector pointer

VecPtr<float> vd(m, 4); // Stride is ncols + 1

for (int j = 0; j<3; j++) vd[j] = 1;

The resulting matrix would be:

100

010

001

One unfortunate limitation with vector pointers is the inability to use expressions like **p*++. Why can't you use this? Consider what the expression translates into in terms of operation function calls:

*p++ <==> (v.operator++()).operator*()

First, the post-increment operator is called, and the pointer--as it existed before the increment--is returned. This pointer is then de-referenced to return the element. However, there is no way to return the original pointer *before* it has been incremented. Only a copy can be returned. On the surface, the following *operator++( ) *function seems to work:

VecPtr<TYPE> operator++(int)

{

VecPtr before (*this); // Make copy of object

p += stride; // Do increment

return before; // Return copy

}

However, what happen in the following (admittedly concocted) expression?

(v++)++; // Undefined result

When you use the built-in array types, it's not possible to directly define matrices that have both their dimensions determined at runtime. Now that you know about pointers and strides, we can show you why this is true. In Chapter 4, we mentioned that, although the number of columns must be fixed, there *is* a way to make the number of rows variable. Here is one way to do it:

typedef float ArrNx5[5]; // Declare array-of-5-columns type

void f(unsigned n)

{

// Dynamically allocate nx5 matrix, referenced by p

ArrNx5 *p = new ArrNx5[n];

// Set lower-right element to 99

p[n - 1][4] = 99;

}

The variable *p *represents a pointer to the first element in the first row of the respective matrix. Even though this variable points to a floating point element, its type isn't *float* *** or even *float* *** *as you first might think. Instead, *p* is *a pointer to floating point array of five columns.* This can easily be seen by *p*'*s* declaration.

A stride is implicit with a pointer to an array. In this case, *p* has the constant 5 as its stride. With this in mind, we have the following equivalence:

P[i] <==> *(p + i*5)

An ordinary pointer gets returned by subscript of a pointer to array. This pointer points to the first element of the *i*th row. The pointer can then be used in a subscript calculation to find the *j*th column of the row. Hence, double subscripts such as *p[i][j]* can be used.

Interestingly enough, the subscripting equation just shown is the same one used for vector pointer strides! The crucial difference is that, here, the stride is constant. Because of this, the compiler is able to easily plug in the constant stride in subscript and pointer increment calculations.

Contrast this with variable strides (such as in a dynamic matrix with the number of columns determined at runtime). Here, the compiler could not simply plug in values for the stride, because it doesn't know what the stride should be! The compiler would be forced to create a special data structure that carries the stride around with the pointer. This approach ought to sound familiar to you, since that's exactly what vector pointers do!

The conclusion is this: if you want to have dynamic matrices with both dimensions determined at runtime, you need to have vector pointers. Since C++ doesn't have vector pointers built in, it isn't able to handle dynamic matrices directly. However, nothing prevents you from creating your own dynamic matrices, as you'll now see.

From the preceding discussions, two basic techniques emerge for defining dynamically sized matrices:

1. Represent the matrix as a one-dimensional array, using row-major or column-major order (row-major ordering is easiest). Allocate this array dynamically.

2. Use vector pointers to conveniently access rows, columns, and diagonals.

The following *SimpleMatrix* class uses these techniques:

template<class TYPE>

class SimpleMatrix {

private:

TYPE *data;

unsigned nrows, ncols;

public:

SimpleMatrix(unsigned nr, unsigned nc);

SimpleMatrix(const SimpleMatrix<TYPE> &m);

~SimpleMatrix();

void Copy(const SimpleMatrix<TYPE> &m);

SimpleMatrix<TYPE> &operator=(const SimpleMatrix<TYPE> &m);

TYPE *operator[](unsigned r);

VecPtr<TYPE> RowPtr(unsigned r=0);

VecPtr<TYPE> ColPtr(unsigned c=0);

VecPtr<TYPE> DiagPtr();

VecPtr<TYPE> PtrToAll();

const TYPE *operator[](unsigned r) const;

VecPtr<const TYPE> RowPtr(unsigned r=0) const;

VecPtr<const TYPE> ColPtr(unsigned c=0) const;

VecPtr<const TYPE> DiagPtr() const;

VecPtr<const TYPE> PtrToAll() const;

unsigned NRows() const;

unsigned NCols() const;

unsigned NElems() const;

};

template<class TYPE>

SimpleMatrix<TYPE>::SimpleMatrix(unsigned nr, unsigned nc)

// General constructor

{

nrows = nr; ncols = nc;

data = new TYPE[nrows*ncols];

}

The size of the matrix is stored in the variables *nrows* and *ncols*, which are also used in calculating vector strides. The functions *RowPtr( )*, *ColPtr( )*, and *DiagPtr( )* all return vector pointers of their respective types:

template<class TYPE>

VecPtr<TYPE> SimpleMatrix<TYPE>::RowPtr(unsigned r)

// Return a row vector pointer to row r

{

return VecPtr<TYPE>(data + r*ncols, 1);

}

template<class TYPE>

VecPtr<TYPE> SimpleMatrix<TYPE>::ColPtr(unsigned c)

// Return a column vector pointer to column c

{

return VecPtr<TYPE>(data+c, nrows);

}

template<class TYPE>

VecPtr<TYPE> SimpleMatrix<TYPE>::DiagPtr()

// Return a diagonal vector pointer. In case matrix

// isn't square, the smallest dimension is used to

// determine the diagonal stride.

{

unsigned dstride = ((nrows > ncols) ? ncols : nrows) + 1;

return VecPtr<TYPE>(data, dstride);

}

Also, the *PtrToAll( )* function returns a vector pointer, with a stride of one, which allows you to access the matrix as though it were a one-dimensional array. This is useful for operations that apply to all elements, such as zeroing the matrix, scaling the matrix, and so on.

template<class TYPE>

VecPtr<TYPE> SimpleMatrix<TYPE>::PtrToAll()

// Return pointer to all elements of the matrix

{

return VecPtr<TYPE>(data, 1);

}

Here is an example that uses these functions:

SimpleMatrix<int> m(3, 3);

int i;

// Set all elements to '4'

VecPtr<int> va = m.PtrToAll();

for(i = 0; i<m.NElems(); i++) va[i] = 4;

// Set row 1 elements to '2'

VecPtr<int> vr = m.RowPtr(1);

for(i = 0; i<m.NCols(); i++) vr[i] = 2;

// Set col 1 elements to '3'

VecPtr<int> vc = m.ColPtr(1);

for(i = 0; i<m.NRows(); i++) vc[i] = 3;

// Set diagonal elements to '1'

VecPtr<int> vd = m.DiagPtr();

for(i = 0; i<m.NRows(); i++) vd[i] = 1;

The resulting matrix would be:

134

212

431

The *SimpleMatrix* class overloads the subscript operator *[ ] *to return an ordinary pointer to the first element in row *r*.

template<class TYPE>

TYPE *SimpleMatrix<TYPE>::operator[](unsigned r)

{

return data + r*ncols;

}

By using a subscript on the returned pointer as well, two-dimensional subscripting can be achieved using a double subscript, as illustrated in Figure 7.5, and in the following code:

SimpleMatrix<float> m(3, 4); // Declare 3x4 matrix

m[1][2] = 99;

You might wonder why the *operator[ ]( )* function doesn't return a row vector pointer. For instance, we could have written:

// Alternative definition for []

template<class TYPE>

VecPtr<TYPE> SimpleMatrix<TYPE>::operator[](unsigned r)

{

return VecPtr(data + r*ncols, 1);

}

This function is identical to *RowPtr( )* (except for its name). Since row vectors themselves have subscripting defined, we can use the same double subscripting as we used earlier:

m[2][4] = 99; // Set lower-right element to 99

// First call SimpleMatrix::operator[], then VecPtr::operator[]

(m.operator[](2)).operator[](4) = 99;

*((data + 2*ncols) + 1*4) = 99;

*((data + 2*ncols) + 4) = 99; // Optimized version

Matrix multiplication is perhaps the most common operation done on matrices, and can be defined as follows: given two matrices *a* and *b*, one of size *mxn* and the other of size *nxp*, the result of multiplying the two matrices is a new matrix, *c*, of size *mxp*, where the elements of *c* are computed by the equation:

In other words, each row of *a* is paired with each column of *b*, and the corresponding elements of these vectors are multiplied and then summed. The sum becomes one element of the result matrix. Figure 7.6 illustrates this process.

The following function implements matrix multiplication:

void MatMult(SimpleMatrix<Number> &c,

const SimpleMatrix<Number> &a, const SimpleMatrix<Number> &b)

{

unsigned i, j, k;

for (i = 0; i < a.NRows(); i++) {

for (j = 0; j < b.NCols(); j++) {

Number sum = 0;

for (k = 0; k < b.NRows(); k++) sum += a[i][k] * b[k][j];

c[i][j] = sum;

}

}

}

We can also overload the *'**' operator for matrix multiplication, as follows:

template<class TYPE>

SimpleMatrix<TYPE>

operator*(const SimpleMatrix<TYPE> &a, const SimpleMatrix<TYPE> &b)

{

SimpleMatrix<TYPE> r(a.NRows(), b.NCols());

MatMult(r, a, b);

return r; // Copy constructor called

}

This example multiplies two matrices:

SimpleMatrix<float> a(3, 2), b(2, 4), c(3, 4);

...

c = a * b;

Unfortunately, the operator *"**" function leads to a lot of copying. First, the result matrix is created, the matrix multiply is performed by *MatMult( )*, and a copy of the result is returned. (We can't safely return a reference to the result.) Then, the assignment to *c* takes place, where yet another copy takes place. The *Matrix* class given later in this chapter avoids this problem.

template<class TYPE>

void MatMult1(SimpleMatrix<TYPE> &c,

const SimpleMatrix<TYPE> &a, const SimpleMatrix<TYPE> &b)

{

unsigned i, j, k;

VecPtr<const TYPE> ar = a.RowPtr();

VecPtr<const TYPE> ac = a.ColPtr();

VecPtr<const TYPE>br = b.RowPtr();

VecPtr<const TYPE>bc = b.ColPtr();

VecPtr<TYPE> cr = c.RowPtr();

VecPtr<TYPE> cc = c.ColPtr();

TYPE const *bstart = br;

for (i = 0; i < a.NRows(); i++) {

cr = cc; // Point to row i. Note: stride not copied

br = bstart;

for (j = 0; j < b.NCols(); j++) {

TYPE sum = 0;

ar = ac; // Point row i. Note: stride not copied.

bc = br; // Point column j. Note: stride not copied.

for (k = 0; k < b.NRows(); k++) {

sum += *ar * *bc;

ar++; // Next column

bc++; // Next row

}

br++; // Next column

*cr = sum;

cr++; // Next column

}

ac++; // Next row

cc++; // Next row

}

}

This function uses six vector pointers, two for each matrix. By incrementing a column vector pointer, we can advance to the next row of a matrix. Incrementing a row vector pointer advances us to the next column of the row. To understand *MatMult1( )*, it's important to realize that when vector pointer assignments take place only the underlying ordinary pointers are copied--*not* the strides. (You can verify this by looking at the definition for *VecPtr* given earlier.) Thus, we can safely assign a column vector pointer to a row vector pointer without confusing the strides.

Original Statement New Statement

VecPtr<const TYPE> ar = a.RowPtr(); const TYPE *ar = a.RowPtr();

VecPtr<const TYPE> br = b.RowPtr(); const TYPE *br = b.RowPtr();

VecPtr<TYPE> cr = c.RowPtr(); TYPE *cr = c.RowPtr();

True to the design principles explained in Chapters 2 and 3, we've made the *VecPtr* and *SimpleMatrix* classes work for constant objects. You'll notice that we supply two versions of the*[ ]*operator for *SimpleMatrix*:

template<class TYPE>

TYPE *SimpleMatrix<TYPE>::operator [](unsigned r)

// Called for non-const matrices

{

return data + r*ncols;

template<class TYPE>

const TYPE *SimpleMatrix<TYPE>::operator[](unsigned i) const

// Called for const matrices

{

return data + r*ncols;

}

template<class TYPE>

VecPtr<TYPE> SimpleMatrix<TYPE>::ColPtr(unsigned c)

// Called for non-const matrices

{

return VecPtr<TYPE>(data+c, nrows);

}

template<class TYPE>

VecPtr<const TYPE> SimpleMatrix<TYPE>::ColPtr(unsigned c) const

// Called for const matrices

{

return VecPtr<const TYPE>(data+c, nrows);

}

Another common matrix operation is to take the *transpose* of a matrix. The B transposition is accomplished by interchanging the rows and the columns. Figure 7.7 shows an example.

The following *Transpose( )* function shows the obvious implementation of a matrix transposition.

template<class TYPE>

SimpleMatrix<TYPE> Transpose(const SimpleMatrix<TYPE> &m)

{

SimpleMatrix<TYPE> t(m.NCols(), m.NRows());

for (unsigned i = 0; i<m.NRows(); i++)

for (unsigned j = 0; j<m.NCols(); j++) t[j][i] = m[i][j];

return t; // Copy constructor called.

}

template<class TYPE>

void SimpleMatrix<TYPE>

Transpose(SimpleMatrix<TYPE> &t, const SimpleMatrix<TYPE> &m)

{

for (unsigned i = 0; i<m.NRows(); i++)

for (unsigned j = 0; j<m.NCols(); j++) t[j][i] = m[i][j];

}

1. Generalize the vector operations--such as *RowPtr( ), ColPtr( ),* and *operator[ ]( )*--to use row and column strides, rather than the number of rows and columns directly. Maintain the row and column strides in separate variables.

2. Interchange the number of rows and columns, and interchange the row and column strides.

The basic idea here is to change what used to be row vectors into column vectors, and vice versa. In a row-major matrix, the row stride is one and the column stride is the number of columns in the matrix. If the matrix is transposed in place, the row stride becomes equal to the number of rows in the new matrix (which is the same as the number of columns in the original matrix), and the column stride becomes one. Assuming the variables *rowstride* and *colstride *were added to the *SimpleMatrix* class, here is a new in-place *Transpose( ) *function, along with modified *RowPtr( )*, *ColPtr( )*, and *operator[ ]( )* functions:

template<class TYPE>

void SimpleMatrix<TYPE>::Transpose()

// Transpose a matrix in place

{

// Interchange nrows and ncols

unsigned temp = nrows;

nrows = ncols;

ncols = temp;

// Interchange row stride and column stride

temp = rowstride;

rowstride = colstride;

colstride = temp;

}

template<class TYPE>

VecPtr<TYPE> SimpleMatrix<TYPE>::RowPtr(unsigned r)

{

return VecPtr<TYPE>(data + r*colstride, rowstride);

}

template<class TYPE>

VecPtr<TYPE> SimpleMatrix<TYPE>::ColPtr(unsigned r)

{

return VecPtr<TYPE>(data + c*rowstride, colstride);

}

template<class TYPE>

VecPtr<TYPE> SimpleMatrix<TYPE>::operator[](unsigned r)

{

return VecPtr<TYPE>(data + r*colstride, rowstride);

}

The *SimpleMatrix* class introduced you to techniques for defining dynamic matrices. However, some aspects of *SimpleMatrix* are not entirely safe, and can lead to a lot of redundant copying.

A more insidious problem is that, by their very nature, vector pointers induce aliasing. In the *SmartPtr* and *String* classes, aliasing was made safe by using reference counting. The same can be done for vectors and matrices. By allowing safe sharing, we can also eliminate a lot of the redundant copying that takes place.

The *Vector* class is similar in many ways to the *String* class described in Chapter 6. (You should review that chapter now if you haven't already read it.) The *Vector* class uses the same letter-envelope reference counting scheme, having an underlying vector representation class *VecRep*:

template<class TYPE> class Vector; // Forward declaration

template<class TYPE>

class VecRep {

private:

friend class Vector<TYPE>;

unsigned alloclen;

unsigned refcnt;

TYPE data[1];

VecRep(unsigned d);

void *operator new(size_t n, unsigned d);

void operator delete(void *p);

public: // So we can set up null_rep

static VecRep<TYPE> null_rep;

VecRep();

~VecRep();

};

The complete code for the *VecRep* class is given on disk in the files *vector.h* and *vector.mth.*

template<class TYPE>

VecRep<TYPE>::VecRep(unsigned d)

// Constructor to initialize a shared vector rep

{

if (this != &null_rep) {

// We only want to do the following if we're

// not referencing null_rep

alloclen = d;

refcnt = 1;

// We must call the default constructor for

// all but the first element

TYPE *q = data + 1;

for(unsigned i = 1; i<d; i++, q++) new(q) TYPE;

}

}

Note that the first element of the vector already has its constructor called implicitly by the *VecRep* constructor, because *VecRep* is defined with one element allocated. The *VecRep* destructor works exactly in reverse of the constructor:

template<class TYPE>

void VecRep<TYPE>::~VecRep()

// Destructor to explicitly destroy all elements

// of the vector, except the first

{

TYPE *q = data + 1;

for(unsigned i = 1; i<alloclen; i++, q++) q->TYPE::~TYPE();

}

#define INITNULLVEC(TYPE) VecRep<TYPE> VecRep<TYPE>::null_rep;

With the *VecRep* class in hand, the *Vector* class can be defined:

template<class TYPE> class Matrix; // Forward declaration

template<class TYPE>

class Vector {

protected:

friend class Matrix<TYPE>;

VecRep<TYPE> *rep; // Pointer to shared vector data

TYPE *start; // Pointer to logical start of data

unsigned len; // Number of logical elements

unsigned stride; // Stride (offset to next logical element)

int Alloc(unsigned n);

void Bind(const Vector<TYPE> &v);

void Unbind();

void NewBinding(const Vector<TYPE> &v);

public:

Vector(unsigned n = 0, const TYPE *s = 0);

Vector(const Vector<TYPE> &v);

Vector(const Vector<TYPE> &v, SliceType styp,

unsigned n=0, unsigned str=1, unsigned ofs=0);

~Vector();

#ifndef NO_RANGE_CHECK

unsigned CheckIndx(unsigned i) const;

#endif

TYPE &operator[](unsigned i);

const TYPE &operator[](unsigned i) const;

void CopyN(const TYPE *src, unsigned n);

void Copy(const Vector<TYPE> &v);

void Share(const Vector<TYPE> &v);

void SetElements(const TYPE &x);

Vector<TYPE> &operator=(const Vector<TYPE> &v);

Vector<TYPE> &operator=(const TYPE &x);

int IsNull() const;

int IsUnique() const;

Vector<TYPE> Clone() const;

int EnsureUnique();

unsigned Length() const;

unsigned Stride() const;

// Low-level hooks

VecPtr<TYPE> All();

VecPtr<const TYPE> All() const;

};

The *Vector* class has four data members: *rep*, a pointer to the vector representation object; *start,* a pointer to the logical start of the vector elements; *len,* the logical length of the vector; and *stride,* the vector's logical stride. The *start* member is analogous to the *text* member of the *String* class. Normally, *start* points to *rep->data,* but in case the vector is a subvector, it may point at some offset from *rep->data.*

The *Vector* class has functions virtually identical to those in the *String* class to support lazy copying--such as *Bind( ), Unbind( ), NewBinding( ), Share( ), EnsureUnique( ), Clone( ),* and *IsUnique( )*. The main difference is that lazy copying isn't automatically performed in the *Vector* class. With the *String* class, we want to ensure that, when modifications are made to a string, the string is unique in order not to affect any other strings sharing the same text. With vectors, the design is just the opposite. Because vectors are going to be used to access shared slices of matrices, we *want* vector modifications to affect the underlying shared matrix data. Functions like *EnsureUnique( )* are not used automatically, but are provided in case you do want to ensure a unique copy before modifying data.

The *Vector* class has three constructors. The first allocates a new vector representation of a specified length, and can optionally copy data from a low-level array into it:

template<class TYPE>

Vector<TYPE>::Vector(unsigned n, const TYPE *s)

// Constructor to allocate space for an n element vector,

// and, if s != 0, to copy low-level array into it.

// If n == 0 or allocation fails, we bind to null rep,

// and do no copying.

{

if (Alloc(n) && s) CopyN(s, n);

}

This constructor is also used as the default, with *n* and *s* defaulting to 0. In this mode, the vector created will reference the null vector representation *null_rep,* as shown in the following *Alloc( )* function called by the constructor:

template<class TYPE>

int Vector<TYPE>::Alloc(unsigned n)

// Allocates a block of n elements for this vector.

// If n == 0 or allocation fails, we bind to the null rep.

// ASSUMES TYPE has a default constructor.

// Returns 0 if we've bound to null_rep otherwise, returns 1.

{

if (n) {

rep = new(n) VecRep<TYPE>(n);

}

else {

rep = &VecRep<TYPE>::null_rep;

rep->refcnt++;

}

start = rep->data;

if (rep != &VecRep<TYPE>::null_rep) {

len = n;

stride = 1;

return 1;

}

else {

len = 0;

stride = 1;

return 0;

}

}

Vector<float> v(10); // Create a new vector of 10 elements

Vector<int> w; // Create a null vector

Like the *String* copy constructor, the *Vector* copy constructor actually uses share semantics:

template<class TYPE>

Vector<TYPE>::Vector(const Vector<TYPE> &v)

// Copy constructor that shares all of the vector v

{

Bind(v);

len = v.len;

stride = v.stride;

start = v.start;

}

The assignment operator also uses share semantics:

template<class TYPE>

Vector<TYPE> &Vector<TYPE>::operator=(const Vector<TYPE> &v)

// Share semantics used for assignment

{

if (this != &v) Share(v); // Note trap for assignment to self

return *this;

}

The *Share( )* function is very similar to the copy constructor, except it calls *NewBinding( )* instead of *Bind( )* because the vector is presumably already bound to a vector representation:

template<class TYPE>

void Vector<TYPE>::Share(const Vector<TYPE> &v)

// Used to shares data with vector v

{

NewBinding(v);

len = v.len;

stride = v.stride;

start = v.start;

}

template<class TYPE>

class Matrix {

protected:

Vector<TYPE> data;

unsigned nrows, ncols, rowstride, colstride;

public:

Matrix(unsigned nr=0, unsigned nc=0, const TYPE *s = 0);

Matrix(const Matrix<TYPE> &m);

Matrix(const Matrix<TYPE> &m, SliceType styp,

unsigned sr=0, unsigned sc=0,

unsigned nr=0, unsigned nc=0);

void Copy(const Matrix<TYPE> &m);

void Share(const Matrix<TYPE> &m);

Matrix<TYPE> Transpose();

Matrix<TYPE> &operator=(const Matrix<TYPE> &m);

Matrix<TYPE> &operator=(const TYPE &x);

#ifndef NO_RANGE_CHECK

unsigned CheckRow(unsigned i) const;

unsigned CheckCol(unsigned i) const;

#endif

Vector<TYPE> operator[](unsigned r);

const Vector<TYPE> operator[](unsigned r) const;

TYPE &operator()(unsigned r, unsigned c);

const TYPE &operator()(unsigned r, unsigned c) const;

Vector<TYPE> Row(unsigned r, SliceType styp=SHARED);

Vector<TYPE> Col(unsigned c, SliceType styp=SHARED);

Vector<TYPE> Diag(SliceType styp=SHARED);

Vector<TYPE> All(SliceType styp=SHARED);

const Vector<TYPE> Row(unsigned r, SliceType styp=SHARED) const;

const Vector<TYPE> Col(unsigned c, SliceType styp=SHARED) const;

const Vector<TYPE> Diag(SliceType styp=SHARED) const;

const Vector<TYPE> All(SliceType styp=SHARED) const;

// Mid-level hooks. Use these at your own risk:

VecPtr<TYPE> RowPtr(unsigned r=0);

VecPtr<TYPE> ColPtr(unsigned c=0);

VecPtr<TYPE> DiagPtr();

VecPtr<TYPE> PtrToAll();

VecPtr<const TYPE> RowPtr(unsigned r=0) const;

VecPtr<const TYPE> ColPtr(unsigned c=0) const;

VecPtr<const TYPE> DiagPtr() const;

VecPtr<const TYPE> PtrToAll() const;

int IsNull() const;

int IsUnique() const;

Matrix<TYPE> Clone() const;

int EnsureUnique();

unsigned NCols() const;

unsigned NRows() const;

unsigned RowStride() const;

unsigned ColStride() const;

int IsRowMajor() const;

int IsColMajor() const;

int IsSquare() const;

};

The *Matrix* class has five data members. The elements of a matrix are stored in the *1-d Vector* object *data*, which of course means the matrix data can be shared. This data is interpreted as a *2-d* array, with *nrows* number of rows and *ncols* number of columns. The two other members, *rowstride* and *colstride*, keep track of the current row and column stride of the matrix. These members are useful when the matrix is transposed and also for submatrices, as you'll see. Figure 7.9 shows the layout of a typical matrix.

Constructing and Assigning Matrices

Extracting Vectors from a Matrix

Subscripting with the Matrix Class

The first *Matrix* constructor allocates and constructs a brand new matrix by calling the *Vector* constructor for *data* and then setting up the matrix size and strides:

template<class TYPE>

Matrix<TYPE>::Matrix(unsigned nr, unsigned nc, const TYPE *s)

: data(nr * nc, s)

{

if (data.IsNull()) {

nrows = 0; ncols = 0; colstride = 1; rowstride = 1;

}

else {

nrows = nr; ncols = nc; colstride = nc; rowstride = 1;

}

}

The copy constructor and overloaded assignment operators use share semantics:

template<class TYPE>

Matrix<TYPE>::Matrix(const Matrix<TYPE> &m)

: data(m.data)

{

if (data.IsNull()) {

nrows = 0; ncols = 0; colstride = 1; rowstride = 1;

}

else {

nrows = m.nrows; ncols = m.ncols;

colstride = m.colstride; rowstride = m.rowstride;

}

}

template<class TYPE>

Matrix<TYPE> &Matrix<TYPE>::operator=(const Matrix<TYPE> &m)

// Share semantics used for assignment.

{

if (this != &m) Share(m); // Note trap for assignment to self

return *this;

}

template<class TYPE>

void Matrix<TYPE>::Share(const Matrix<TYPE> &m)

{

data.Share(m.data);

nrows = m.nrows; ncols = m.ncols;

colstride = m.colstride; rowstride = m.rowstride;

}

Here's an example of these functions:

float one_d_arr[4] = {0, 1, 2, 3};

Matrix<float> m(3, 5); // Elements left uninitialized

Matrix<float> w(2, 2, one_d_arr); // Elements set to {{0, 1},{2, 3}}

Matrix<float> z(w); // Copy constructor called

m = z; // m now reassigned to z's data

The *Matrix* class can support submatrices, which *SimpleMatrix* couldn't do. These submatrices can either overlay an existing matrix (with the data being shared) or be separate copies. Figure 7.10 shows an example of overlaying a 3x5 submatrix onto a 6x7 matrix.

template<class TYPE>

Matrix<TYPE>::Matrix(const Matrix<TYPE> &m, SliceType styp,

unsigned sr, unsigned sc,

unsigned nr, unsigned nc)

// Constructor that constructs a submatrix of matrix m

// If styp==SHARED, it means to share the data with m

// NOTE: If sharing, we initially create a null vector, and

// then we immediately rebind to shared vector. If copying,

// a row-major submatrix is created, and if nr == 0,

// then m.nrows is used. If nc == 0, then m.ncols is used.

: data((styp == SHARED) ?

0 : ((nr ? nr : m.nrows) * (nc ? nc : m.ncols)))

{

unsigned start_ofs, n;

if (nr) nrows = nr; else { nrows = m.nrows; sr = 0; }

if (nc) ncols = nc; else { ncols = m.ncols; sc = 0; }

if (styp == SHARED) { // Sharing

// When sharing, strides are always the same as parent strides

colstride = m.colstride;

rowstride = m.rowstride;

// Calculate length of underlying vector and offset.

// Remember: either colstride or rowstride == 1.

if (rowstride == 1) {

n = colstride * (nrows-1) + ncols;

}

else {

n = rowstride * (ncols-1) + nrows;

}

start_ofs = sr * colstride + rowstride * sc;

data.Share(Vector<TYPE>(m.data, SHARED, n, 1, start_ofs));

}

else {

if (data.IsNull()) {

nrows = 0; ncols = 0; colstride = 1; rowstride = 1;

}

else {

// Set up as a row-major matrix

colstride = ncols; rowstride = 1;

// Copy data from shared submatrix into allocated space.

// Note that the constructor is called recursively here

// to create a shared submatrix used for the copy.

Copy(Matrix<TYPE>(m, SHARED, sr, sc, nrows, ncols));

}

}

}

Here's an example of a 3x5 submatrix that shares its data with a 6x7 matrix, starting at row and column (2, 2), as illustrated earlier in Figure 7.10 :

Matrix<float> m(6, 7);

...

Matrix<float> subm(m, SHARED, 2, 2, 3, 5);

The last four parameters of the submatrix constructor, which define the origin and size of the submatrix, default to zero. In this case, the submatrix encompasses all of the original matrix. This approach can be used to create an actual copy of the matrix--which the copy constructor can't do--since it uses share semantics exclusively. For example:

Matrix<float> copy_of_m(m, COPIED);

The two variables *rowstride* and *colstride* provide the key to allowing a submatrix to share data with its parent. For normal row-major matrices, the column stride is always equal to the number of columns in the matrix. However, that isn't necessarily true for submatrices. Consider, for instance, the 3x5 submatrix shown earlier in Figure 7.10. You can see that its column stride should be the number of columns of the parent 6x7 matrix, and thus should be 7 instead of 5.

If one of the matrices is transposed or yet another submatrix is taken of a submatrix, the matter is further complicated. However, by judicious use of *rowstride *and *colstride*, the strides are never confused. Here's the basic rule: the row and column strides of a submatrix are always the same as their parents' strides and must always reflect the physical layout of the data. The real data lies somewhere up the chain of parents and is organized in row-major order. Of course, if one parent is transposed, the row and column strides will be interchanged.

template<class TYPE>

Matrix<TYPE> Matrix<TYPE>::Transpose()

// Returns a transposition of this matrix. Does not have to move

// the elements of the matrix around, just merely changes

// interpretation between row-major and column-major ordering.

{

Matrix<TYPE> t(*this); // Remember, copy constructor shares

// Interchange number of rows and cols

unsigned temp = t.nrows;

t.nrows = t.ncols;

t.ncols = temp;

// Interchange row stride and column stride

temp = t.colstride;

t.colstride = t.rowstride;

t.rowstride = temp;

return t;

}

Unlike the similar optimized *Transpose( ) *we showed for the *SimpleMatrix *class, this function does not need to assume that the matrix is to be transposed in place. Thus, the original matrix is untouched. Here's an example of this function:

Matrix<float> m(3, 5); // Create a 3x5 matrix

...

Matrix<float> t; // Create a null matrix

t = m.Transpose();

As is true with the *SimpleMatrix* class, the functions *RowPtr( )*, *ColPtr( )*, *DiagPtr( )*, and *PtrToAll( )* all are defined for the *Matrix* class. We'll refer to these as *vector extraction functions*. These functions use the modifications we suggested earlier--to handle the potential transposition of the matrix--by using *rowstride* and *colstride* rather than *nrows* and *ncols* directly. By using *rowstride* and *colstride*, the extraction functions also work for submatrices.

template<class TYPE>

VecPtr<TYPE> Matrix<TYPE>::RowPtr(unsigned r)

// Returns vector pointer to row r

{

return VecPtr<TYPE>(data.start+CHECKROW(r)*colstride, rowstride);

}

template<class TYPE>

VecPtr<TYPE> Matrix<TYPE>::ColPtr(unsigned c)

// Returns vector pointer to column c

{

return VecPtr<TYPE>(data.start+CHECKCOL(c)*rowstride, colstride);

}

In these functions, range checking is handled by calls to the macros *CHECKROW( )* and *CHECKCOL( )*. These macros are the two-dimensional counterparts to the *CHECK( )* macro used in the *Array* classes (described in Chapter 4), and are defined in the *range.h* header file. The macros do nothing if *NO_RANGE_CHECK* is undefined.

The range checking is done when the vector pointers are created. But remember that, when you use the vector pointers, nothing prevents you from going out of bounds. For this reason, another set of vector extraction functions are defined. These functions, which return fully range-checked *Vector *objects, are *Row( ), Col( ), Diag( )*, and *All( )*. Here are examples of *Row( )*and *Col( )*:

template<class TYPE>

Vector<TYPE> Matrix<TYPE>::Row(unsigned r, SliceType styp)

// Return a row slice of a matrix

{

return Vector<TYPE>(data, styp,

ncols, rowstride, CHECKROW(r)*colstride);

}

template<class TYPE>

Vector<TYPE> Matrix<TYPE>::Col(unsigned c, SliceType styp)

// Return a column slice of a matrix

{

return Vector<TYPE>(data, styp,

nrows, colstride, CHECKCOL(c)*rowstride);

}

To return a row vector, the subscript operator is overloaded for the *Matrix* class. As such, *operator[ ]( )* is the same as *Row( )*, except the returned vector always shares its data with the matrix:

template<class TYPE>

Vector<TYPE> Matrix<TYPE>::operator[](unsigned r)

// Row subscripting operator for non-const matrices.

// This routine does the same thing as Row(r, SHARED).

{

return Vector<TYPE>(data, SHARED,

ncols, rowstride, CHECKROW(r)*colstride);

}

Matrix<float> m(3, 5);

m[2][4] = 99; // Set lower-right element to 99

Thus, you can use the same double-subscripting syntax that's used for built-in arrays. There's only one problem. The *SimpleMatrix* class returned an ordinary pointer from the first subscript. Here, though, a full vector object is returned, and this object must be constructed. The construction process, while fairly efficient due to share semantics, still takes time. In fact, a matrix multiply routine that uses *Matrix* objects with double-subscripting is more than *six times* *slower* than its *SimpleMatrix* counterpart! And you'll only get this performance level when range checking is turned off. If you turn on range checking, the performance could be more than *ten times* slower.

template<class TYPE>

TYPE &Matrix<TYPE>::operator()(unsigned r, unsigned c)

{

return data.start[CHECKROW(r)*colstride + CHECKCOL(c)*rowstride];

}

...

Matrix<float> m(3, 5); // Construct matrix

...

m(2, 4) = 99; // Set lower-right element to 99

m(r, c) = *(m + r*ncols + c)

m(r, c) = *(m + r*colstride + c) // If row major

m(r, c) = *(m + r + c*rowstride) // If column major

A matrix is said to be *contiguous* if you can access all of the elements using a single *1-d *vector. Normal matrices created by the *Matrix* class will always be contiguous, regardless of whether they are transposed. However, its possible for submatrices that are shared with a parent to be non-contiguous. Figure 7.11 shows some examples of contiguous and non-contiguous submatrices, assuming a row-major organization.

From Figure 7.11, the following rule emerges for determining whether a submatrix is contiguous: either the number of columns must equal the column stride or the number of rows must be one. For column-major matrices (not shown in Figure 7.11), the number of rows must equal the row stride or the number of columns must be one. The function *All( )*, which attempts to return a *1-d *vector to all the elements in a matrix, makes these tests for contiguousness. If the matrix isn't contiguous, *All( )* returns a null vector because it isn't possible to access all the elements with a single vector:

template<class TYPE>

Vector<TYPE> Matrix<TYPE>::All(SliceType styp)

// Returns a 1-d vector with all of the data for the matrix.

// If it isn't possible for the 1-d data to be contiguous

// (as will be the case for some submatrices), then a

// null vector is returned.

// ASSUMES either colstride or rowstride == 1.

{

if ((rowstride == 1 && nrows > 1 && ncols != colstride) ||

(colstride == 1 && ncols > 1 && nrows != rowstride))

return Vector<TYPE>(); // Return a null vector

return Vector<TYPE>(data, styp, nrows * ncols, 1, 0);

}

template<class TYPE>

Matrix<TYPE> &Matrix<TYPE>::operator=(const TYPE &x)

// Set all elements in the matrix to the value x.

// Assumes a worst case non-contiguous submatrix.

{

// tc = column vector pointer to rows of t

// tr = row vector pointer to columns of t

VecPtr<TYPE> tc(data.start, colstride);

VecPtr<TYPE> tr(data.start, rowstride);

for (unsigned i = 0; i<nrows; i++) {

tr = tc; // Remember, strides aren't assigned!

for (unsigned j = 0; j<ncols; j++) {

*tr = x;

tr++; // Next column

}

tc++; // Next row

}

return *this;

}

In this chapter, you've seen in detail how to define generalized dynamic vectors and matrices. In our design, we were careful to remove all potential inefficiencies, without sacrificing generality. However, any time you make something general, performance is bound to suffer. This is manifested in basically one form: vector pointers must be used rather than ordinary pointers. Because of their generality, vector pointers with strides of one may be less efficient than ordinary pointers (depending on how well your compiler optimizes), even though the two are logically equivalent.

An interesting test is to compare matrix multiply times for built-in statically sized matrices, *SimpleMatrix* objects, and *Matrix* objects. Table 7.1 shows the fastest times possible for all three types (using all the tricks described in this chapter). The results were obtained by multiplying two 30x30 double precision floating-point matrices together, 100 times, and taking the average. The tests were performed using 32-bit pointers on a 486 33-MHz machine.

As you might expect, statically sized matrices yielded the fastest results. Using a generalized matrix resulted in a 32 percent drop in speed, which isn't bad considering the flexibility that we gain. Interestingly enough, the general *Matrix* class performed slightly better than the *SimpleMatrix* class (probably due to some compiler optimization quirk). Their performance is close enough to suggest that the main performance penalty is due mostly to the use of dynamically sized matrices and not the fact that, in the case of the *Matrix* class, submatrices and efficient transposition are supported.

What is the space cost of generalized matrices? Figure 7.9 given earlier shows the memory layout of a *Matrix* object. Note that, in addition to the matrix elements themselves, a *Matrix* object has four data members plus a *Vector* object with four data members--one of which points to a *VecRep* object using two data members. Thus, an independent, unique matrix has an overhead of 10 data members (probably in the range of 20 to 40 bytes, depending on the implementation). This may be significant if you have a lot of small 3x3 or 4x4 matrices. However, for larger matrices, the overhead becomes less important, especially in light of the flexibility gained. And, of course, if the matrix data is shared, the overhead is reduced by two members plus the elements shared.

Type Time Performance Relative to Fastest

of Matrix in Seconds (Smaller Is Better)

Statically sized

built-in 2-d arrays 0.0470 1.0

SimpleMatrix objects 0.0660 1.40

Matrix objects 0.0621 1.32