BLOCK TRUNCATION COMPRESSION

An efficient algorithm for image compression

Anton Kruger

Anton works for Truda Software, a Fortran and C consulting firm. He can be reached through the DDJ office.


Data compression algorithms can be divided into two groups. The first, error-free, or lossless compression algorithms, such as Huffman and LZW, incurs no loss of information during the compress/decompress cycle. This is an important consideration in many applications. For example, it is normally disastrous to have even 1 bit wrong in an executable file. Error-free compression is also imperative with the vast majority of text files. You don't want a compression program changing the source code of a computer program or introducing spelling mistakes into a word-processing document.

On the other hand, in some applications involving images, the error-free requirement can be dropped. When dealing with photographs of people's faces, natural scenes, and so on, some image degradation is often tolerable. This second group of compression algorithms is known as "lossy," and is well-suited for such applications. Examples of this group are the Fourier and cosine transform compression algorithms, as well as Block Truncation Compression (BTC), the focus of this article.

Block Truncation Compression

The idea behind BTC is to preserve local image statistics, typically over 4x4 blocks of the image, though larger blocks may also be used. One advantage of the BTC algorithm is that it does not require global image statistics (such as the image histogram). The implementation described in this article operates on 4x4 blocks, so that only four lines of the image are in memory at any given time. The algorithm is fairly simple and quite fast. For example, it takes about 7.25 seconds to compress a 256x256 8-bit (256-level) monochrome image, and about 2 seconds to decompress the resulting file on a 20-MHz AT clone. The most important property of Block Truncation Compression, however, is the compression ratio. The algorithm described in this article compresses monochrome image files down to 2 bits/pixel, so that the compressed images require 25 percent of their former disk space.

How does this compare with lossless compression algorithms? Two image files, "baboon" and "Valerie" were used as test files. Each contains 256x256 8-bit monochrome images in pure binary format: There are no file headers or any other information, and the images are stored line-after-line in the files. Both the popular compression program PKZIP and a DOS version of the UNIX compress utility failed to reduce the size of the two test images, and simply stored the files. Another popular file format for images is CompuServe's GIF format. A GIF-encoded version of "baboon" is about 20 percent larger, and a GIF-encoded version of "Valerie" is about 1 percent smaller than the original files. This is not too surprising, because the UNIX compress program and the GIF encoder both employ an LZW algorithm. (Sample GIF files are available eletronically; see page 3.) The files were also run through a Huffman compression program. The size of both files stayed practically the same.

In this instance we have two files that cannot easily be compressed with lossless compression algorithms, but can be compressed to 25 percent of their original size with a lossy compression algorithm. While the specific example may be extreme, it is generally true that much higher compression ratios are attainable with lossy compression algorithms than with lossless algorithms. (The price is a reduction in image quality.) Furthermore, as will hopefully become clear, this compression ratio is guaranteed with BTC, and is independent of the image properties. In fact, compression speed, decompression speed, and compression ratio are all essentially independent of the image, in contrast with error-free compression algorithms.

BTC is described in terms of Figure 1, adapted from Rosenfeld and Kak's Digital Picture Processing. Figure 1(a) shows a 4X4 block of pixels, labeled P0, P1 ,..., P15. The BTC algorithm operates by replacing the pixels in the block with only one of two judiciously chosen values, which we shall call a and b; see Figure 1(b). If a and b are made available to the decompressor, then the block of pixels can be represented with 1s and Os, as in Figure 1(c). This 16-bit bit plane can be coded into 2 bytes. Add to this 1 byte each for a and b, giving a total of 4 bytes required for the 4X4 block. Without compression, the block requires 16 bytes of storage. The savings is therefore 4/16 or 25 percent. This procedure is called "Block Truncation Compression" because the digital representation of the pixels in a block is truncated to 1 bit. The block truncation code (code) as well as a and b are needed to recover the block of pixels.

What values a and b are used to represent the pixels in the block? It was stated earlier that the idea behind the BTC algorithm is to preserve local image statistics. In particular, the first and second moments of a block of pixels are to be preserved. The first moment is simply the mean of the pixels in the block, as in Example 1(a), where n (=16) is the number of pixels in the block. The second moment is the mean of the squares of the pixel values; see Example 1(b). Let q be the number of pixels in the block greater or equal to the mean. When only a and b are used to represent the pixels in the block, the mean becomes that shown in Example 1(c), and the second moment becomes that shown in Example 1(d).

Because we want to preserve the mean and the second moment, we require these values to be the same as the original values. Thus, the equation in Example 1(a) is equated to that in 1(c) and the equation in 1(b) is equated to 1(d). When the two sets of equations are solved for a and b, we find the values in Example 1(e), where d and are given by Example 1(f).

The BTC algorithm now works as follows. First, divide the image into blocks of 4X4 pixels. For each block, compute the mean and second moments from the equations in Example 1(a) and Example 1(b). Second, construct a bit plane by replacing each pixel that has a value less than the mean with a 0, and pixels with values greater or equal to the mean with a 1. Third, pack the resulting bit plane into 2 bytes, which becomes the block truncation code. Finally, write a, b, and the 2-byte code to the output file.

Decompression is straightforward. First, read a, b, and a 2-byte block truncation code from the compressed file. Second, examine each bit in the 2-byte code. If a bit is clear, then display (or write to an output file) the value a. If a bit is set, display the value b. Repeat this until no more codes are in the file. Of course, the decompressed file differs from the original, which is why BTC is not an error-free compression algorithm. However, some local image statistics (the first and second moments) over 4X4 blocks of pixels are intact.

The following is a numerical example, taken from the references; refer to Figure 1(d). With the pixel values shown, it follows that n=16, m1=98.75, m2=18.39x103, and q=7. From this and the equations in Example 1(e) and Example 1(f), it follows that =92.95, d=0.882, a=17, and b=204, where a and b were rounded to the closest integer. If all pixels smaller than the mean are set to a, and the rest are set to b, the result is Figure 1(b). The bit-plane representation is given in Figure 1(c). One way of packing the bit plane is shown in Figure 1(d), and the block truncation code for this block is therefore 256* 136+227= 35043. Thus, for this block we write a=17 (1 byte), b=204 (1 byte), and code=35043 (2 bytes) to the output file.

Two special cases must be considered. The first is when q= n= 16, which occurs when all the pixel values are identical. In this instance one cannot compute d--see Example 1(f)--because a divide-by-zero error will result. This is handled by setting a= b= m1. The second case is when the computed values of a or b fall outside the closed interval [0...255]. This is handled by clamping values outside the interval so that they fall in the desired range.

Implementation

Listing One (page 104) shows code for compressing a 256x256 8-bit image with the BTC algorithm, and Listing Two (page 104) contains code for decompressing the compressed file. In this particular implementation, the image size, number of gray levels, and block size are all fixed and hard-coded into the algorithm. This was done for two reasons. The first is simplicity and clarity, and the second is that it is quite common to compress a large number of similar and related images on a regular basis. For example, a compression algorithm can be employed to compress the output of a scanner used to enter images into a personnel database. In any event, it should not be difficult to alter the code to suit other image sizes.

Listing One starts with a driver routine that calls btc4x4, the routine that does the actual compression. This routine first calls GetBlock, which fetches the next 4X4 block of pixels from the image file, and next calls GetStats, which computes the statistics a, b, and m1 (the mean) for each block according to the equations in Example 1(f). Next, each pixel in the block is examined and the appropriate bit in variable code is set, after which a, b, and code are written to the output file with a call to PutCode. The routine GetStats is quite straightforward. Floating-point calculations are used for simplicity, and rounding to the nearest integer is performed when required. The routine GetBlock maintains an internal four-line buffer that normally contains four lines of the image. A buffer pointer, bp, keeps track of the current location in the buffer. On each call, 16 pixels are copied from the four-line buffer, and bp is updated. After 64 calls, bp reaches the end of the four-line buffer, and is filled again until the end of the file is reached.

Listing Two starts with a driver for the decompression routine, btd4x4. This routine calls GetCode, which reads an a, b, and a BTC code from the compressed file. Each bit in the code is then examined, and a or b is stored in the variable block, depending on whether the bit is clear or not. It stands to reason that the same byte ordering, as well as the numbering of the bits must be followed by both the compression and decompression routines; see Figure 1(e). When block is full, PutBlock is called to write the block of pixels to an output file via a four-line buffer, quite similar to GetBlock. In other applications, PutBlock will display the block of pixels on a video display unit, or send the pixels to a printer, and so on.

As a final note on the implementation, the programs should be quite portable, though they were tested only on a PC under DOS. Two typedefs were used to create byte (1 byte) and word (2 byte) data-type names. You may need to change these to correspond to the proper variables on other systems.

Executable versions of both listings and the sample data files are available electronically.

Results and Discussion

How well does BTC work? This depends to some extent on the image, but the results are generally quite good for 4X4 block sizes. Figure 2(a) shows "Valerie," a 256x256 8-bit monochrome image. (The display was a VGA-compatible system with 6-bit 64-level gray scale capabilities.) In the BTC-compressed Figure 2(b), some block-iness is visible, especially around the baby's nose and chin, as well as the baby's right cheek. While it may not be clear on the photograph, there is also some false contouring on the baby's left cheek. This is typical of BTC, where errors show up as ragged edges and false contours in low-contrast areas. In any event, the image is acceptable, considering that it requires four times less disk space than the original.

Actually, some of the artifacts are due to the fact that a VGA-compatible system can only display 64 gray levels, and some block-iness and rough edges are present in the original Valerie image. When an image is going to be displayed on a VGA-compatible system, some additional savings are possible. A VGA-compatible system is only capable of displaying 64 gray levels, so we can save 2 bits each on a and b. When this is done, we need 1.6 bits/pixel, which implies 20 percent of the original disk space. However, implementation is a little bit more complex, because a, b, and the block truncation code no longer make up an integral number of bytes.

In the current implementation, we save a and b, computed from Example 1(e), along with the block truncation code to the output file. An alternative is to save the mean (m1) and standard deviation (), and then compute a and b during decompression. This is a slightly slower approach.

Finally, what happens when the BTC algorithm is applied to an image that has gone through the compress/decompress cycle once before--is there further reduction in image quality? The answer is generally no, which can be confirmed by working an example. Of course, if the image is compressed with another lossy compression algorithm, more image degradation will normally result.

References

Delp, E.J. and O.R. Mitchell. "Image Compression Using Block Truncation Coding." IEEE Transactions on Communications (September 1979).

Rosenfeld, A. and A.C. Kak. Digital Picture Processing, Second Edition. San Diego, Calif.: Academic Press, 1982.


Copyright © 1992, Dr. Dobb's Journal

Back to Table of Contents