# CHAPTER 37: APPROXIMATION ALGORITHMS

Performance bounds for approximation algorithms

Assume that we are working on an optimization problem in which each potential solution has a positive cost, and that we wish to find a near-optimal solution. Depending on the problem, an optimal solution may be defined as one with maximum possible cost or one with minimum possible cost; the problem may be a maximization or a minimization problem.

#### (37.1)

This definition applies for both minimization and maximization problems. For a maximization problem, 0 < C C*, and the ratio C*/C gives the factor by which the cost of an optimal solution is larger than the cost of the approximate solution. Similarly, for a minimization problem, 0 < C* C, and the ratio C/C* gives the factor by which the cost of the approximate solution is larger than the cost of an optimal solution. Since all solutions are assumed to have positive cost, these ratios are always well defined. The ratio bound of an approximation algorithm is never less than 1, since C/C* < 1 implies C*/C > 1. An optimal algorithm has ratio bound 1, and an approximation algorithm with a large ratio bound may return a solution that is very much worse than optimal.

where, as before, C* is the cost of an optimal solution and C is the cost of the solution produced by the approximation algorithm. The relative error is always nonnegative. An approximation algorithm has a relative error bound of (n) if

#### (37.2)

It follows from the definitions that the relative error bound can be bounded as a function of the ratio bound:

(n)  (n) - 1.

#### (37.3)

(For a minimization problem, this is an equality, whereas for a maximization problem, we have (n) = ((n) - 1) / (n), which satisfies inequality (37.3) since (n) 1.)

For many problems, approximation algorithms have been developed that have a fixed ratio bound, independent of n. For such problems, we simply use the notation or , indicating no dependence on n.

For other problems, computer scientists have been unable to devise any polynomial-time approximation algorithm having a fixed ratio bound. For such problems, the best that can be done is to let the ratio bound grow as a function of the input size n. An example of such a problem is the set-cover problem presented in Section 37.3.

Some NP-complete problems allow approximation algorithms that can achieve increasingly smaller ratio bounds (or, equivalently, increasingly smaller relative error bounds) by using more and more computation time. That is, there is a trade-off between computation time and the quality of the approximation. An example is the subset-sum problem studied in Section 37.4. This situation is important enough to deserve a name of its own.

The running time of a polynomial-time approximation scheme should not increase too rapidly as decreases. Ideally, if decreases by a constant factor, the running time to achieve the desired approximation should not increase by more than a constant factor. In other words, we would like the running time to be polynomial in 1/ as well as in n.

Chapter outline

The first three sections of this chapter present some examples of polynomial-time approximation algorithms for NP-complete problems, and the last section presents a fully polynomial-time approximation scheme. Section 37.1 begins with a study of the vertex-cover problem, an NP-complete minimization problem that has an approximation algorithm with a ratio bound of 2. Section 37.2 presents an approximation algorithm with ratio bound 2 for the case of the traveling-salesman problem in which the cost function satisfies the triangle inequality. It also shows that without triangle inequality, an -approximation algorithm cannot exist unless P = NP. In Section 37.3, we show how a greedy method can be used as an effective approximation algorithm for the set-covering problem, obtaining a covering whose cost is at worst a logarithmic factor larger than the optimal cost. Finally, Section 37.4 presents a fully polynomial-time approximation scheme for the subset-sum problem.

# 37.1 The vertex-cover problem

Even though it may be difficult to find an optimal vertex cover in a graph G, however, it is not too hard to find a vertex cover that is near-optimal. The following approximation algorithm takes as input an undirected graph G and returns a vertex cover whose size is guaranteed to be no more than twice the size of an optimal vertex cover.

#### Figure 37.1 The operation of APPROX-VERTEX-COVER. (a) The input graph G, which has 7 vertices and 8 edges. (b) The edge (b,c), shown heavy, is the first edge chosen by APPROX-VERTEX-COVER. Vertices b and c, shown lightly shaded, are added to the set A containing the vertex cover being created. Edges (a, b), (c, e), and (c,d), shown dashed, are removed since they are now covered by some vertex in A. (c) Edge (e, f) is added to A. (d) Edge (d,g) is added to A. (e) The set A, which is the vertex cover produced by APPROX-VERTEX-COVER, contains the six vertices b, c, d, e, f, g. (f) The optimal vertex cover for this problem contains only three vertices: b, d, and e.

APPROX-VERTEX-COVER has a ratio bound of 2.

Proof The set C of vertices that is returned by APPROX-VERTEX-COVER is a vertex cover, since the algorithm loops until every edge in E[G] has been covered by some vertex in C.

To see that APPROX-VERTEX-COVER returns a vertex cover that is at most twice the size of an optimal cover, let A denote the set of edges that were picked in line 4 of APPROX-VERTEX-COVER. No two edges in A share an endpoint, since once an edge is picked in line 4, all other edges that are incident on its endpoints are deleted from E' in line 6. Therefore, each execution of line 5 adds two new vertices to C, and |C| = 2 |A|. In order to cover the edges in A, however, any vertex cover--in particular, an optimal cover C*--must include at least one endpoint of each edge in A. Since no two edges in A share an endpoint, no vertex in the cover is incident on more than one edge in A. Therefore, |A| |C*|, and |C| 2 |C*|, proving the theorem.

## Exercises

Professor Nixon proposes the following heuristic to solve the vertex-cover problem. Repeatedly select a vertex of highest degree, and remove all of its incident edges. Give an example to show that the professor's heuristic does not have a ratio bound of 2.

# 37.2 The traveling-salesman problem

c(u, w)  c(u, v) + c(v, w).

The triangle inequality is a natural one, and in many applications it is automatically satisfied. For example, if the vertices of the graph are points in the plane and the cost of traveling between two vertices is the ordinary euclidean distance between them, then the triangle inequality is satisfied.

As Exercise 37.2-1 shows, restricting the cost function to satisfy the triangle inequality does not alter the NP-completeness of the traveling- salesman problem. Thus, it is unlikely that we can find a polynomial-time algorithm for solving this problem exactly. We therefore look instead for good approximation algorithms.

In Section 37.2.1, we examine an approximation algorithm for the traveling-salesman problem with triangle inequality that has a ratio bound of 2. In Section 37.2.2, we show that without triangle inequality, an approximation algorithm with constant ratio bound does not exist unless P = NP.

## 37.2.1 The traveling-salesman problem with triangle inequality

APPROX-TSP-TOUR(G, c)

1 select a vertex r  V[G] to be a "root" vertex

2 grow a minimum spanning tree T for G from root r

using MST-PRIM(G, c, r)

3 let L be the list of vertices visited in a preorder tree walk of T

4 return the hamiltonian cycle H that visits the vertices in the order L

Recall from Section 13.1 that a preorder tree walk recursively visits every vertex in the tree, listing a vertex when it is first encountered, before any of its children are visited.

Figure 37.2 illustrates the operation of APPROX-TSP-TOUR. Part (a) of the figure shows the given set of vertices, and part (b) shows the minimum spanning tree T grown from root vertex a by MST-PRIM. Part (c) shows how the vertices are visited by a preorder walk of T, and part (d) displays the corresponding tour, which is the tour returned by APPROX-TSP-TOUR. Part (e) displays an optimal tour, which is about 23% shorter.

The running time of APPROX-TSP-TOUR is (E) = (V2), since the input is a complete graph (see Exercise 24.2-2). We shall now show that if the cost function for an instance of the traveling-salesman problem satisfies the triangle inequality, then APPROX-TSP-TOUR returns a tour whose cost is not more than twice the cost of an optimal tour.

APPROX-TSP-TOUR is an approximation algorithm with a ratio bound of 2 for the traveling-salesman problem with triangle inequality.

Proof Let H* denote an optimal tour for the given set of vertices. An equivalent statement of the theorem is that c(H) 2c(H*), where H is the tour returned by APPROX-TSP-TOUR. Since we obtain a spanning tree by deleting any edge from a tour, if T is a minimum spanning tree for the given set of vertices, then

c(T)  c(H*) .

#### (37.4)

a, b, c, b, h, b, a, d, e, f, e, g, e, d, a.

Since the full walk traverses every edge of T exactly twice, we have

c(W) = 2c(T) .

#### (37.5)

Equations (37.4) and (37.5) imply that

c(W)  2c(H*),

#### (37.6)

and so the cost of W is within a factor of 2 of the cost of an optimal tour.

#### Figure 37.2 The operation of APPROX-TSP-TOUR. (a) The given set of points, which lie on vertices of an integer grid. For example, f is one unit to the right and two units up from h. The ordinary euclidean distance is used as the cost function between two points. (b) A minimum spanning tree T of these points, as computed by MST-PRIM. Vertex a is the root vertex. The vertices happen to be labeled in such a way that they are added to the main tree by MST-PRIM in alphabetical order. (c) A walk of T, starting at a. A full walk of the tree visits the vertices in the order a, b, c, b, h, b, a, d, e, f, e, g, e, d, a. A preorder walk of T lists a vertex just when it is first encountered, yielding the ordering a, b, c, h, d, e, f, g. (d) A tour of the vertices obtained by visiting the vertices in the order given by the preorder walk. This is the tour H returned by APPROX-TSP-TOUR. Its total cost is approximately 19.074. (e) An optimal tour H* for the given set of vertices. Its total cost is approximately 14.715.

a, b, c, h, d, e, f, g .

This ordering is the same as that obtained by a preorder walk of the tree T. Let H be the cycle corresponding to this preorder walk. It is a hamiltonian cycle, since every vertex is visited exactly once, and in fact it is the cycle computed by APPROX-TSP-TOUR. Since H is obtained by deleting vertices from the full walk W, we have

c(H)  c(W).

#### (37.7)

Combining inequalities (37.6) and (37.7) completes the proof.

In spite of the nice ratio bound provided by Theorem 37.2, APPROX-TSP-TOUR is usually not the best practical choice for this problem. There are other approximation algorithms that typically perform much better in practice (see the references at the end of this chapter).

## 37.2.2 The general traveling-salesman problem

If we drop the assumption that the cost function c satisfies the triangle inequality, good approximate tours cannot be found in polynomial time unless P = NP.

If P NP and 1, there is no polynomial-time approximation algorithm with ratio bound for the general traveling-salesman problem.

Proof The proof is by contradiction. Suppose to the contrary that for some number 1, there is a polynomial-time approximation algorithm A with ratio bound . Without loss of generality, we assume that is an integer, by rounding it up if necessary. We shall then show how to use A to solve instances of the hamiltonian-cycle problem (defined in Section 36.5.5) in polynomial time. Since the hamiltonian-cycle problem is NP-complete, by Theorem 36.14, solving it in polynomial time implies that P = NP, by Theorem 36.4.

Let G = (V, E) be an instance of the hamiltonian-cycle problem. We wish to determine efficiently whether G contains a hamiltonian cycle by making use of the hypothesized approximation algorithm A. We turn G into an instance of the traveling-salesman problem as follows. Let G' = (V, E') be the complete graph on V; that is,

E' = {(u, v): u, v  V and u  v}.

Assign an integer cost to each edge in E' as follows:

Representations of G' and c can be created from a representation of G in time polynomial in |V| and |E|.

Now, consider the traveling-salesman problem (G', c). If the originalgraph G has a hamiltonian cycle H, then the cost function c assigns to each edge of H a cost of 1, and so (G', c) contains a tour of cost |V| On the other hand, if G does not contain a hamiltonian cycle, then any tour of G' must use some edge not in E. But any tour that uses an edge not in E has a cost of at least

( |V| + 1) + (|V| - 1) >  |V| .

Because edges not in G are so costly, there is a large gap between the cost of a tour that is a hamiltonian cycle in G (cost |V|) and the cost of any other tour (cost greater than |V|).

What happens if we apply the approximation algorithm A to the traveling-salesman problem (G', c)? Because A is guaranteed to return a tour of cost no more than times the cost of an optimal tour, if G contains a hamiltonian cycle, then A must return it. If G has no hamiltonian cycle, then A returns a tour of cost more than |V|. Therefore, we can use A to solve the hamiltonian-cycle problem in polynomial time.

## Exercises

Show how in polynomial time we can transform one instance of the traveling-salesman problem into another instance whose cost function satisfies the triangle inequality. The two instances must have the same set of optimal tours. Explain why such a polynomial-time transformation does not contradict Theorem 37.3, assuming that P NP.

Suppose that the vertices for an instance of the traveling-salesman problem are points in the plane and that the cost c(u, v) is the euclidean distance between points u and v. Show that an optimal tour never crosses itself.

# 37.3 The set-covering problem

The set-covering problem is an optimization problem that models many resource-selection problems. It generalizes the NP-complete vertex-cover problem and is therefore also NP-hard. The approximation algorithm developed to handle the vertex-cover problem doesn't apply here, however, and so we need to try other approaches. We shall examine a simple greedy heuristic with a logarithmic ratio bound. That is, as the size of the instance gets larger, the size of the approximate solution may grow, relative to the size of an optimal solution. Because the logarithm function grows rather slowly, however, this approximation algorithm may nonetheless give useful results.

#### (37.8)

We say that any satisfying equation (37.8) covers X. Figure 37.3 illustrates the problem.

The set-covering problem is an abstraction of many commonly arising combinatorial problems. As a simple example, suppose that X represents a set of skills that are needed to solve a problem and that we have a given set of people available to work on the problem. We wish to form a committee, containing as few people as possible, such that for every requisite skill in X, there is a member of the committee having that skill. In the decision version of the set-covering problem, we ask whether or not a covering exists with size at most k, where k is an additional parameter specified in the problem instance. The decision version of the problem is NP-complete, as Exercise 37.3-2 asks you to show.

## A greedy approximation algorithm

In the example of Figure 37.3, GREEDY-SET-COVER adds to the sets Sl, S4, S5, S3 in order.

The algorithm works as follows. The set U contains, at each stage, the set of remaining uncovered elements. The set contains the cover being constructed. Line 4 is the greedy decision-making step. A subset S is chosen that covers as many uncovered elements as possible (with ties broken arbitrarily). After S is selected, its elements are removed from U, and S is placed in . When the algorithm terminates, the set contains a subfamily of that covers X.

The algorithm GREEDY-SET-COVER can easily be implemented to run in time polynomial in . Since the number of iterations of the loop on lines 3-6 is at most min , and the loop body can be implemented to run in time , there is an implementation that runs in time . Exercise 37.3-3 asks for a linear- time algorithm.

## Analysis

We now show that the greedy algorithm returns a set cover that is not too much larger than an optimal set cover. For convenience, in this chapter we denote the dth harmonic number (see Section 3.1) by H(d) .

GREEDY-SET-COVER has a ratio bound

Proof The proof proceeds by assigning a cost to each set selected by the algorithm, distributing this cost over the elements covered for the first time, and then using these costs to derive the desired relationship between the size of an optimal set cover and the size of the set cover returned by the algorithm. Let Si denote the ith subset selected by GREEDY-SET-COVER; the algorithm incurs a cost of 1 when it adds Si to . We spread this cost of selecting Si evenly among the elements covered for the first time by Si. Let cx denote the cost allocated to element x, for each x X. Each element is assigned a cost only once, when it is covered for the first time. If x is covered for the first time by Si, then

The algorithm finds a solution of total cost , and this cost has been spread out over the elements of X. Therefore, since the optimal cover also covers X, we have

#### (37.9)

The remainder of the proof rests on the following key inequality, which we shall prove shortly. For any set S belonging to the family ,

#### (37.10)

From inequalities (37.9) and (37.10), it follows that

proving the theorem. It thus remains only to prove inequality (37.10). For any set , let

be the number of elements in S remaining uncovered after S1, S2, . . . , Si have been selected by the algorithm. We define u0 = |S| to be the number of elements of S, which are all initially uncovered. Let k be the least index such that uk = 0, so that every element in S is covered by at least one of the sets S1, S2, . . . , Sk. Then, ui-1 ui, and ui-1 - ui elements of S are covered for the first time by Si, for i = 1, 2, . . . , k. Thus,

Observe that

because the greedy choice of Si guarantees that S cannot cover more new elements than Si does (otherwise, S would have been chosen instead of Si). Consequently, we obtain

For integers a and b, where a < b, we have

Using this inequality, we obtain the telescoping sum

since H(0) = 0. This completes the proof of inequality (37.10).

GREEDY-SET-COVER has a ratio bound of (ln |X| + 1).

Proof Use inequality (3.12) and Theorem 37.4.

In some applications, max is a small constant, and so the solution returned by GREEDY-SET-COVER is at most a small constant times larger than optimal. One such application occurs when this heuristic is used to obtain an approximate vertex cover for a graph whose vertices have degree at most 3. In this case, the solution found by GREEDY-SET-COVER is not more than H(3) = 11/6 times as large as an optimal solution, a performance guarantee that is slightly better than that of APPROX-VERTEX-COVER.

## Exercises

Consider each of the following words as a set of letters: {arid, dash, drain, heard, lost, nose, shun, slate, snare, thread}. Show which set cover GREEDY-SET-COVER produces when ties are broken in favor of the word that appears first in the dictionary.

Show how to implement GREEDY-SET-COVER in such a way that it runs in time .

Show that the following weaker form of Theorem 37.4 is trivially true:

Create a family of set-cover instances demonstrating that GREEDY-SET-COVER can return a number of different solutions that is exponential in the size of the instance. (Different solutions result from ties being broken differently in the choice of S in line 4.)

# 37.4 The subset-sum problem

The optimization problem associated with this decision problem arises in practical applications. In the optimization problem, we wish to find a subset of {x1, x2, . . . , xn} whose sum is as large as possible but not larger than t. For example, we may have a truck that can carry no more than t pounds, and n different boxes to ship, the ith of which weighs xi pounds. We wish to fill the truck as full as possible without exceeding the given weight limit.

In this section, we present an exponential-time algorithm for this optimization problem and then show how to modify the algorithm so that it becomes a fully polynomial-time approximation scheme. (Recall that a fully polynomial-time approximation scheme has a running time that is polynomial in 1/ as well as in n.)

## An exponential-time algorithm

If L is a list of positive integers and x is another positive integer, then we let L + x denote the list of integers derived from L by increasing each element of L by x. For example, if L = 1, 2, 3, 5, 9, then L + 2 = 3, 4, 5, 7, 11. We also use this notation for sets, so that

S + x = {s + x : s  S} .

EXACT-SUBSET-SUM(S, t)

1  n  |S|

2  L0  0

3  for i  1 to n

4  do Li  MERGE-LISTS(Li-1, Li-1 + xi)

5  remove from Li every element that is greater than t

6  return the largest element in Ln

Let Pi denote the set of all values that can be obtained by selecting a (possibly empty) subset of {x1, x2, . . . , xi} and summing its members. For example, if S = {1, 4, 5}, then

P1 = {0, 1} ,

P2 = {0, 1, 4, 5} ,

P3 = {0, 1, 4, 5, 6, 9, 10} .

Given the identity

#### (37.11)

we can prove by induction on i (see Exercise 37.4-1) that the list Li is a sorted list containing every element of Pi whose value is not more than t. Since the length of Li can be as much as 2i, EXACT-SUBSET-SUM is an exponential-time algorithm in general, although it is a polynomial-time algorithm in the special cases in which t is polynomial in |S| or all of the numbers in S are bounded by a polynomial in |S|.

## A fully polynomial-time approximation scheme

or, equivalently,

(1 - )y  z  y .

We can think of such a z as "representing" y in the new list L'. Each y is represented by a z such that the relative error of z, with respect to y, is at most . For example, if = 0.1 and

L = 10, 11, 12, 15, 20, 21, 22, 23, 24, 29,

then we can trim L to obtain

L' = 10, 12, 15, 20, 23, 29,

where the deleted value 11 is represented by 10, the deleted values 21 and 22 are represented by 20, and the deleted value 24 is represented by 23. It is important to remember that every element of the trimmed version of the list is also an element of the original version of the list. Trimming a list can dramatically decrease the number of elements in the list while keeping a close (and slightly smaller) representative value in the list for each element deleted from the list.

The following procedure trims an input list L = y1, y2, . . . , ym in time (m), assuming that L is sorted into nondecreasing order. The output of the procedure is a trimmed, sorted list.

TRIM(L, )

1  m  |L|

2  L'  <y1>

3  last  <y1>

4  for i  2 to m

5       do if last < (1 - )yi

6             then append yi onto the end of L'

7                  last  yi

8  return L'

The elements of L are scanned in increasing order, and a number is put into the returned list L' only if it is the first element of L or if it cannot be represented by the most recent number placed into L'.

Given the procedure TRIM, we can construct our approximation scheme as follows. This procedure takes as input a set S = {x1, x2, . . . , xn} of n integers (in arbitrary order), a target integer t, and an "approximation parameter" , where 0 < < 1.

APPROX-SUBSET-SUM(S, t, )

1  n  |S|

2  L0  0

3  for i  1 to n

4       do Li  MERGE-LISTS(Li-1, Li-1 + xi)

5          Li  TRIM(Li,  / n)

6          remove from Li every element that is greater than t

7  let z be the largest value in Ln

8  return z

Line 2 initializes the list L0 to be the list containing just the element 0. The loop in lines 3-6 has the effect of computing Li as a sorted list containing a suitably trimmed version of the set Pi, with all elements larger than t removed. Since Li is created from Li-1, we must ensure that the repeated trimming doesn't introduce too much inaccuracy. In a moment, we shall see that APPROX-SUBSET-SUM returns a correct approximation if one exists.

As an example, suppose we have the instance

L = 104, 102, 201, 101

with t = 308 and = 0.20. The trimming parameter is / 4 = 0.05. APPROX-SUBSET-SUM computes the following values on the indicated lines:

line 2:  L0  =  0 ,

line 4:  L1  =  0, 104 ,

line 5:  L1  =  0, 104 ,

line 6:  L1  =  0, 104 ,

line 4:  L2  =  0, 102, 104, 206 ,

line 5:  L2  =  0, 102, 206 ,

line 6:  L2  =  0, 102, 206 ,

line 4:  L3  =  0,102, 201, 206, 303, 407 ,

line 5:  L3  =  0,102, 201, 303, 407 ,

line 6:  L3  =  0, 102, 201, 303 ,

line 4:  L4  =  0, 101, 102, 201, 203, 302, 303, 404 ,

line 5:  L4  =  0,101, 201, 302, 404 ,

line 6:  L4  =  0,101, 201, 302 .

The algorithm returns z = 302 as its answer, which is well within = 20% of the optimal answer 307 = 104 + 102 + 101; in fact, it is within 2%.

APPROX-SUBSET-SUM is a fully polynomial-time approximation scheme for the subset-sum problem.

Proof The operations of trimming Li in line 5 and removing from Li every element that is greater than t maintain the property that every element of Li is also a member of Pi. Therefore, the value z returned in line 8 is indeed the sum of some subset of S. It remains to show that it is not smaller than 1 - times an optimal solution. (Note that because the subset-sum problem is a maximization problem, equation (37.2) is equivalent to C*(1-) C.) We must also show that the algorithm runs in polynomial time.

To show that the relative error of the returned answer is small, note that when list Li is trimmed, we introduce a relative error of at most /n between the representative values remaining and the values before trimming. By induction on i, it can be shown that for every element y in Pi that is at most t, there is a z Li such that

(1 - /n)iy  z  y.

#### (37.12)

If y* Pn denotes an optimal solution to the subset-sum problem, then there is a z Ln such that

(1 - /n)ny*  z  y*;

#### (37.13)

the largest such z is the value returned by APPROX-SUBSET-SUM. Since it can be shown that

the function (1 - /n)n increases with n, so that n > 1 implies

1 -  <(1 - /n)n,

and thus,

(1 - )y*  z.

Therefore, the value z returned by APPROX-SUBSET-SUM is not smaller than 1 - times the optimal solution y*.

To show that this is a fully polynomial-time approximation scheme, we derive a bound on the length of Li. After trimming, successive elements z and z' of Li must have the relationship z/z' > 1/(1 - /n). That is, they must differ by a factor of at least 1/(1 - /n). Therefore, the number of elements in each Li is at most

using equation (2.10). This bound is polynomial in the number n of input values given, in the number of bits 1g t needed to represent t, and in 1/. Since the running time of APPROX-SUBSET-SUM is polynomial in the length of the Li, APPROX-SUBSET-SUM is a fully polynomial-time approximation scheme.

## Exercises

Prove equation (37.11).

Prove equations (37.12) and (37.13).

How would you modify the approximation scheme presented in this section to find a good approximation to the smallest value not less than t that is a sum of some subset of the given input list?

# Problems

a. Prove that the problem of determining the minimum number of bins required is NP-hard. (Hint: Reduce from the subset-sum problem.)

The first-fit heuristic takes each object in turn and places it into the first bin that can accommodate it. Let

.

b. Argue that the optimal number of bins required is at least S.

c. Argue that the first-fit heuristic leaves at most one bin less than half full.

d. Prove that the number of bins used by the first-fit heuristic is never more than 2S.

e. Prove a ratio bound of 2 for the first-fit heuristic.

f. Give an efficient implementation of the first-fit heuristic, and analyze its running time.

a. Prove that the size of the maximum clique in G(k) is equal to the kth power of the size of the maximum clique in G.

b. Argue that if there is an approximation algorithm that has a constant ratio bound for finding a maximum-size clique, then there is a fully polynomial-time approximation scheme for the problem.

Show that the greedy set-covering heuristic can be generalized in a natural manner to provide an approximate solution for any instance of the weighted set-covering problem. Show that your heuristic has a ratio bound of H(d), where d is the maximum size of any set Si.

# Chapter notes

There is a wealth of literature on approximation algorithms. A good place to start is Garey and Johnson [79]. Papadimitriou and Steiglitz [154] also have an excellent presentation of approximation algorithms. Lawler, Lenstra, Rinnooy Kan, and Shmoys [133] provide an extensive treatment of the traveling-salesman problem.

Papadimitriou and Steiglitz attribute the algorithm APPROX-VERTEX-COVER to F. Gavril and M. Yannakakis. The algorithm APPROX-TSP-TOUR appears in an excellent paper by Rosenkrantz, Stearns, and Lewis [170]. Theorem 37.3 is due to Sahni and Gonzalez [172]. The analysis of the greedy heuristic for the set-covering problem is modeled after the proof published by Chvátal [42] of a more general result; this basic result as presented here is due to Johnson [113] and Lovász [141]. The algorithm APPROX-SUBSET-SUM and its analysis are loosely modeled after related approximation algorithms for the knapsack and subset-sum problem by Ibarra and Kim [111].