# CHAPTER 35: COMPUTATIONAL GEOMETRY

In this chapter, we look at a few computational-geometry algorithms in two dimensions, that is, in the plane. Each input object is represented as a set of points {pi}, where each pi = (xi, yi) and xi, yi R. For example, an n-vertex polygon P is represented by a sequence p0, p1, p2, . . . , pn-1 of its vertices in order of their appearance on the boundary of P. Computational geometry can also be performed in three dimensions, and even in higher- dimensional spaces, but such problems and their solutions can be very difficult to visualize. Even in two dimensions, however, we can see a good sample of computational-geometry techniques.

Section 35.1 shows how to answer simple questions about line segments efficiently and accurately: whether one segment is clockwise or counterclockwise from another that shares an endpoint, which way we turn when traversing two adjoining line segments, and whether two line segments intersect. Section 35.2 presents a technique called "sweeping" that we use to develop an O(n lg n)-time algorithm for determining whether there are any intersections among a set of n line segments. Section 35.3 gives two "rotational-sweep" algorithms that compute the convex hull (smallest enclosing convex polygon) of a set of n points: Graham's scan, which runs in time O(n 1g n), and Jarvis's march, which takes O(nh) time, where h is the number of vertices of the convex hull. Finally, Section 35.4 gives an O(n 1g n)-time divide-and-conquer algorithm for finding the closest pair of points in a set of n points in the plane.

# 35.1 Line-segment properties

In this section, we shall explore the following questions:

There are no restrictions on the given points.

We can answer each question in O(1) time, which should come as no surprise since the input size of each question is O(1). Moreover, our methods will use only additions, subtractions, multiplications, and comparisons. We need neither division nor trigonometric functions, both of which can be computationally expensive and prone to problems with round-off error. For example, the "straightforward" method of determining whether two segments intersect--compute the line equation of the form y = mx + b for each segment (m is the slope and b is the y-intercept), find the point of intersection of the lines, and check whether this point is on both segments--uses division to find the point of intersection. When the segments are nearly parallel, this method is very sensitive to the precision of the division operation on real computers. The method in this section, which avoids division, is much more accurate.

## Cross products

#### Figure 35.1 (a) The cross product of vectors p1 and p2 is the signed area of the parallelogram. (b) The lightly shaded region contains vectors that are clockwise from p. The darkly shaded region contains vectors that are counterclockwise from p.

a matrix:1

1 Actually, the cross product is a three-dimensional concept. It is a vector that is perpendicular to both p1 and p2 according to the "right-hand rule" and whose magnitude is |x1y2 - x2y1|. In this chapter, however, it will prove convenient to treat the cross product simply as the value x1y2 - x2y1.

To determine whether a directed segment is clockwise from a directed segment with respect to their common endpoint p0, we simply translate to use p0 as the origin. That is, we let p1 - p0 denote the vector p'1 = (x'1, y'1), where x'1 = x1 - x0 and y'1 = y1 - y0, and we define p2 - p0 similarly. We then compute the cross product

`(p1 - p0) x (p2 - p0) = (x1 - x0) (y2 - y0) - (x2 - x0) (y1 - y0).`

If this cross product is positive, then is clockwise from ; if negative, it is counterclockwise.

## Determining whether two line segments intersect

is true. The rectangles must intersect in both dimensions. The first two comparisons above determine whether the rectangles intersect in x; the second two comparisons determine whether the rectangles intersect in y.

The second stage in determining whether two line segments intersect decides whether each segment "straddles" the line containing the other. A segment straddles a line if point p1 lies on one side of the line and point p2 lies on the other side. If p1 or p2 lies on the line, then we say that the segment straddles the line. Two line segments intersect if and only if they pass the quick rejection test and each segment straddles the line containing the other.

#### Figure 35.3 Determining whether line segment straddles the line containing segment . (a) If it does straddle, then the signs of the cross products (p3 - p1) x (p2 - p1) and (p4 - p1) x (p2 - p1) differ. (b) If it does not straddle, then the signs of the cross products are the same. (c)-(d) Boundary cases in which at least one of the cross products is zero and the segment straddles. (e) A boundary case in which the segments are collinear but do not intersect. Both cross products are zero, but they would not be computed by our algorithm because the segments fail the quick rejection test--their bounding boxes do not intersect.

We can use the cross-product method to determine whether line segment straddles the line containing points p1 and p2. The idea, as shown in Figures 35.3(a) and (b), is to determine whether directed segments and have opposite orientations relative to . If so, then the segment straddles the line. Recalling that we can determine relative orientations with cross products, we just check whether the signs of the cross products (p3 - p1) X (p2 - p1) and (p4 - p1) X (p2 - p1) are different. A boundary condition occurs if either cross product is zero. In this case, either p3 or p4 lies on the line containing segment . Because the two segments have already passed the quick rejection test, one of the points p3 and p4 must in fact lie on segment . Two such situations are shown in Figures 35.3(c) and (d). The case in which the two segments are collinear but do not intersect, shown in Figure 35.3(e), is eliminated by the quick rejection test. A final boundary condition occurs if one or both of the segments has zero length, that is, if its endpoints are coincident. If both segments have zero length, then the quick rejection test suffices. If just one segment, say , has zero length, then the segments intersect if and only if the cross product (p3 - p1) x (p2 - p1) is zero.

## Other applications of cross products

Later sections of this chapter will introduce additional uses for cross products. In Section 35.3, we shall need to sort a set of points according to their polar angles with respect to a given origin. As Exercise 35.1-2 asks you to show, cross products can be used to perform the comparisons in the sorting procedure. In Section 35.2, we shall use red-black trees to maintain the vertical ordering of a set of nonintersecting line segments. Rather than keeping explicit key values, we shall replace each key comparison in the red-black tree code by a cross-product calculation to determine which of two segments that intersect a given vertical line is above the other.

## Exercises

Prove that if p1 X p2 is positive, then vector p1 is clockwise from vector p2 with respect to the origin (0, 0) and that if this cross product is negative, then p1 is counterclockwise from p2.

Write pseudocode to sort a sequence p1, p2, . . . . ,pn) of n points according to their polar angles with respect to a given origin point p0. Your procedure should take 0(n lg n) time and use cross products to compare angles.

Professor Amundsen proposes the following method to determine whether a sequence p0, p1, . . . . , pn-1 of n points forms the consecutive vertices of a convex polygon. (See Section 16.4 for definitions pertaining to polygons.) Output "yes" if the set {pipi + 1pi + 2 : i = 0, 1, . . . ,n - 1}, where subscript addition is performed modulo n, does not contain both left turns and right turns; otherwise, output "no." Show that although this method runs in linear time, it does not always produce the correct answer. Modify the professor's method so that it always produces the correct answer in linear time.

Given a point p0 = (x0, y0), the right horizontal ray from p0 is the set of points { pi = (xi, yi) : xi x0 and yi = y0}, that is, it is the set of points due right of p0 along with p0 itself. Show how to determine whether a given right horizontal ray from p0 intersects a line segment in O(1) time by reducing the problem to that of determining whether two line segments intersect.

Show how to compute the area of an n-vertex simple, but not necessarily convex, polygon in (n) time.

# 35.2 Determining whether any pair of segments intersects

The algorithm runs in O(n lg n) time, where n is the number of segments we are given. It determines only whether or not any intersection exists; it does not print all the intersections. (By Exercise 35.2-1, it takes (n2) time in the worst case to find all the intersections in a set of n line segments.)

Our algorithm for determining whether any two of n line segments intersect makes two simplifying assumptions. First, we assume that no input segment is vertical. Second, we assume that no three input segments intersect at a single point. (Exercise 35.2-8 asks you to describe an implementation that works even if these assumptions fail to hold.) Indeed, removing such simplifying assumptions and dealing with boundary conditions is often the most difficult part of programming computational-geometry algorithms and proving their correctness.

## Ordering segments

Since we assume that there are no vertical segments, any input segment that intersects a given vertical sweep line intersects it at a single point. We can thus order the segments that intersect a vertical sweep line according to the y-coordinates of the points of intersection.

For any given x, the relation ">x" is a total order (see Section 5.2) on segments that intersect the sweep line at X. The order may differ for differing values of x, however, as segments enter and leave the ordering. A segment enters the ordering when its left endpoint is encountered by the sweep, and it leaves the ordering when its right endpoint is encountered.

What happens when the sweep line passes through the intersection of two segments? As Figure 35.4(b) shows, their positions in the total order are reversed. Sweep lines v and w are to the left and right, respectively, of the point of intersection of segments e and f, and we have e >v f and f >w e. Note that because we assume that no three segments intersect at the same point, there must be some vertical sweep line x for which intersecting segments e and f are consecutive in the total order >x. Any sweep line that passes through the shaded region of Figure 35.4(b), such as z, has e and f consecutive in its total order.

## Moving the sweep line

Sweeping algorithms typically manage two sets of data:

The sweep-line status is a total order T, for which we require the following operations:

INSERT(T, s): insert segment s into T.

DELETE(T, s): delete segment s from T.

## Segment-intersection pseudocode

The following algorithm takes as input a set S of n line segments, returning the boolean value TRUE if any pair of segments in S intersects, and FALSE otherwise. The total order T is implemented by a red-black tree.

#### Figure 35.5 The execution of ANY-SEGMENTS-INTERSECT. Each dashed line is the sweep line at an event point, and the ordering of segment names below each sweep line is the total order T at the end of the for loop in which the corresponding event point is processed. The intersection of segments d and b is found when segment c is deleted.

`ANY-SEGMENTS-INTERSECT(S)`

`2  sort the endpoints of the segments in S from left to right,`

`breaking ties by putting points with lower y-coordinates first`

`3  for each point p in the sorted list of endpoints`

`4      do if p is the left endpoint of a segment s`

`5            then INSERT(T, s)`

`6                 if (ABOVE(T, s) exists and intersects s)`

`or (BELOW(T, s) exists and intersects s)`

`7                    then return TRUE`

`8                 if p is the right endpoint of a segment s`

`9                    then if both ABOVE(T, s) and BELOW(T, s) exist`

`and ABOVE(T, s) intersects BELOW(T, s)`

`10                            then return TRUE`

`11                         DELETE(T, s)`

`12  return FALSE`

Figure 35.5 illustrates the execution of the algorithm. Line 1 initializes the total order to be empty. Line 2 determines the event-point schedule by sorting the 2n segment endpoints from left to right, breaking ties by putting points with lower y-coordinates first. Note that line 2 can be performed by lexicographically sorting the endpoints on (x, y).

Each iteration of the for loop of lines 3-11 processes one event point p. If p is the left endpoint of a segment s, line 5 adds s to the total order, and lines 6-7 return TRUE if s intersects either of the segments it is consecutive with in the total order defined by the sweep line passing through p. (A boundary condition occurs if p lies on another segment s'. In this case, we only require that s and s' be placed consecutively into T.) If p is the right endpoint of a segment s, then s is to be deleted from the total order. Lines 9-10 return TRUE if there is an intersection between the segments surrounding s in the total order defined by the sweep line passing through p; these segments will become consecutive in the total order when s is deleted. If these segments do not intersect, line 11 deletes segment s from the total order. Finally, if no intersections are found in processing all the 2n event points, line 12 returns FALSE.

## Correctness

The following theorem shows that ANY-SEGMENTS-INTERSECT is correct.

The call ANY-SEGMENTS-INTERSECT(S) returns TRUE if and only if there is an intersection among the segments in S.

Proof The procedure can be incorrect only by returning TRUE when no intersection exists or by returning FALSE when there is at least one intersection. The former case cannot occur, because ANY-SEGMENTS-INTERSECT returns TRUE only if it finds an intersection between two of the input segments.

To show that the latter case cannot occur, let us suppose for the sake of contradiction that there is at least one intersection, yet ANY-SEGMENTS-INTERSECT returns FALSE. Let p be the leftmost intersection point, breaking ties by choosing the one with the lowest y-coordinate, and let a and b be the segments that intersect at p. Since no intersections occur to the left of p, the order given by T is correct at all points to the left of p. Because no three segments intersect at the same point, there exists a sweep line z at which a and b become consecutive in the total order.2 Moreover, z is to the left of p or goes through p. There exists a segment endpoint q on sweep line z that is the event point at which a and b become consecutive in the total order. If p is on sweep line z, then q = p. If p is not on sweep line z, then q is to the left of p. In either case, the order given by T is correct just before q is processed. (Here we rely on p being the lowest of the leftmost intersection points. Because of the lexicographical order in which event points are processed, even if p is on sweep line z and there is another intersection point p' on z, event point q = p is processed before the other intersection p' can interfere with the total order T.) There are only two possibilities for the action taken at event point q:

2If we allow three segments to intersect at the same point, there may be an intervening segment c that intersects both a and b at point p. That is, we may have a <w c and c <w b for all sweep lines w to the left of p for which a <w b.

1. Either a or b is inserted into T, and the other segment is above or below it in the total order. Lines 4-7 detect this case.

2. Segments a and b are already in T, and a segment between them in the total order is deleted, making a and b become consecutive. Lines 8-11 detect this case.

In either case, the intersection p is found, contradicting the assumption that the procedure returns FALSE.

## Running time

If there are n segments in set S, then ANY-SEGMENTS-INTERSECT runs in time O(n lg n). Line 1 takes O(1) time. Line 2 takes O(n lg n) time, using merge sort or heapsort. Since there are 2n event points, the for loop of lines 3-11 iterates at most 2n times. Each iteration takes 0(lg n) time, since each red-black-tree operation takes O(lg n) time and, using the method of Section 35.1, each intersection test takes O(1) time. The total time is thus O(n lg n).

## Exercises

Show that there may be (n2) intersections in a set of n line segments.

Given two nonintersecting segments a and b that are comparable at x, show how to use cross products to determine in O(1) time which of a >x b or b >x a holds.

Show how to implement the red-black-tree procedures so that ANY-SEGMENTS-INTERSECT works correctly even if some segments are vertical or more than three segments intersect at the same point. Prove that your implementation is correct.

# 35.3 Finding the convex hull

In this section, we shall present two algorithms that compute the convex hull of a set of n points. Both algorithms output the vertices of the convex hull in counterclockwise order. The first, known as Graham's scan, runs in O(n lg n) time. The second, called Jarvis's march, runs in O(nh) time, where h is the number of vertices of the convex hull. As can be seen from Figure 35.6, every vertex of CH(Q) is a point in Q. Both algorithms exploit this property, deciding which vertices in Q to keep as vertices of the convex hull and which vertices in Q to throw out.

## Graham's scan

#### Figure 35.7 The execution of GRAHAM-SCAN on the set Q of Figure 35.6. The current convex hull contained in stack S is shown in gray at each step. (a) The ordered polar angles of p1,p2, . . . , p12 relative to p0 and the initial stack S containing p0, p1, and p2. (b)-(k) Stack S after each iteration of the for loop of lines 7-10. Dashed lines show nonleft turns, which cause points to be popped from the stack. In part (h), for example, the right turn at angle p7p8p9 causes p8 to be popped, and then the right turn at angle p6p7p9 causes p7 to be popped. (l) The convex hull returned by the procedure, which matches that of Figure 35.6.

`GRAHAM-SCAN(Q)`

`1  let p0 be the point in Q with the minimum y-coordinate,`

`or the leftmost such point in case of a tie`

`2  let p1, p2, . . . , pm be the remaining points in Q,`

`sorted by polar angle in counterclockwise order around p0`

`(if more than point has the same angle, remove all but`

`the one that is farthest from p0)`

`3  top[S] 0`

`4  PUSH(p0,S)`

`5  PUSH(p1,S)`

`6  PUSH(p2,S)`

`7  for i  3 to m`

`8        do while the angle formed by points NEXT-TO-TOP(S),`

`TOP(S), and pi makes a nonleft turn`

`9               do POP(S)`

`10           PUSH(S,pi)`

`11  return S`

Figure 35.7 illustrates the progress of GRAHAM-SCAN. Line 1 chooses point p0 as the point with the lowest y-coordinate, picking the leftmost such point in case of a tie. Since there is no point in Q that is below p0 and any other points with the same y-coordinate are to its right, p0 is a vertex of CH(Q). Line 2 sorts the remaining points of Q by polar angle relative to p0, using the same method--comparing cross products--as in Exercise 35.1-2. If two or more points have the same polar angle relative to p0, all but the farthest such point are convex combinations of p0 and the farthest point, and so we remove them entirely from consideration. We let m denote the number of points other than p0 that remain. The polar angle, measured in radians, of each point in Q relative to p0 is in the half-open interval [0, /2). Since polar angles increase in a counterclockwise fashion, the points are sorted in counterclockwise order relative to p0. We designate this sorted sequence of points by p1, p2, . . . , pm . Note that points p1 and pm are vertices of CH(Q) (see Exercise 35.3-1). Figure 35.7(a) shows the points of Figure 35.6, with the ordered polar angles of p1, p2, . . . , p12 relative to p0.

The remainder of the procedure uses the stack S Lines 3-6 initialize the stack to contain, from bottom to top, the first three points p0, p1, and p2. Figure 35.7(a) shows the initial stack S. The for loop of lines 7-10 iterates once for each point in the subsequent p3, p4, . . . ,pm) The intent is that after processing point pi, stack S contains, from bottom to top, the vertices of CH({p0,p1, . . . ,pi}) in counterclockwise order. The while loop of lines 8-9 removes points from the stack if they are found not to be vertices of the convex hull. When we traverse the convex hull counterclockwise, we should make a left turn at each vertex. Thus, each time the while loop finds a vertex at which we make a nonleft turn, the vertex is popped from the stack. (By checking for a nonleft turn, rather than just a right turn, this test precludes the possibility of a straight angle at a vertex of the resulting convex hull. This is just what we want, since every vertex of a convex polygon must not be a convex combination of other vertices of the polygon.) After we pop all the vertices that have nonleft turns when heading toward point pi, we push pi onto the stack. Figures 35.7(b)-(k) show the state of the stack S after each iteration of the for loop. Finally, GRAHAM-SCAN returns the stack S in line 11. Figure 35.7(1) shows the corresponding convex hull.

The following theorem formally proves the correctness of GRAHAM-SCAN.

If GRAHAM-SCAN is run on a set Q of points, where |Q| 3, then a point of Q is on the stack S at termination if and only if it is a vertex of CH(Q).

Proof As noted above, a vertex that is a convex combination of p0 and some other vertex in Q is not a vertex of CH(Q). Such a vertex is not included in the sequence pl, p2, . . . , p m, and so it can never appear on stack S.

The crux of the proof lies in the two situations shown in Figure 35.8. Part (a) deals with nonleft turns, and part (b) deals with left turns.

We first show that each point popped from stack S is not a vertex of CH(Q). Suppose that point pj is popped from the stack because angle pkpjpi makes a nonleft turn, as shown in Figure 35.8(a). Because we scan the points in order of increasing polar angle relative to point p0, there is a triangle with point pj either in the interior of the triangle or on line segment . In either case, point pj cannot be a vertex of CH(Q).

We now show that each point on stack S is a vertex of CH(Q) at termination. We start by proving the following claim: GRAHAM-SCAN maintains the invariant that the points on stack S always form the vertices of a convex polygon in counterclockwise order.

The claim holds immediately after the execution of line 6, since points p0, p1, and p2 form a convex polygon. Now we examine how stack S changes during the course of GRAHAM-SCAN. Points are either popped or pushed. In the former case, we rely on a simply geometrical property: if a vertex is removed from a convex polygon, the resulting polygon is convex. Thus, popping a point from stack S preserves the invariant.

#### Figure 35.8 The two basic situations in the proof of correctness of GRAHAM- SCAN. (a) Showing that a point popped from the stack in GRAHAM-SCAN is not a vertex of CH(Q). If point pj is popped from the stack because angle pkpjpi makes a nonleft turn, then the shaded triangle contains point pj. Point pj is therefore not a vertex of CH(Q). (b) If point pi is pushed onto the stack, then there must be a left turn at angle pkpjpi. Because pi follows pj in the polar-angle ordering of points and because of how p0 was chosen, pi must be in the shaded region. If the points on the stack form a convex polygon before the push, then they must form a convex polygon afterward.

Before we consider the case in which a point is pushed onto the stack, let us examine another geometrical property, illustrated in Figures 35.9(a) and (b). Let P be a convex polygon, and choose any side of P. Consider the region bounded by and the extensions of the two adjacent sides. (Depending on the relative angles of the adjacent sides, the region may be either bounded, like the shaded region in part (a), or unbounded, like the shaded region in part (b).) If we add any point ps in this region to P as a new vertex, with the sides replacing side , the resulting polygon is convex.

Now consider a point pi that is pushed onto S. Referring back to Figure 35.8(b), let pj be the vertex on the top of S just prior to pushing pi, and let pk be the predecessor of pj on S. We claim that pi must fall within the shaded region of Figure 35.8(b), which corresponds directly to the shaded regions of Figure 35.9. Because the angle pkpjpi makes a left turn, pi must be on the shaded side of the extension of . Because pi follows pj in the polar-angle ordering, it must be on the shaded side of . Moreover, because of how we chose p0, point pi must be on the shaded side of the extension of . Thus, pi is in the shaded region, and therefore after pi has been pushed onto stack S, the points on S form a convex polygon. This completes the proof of the claim.

At the end of GRAHAM-SCAN, therefore, the points of Q that are on stack S form the vertices of a convex polygon. We have shown that all points not on S are not vertices of CH(Q) or, equivalently, that all vertices of CH(Q) are on S. Since S contains only vertices from Q and its points form a convex polygon, they must form CH(Q).

#### Figure 35.9 Adding a point in the shaded region to a convex polygon P yields another convex polygon. The shaded region is bounded by a side of and the extensions of the two adjacent sides. (a) The shaded region is bounded. (b) The shaded region is unbounded.

We now show that the running time of GRAHAM-SCAN is O(n lg n), where n = |Q|. Line 1 takes(n) time. Line 2 takes O(n lg n) time, using merge sort or heapsort to sort the polar angles and the cross-product method of Section 35.1 to compare angles. (Removing all but the farthest point with the same polar angle can be done in a total of O(n) time.) Lines 3-6 take O(1) time. Because m n - 1, the for loop of lines 7-10 is executed at most n - 3 times. Since PUSH takes O(1) time, each iteration takes O(1) time exclusive of the time spent in the while loop of lines 8-9, and thus overall the for loop takes O(n) time exclusive of the nested while loop.

## Jarvis's march

#### Figure 35.10 The operation of Jarvis's march. The first vertex chosen is the lowest point p0. The next vertex, p1, has the least polar angle of any point with respect to p0. Then, p2 has the least polar angle with respect to p1. The right chain goes as high as the highest point p3. Then, the left chain is constructed by finding least polar angles with respect to the negative x-axis.

Intuitively, Jarvis's march simulates wrapping a taut piece of paper around the set Q. We start by taping the end of the paper to the lowest point in the set, that is, to the same point p0 with which we start Graham's scan. This point is a vertex of the convex hull. We pull the paper to the right to make it taut, and then we pull it higher until it touches a point. This point must also be a vertex of the convex hull. Keeping the paper taut, we continue in this way around the set of vertices until we come back to our original point p0.

We could implement Jarvis's march in one conceptual sweep around the convex hull, that is, without separately constructing the right and left chains. Such implementations typically keep track of the angle of the last convex-hull side chosen and require the sequence of angles of hull sides to be strictly increasing (in the range of 0 to 2 radians). The advantage of constructing separate chains is that we need not explicitly compute angles; the techniques of Section 35.1 suffice to compare angles.

If implemented properly, Jarvis's march has a running time of O(nh). For each of the h vertices of CH(Q), we find the vertex with the minimum polar angle. Each comparison between polar angles takes O(1) time, using the techniques of Section 35.1. As Section 10.1 shows, we can compute the minimum of n values in O(n) time if each comparison takes O(1) time. Thus, Jarvis's march takes O(nh) time.

## Exercises

Prove that in the procedure GRAHAM-SCAN, points p1 and pm must be vertices of CH(Q).

Given a set of points Q, prove that the pair of points farthest from each other must be vertices of CH(Q).

#### Figure 35.11 The definition of a star-shaped polygon, for use in Exercise 35.3-4. (a) A star-shaped polygon. The segment from point p to any point q on the boundary intersects the boundary only at q. (b) A non-star-shaped polygon. The shaded region on the left is the shadow of q, and the shaded region on the right is the shadow of q'. Since these regions are disjoint, the kernel is empty.

Show how to implement the incremental method for computing the convex hull of n points so that it runs in O(n lg n) time.

# 35.4 Finding the closest pair of points

A brute-force closest-pair algorithm simply looks at all the pairs of points. In this section, we shall describe a divide-and-conquer algorithm for this problem whose running time is described by the familiar recurrence T(n) = 2T(n/2) + O(n). Thus, this algorithm uses only O(n lg n) time.

## The divide-and-conquer algorithm

A given recursive invocation with inputs P, X, and Y first checks whether |P| 3. If so, the invocation simply performs the brute-force method described above: try all pairs of points and return the closest pair. If |P| > 3, the recursive invocation carries out the divide-and-conquer paradigm as follows.

Divide: It finds a vertical line l that bisects the point set P into two sets PL and PR such that |PL| = |P|/2, |PR| = |P| /2, all points in PL are on or to the left of line l, and all points in PR are on or to the right of l. The array X is divided into arrays XL and XR, which contain the points of PL and PR respectively, sorted by monotonically increasing x-coordinate. Similarly, the array Y is divided into arrays YL and YR, which contain the points of PL and PR respectively, sorted by monotonically increasing y-coordinate.

Conquer: Having divided P into PL and PR, it makes two recursive calls, one to find the closest pair of points in PL and the other to find the closest pair of points in PR. The inputs to the first call are the subset PL and arrays XL and YL; the second call receives the inputs PR, XR, and YR. Let the closest-pair distances returned for PL and PR be L and R, respectively, and let = min(L, R).

Combine: The closest pair is either the pair with distance found by oneof the recursive calls, or it is a pair of points with one point in PL and the other in PR. The algorithm determines if there is such a pair whose distance is less than . Observe that if there is a pair of points with distance less than , both points of the pair must be within units of line l. Thus, as Figure 35.12(a) shows, they both must reside in the 2-wide vertical strip centered at line l. To find such a pair, if one exists, the algorithm does the following.

1. It creates an array Y', which is the array Y with all points not in the 2-wide vertical strip removed. The array Y' is sorted by y-coordinate, just as Y is.

2. For each point p in the array Y', the algorithm tries to find points in Y' that are within units of p. As we shall see shortly, only the 7 points in Y' that follow p need be considered. The algorithm computes the distance from p to each of these 7 points and keeps track of the closest-pair distance ' found over all pairs of points in Y'.

3. If ' < , then the vertical strip does indeed contain a closer pair than was found by the recursive calls. This pair and its distance ' are returned. Otherwise, the closest pair and its distance found by the recursive calls are returned.

#### Figure 35.12 Key concepts in the proof that the closest-pair algorithm needs to check only 7 points following each point in the array Y'. (a) If PL PL and PR PR are less than units apart, they must reside within a x 2 rectangle centered at line l. (b) How 4 points that are pairwise at least units apart can all reside within a x square. On the left are 4 points in PL, and on the right are 4 points in PR There can be 8 points in the x 2 rectangle if the points shown on line l are actually pairs of coincident points with one point in PL and one in PR.

The above description omits some implementation details that are necessary to achieve the O(n 1g n) running time. After proving the correctness of the algorithm, we shall show how to implement the algorithm to achieve the desired time bound.

## Correctness

The correctness of this closest-pair algorithm is obvious, except for two aspects. First, by bottoming out the recursion when |P| 3, we ensure that we never try to divide a set of points with only one point. The second aspect is that we need only check the 7 points following each point p in array Y'; we shall now prove this property.

Suppose that at some level of the recursion, the closest pair of points is pL PL and PR PR. Thus, the distance ' between PL and PR is strictly less than . Point pL must be on or to the left of line l and less than units away. Similarly, pR is on or to the right of l and less than units away. Moreover, pL and pR are within units of each other vertically. Thus, as Figure 35.12(a) shows, pL and pR are within a X 2 rectangle centered at line l. (There may be other points within this rectangle as well.)

We next show that at most 8 points of P can reside within this X 2 rectangle. Consider the X square forming the left half of this rectangle. Since all points within PL are at least units apart, at most 4 points can reside within this square; Figure 35.12(b) shows how. Similarly, at most 4 points in PR can reside within the X square forming the right half of the rectangle. Thus, at most 8 points of P can reside within the X 2 rectangle. (Note that since points on line l may be in either PL or PR, there may be up to 4 points on l. This limit is achieved if there are two pairs of coincident points, each pair consisting of one point from PL and one point from PR, one pair is at the intersection of l and the top of the rectangle, and the other pair is where l intersects the bottom of the rectangle.)

Having shown that at most 8 points of P can reside within the rectangle, it is easy to see that we need only check the 7 points following each point in the array Y'. Still assuming that the closest pair is PL and PR, let us assume without loss of generality that pL precedes pR in array Y'. Then, even if pL occurs as early as possible in Y' and pR occurs as late as possible, pR is in one of the 7 positions following pL. Thus, we have shown the correctness of the closest-pair algorithm.

## Implementation and running time

As we have noted, our goal is to have the recurrence for the running time be T(n) = 2T(n/2) + O(n), where T(n) is, of course, the running time for a set of n points. The main difficulty is in ensuring that the arrays XL, XR, YL, and YR, which are passed to recursive calls, are sorted by the proper coordinate and also that the array Y' is sorted by y-coordinate. (Note that if the array X that is received by a recursive call is already sorted, then the division of set P into PL and PR is easily accomplished in linear time.)

The key observation is that in each call, we wish to form a sorted subset of a sorted array. For example, a particular invocation is given the subset P and the array Y, sorted by y-coordinate. Having partitioned P into PL and PR, it needs to form the arrays YL and YR, which are sorted by y-coordinate. Moreover, these arrays must be formed in linear time. The method can be viewed as the opposite of the MERGE procedure from merge sort in Section 1.3.1: we are splitting a sorted array into two sorted arrays. The following pseudocode gives the idea.

`1  length[YL]  length[YR]  0`

`2  for i  1 to length[Y]`

`3        do if Y[i]  PL`

`4              then length[YL]  length[YL] + 1`

`5                   Y[length[YL]]  Y[i]`

`6              else length[YR]  length[YR] + 1`

`7                   Y[length[YR]]  Y[i]`

We simply examine the points in array Y in order. If a point Y[i] is in PL, we append it to the end of array YL; otherwise, we append it to the end of array YR. Similar pseudocode works for forming arrays XL, XR, and Y'.

Thus, T(n) = O(n lg n) and T'(n) = O(n lg n).

## Exercises

Professor Smothers comes up with a scheme that allows the closest-pair algorithm to check only 5 points following each point in array Y'. The idea is always to place points on line l into set PL. Then, there cannot be pairs of coincident points on line l with one point in PL and one in PR. Thus, at most 6 points can reside in the x 2 rectangle. What is the flaw in the professor's scheme?

Without increasing the asymptotic running time of the algorithm, show how to ensure that the set of points passed to the very first recursive call contains no coincident points. Prove that it then suffices to check the points in the 6 (not 7) array positions following each point in the array Y'. Why doesn't it suffice to check only the 5 array positions following each point?

Given two points p1 and p2 in the plane, the L-distance between them is max(|x1 - x2|, |y1 - y2|). Modify the closest-pair algorithm to use the L-distance.

# Problems

a. Give an O(n2)-time algorithm to find the convex layers of a set on n points.

b. Prove that (n lg n) time is required to compute the convex layers of a set of n points on any model of computation that requires (n lg n) time to sort n real numbers.

Suppose that Q has k nonempty maximal layers, and let yi be the y-coordinate of the leftmost point in Li for i = 1, 2, . . . ,k. For now, assume that no two points in Q have the same x - or y-coordinate.

a. Show that y1 > y2 > . . . > yk.

Consider a point (x, y) that is to the left of any point in Q and for which y is distinct from the y-coordinate of any point in Q. Let Q' = Q {(x, y)}.

b. Let j be the minimum index such that yj < y, unless y < yk, in which case we let j = k + 1. Show that the maximal layers of Q' are as follows.

If j k, then the maximal layers of Q' are the same as the maximal layers of Q, except that Lj also includes (x, y) as its new leftmost point.

If j = k + 1, then the first k maximal layers of Q' are the same as for Q, but in addition, Q' has a nonempty (k + 1)st maximal layer: Lk+1 = {(x, y)}.

d. Do any difficulties arise if we now allow input points to have the same x- or y-coordinate? Suggest a way to resolve such problems.

A group of n Ghostbusters is battling n ghosts. Each Ghostbuster is armed with a proton pack, which shoots a stream at a ghost, e radicating it. A stream goes in a straight line and terminates when it hits the ghost. The Ghostbusters decide upon the following strategy. They will pair off with the ghosts, forming n Ghostbuster-ghost pairs, and then simultaneously each Ghostbuster will shoot a stream at his or her chosen ghost. As we all know, it is very dangerous to let streams cross, and so the Ghostbusters must choose pairings for which no streams will cross.

Assume that the position of each Ghostbuster and each ghost is a fixed point in the plane and that no three positions are collinear.

a. Argue that there exists a line passing through one Ghostbuster and one ghost such the number of Ghostbusters on one side of the line equals the number of ghosts on the same side. Describe how to find such a line in O(n lg n) time.

b. Give an O(n2 lg n)-time algorithm to pair Ghostbusters with ghosts in such a way that no streams cross.

Points drawn uniformly from a unit-radius disk. The convex hull has (n1/3) expected size.

Points drawn uniformly from the interior of a convex polygon with k sides, for any constant k. The convex hull has (1g n) expected size.

Points drawn according to a two-dimensional normal distribution. The convex hull has expected size.

a. Given two convex polygons with n1 and n2 vertices respectively, show how to compute the convex hull of all n1 + n2 points in O(n1 + n2) time. (The polygons may overlap.)

b. Show that the convex hull of a set of n points drawn independently according to a sparse-hulled distribution can be computed in O(n) expected time. (Hint: Recursively find the convex hulls of the first n/2 points and the second n/2 points, and then combine the results.)

# Chapter notes

This chapter barely scratches the surface of computational-geometry algorithms and techniques. Books on computational geometry include those by Preparata and Shamos [160] and Edelsbrunner [60].

Although geometry has been studied since antiquity, the development of algorithms for geometric problems is relatively new. Preparata and Shamos note that the earliest notion of the complexity of a problem was given by E. Lemoine in 1902. He was studying euclidean constructions--those using a ruler and a straightedge--and devised a set of five primitives: placing one leg of the compass on a given point, placing one leg of the compass on a given line, drawing a circle, passing the ruler's edge through a given point, and drawing a line. Lemoine was interested in the number of primitives needed to effect a given construction; he called this amount the "simplicity" of the construction.

The algorithm of Section 35.2, which determines whether any segments intersect, is due to Shamos and Hoey [176].

The original version of Graham's scan is given by Graham [91]. The package-wrapping algorithm is due to Jarvis [112]. Using a decision-tree model of computation, Yao [205] proved a lower bound of (n lg n) for the running time of any convex-hull algorithm. When the number of vertices h of the convex hull is taken into account, the prune-and-search algorithm of Kirkpatrick and Seidel [120], which takes O(n lg h) time, is asymptotically optimal.

The O(n lg n)-time divide-and-conquer algorithm for finding the closest pair of points is by Shamos and appears in Preparata and Shamos [160]. Preparata and Shamos also show that the algorithm is asymptotically optimal in a decision-tree model.