- considerable computer time to interpolate a large regular grid to represent relatively few data points,
- lack of flexibility in responding to variable data densities in different parts of a map,
- non-honouring of data points caused by insufficiently fine a grid in order to keep computer time down to reasonable levels, and
- difficulty in representing fault information adequately on a continuous surface.
- it appeared impossible to generate the same triangulation, and
hence the same map, from the same set of data, independent of the
starting point of the triangulation process, and
FIGURE 3.1 - the time taken for automatic triangulation was exorbitant. Although the time taken for gridding is related to the square of the number of data points, some of the methods of automatic triangulation were at best related to the cube!

The objections to grid based methods are:

These have led to the intensive development in recent years of alternative methods of generating contour maps.

The two most widely known are methods based on triangulation of the data set, and the process of contour following without either grid or triangulation as a guide.

The human cartographer, given a scattered data set to contour, will visualize a set of triangles in the area in which he is working that helps him locate the contour he is tracing. These triangles have no existence but provide a structure for him to use to estimate position of the contour line relative to other data points. It would seem likely that an automated approach that did the same would have considerable benefits, particularly as it would always honour all data points - as the data points would form the vertices of the triangulation. This approach is summarised in Figure 3.1 where the data points connected by a triangulation can then be contoured using linear approximations across the planar facets of the triangles that have been created. There used to be little interest in the automatic triangulation of seismic and other data sets because:

Improvements in the last ten years have now produced reliable triangulation procedures that produce the 'most equilateral' (and therefore unique) set of triangles possible - the Delaunay triangulation - in a time linearly related to the number of data points, and without the need for large computer memory requirements. Some of the multitude of alternative names for the same procedure (or its dual) found in the literature are Thiessen, Vorodonoi, Dirichlet, and Deltri. For any given data set it is now much faster to generate unsmoothed contours from an automatic triangulation procedure than from use of a grid interpolation approach. At the same time all data points are honoured, the resolution of the map varies with the data density, and maps can be joined together without error at the margins.

The problem with the triangulation technique is presently that of smooth contouring. A number of 'patch' functions exist as for grid contouring, but their calculation is more involved because the network of triangles is irregular and no calculation short cuts can be wrung from nice orthogonal axes. Smooth contouring from triangles presently takes two to five times longer than smooth contouring of a regular rectangular grid.

A major reason for the upsurge of interest in triangulation techniques has been that they are ideally suited to fault insertion. If the fault location is entered as a set of data points, the triangulation process will include them and will automatically relate them to the rest of the data set. Then, the triangulation can be 'unzipped' so that there is no direct connection in the data structure between the two halves of the fault. Contouring can then take place and the result will be a perfect edge to the fault depending only on the input resolution of the fault line.

Any triangulation that is to be used for the basis of isarithm map production must have the properties of stability, equilateralness, and non-intersection. It is desirable that the triangulation resulting from any data distribution should be independent of the starting point of the triangulation process inside the distribution. This is particularly important in cases where ambiguity of triangulation might be expected to occur, for instance where four near equidistant data points could be divided into two triangles in either of two ways. From the triangulation view point there would be little difference, but in terms of the contoured surface there could be dramatic changes related not to real variation but purely to the imposed triangulation. Any triangular approach usually attempts to achieve a set of triangles that are as equilateral as possible with minimum line strength. In the past this has often meant iterative processes such as those in SCA (1975) and GTN (1977) that attempted refinement of an initial triangulation. This did not produce a unique solution, and was expensive in computer time.

Brassel and Reif (1979) published a paper concerning the sub-division of a two-dimensional area into Thiessen polygons based on the location of a set of random data points. Their method is related to work by Rhynsburger (1973), Shamos and Bentley (1978), Green and Sibson (1978), Gold (1977) and Elfick ( 1979). The Thiessen polygon is an important concept in geographical thought as it can be used to define the region of influence of any point in a real context. It is however only one form of display of the final solution in that instead of showing the regions surrounding points it is also possible to connect the neighbouring points, or Thiessen neighbours, to produce the dual of Thiessen polygons, the Delauney triangulation. The same calculation procedure that calculates one can be use to generate the other. Figure 3.2 shows the relation between the Delauney triangle connecting a point with its neighbours and the Theissen polygon surrounding a points. In Brassel and Reif's case the polygon network was the most important. In the present instance the triangulation is wanted.

The Delaunay triangulation (Delaunay, 1934) has all the desired properties for use as a base for automatic contouring. The problem of calculating the triangulation is closely akin to that of Thiessen polygon generation but certain modifications can be made which increase the speed of computation, help the algorithm reach linearity, and allow certain calculations to be omitted. The only major problem is that considered by Yoeli (1977) related to the representation of known topographic structure where 'break-lines' may have to be included to maintain ridges and valleys. More recently there has been a great increase of interest in triangulation approaches to mapping, and a number of algorithms and reviews have appeared (Sabin 1980, Peucker 1980, Sibson 1981, Watson 1982).

The Brassel and Reif algorithm approaches the problem of Thiessen polygon formation by choosing an arbitrary starting point and, as necessity demands, creating a set of imaginary guaranteed neighbours outside the data area to be polygonised. Once a known neighbour has been determined by this arbitrary starting method each neighbour of a given point can be found by rotation about that point in a clockwise direction, at the same time building up an index list of other neighbourhood relationships for use later on.

The algorithm works on a one-dimensionally sorted data list and has to check a considerable number of points that may be the correct next neighbour out of the neighbours surrounding the data points. Figure 3.3 illustrates the principle. If point 0 is the point whose neighbours are to be discovered and 1 is a known neighbour, the line 0-1 is a base from which the next rightmost neighbour can be determined. A new rightmost neighbour has been found when a circumscribing circle passing through the base line 0-1 and the new point, 2, contains within it no other data point. The line 0-2 then becomes the new base-line and point 3 can be found as the next neighbour, and so on around the rotation point, 0, until point 1 is reached again. At this time all the neighbours of point 0 have been defined and a Thiessen polygon for point 0 could be calculated. In the Brassel and Reif algorithm the formation of the Theissen polygon coincides with the discovery of the neighbours. While the neighbours for the rotation point are being discovered it is possible to update indices of the neighbours for all connected points, greatly reducing calculation effort at later stages. In a random data set about two-thirds of all calculations will have already been made by the time any given data point is investigated. It should be noted that if Delaunay triangulation is required no polygon calculation is necessary as it is simply the triangle set indicated by ABCD etc in Figure 3.3.

Much of the Brassel and Reif algorithm is related to searching for the next neighbour for a given base line. In Figure 3.3 their method is to calculate the centre and radius of the circumscribing circle for some suitable starting point using points from the sorted list to define the area of search. If a point is found inside the circle a new centre and radius are calculated for the new circumscribed circle and the process iterates until all possible points in the sorted list have been checked. This process has two disadvantages. First, in a direction perpendicular to the sort there is no segregation, so by using a singly sorted list the band of points to be considered within the radius of the present circumscribing circle can be very numerous in a large data set. It would appear therefore that a two- dimensional sort structure should be used to minimise searching time for any given point. Associated with the single direction sort is the fact that all points in the sorted list within the radius limit bounds have to be checked. There is no way of checking the points which are nearer the base line in preference to those which are less likely to be neighbours because they lie further away.

The second problem relates to the amount of calculation involved in determining the new circumscribing circle centre and radius. Although some method must be used to determine point position in relation to a possible circumscribing circle the centre calculation should be avoided for Delaunay triangulation as it necessarily involves considerable floating point calculation and should be used only when a Thiessen vertex needs to be calculated and all neighbours are known.

A major strength of their approach, however, is that it enables them to proceed in a logical spatial manner through the data set, never covering the same ground again. Once neighbours have been determined no more points will be found in that region. This makes for considerable economy in storage.

Figure 3.4 shows a map of the location of 50 data points that are to be triangulated and the resulting Delaunay triangulation of those data points. It can be seen by inspection that the points, A, B, and C form a set of Thiessen neighbours and a Delaunay triangle as the circumscribing circle through those points includes no other points in the data distribution. This is also the case for all other triangles shown in the network on the right of Figure 3.4.

In all triangulation systems there is a boundary problem that must be solved in some manner. The points lying within the data window (left box, Figure 3.4) may well not be isolated but only part of a larger data set. If this is the case any arbitrary triangulation around the outside of the present data area must be incorrect. As it is impossible to know what lies outside the data window some boundary condition must be set up to act as a frame and to provide a set of boundary triangles so that isarithms can be extrapolated outside the present apparent data area. Many possibilities exist for this but one of the most efficient is to place a set of imaginary points around the outside of the area, just outside the data window. The position of these imaginary points is shown on the right of Figure 3.4, as is the Delaunay triangular relationship with the real data points. Once these imaginary points have been added to the original data set the whole area can be triangulated starting with any pair of imaginary points as initial known neighbours. The question of how many imaginary points should be used and their distribution is considered later.

A suitable data structure that enables fast access to points lying in the immediate proximity of others according to Knuth (1973) is a two-dimensional sort that can be likened to a box structure. Figure 3.5 shows a simplified data set containing some 16 points. A single sort in the X direction results in the X order shown at the bottom of the diagram and similarly the result of a Y order is given to the right of the diagram. Only the four points labelled 1, 4, 8, and 12 have been entered in the X and Y order. If first a Y order is performed, and then multiple X orders, a box structure can be achieved. The Y order has first been split into a series of Y sections.

This operation of putting data points in boxes is very fast and linear as no sorting is necessary. A division of the X and Y coordinates by the box side length provides an instant reference to the box that contains the point. In Figure 3.5 boxes A and B form the first Y section and C and D the second section. Each Y section is then sorted in terms of X and the resulting X order can be divided into sections in its turn thus segregating A and B into separate boxes and later segregating C and D when the second Y section has been put in X order. The result is to create a succession of points in an ordered listing such that each row of boxes is in Y order, but the data inside each box is in X order. An index to the first point in each box can then be created as in the right of Figure 3.5. Given the coordinate position of any point in the area it is then very fast to find the boxes that will contain its probable neighbours. In the simple 2 x 2 box structure of Figure 3.5 there is only a small advantage to be gained. The larger the data set, however, the greater the saving will be providing the box structure becomes similarly more detailed.

An immediate question is the number of points which should be contained on average within each box in order to maximise the likelihood of finding the desired Theissen neighbour as quickly as possible. As the search procedure is local both the rotation point and the known neighbour of Figure 3.3 will often exist in the same box. As the rightmost neighbour is wanted in the clockwise algorithm of Figure 3.3, the data distribution is assumed random, and the two base line points are probably in the box. An average of four points seems reasonable. Very few redundant points will be searched while looking for the Thiessen neighbour for any given base line.

Of course base line length will be variable depending on data distribution. In those cases where the base line is very long, the number of boxes containing on average four points that will have to be opened and searched will be quite large. They will not, however, contain as many points as if a uni-dimensional sort was employed.

Figure 3.6 shows the application of the structure to a more complex data set. Here the total data set includes about 80 points which leads to a rectangular 20 box structure with 5 boxes along the X axis and 4 boxes down the Y axis. A single sort algorithm would search on the basis of an outwards-inwards approach in that the points most likely to be found first and considered as neighbours are those likely to lie furthest away from the base line. On average many incorrect neighbours will be tested before the right point is found.

An inwards-outwards search involves an iterative expansion of the area being searched if no suitable points are found. If there are large empty areas in the data set there will be a large number of empty boxes. As these can be searched very quickly speed is not significantly degraded in the empty areas. There is, however, some decrease in speed in the populated areas as, although the average contents per box would have been four, the actual contents in the populated area will be proportionately higher to compensate for the empty areas. The process of iterative expansion is shown in Figure 3.7 indicating the method of determining the Delaunay triangle and of resolving problem cases. The algorithms always start from a rotation point and a known neighbour which form a base line for further computation. The area of interest for the first search is defined as the series of boxes covered by the circle having as its diameter the base line. A first approximation is that the boxes to be searched are those covered by the square that encloses the circle. This ensures fast determination of the desired area but can lead to difficulties at the corners of the square but outside the search circle. In Figure 3.7 point 3 can be seen to lie within circle A and would definitely be the Thiessen neighbour required to form the Delaunay triangle with the base line. As it is quite possible that more than one point will lie within the circle it is important to be able to distinguish immediately which point inside the search circle will be the Thiessen neighbour.

Consider a point lying on circle A on the clockwise side of the base line. Wherever that point may fall the angle subtends from the base line to that point will always be the same. In the special case when the base line is the diameter of the circle that angle will be a right angle. Any point lying outside circle A will have an angle more acute and any point inside subtended an angle more obtuse than the circle perimeter angle. Simple geometry shows that the desired neighbour is the point lying inside the circle which has the largest angle subtended from the base line. Point 3 will have an angle rather greater than 90 . If there were another point inside circle A as well as point 3 and its angle were larger than that subtended by point 3 then it would be the Thiessen neighbour and would form the Delaunay triangle.

Imagine that circle A has no point inside it. The search circle area must then be increased by some factor to the size indicated by circle B. All boxes covered by the square enclosing circle B are then opened and the points inside inspected. Points 1, 2, 4 and 10 would be discovered. The only point subtending an angle larger than the perimeter angle of circle B would be point 4 and hence it would be discovered as a neighbour. The inefficiency of using the enclosing square rather than the circle B as the search area is that point 10 would be investigated as it lies within the square but outside the circle but would be discarded as its subtending angle is less than the circle B perimeter angle. This inefficiency can be quite useful in that if no points exist inside the search circle the new maximum search circle that need be employed is that which would include point 10 on its perimeter rather than the much larger circle C in Figure 3.7. Thus, although point 10 would be ignored in the presence of point 1, 2 and 4, it would serve as a limitation for further search if those interior points were not present.

It is possible that in some cases, particularly with regularly distributed data, more than one point may lie exactly on the circle perimeter whereas no points lie inside the circle. It is necessary to employ a decision rule to decide which (in Figure 3.7) of the points 1 and 2 should be chosen as the Thiessen neighbour of the rotation point and the known neighbour. The angles subtended at points 1 and 2 are the same, but in terms of a clockwise rotation around the rotation point it is clear that point 1 should be chosen in preference to point 2. The correct neighbour can be determined once the point closest to the known neighbour and the point closest to the rotation point have been determined. The chosen point is the one that is closest to the known neighbour providing it is not also closest to the rotation point.

If no points are found in either circle A or B the search circle would be enlarged again to size C. This process could continue indefinitely at the edge of the data point distribution as there would be no further points to discover and no limit to which the circle size could be increased that would discover any further points. This is one reason why the imaginary points that lie outside the data window in Figure 3.4 are essential. Any expanding circle search inside the data area will be halted at some stage by contact with an imaginary point that will always be able to form a Delaunay triangle and act as a Theissen neighbour. When the rotation point is itself an imaginary point as is the known neighbour, further effort is abandoned as the two points are already known to be neighbours. It is therefore essential to include the imaginary points in the basic box structure of the data. Every perimeter box contains one imaginary data point. This ensures that a limit is set on all searches no matter which boxes are opened during a search for a neighbour.

The increase in circle size at each phase of a possible search must ensure that the absolute minimum of redundant searching is performed and that the circle area does not grow so rapidly that a vast number of new points are found. A reasonable area increase lies between two and three times for each circle expansion. If the base line is very small this is usually indicative of an area in the window where data is densely packed. Although absolute circle size growth will be slow, it is likely a neighbour will be found very quickly. If the base line is large there is always the possibility that an over large number of points will be searched, but as the algorithm starts with the base line as diameter, as in Figure 3.7, even then the number of points being searched is unlikely to be too excessive.

The triangulations resulting from many triangular mapping program systems have been critically affected by the choice of the starting point for the triangulation process. It is essential that the triangulation should be stable within the data area irrespective of starting position. This is particularly necessary to ensure that maps of consecutive areas can be overlapped successfully without join marks so that updating and modification of maps can be performed without complete redrawing of an entire area. Figure 3.8 shows the triangulated data set of Figure 3.4 in two orientations. In A the orientation is the same as in Figure 3.4 and in b the data set has been rotated through 180 degrees. As the Delaunay network is unique, the triangulation is the same in both cases. The process of triangulation is in both cases that of an expanding wave.

As an arbitrary decision the bottom left hand corner is chosen as the starting point. The neighbours for the bottom left hand corner point are those in contact with shell 1 in Figure 3.8. Thus after the points on shell 1 are found we have complete rotational information for the lower left corner point and partial information for the points lying on shell 1. The information available for these shell points concerns all their relationships to the previous shell. At the start the only information available is related to their neighbours on the first shell and to the original starting point, which is on shell 0. Each point on shell 1 is then considered in turn as a rotation point and a new set of neighbours built up on shell 2. By the time shell 2 is complete all neighbours for points on shell 1 have been found and some of the neighbours for shell 2 including all neighbours on previous shells. For both Figure 3.8A and 3.8B the process continues shell after shell until by shell 6 all neighbours for all points have been found. Every point is only visited as a rotation point once unless its neighbours have already been discovered from previous rotation points in which case it is complete. An updating process enables track to be kept of all the relationships. Investigation of the length of each shell indicates that in large data sheets only a small proportion of the points exist on any given shell. As these points are the only active ones in terms of finding new neighbours and also for the relationships between points, they are the only ones that need to be kept in memory. Relationships that have to be maintained for instant reference are kept quite small. The algorithm can therefore be run equally efficiently on small or large computers. In all, the number of active relationship lists that need be kept will be on average approximately 1.5 times the square root of the number of data points for a randomly distributed data set, with six items per list.

Although the same starting location was used for both A and B in Figure 3.8, the shell formation is different as it adjusts to the different data distribution encountered. The final pattern of the triangulation, however, is the same.

In Laser-Scan DTM software a rather different order of calculation is used. The first shell includes all the imaginary points. Thus the shells are circular. This uses rather more memory as the live shell is approximately 4 times the square root of the number of points (at maximum).