Home :: Tutorials :: Simplifying a Polyline

Simplifying a Polyline - The Algorithm

To get a better idea of the problem we are solving, let's define the requirements to be satisfied by our polyline simplification algorithm. These should be fairly simple.

  1. The algorithm must accept a list of user-specified points that define a polyline.
  2. The algorithm must return a list of points that define a polyline that visually approximates the original polyline.
  3. The algorithm must accept a user-specified tolerance value that determines the quality of the visual approximation of the original polyline.
  4. The algorithm must minimize the length of the resulting list of points.

So, we want an algorithm that takes a polyline and simplifies it as much as possible while retaining a given faithfulness to the appearance of the original curve.

It is quite difficult to draw using the mouse so it reasonable to assume that the drawn curve is rough in nature. Given the roughness of the curve, we have an opportunity to alter it without negatively impacting the curve's appearance. Tolerance allows us to quantify the degree to which we can alter the curve while preserving the appearance of the original curve.

Just to refresh our memories, our algorithm is trying to solve the case when we draw straight lines which end up with extra points in the line. The requirements do not state how a tolerance value works other than it effects the quality of the approximation of the original curve. We need to define the effect of a tolerance value on the approximation and we must define what are valid tolerance values.

If I have three colinear points (all on the same line), then one of those points can be removed without effecting the appearance of the line. If one of the points is slightly off of the line defined by the other two points, then we may be able to remove that point without drastically impacting the final appearance as compared to the original curve. The further away the point is from the line, the worse the appearance will be as compared to the original curve. We can control the quality of the approximation by adjusting the distance a point may stray from a line before it is no longer considered to be on the line. This turns out to be a reasonable measure for tolerance.

When the mid-point of the line is on the curve, it is 0 units from the line. As the point moves away from the curve, this value increases. Therefore, we define tolerance to be a positive number of units a point may stray from a line before it is considered no longer a part of the line. As this tolerance number increases, the quality of the approximation will decrease along with the number of points in the approximated polyline. If we need an exact visual duplicate of the original curve, then we use a tolerance of zero. If we are more concerned about reducing the number of points in our curve, then we can increase the tolerance to reduce the point count while, at the same time, losing quality as compared to the original curve.

From this discussion of the tolerance value, we now know enough to describe the algorithm.

define lastPoint to be the first point in the original point list
loop through the remaining list of points, assigning each to currentPoint
    form a line from lastPoint to currentPoint
    test all points between, but not including lastPoint and currentPoint
        if the test point's distance from the line is more than the tolerance then
            add the point before currentPoint to the result array
            set lastPoint equal to the added point
            proceed to the next point in the top-most loop
        end if
    end test
end loop
add the last point in the user-specified point list to the result array