Thursday, June 4, 2015

A New Variant of Douglas–Peucker Algorithm (2/4) [Original Algo. MATLAB Implementation]

Previous Posts  

Original Douglas-Peucker Algorithm

The following algorithm description is taken from Wikipdeia [1] and CodeProject article by [2]. The Ramer–Douglas–Peucker algorithm (RDP) is an algorithm for reducing the number of points in a curve that is approximated by a series of points. This algorithm is also known under the names Douglas–Peucker algorithm, iterative end-point fit algorithm and split-and-merge algorithm.

The Douglas-Peucker algorithm uses a point-to-edge distance tolerance. The algorithm starts with a crude simplification that is the single edge joining the first and last vertices of the original polyline. It then computes the distance of all intermediate vertices to that edge. The vertex that is furthest away from that edge, and that has a computed distance that is larger than a specified tolerance, will be marked as a key and added to the simplification. This process will recurse for each edge in the current simplification, until all vertices of the original polyline are within tolerance of the simplification results. This process is illustrated below:

Initially, the simplification consists of a single edge. During the first step, the fourth vertex is marked as a key and the simplification is adjusted accordingly. During the second step, the first edge of the current simplification is processed. The maximum vertex distance to that edge falls below the tolerance threshold, and no new key is added. During the third step, a key is found for the second edge of the current simplification. This edge is split at the key and the simplification is updated. This process continues until no more keys can be found. Note that at each step, only one edge of the current simplification is processed.
This algorithm has a worst case running time of O(nm), and O(n log m) on average, where m is the size of the simplified polyline. As such, this is an output dependent algorithm, and will be very fast when m is small. To make it even faster, the Distance between points routine is applied as a pre-processing step.

MATLAB Implementation

function out_points = points_simplify_2d(points, epsilon)
  id_end = size(points,1);
  id_start = 1;

  % Indeces of points are not removed by Douglas-Peucker algorithm
  remaining_ids = true(id_end,1);

  % Call douglas_peucker_iteration (recursive function)
  [points remaining_ids] = douglas_peucker_iteration( ...
     points,epsilon,id_start,id_end, remaining_ids);
  out_points = points(remaining_ids,:);

function [points remaining_ids]  = douglas_peucker_iteration(points, epsilon, id_start, id_end, remaining_ids)
  % For the points (start + 1) to (end) -> next_points_relative is
  % relative coordinates from start point
  next_points_relative = bsxfun(@minus, points(id_start+1:id_end,:), 
  end_point_relative = next_points_relative(end,:)';

  % Efficient Method to get distances_from_start_end_line
  beta = (end_point_relative' *
  b = next_points_relative-bsxfun(@times,beta,end_point_relative)';
  distances_from_start_end_line = hypot(b(:,1),b(:,2));

  % Identify maximum distance and get its index
  [dmax dmax_id] = max(distances_from_start_end_line);
  dmax_id  = id_start + dmax_id; %ID of the edge point

  if dmax <= epsilon;
    remaining_ids(id_start+1:id_end-1) = false; 
    [points remaining_ids] = douglas_peucker_iteration( ...
       points,epsilon,id_start,dmax_id, remaining_ids);
    [points remaining_ids] = douglas_peucker_iteration( ...
       points,epsilon,dmax_id,id_end, remaining_ids);

[2] Polyline Simplification

No comments: