Wednesday, October 12, 2011

In-memory index structures

Points
K-d tree (Point K-d tree)


Recursive subdivision of point-set into two halves using
vertical/horizontal line.
Horizontal line on even levels, vertical on odd levels
One point in each leaf
    Alternating splits on dimensions.
    Cost in two dimensions: O(√N + T) query time, O(N) storage
    Cost in higher dimensions: O(N^(1-1/d) + T) time complexity for d > 2
      Range tree
        Nested binary search trees
        Cost in two dimensions: O((logN)^2 + T) query time, O(NlogN) storage, Fractional cascading reduces query complexity to O(log N + T).
          Quadtree
            Division into 4 subspaces
            Point and region versions
            Range query complexity similar to K-d tree

            Can be used for compressing/clustering
            information (Point versus region quadtree)



            Intervals

            Used for stabbing queries on 1-d intervals
            – Root defined by median end-point
            – Left child stores all intervals that end before the median
            – Right child stores all intervals that begin after the median
            – The root stores all intervals that intersect the median; order by left end-points and by right end-points in separate lists


            Used together with Range Tree for window query:
            O(N log N) storage, O(N log N) construction time, O((logN)^2 + T) query time



            Voronoi Diagram for NN Search

            O(N) space, O(N log N) construction time, O(logN) query time


            kth order Voronoi diagram defines the partitioning based on k-closest sites. For 2 dimensions, O(k(N-k)) space, O(k2N log k + kN log N) construction time, O(k + logN) query time.

            A number of approximate algorithms.

            References:

            Computational geometry, algorithms and applications, de Berg et al, Springer.


            No comments:

            Post a Comment