程序代写代做代考 Excel data structure algorithm hp0
hp0
HIERARCHICAL REPRESENTATIONS OF
POINT DATA
Hanan Samet
Computer Science Department and
Institute for Advanced Computer Studies and
Center for Automation Research
University of Maryland
College Park, MD 20742 USA
e-mail: hjs@umiacs.umd.edu
Copyright © 1998 Hanan Samet
These notes may not be reproduced by any means (mechanical or
electronic or any other) without the express written permission of Hanan
Samet
hp1
PRELIMINARIES
• File ≡ collection of records (N)
• Each record contains several attributes or keys (k)
Queries:
1. Point query
2. Range query (includes partial match)
3. Boolean query ≡ combine 1 and 2 with AND, OR, NOT
Search methods
1. Organize data to be stored
• boundaries of regions in the search space are
determined by the data
• e.g., binary search tree
2. Organize the embedding space from which the data is
drawn
• region boundaries in the search space are fixed
• e.g., address computation methods such as digital
searching
3. Hybrid
• use 1 for some attributes and 2 for others
Extreme solution:
• Bitmap representation where one bit is reserved for
every possible record in the multidimensional point
space whether or not it is present
• Problems:
1. large number of attributes
2. continuous data (non-discrete)
Copyright © 1998 by Hanan Samet
hp2
SIMPLE NON-HIERARCHICAL DATA STRUCTURES
1. Sequential list
NAME X Y
Chicago 35 42
Mobile 52 10
Toronto 62 77
Buffalo 82 65
Denver 5 45
Omaha 27 35
Atlanta 85 15
Miami 90 5
2. Inverted List
X Y
Denver Miami
Omaha Mobile
Chicago Atlanta
Mobile Omaha
Toronto Chicago
Buffalo Denver
Atlanta Buffalo
Miami Toronto
• 2 sorted lists
• Data is pointers
• Enables pruning the search with respect to one key
Copyright © 1998 by Hanan Samet
hp3
GRID METHOD
(0,100) (100,100)
(100,0)(0,0)
(5,45)
Denver (35,42)
Chicago
(27,35)
Omaha
(52,10)
Mobile
(62,77)
Toronto (82,65)
Buffalo
(85,15)
Atlanta
(90,5)
Miami
• Divide space into squares of width equal to the search
region
• Each cell contains a list of all points within it
• Assume L∞ distance metric (i.e., chessboard)
• Assume C = uniform distribution of points per cell
• Average search time for k-dimensional space is O(F•2k)
F = number of records found = C since query region has
the width of a cell
2k = number of cells examined
Copyright © 1998 by Hanan Samet
POINT QUADTREE (Finkel/Bentley)
1 hp4
b
• Marriage between a uniform grid and a binary search tree
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
Chicago
Copyright © 1998 by Hanan Samet
POINT QUADTREE (Finkel/Bentley)
1 hp4
b
• Marriage between a uniform grid and a binary search tree
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
Chicago
Copyright © 1998 by Hanan Samet
2
r
hp4
(52,10)
Mobile
Mobile
Copyright © 1998 by Hanan Samet
POINT QUADTREE (Finkel/Bentley)
1 hp4
b
• Marriage between a uniform grid and a binary search tree
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
Chicago
Copyright © 1998 by Hanan Samet
2
r
hp4
(52,10)
Mobile
Mobile
Copyright © 1998 by Hanan Samet
3
z
hp4
(62,77)
Toronto
Toronto
Copyright © 1998 by Hanan Samet
POINT QUADTREE (Finkel/Bentley)
1 hp4
b
• Marriage between a uniform grid and a binary search tree
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
Chicago
Copyright © 1998 by Hanan Samet
2
r
hp4
(52,10)
Mobile
Mobile
Copyright © 1998 by Hanan Samet
3
z
hp4
(62,77)
Toronto
Toronto
Copyright © 1998 by Hanan Samet
4
g
hp4
(82,65)
Buffalo
Buffalo
Copyright © 1998 by Hanan Samet
POINT QUADTREE (Finkel/Bentley)
1 hp4
b
• Marriage between a uniform grid and a binary search tree
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
Chicago
Copyright © 1998 by Hanan Samet
2
r
hp4
(52,10)
Mobile
Mobile
Copyright © 1998 by Hanan Samet
3
z
hp4
(62,77)
Toronto
Toronto
Copyright © 1998 by Hanan Samet
4
g
hp4
(82,65)
Buffalo
Buffalo
Copyright © 1998 by Hanan Samet
5
v
hp4
(5,45)
Denver
Denver
Copyright © 1998 by Hanan Samet
POINT QUADTREE (Finkel/Bentley)
1 hp4
b
• Marriage between a uniform grid and a binary search tree
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
Chicago
Copyright © 1998 by Hanan Samet
2
r
hp4
(52,10)
Mobile
Mobile
Copyright © 1998 by Hanan Samet
3
z
hp4
(62,77)
Toronto
Toronto
Copyright © 1998 by Hanan Samet
4
g
hp4
(82,65)
Buffalo
Buffalo
Copyright © 1998 by Hanan Samet
5
v
hp4
(5,45)
Denver
Denver
Copyright © 1998 by Hanan Samet
6
g
hp4
(27,35)
Omaha
Omaha
Copyright © 1998 by Hanan Samet
POINT QUADTREE (Finkel/Bentley)
1 hp4
b
• Marriage between a uniform grid and a binary search tree
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
Chicago
Copyright © 1998 by Hanan Samet
2
r
hp4
(52,10)
Mobile
Mobile
Copyright © 1998 by Hanan Samet
3
z
hp4
(62,77)
Toronto
Toronto
Copyright © 1998 by Hanan Samet
4
g
hp4
(82,65)
Buffalo
Buffalo
Copyright © 1998 by Hanan Samet
5
v
hp4
(5,45)
Denver
Denver
Copyright © 1998 by Hanan Samet
6
g
hp4
(27,35)
Omaha
Omaha
Copyright © 1998 by Hanan Samet
7
v
hp4
(85,15)
Atlanta
Atlanta
Copyright © 1998 by Hanan Samet
POINT QUADTREE (Finkel/Bentley)
1 hp4
b
• Marriage between a uniform grid and a binary search tree
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
Chicago
Copyright © 1998 by Hanan Samet
2
r
hp4
(52,10)
Mobile
Mobile
Copyright © 1998 by Hanan Samet
3
z
hp4
(62,77)
Toronto
Toronto
Copyright © 1998 by Hanan Samet
4
g
hp4
(82,65)
Buffalo
Buffalo
Copyright © 1998 by Hanan Samet
5
v
hp4
(5,45)
Denver
Denver
Copyright © 1998 by Hanan Samet
6
g
hp4
(27,35)
Omaha
Omaha
Copyright © 1998 by Hanan Samet
7
v
hp4
(85,15)
Atlanta
Atlanta
Copyright © 1998 by Hanan Samet
8
z
hp4
(90,5)
Miami
Miami
Copyright © 1998 by Hanan Samet
hp5
PROBLEM OF DELETION IN POINT QUADTREES
1
b
A
I
G
B
C
H
E
D
F
• Delete node A
• Conventional algorithm takes one son as the new root
and reinserts the remaining subtrees
Copyright © 1998 by Hanan Samet
hp5
1. B is the new root
2
r
PROBLEM OF DELETION IN POINT QUADTREES
1
b
A
I
G
B
C
H
E
D
F
• Delete node A
• Conventional algorithm takes one son as the new root
and reinserts the remaining subtrees
Copyright © 1998 by Hanan Samet
hp5
2. D is the new root
2
r
PROBLEM OF DELETION IN POINT QUADTREES
1
b
A
I
G
B
C
H
E
D
F
• Delete node A
• Conventional algorithm takes one son as the new root
and reinserts the remaining subtrees
Copyright © 1998 by Hanan Samet
hp5
3. G is the new root
2
r
PROBLEM OF DELETION IN POINT QUADTREES
1
b
A
I
G
B
C
H
E
D
F
• Delete node A
• Conventional algorithm takes one son as the new root
and reinserts the remaining subtrees
Copyright © 1998 by Hanan Samet
hp52
r
4. I is the new root
PROBLEM OF DELETION IN POINT QUADTREES
1
b
A
I
G
B
C
H
E
D
F
• Delete node A
• Conventional algorithm takes one son as the new root
and reinserts the remaining subtrees
Copyright © 1998 by Hanan Samet
hp53
z
• Optimal solution is to use H as the new root since the
shaded region is empty
PROBLEM OF DELETION IN POINT QUADTREES
1
b
A
I
G
B
C
H
E
D
F
• Delete node A
• Conventional algorithm takes one son as the new root
and reinserts the remaining subtrees
Copyright © 1998 by Hanan Samet
hp53
z
• Optimal solution is to use H as the new root since the
shaded region is empty
4
g
• Problems:
1. must search for H
2. a node such as H may possibly not exist as is the
case if node J is present
J
PROBLEM OF DELETION IN POINT QUADTREES
1
b
A
I
G
B
C
H
E
D
F
• Delete node A
• Conventional algorithm takes one son as the new root
and reinserts the remaining subtrees
Copyright © 1998 by Hanan Samet
MECHANICS OF DELETION IN POINT QUADTREES
• Ideally, want to replace deleted
node (A) with a node (B) that
leaves an empty shaded region
hp61
b
B
A
Copyright © 1998 by Hanan Samet
MECHANICS OF DELETION IN POINT QUADTREES
• Ideally, want to replace deleted
node (A) with a node (B) that
leaves an empty shaded region
hp61
b
B
A
Copyright © 1998 by Hanan Samet
2
z
hp6
A
B
C
D A
D
C
B
• Involves search and instead settle on a set of
candidate nodes obtained using a method analogous
to searching a binary search tree
Copyright © 1998 by Hanan Samet
MECHANICS OF DELETION IN POINT QUADTREES
• Ideally, want to replace deleted
node (A) with a node (B) that
leaves an empty shaded region
hp61
b
B
A
Copyright © 1998 by Hanan Samet
2
z
hp6
A
B
C
D A
D
C
B
• Involves search and instead settle on a set of
candidate nodes obtained using a method analogous
to searching a binary search tree
Copyright © 1998 by Hanan Samet
3
r
hp6
D is the closest node
Copyright © 1998 by Hanan Samet
MECHANICS OF DELETION IN POINT QUADTREES
• Ideally, want to replace deleted
node (A) with a node (B) that
leaves an empty shaded region
hp61
b
B
A
Copyright © 1998 by Hanan Samet
2
z
hp6
A
B
C
D A
D
C
B
• Involves search and instead settle on a set of
candidate nodes obtained using a method analogous
to searching a binary search tree
Copyright © 1998 by Hanan Samet
3
r
hp6
D is the closest node
Copyright © 1998 by Hanan Samet
4
v
hp6
• Set of candidates = “closest” node in each quadrant
1. choose the candidate node that is closer to each of
its bordering axes than any other candidate node
which is on the same side of those axes
Copyright © 1998 by Hanan Samet
MECHANICS OF DELETION IN POINT QUADTREES
• Ideally, want to replace deleted
node (A) with a node (B) that
leaves an empty shaded region
hp61
b
B
A
Copyright © 1998 by Hanan Samet
2
z
hp6
A
B
C
D A
D
C
B
• Involves search and instead settle on a set of
candidate nodes obtained using a method analogous
to searching a binary search tree
Copyright © 1998 by Hanan Samet
3
r
hp6
D is the closest node
Copyright © 1998 by Hanan Samet
4
v
hp6
• Set of candidates = “closest” node in each quadrant
1. choose the candidate node that is closer to each of
its bordering axes than any other candidate node
which is on the same side of those axes
Copyright © 1998 by Hanan Samet
5
g
hp6
A
B
E
D
C
condition does not always hold
Copyright © 1998 by Hanan Samet
MECHANICS OF DELETION IN POINT QUADTREES
• Ideally, want to replace deleted
node (A) with a node (B) that
leaves an empty shaded region
hp61
b
B
A
Copyright © 1998 by Hanan Samet
2
z
hp6
A
B
C
D A
D
C
B
• Involves search and instead settle on a set of
candidate nodes obtained using a method analogous
to searching a binary search tree
Copyright © 1998 by Hanan Samet
3
r
hp6
D is the closest node
Copyright © 1998 by Hanan Samet
4
v
hp6
• Set of candidates = “closest” node in each quadrant
1. choose the candidate node that is closer to each of
its bordering axes than any other candidate node
which is on the same side of those axes
Copyright © 1998 by Hanan Samet
5
g
hp6
A
B
E
D
C
condition does not always hold
Copyright © 1998 by Hanan Samet
6
z
hp6
A
B
ED
C
more than one node satisfies it
Copyright © 1998 by Hanan Samet
MECHANICS OF DELETION IN POINT QUADTREES
• Ideally, want to replace deleted
node (A) with a node (B) that
leaves an empty shaded region
hp61
b
B
A
Copyright © 1998 by Hanan Samet
2
z
hp6
A
B
C
D A
D
C
B
• Involves search and instead settle on a set of
candidate nodes obtained using a method analogous
to searching a binary search tree
Copyright © 1998 by Hanan Samet
3
r
hp6
D is the closest node
Copyright © 1998 by Hanan Samet
4
v
hp6
• Set of candidates = “closest” node in each quadrant
1. choose the candidate node that is closer to each of
its bordering axes than any other candidate node
which is on the same side of those axes
Copyright © 1998 by Hanan Samet
5
g
hp6
A
B
E
D
C
condition does not always hold
Copyright © 1998 by Hanan Samet
6
z
hp6
A
B
ED
C
more than one node satisfies it
Copyright © 1998 by Hanan Samet
7
r
hp6
Lx
Ly
dx
dy
2. Use the L1 metric to break ties or deadlocks
• L1 metric is the sum of the displacements from the
bordering x and y axes
• rationale: area of shaded region is Lx•dy+Ly•dx-dx•dy
which can be approximated by 2dx•(Lx+Ly) assuming
dx=dy and dx being very small
Copyright © 1998 by Hanan Samet
MECHANICS OF DELETION IN POINT QUADTREES
• Ideally, want to replace deleted
node (A) with a node (B) that
leaves an empty shaded region
hp61
b
B
A
Copyright © 1998 by Hanan Samet
2
z
hp6
A
B
C
D A
D
C
B
• Involves search and instead settle on a set of
candidate nodes obtained using a method analogous
to searching a binary search tree
Copyright © 1998 by Hanan Samet
3
r
hp6
D is the closest node
Copyright © 1998 by Hanan Samet
4
v
hp6
• Set of candidates = “closest” node in each quadrant
1. choose the candidate node that is closer to each of
its bordering axes than any other candidate node
which is on the same side of those axes
Copyright © 1998 by Hanan Samet
5
g
hp6
A
B
E
D
C
condition does not always hold
Copyright © 1998 by Hanan Samet
6
z
hp6
A
B
ED
C
more than one node satisfies it
Copyright © 1998 by Hanan Samet
7
r
hp6
Lx
Ly
dx
dy
2. Use the L1 metric to break ties or deadlocks
• L1 metric is the sum of the displacements from the
bordering x and y axes
• rationale: area of shaded region is Lx•dy+Ly•dx-dx•dy
which can be approximated by 2dx•(Lx+Ly) assuming
dx=dy and dx being very small
Copyright © 1998 by Hanan Samet
8
v
hp6
• at most one of the remaining candidates will be in
the shaded region
Copyright © 1998 by Hanan Samet
ALGORITHM FOR DELETION IN POINT QUADTREES
1 hp7
b
A
N
M
L
JK D
B E
C
H
F
G
I
• Select a node satisfying the “closest” criteria to the
deleted node A to serve as the new root (B in the NE
quadrant
Copyright © 1998 by Hanan Samet
ALGORITHM FOR DELETION IN POINT QUADTREES
1 hp7
b
A
N
M
L
JK D
B E
C
H
F
G
I
• Select a node satisfying the “closest” criteria to the
deleted node A to serve as the new root (B in the NE
quadrant
Copyright © 1998 by Hanan Samet
2
z
hp7
Copyright © 1998 by Hanan Samet
ALGORITHM FOR DELETION IN POINT QUADTREES
1 hp7
b
A
N
M
L
JK D
B E
C
H
F
G
I
• Select a node satisfying the “closest” criteria to the
deleted node A to serve as the new root (B in the NE
quadrant
Copyright © 1998 by Hanan Samet
2
z
hp7
Copyright © 1998 by Hanan Samet
3
v
hp7
1. no reinsertion in opposite quadrant (SW)
Copyright © 1998 by Hanan Samet
ALGORITHM FOR DELETION IN POINT QUADTREES
1 hp7
b
A
N
M
L
JK D
B E
C
H
F
G
I
• Select a node satisfying the “closest” criteria to the
deleted node A to serve as the new root (B in the NE
quadrant
Copyright © 1998 by Hanan Samet
2
z
hp7
Copyright © 1998 by Hanan Samet
3
v
hp7
1. no reinsertion in opposite quadrant (SW)
Copyright © 1998 by Hanan Samet
4
r
hp7
2. ADJ: apply to adjacent quadrants (NW,SE):
if root remains in the quadrant (J) then
• no reinsertion in 2 subquadrants (NW,NE)
• apply ADJ to remaining 2 subquadrants (SW,SE)
else reinsert entire quadrant
Copyright © 1998 by Hanan Samet
ALGORITHM FOR DELETION IN POINT QUADTREES
1 hp7
b
A
N
M
L
JK D
B E
C
H
F
G
I
• Select a node satisfying the “closest” criteria to the
deleted node A to serve as the new root (B in the NE
quadrant
Copyright © 1998 by Hanan Samet
2
z
hp7
Copyright © 1998 by Hanan Samet
3
v
hp7
1. no reinsertion in opposite quadrant (SW)
Copyright © 1998 by Hanan Samet
4
r
hp7
2. ADJ: apply to adjacent quadrants (NW,SE):
if root remains in the quadrant (J) then
• no reinsertion in 2 subquadrants (NW,NE)
• apply ADJ to remaining 2 subquadrants (SW,SE)
else reinsert entire quadrant
Copyright © 1998 by Hanan Samet
5
g
hp7
3. NEWROOT: apply to quadrant containing replacement
node (NE)
• same subquadrant (NE): no reinsertion
• adjacent subquadrants (NW,SE): ADJ
• opposite subquadrant (SW): NEWROOT
Copyright © 1998 by Hanan Samet
ALGORITHM FOR DELETION IN POINT QUADTREES
1 hp7
b
A
N
M
L
JK D
B E
C
H
F
G
I
• Select a node satisfying the “closest” criteria to the
deleted node A to serve as the new root (B in the NE
quadrant
Copyright © 1998 by Hanan Samet
2
z
hp7
Copyright © 1998 by Hanan Samet
3
v
hp7
1. no reinsertion in opposite quadrant (SW)
Copyright © 1998 by Hanan Samet
4
r
hp7
2. ADJ: apply to adjacent quadrants (NW,SE):
if root remains in the quadrant (J) then
• no reinsertion in 2 subquadrants (NW,NE)
• apply ADJ to remaining 2 subquadrants (SW,SE)
else reinsert entire quadrant
Copyright © 1998 by Hanan Samet
5
g
hp7
3. NEWROOT: apply to quadrant containing replacement
node (NE)
• same subquadrant (NE): no reinsertion
• adjacent subquadrants (NW,SE): ADJ
• opposite subquadrant (SW): NEWROOT
Copyright © 1998 by Hanan Samet
6
b
hp7
• Comparison with conventional reinsertion algorithm
1. 5/6 reduction in number of nodes requiring
reinsertion (2/3 if pick a candidate at random)
2. number of comparisons during reinsertion was
observed to be ~log4n vs. a much larger factor
3. (average total path length)/(optimal total path length)
was observed to be constant vs. an increase
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp8
MX QUADTREE (Hunter)
• Points are like BLACK pixels in a region quadtree
• Useful for raster to vector conversion
• Empty cells are merged to form larger empty cells
• Only good for discrete data
• Good for sparse matrix applications
• Assume that the point is associated with the lower
left corner of each cell
• Ex: assume an 8 x 8 array
divide coordinate values by 12.5
1
b
(0,8) (8,8)
(8,0)(0,0)
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp8
MX QUADTREE (Hunter)
• Points are like BLACK pixels in a region quadtree
• Useful for raster to vector conversion
• Empty cells are merged to form larger empty cells
• Only good for discrete data
• Good for sparse matrix applications
• Assume that the point is associated with the lower
left corner of each cell
• Ex: assume an 8 x 8 array
divide coordinate values by 12.5
1
b
(0,8) (8,8)
(8,0)(0,0)
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp82
r
(2,3)
Chicago
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp8
MX QUADTREE (Hunter)
• Points are like BLACK pixels in a region quadtree
• Useful for raster to vector conversion
• Empty cells are merged to form larger empty cells
• Only good for discrete data
• Good for sparse matrix applications
• Assume that the point is associated with the lower
left corner of each cell
• Ex: assume an 8 x 8 array
divide coordinate values by 12.5
1
b
(0,8) (8,8)
(8,0)(0,0)
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp82
r
(2,3)
Chicago
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp83
z
(4,0)
Mobile
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp8
MX QUADTREE (Hunter)
• Points are like BLACK pixels in a region quadtree
• Useful for raster to vector conversion
• Empty cells are merged to form larger empty cells
• Only good for discrete data
• Good for sparse matrix applications
• Assume that the point is associated with the lower
left corner of each cell
• Ex: assume an 8 x 8 array
divide coordinate values by 12.5
1
b
(0,8) (8,8)
(8,0)(0,0)
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp82
r
(2,3)
Chicago
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp83
z
(4,0)
Mobile
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp84
g
(4,6)
Toronto
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp8
MX QUADTREE (Hunter)
• Points are like BLACK pixels in a region quadtree
• Useful for raster to vector conversion
• Empty cells are merged to form larger empty cells
• Only good for discrete data
• Good for sparse matrix applications
• Assume that the point is associated with the lower
left corner of each cell
• Ex: assume an 8 x 8 array
divide coordinate values by 12.5
1
b
(0,8) (8,8)
(8,0)(0,0)
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp82
r
(2,3)
Chicago
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp83
z
(4,0)
Mobile
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp84
g
(4,6)
Toronto
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp85
v
(6,5)
Buffalo
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp8
MX QUADTREE (Hunter)
• Points are like BLACK pixels in a region quadtree
• Useful for raster to vector conversion
• Empty cells are merged to form larger empty cells
• Only good for discrete data
• Good for sparse matrix applications
• Assume that the point is associated with the lower
left corner of each cell
• Ex: assume an 8 x 8 array
divide coordinate values by 12.5
1
b
(0,8) (8,8)
(8,0)(0,0)
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp82
r
(2,3)
Chicago
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp83
z
(4,0)
Mobile
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp84
g
(4,6)
Toronto
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp85
v
(6,5)
Buffalo
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp86
z
(0,3)
Denver
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp8
MX QUADTREE (Hunter)
• Points are like BLACK pixels in a region quadtree
• Useful for raster to vector conversion
• Empty cells are merged to form larger empty cells
• Only good for discrete data
• Good for sparse matrix applications
• Assume that the point is associated with the lower
left corner of each cell
• Ex: assume an 8 x 8 array
divide coordinate values by 12.5
1
b
(0,8) (8,8)
(8,0)(0,0)
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp82
r
(2,3)
Chicago
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp83
z
(4,0)
Mobile
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp84
g
(4,6)
Toronto
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp85
v
(6,5)
Buffalo
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp86
z
(0,3)
Denver
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp87
g
(2,2)
Omaha
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp8
MX QUADTREE (Hunter)
• Points are like BLACK pixels in a region quadtree
• Useful for raster to vector conversion
• Empty cells are merged to form larger empty cells
• Only good for discrete data
• Good for sparse matrix applications
• Assume that the point is associated with the lower
left corner of each cell
• Ex: assume an 8 x 8 array
divide coordinate values by 12.5
1
b
(0,8) (8,8)
(8,0)(0,0)
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp82
r
(2,3)
Chicago
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp83
z
(4,0)
Mobile
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp84
g
(4,6)
Toronto
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp85
v
(6,5)
Buffalo
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp86
z
(0,3)
Denver
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp87
g
(2,2)
Omaha
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp88
r
(6,1)
Atlanta
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp8
MX QUADTREE (Hunter)
• Points are like BLACK pixels in a region quadtree
• Useful for raster to vector conversion
• Empty cells are merged to form larger empty cells
• Only good for discrete data
• Good for sparse matrix applications
• Assume that the point is associated with the lower
left corner of each cell
• Ex: assume an 8 x 8 array
divide coordinate values by 12.5
1
b
(0,8) (8,8)
(8,0)(0,0)
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp82
r
(2,3)
Chicago
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp83
z
(4,0)
Mobile
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp84
g
(4,6)
Toronto
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp85
v
(6,5)
Buffalo
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp86
z
(0,3)
Denver
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp87
g
(2,2)
Omaha
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp88
r
(6,1)
Atlanta
Copyright © 1998 by Hanan Samet
hp8hp8hp8hp8hp8hp8hp8hp8hp8
(7,0)
Miami
9
v
Copyright © 1998 by Hanan Samet
PR QUADTREE (Orenstein)
1 hp9
b
Regular decomposition point representation
Decomposition occurs whenever a block contains more
than one point
Useful when the domain of data points is not discrete
but finite
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
Maximum level of decomposition depends on the
minimum separation between two points
• if two points are very close, then decomposition can be
very deep
• can be overcome by viewing blocks as buckets with
capacity c and only decomposing the block when it
contains more than c points
1.
2.
3.
4.
Ex: c = 1
Copyright © 1998 by Hanan Samet
PR QUADTREE (Orenstein)
1 hp9
b
Regular decomposition point representation
Decomposition occurs whenever a block contains more
than one point
Useful when the domain of data points is not discrete
but finite
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
Maximum level of decomposition depends on the
minimum separation between two points
• if two points are very close, then decomposition can be
very deep
• can be overcome by viewing blocks as buckets with
capacity c and only decomposing the block when it
contains more than c points
1.
2.
3.
4.
Ex: c = 1
Copyright © 1998 by Hanan Samet
2
r
hp9
(52,10)
Mobile
Copyright © 1998 by Hanan Samet
PR QUADTREE (Orenstein)
1 hp9
b
Regular decomposition point representation
Decomposition occurs whenever a block contains more
than one point
Useful when the domain of data points is not discrete
but finite
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
Maximum level of decomposition depends on the
minimum separation between two points
• if two points are very close, then decomposition can be
very deep
• can be overcome by viewing blocks as buckets with
capacity c and only decomposing the block when it
contains more than c points
1.
2.
3.
4.
Ex: c = 1
Copyright © 1998 by Hanan Samet
2
r
hp9
(52,10)
Mobile
Copyright © 1998 by Hanan Samet
3
z
hp9
(62,77)
Toronto
Copyright © 1998 by Hanan Samet
PR QUADTREE (Orenstein)
1 hp9
b
Regular decomposition point representation
Decomposition occurs whenever a block contains more
than one point
Useful when the domain of data points is not discrete
but finite
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
Maximum level of decomposition depends on the
minimum separation between two points
• if two points are very close, then decomposition can be
very deep
• can be overcome by viewing blocks as buckets with
capacity c and only decomposing the block when it
contains more than c points
1.
2.
3.
4.
Ex: c = 1
Copyright © 1998 by Hanan Samet
2
r
hp9
(52,10)
Mobile
Copyright © 1998 by Hanan Samet
3
z
hp9
(62,77)
Toronto
Copyright © 1998 by Hanan Samet
4
g
hp9
(82,65)
Buffalo
Copyright © 1998 by Hanan Samet
PR QUADTREE (Orenstein)
1 hp9
b
Regular decomposition point representation
Decomposition occurs whenever a block contains more
than one point
Useful when the domain of data points is not discrete
but finite
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
Maximum level of decomposition depends on the
minimum separation between two points
• if two points are very close, then decomposition can be
very deep
• can be overcome by viewing blocks as buckets with
capacity c and only decomposing the block when it
contains more than c points
1.
2.
3.
4.
Ex: c = 1
Copyright © 1998 by Hanan Samet
2
r
hp9
(52,10)
Mobile
Copyright © 1998 by Hanan Samet
3
z
hp9
(62,77)
Toronto
Copyright © 1998 by Hanan Samet
4
g
hp9
(82,65)
Buffalo
Copyright © 1998 by Hanan Samet
5
v
hp9
(5,45)
Denver
Copyright © 1998 by Hanan Samet
PR QUADTREE (Orenstein)
1 hp9
b
Regular decomposition point representation
Decomposition occurs whenever a block contains more
than one point
Useful when the domain of data points is not discrete
but finite
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
Maximum level of decomposition depends on the
minimum separation between two points
• if two points are very close, then decomposition can be
very deep
• can be overcome by viewing blocks as buckets with
capacity c and only decomposing the block when it
contains more than c points
1.
2.
3.
4.
Ex: c = 1
Copyright © 1998 by Hanan Samet
2
r
hp9
(52,10)
Mobile
Copyright © 1998 by Hanan Samet
3
z
hp9
(62,77)
Toronto
Copyright © 1998 by Hanan Samet
4
g
hp9
(82,65)
Buffalo
Copyright © 1998 by Hanan Samet
5
v
hp9
(5,45)
Denver
Copyright © 1998 by Hanan Samet
6
g
hp9
(27,35)
Omaha
Copyright © 1998 by Hanan Samet
PR QUADTREE (Orenstein)
1 hp9
b
Regular decomposition point representation
Decomposition occurs whenever a block contains more
than one point
Useful when the domain of data points is not discrete
but finite
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
Maximum level of decomposition depends on the
minimum separation between two points
• if two points are very close, then decomposition can be
very deep
• can be overcome by viewing blocks as buckets with
capacity c and only decomposing the block when it
contains more than c points
1.
2.
3.
4.
Ex: c = 1
Copyright © 1998 by Hanan Samet
2
r
hp9
(52,10)
Mobile
Copyright © 1998 by Hanan Samet
3
z
hp9
(62,77)
Toronto
Copyright © 1998 by Hanan Samet
4
g
hp9
(82,65)
Buffalo
Copyright © 1998 by Hanan Samet
5
v
hp9
(5,45)
Denver
Copyright © 1998 by Hanan Samet
6
g
hp9
(27,35)
Omaha
Copyright © 1998 by Hanan Samet
7
z
hp9
(85,15)
Atlanta
Copyright © 1998 by Hanan Samet
PR QUADTREE (Orenstein)
1 hp9
b
Regular decomposition point representation
Decomposition occurs whenever a block contains more
than one point
Useful when the domain of data points is not discrete
but finite
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
Maximum level of decomposition depends on the
minimum separation between two points
• if two points are very close, then decomposition can be
very deep
• can be overcome by viewing blocks as buckets with
capacity c and only decomposing the block when it
contains more than c points
1.
2.
3.
4.
Ex: c = 1
Copyright © 1998 by Hanan Samet
2
r
hp9
(52,10)
Mobile
Copyright © 1998 by Hanan Samet
3
z
hp9
(62,77)
Toronto
Copyright © 1998 by Hanan Samet
4
g
hp9
(82,65)
Buffalo
Copyright © 1998 by Hanan Samet
5
v
hp9
(5,45)
Denver
Copyright © 1998 by Hanan Samet
6
g
hp9
(27,35)
Omaha
Copyright © 1998 by Hanan Samet
7
z
hp9
(85,15)
Atlanta
Copyright © 1998 by Hanan Samet
8
r
hp9
(90,5)
Miami
Copyright © 1998 by Hanan Samet
REGION SEARCH
1 hp10
b
Use of quadtree results in pruning the search space
Ex: Find all points within radius r of point A
A
r
Copyright © 1998 by Hanan Samet
REGION SEARCH
1 hp10
b
Use of quadtree results in pruning the search space
Ex: Find all points within radius r of point A
A
r
Copyright © 1998 by Hanan Samet
hp10
If a quadrant subdivision point p lies in a region l, then
search the quadrants of p specified by l
1. SE 6. NE 11. All but SW
2. SE, SW 7. NE, NW 12. All but SE
3. SW 8. NW 13. All
4. SE, NE 9. All but NW
5. SW, NW 10. All but NE
1 2 3
9 10
13
1211
4
5
876
2
r
Copyright © 1998 by Hanan Samet
REGION SEARCH
1 hp10
b
Use of quadtree results in pruning the search space
Ex: Find all points within radius r of point A
A
r
Copyright © 1998 by Hanan Samet
hp10
If a quadrant subdivision point p lies in a region l, then
search the quadrants of p specified by l
1. SE 6. NE 11. All but SW
2. SE, SW 7. NE, NW 12. All but SE
3. SW 8. NW 13. All
4. SE, NE 9. All but NW
5. SW, NW 10. All but NE
1 2 3
9 10
13
1211
4
5
876
2
r
Copyright © 1998 by Hanan Samet
hp103
z
Copyright © 1998 by Hanan Samet
hp11
FINDING THE NEAREST OBJECT
• Ex: find the nearest object to P
1
b
P
12 8 7 6
13 9 1 4 5
2 3
10 11
D
E C
F
A
B
• Assume PR quadtree for points (i.e., at most one point
per block)
• Search neighbors of block 1 in counterclockwise order
• Points are sorted with respect to the space they occupy
which enables pruning the search space
• Algorithm:
Copyright © 1998 by Hanan Samet
hp11
FINDING THE NEAREST OBJECT
• Ex: find the nearest object to P
1
b
P
12 8 7 6
13 9 1 4 5
2 3
10 11
D
E C
F
A
B
• Assume PR quadtree for points (i.e., at most one point
per block)
• Search neighbors of block 1 in counterclockwise order
• Points are sorted with respect to the space they occupy
which enables pruning the search space
• Algorithm:
Copyright © 1998 by Hanan Samet
hp112
r
1. start at block 2 and compute distance to P from A
Copyright © 1998 by Hanan Samet
hp11
FINDING THE NEAREST OBJECT
• Ex: find the nearest object to P
1
b
P
12 8 7 6
13 9 1 4 5
2 3
10 11
D
E C
F
A
B
• Assume PR quadtree for points (i.e., at most one point
per block)
• Search neighbors of block 1 in counterclockwise order
• Points are sorted with respect to the space they occupy
which enables pruning the search space
• Algorithm:
Copyright © 1998 by Hanan Samet
hp112
r
1. start at block 2 and compute distance to P from A
Copyright © 1998 by Hanan Samet
hp113
z
2. ignore block 3 whether or not it is empty as A is closer
to P than any point in 3
Copyright © 1998 by Hanan Samet
hp11
FINDING THE NEAREST OBJECT
• Ex: find the nearest object to P
1
b
P
12 8 7 6
13 9 1 4 5
2 3
10 11
D
E C
F
A
B
• Assume PR quadtree for points (i.e., at most one point
per block)
• Search neighbors of block 1 in counterclockwise order
• Points are sorted with respect to the space they occupy
which enables pruning the search space
• Algorithm:
Copyright © 1998 by Hanan Samet
hp112
r
1. start at block 2 and compute distance to P from A
Copyright © 1998 by Hanan Samet
hp113
z
2. ignore block 3 whether or not it is empty as A is closer
to P than any point in 3
Copyright © 1998 by Hanan Samet
hp114
g
3. examine block 4 as distance to SW corner is shorter
than the distance from P to A; however, reject B as it is
further from P than A
Copyright © 1998 by Hanan Samet
hp11
FINDING THE NEAREST OBJECT
• Ex: find the nearest object to P
1
b
P
12 8 7 6
13 9 1 4 5
2 3
10 11
D
E C
F
A
B
• Assume PR quadtree for points (i.e., at most one point
per block)
• Search neighbors of block 1 in counterclockwise order
• Points are sorted with respect to the space they occupy
which enables pruning the search space
• Algorithm:
Copyright © 1998 by Hanan Samet
hp112
r
1. start at block 2 and compute distance to P from A
Copyright © 1998 by Hanan Samet
hp113
z
2. ignore block 3 whether or not it is empty as A is closer
to P than any point in 3
Copyright © 1998 by Hanan Samet
hp114
g
3. examine block 4 as distance to SW corner is shorter
than the distance from P to A; however, reject B as it is
further from P than A
Copyright © 1998 by Hanan Samet
hp115
v
4. ignore blocks 6, 7, 8, 9, and 10 as the minimum
distance to them from P is greater than the distance
from P to A
Copyright © 1998 by Hanan Samet
hp11
FINDING THE NEAREST OBJECT
• Ex: find the nearest object to P
1
b
P
12 8 7 6
13 9 1 4 5
2 3
10 11
D
E C
F
A
B
• Assume PR quadtree for points (i.e., at most one point
per block)
• Search neighbors of block 1 in counterclockwise order
• Points are sorted with respect to the space they occupy
which enables pruning the search space
• Algorithm:
Copyright © 1998 by Hanan Samet
hp112
r
1. start at block 2 and compute distance to P from A
Copyright © 1998 by Hanan Samet
hp113
z
2. ignore block 3 whether or not it is empty as A is closer
to P than any point in 3
Copyright © 1998 by Hanan Samet
hp114
g
3. examine block 4 as distance to SW corner is shorter
than the distance from P to A; however, reject B as it is
further from P than A
Copyright © 1998 by Hanan Samet
hp115
v
4. ignore blocks 6, 7, 8, 9, and 10 as the minimum
distance to them from P is greater than the distance
from P to A
Copyright © 1998 by Hanan Samet
hp116
z
5. examine block 11 as the distance from P to the southern
border of 1 is shorter than the distance from P to A;
however, reject F as it is further from P than A
Copyright © 1998 by Hanan Samet
hp11
FINDING THE NEAREST OBJECT
• Ex: find the nearest object to P
1
b
P
12 8 7 6
13 9 1 4 5
2 3
10 11
D
E C
F
A
B
• Assume PR quadtree for points (i.e., at most one point
per block)
• Search neighbors of block 1 in counterclockwise order
• Points are sorted with respect to the space they occupy
which enables pruning the search space
• Algorithm:
Copyright © 1998 by Hanan Samet
hp112
r
1. start at block 2 and compute distance to P from A
Copyright © 1998 by Hanan Samet
hp113
z
2. ignore block 3 whether or not it is empty as A is closer
to P than any point in 3
Copyright © 1998 by Hanan Samet
hp114
g
3. examine block 4 as distance to SW corner is shorter
than the distance from P to A; however, reject B as it is
further from P than A
Copyright © 1998 by Hanan Samet
hp115
v
4. ignore blocks 6, 7, 8, 9, and 10 as the minimum
distance to them from P is greater than the distance
from P to A
Copyright © 1998 by Hanan Samet
hp116
z
5. examine block 11 as the distance from P to the southern
border of 1 is shorter than the distance from P to A;
however, reject F as it is further from P than A
Copyright © 1998 by Hanan Samet
hp117
r
• If F was moved, a better order would have started with
block 11, the southern neighbor of 1, as it is closest
new F
Copyright © 1998 by Hanan Samet
hp12
COMPARISON OF POINT, MX, AND PR QUADTREES
FEATURE MX PR POINT
Regular
decomposition
Yes Yes No
Type of nodes Data Stored in
leaf nodes and
non-leaf nodes
are for control
Data stored in leaf
nodes and non-
leaf nodes are for
control
Data stored in
leaf nodes and
non-leaf nodes
Shape of tree
depends on order
of inserting nodes
No No Yes
Deletion Simple but may
have to collapse
WHITE nodes
Simple but may
have to collapse
WHITE nodes
Complex
Size of space
represented
Finite Finite Unbounded
Type of data
represented
Discrete Continuous Continuous
Shape of space
represented
Square Rectangle Unbounded
Stores
coordinates
No Yes Yes
Depth of tree (d);
assume M points
All nodes are at
the same depth,
n for a 2n by 2n
region
For square region
with side length L
and minimum
separation S
between two
points,
log4(M-1) ≤ d
and
d≤log2((L/S)2.5)
log4(3M) ≤ d
and d ≤ M-1
Copyright © 1998 by Hanan Samet
APPLICATION OF THE MX QUADTREE (Hunter)
• Represent the boundary as a sequence of BLACK
pixels in a region quadtree
• Useful for a simple digitized polygon (i.e., non-
intersecting edges)
• Three types of nodes
1. interior – treat like WHITE nodes
2. exterior – treat like WHITE nodes
3. boundary – the edge of the polygon passes
through them and treated like BLACK nodes
• Disadvantages
1. a thickness is associated with the line segments
2. no more than 4 lines can meet at a point
hp131
b
Copyright © 1998 by Hanan Samet
APPLICATION OF THE MX QUADTREE (Hunter)
• Represent the boundary as a sequence of BLACK
pixels in a region quadtree
• Useful for a simple digitized polygon (i.e., non-
intersecting edges)
• Three types of nodes
1. interior – treat like WHITE nodes
2. exterior – treat like WHITE nodes
3. boundary – the edge of the polygon passes
through them and treated like BLACK nodes
• Disadvantages
1. a thickness is associated with the line segments
2. no more than 4 lines can meet at a point
hp131
b
Copyright © 1998 by Hanan Samet
hp132
r
Copyright © 1998 by Hanan Samet
hp14
MX-CIF QUADTREE (Kedem)
1
b
Collections of small rectangles for VLSI applications
Each rectangle is associated with its minimum
enclosing quadtree block
Like hashing: quadtree blocks serve as hash buckets
1.
2.
3.
1
2
3
4
5
6
9
87
10 11
12
D {11}
F {12}E {3,4,5}
B {1}
A {2,6,7,8,9,10}
C {}
A
B
C
E
D
F
Copyright © 1998 by Hanan Samet
hp14
MX-CIF QUADTREE (Kedem)
1
b
Collections of small rectangles for VLSI applications
Each rectangle is associated with its minimum
enclosing quadtree block
Like hashing: quadtree blocks serve as hash buckets
1.
2.
3.
1
2
3
4
5
6
9
87
10 11
12
D {11}
F {12}E {3,4,5}
B {1}
A {2,6,7,8,9,10}
C {}
A
B
C
E
D
F
Copyright © 1998 by Hanan Samet
hp142
r
Collision = more than one rectangle in a block
resolve by using two one-dimensional MX-CIF trees to
store the rectangle intersecting the lines passing
through each subdivision point
4.
Copyright © 1998 by Hanan Samet
hp14
MX-CIF QUADTREE (Kedem)
1
b
Collections of small rectangles for VLSI applications
Each rectangle is associated with its minimum
enclosing quadtree block
Like hashing: quadtree blocks serve as hash buckets
1.
2.
3.
1
2
3
4
5
6
9
87
10 11
12
D {11}
F {12}E {3,4,5}
B {1}
A {2,6,7,8,9,10}
C {}
A
B
C
E
D
F
Copyright © 1998 by Hanan Samet
hp142
r
Collision = more than one rectangle in a block
resolve by using two one-dimensional MX-CIF trees to
store the rectangle intersecting the lines passing
through each subdivision point
4.
Copyright © 1998 by Hanan Samet
hp14
one for y-axis
Binary tree for y-
axis through A
Y1
Y2
10
Y4
2
Y5
Y3
6
Y7
8
Y6
3
g
Copyright © 1998 by Hanan Samet
hp14
MX-CIF QUADTREE (Kedem)
1
b
Collections of small rectangles for VLSI applications
Each rectangle is associated with its minimum
enclosing quadtree block
Like hashing: quadtree blocks serve as hash buckets
1.
2.
3.
1
2
3
4
5
6
9
87
10 11
12
D {11}
F {12}E {3,4,5}
B {1}
A {2,6,7,8,9,10}
C {}
A
B
C
E
D
F
Copyright © 1998 by Hanan Samet
hp142
r
Collision = more than one rectangle in a block
resolve by using two one-dimensional MX-CIF trees to
store the rectangle intersecting the lines passing
through each subdivision point
4.
Copyright © 1998 by Hanan Samet
hp14
one for y-axis
Binary tree for y-
axis through A
Y1
Y2
10
Y4
2
Y5
Y3
6
Y7
8
Y6
3
g
Copyright © 1998 by Hanan Samet
hp14
if a rectangle intersects both x and y axes, then
associate it with the y axis
4
v
Copyright © 1998 by Hanan Samet
hp14
MX-CIF QUADTREE (Kedem)
1
b
Collections of small rectangles for VLSI applications
Each rectangle is associated with its minimum
enclosing quadtree block
Like hashing: quadtree blocks serve as hash buckets
1.
2.
3.
1
2
3
4
5
6
9
87
10 11
12
D {11}
F {12}E {3,4,5}
B {1}
A {2,6,7,8,9,10}
C {}
A
B
C
E
D
F
Copyright © 1998 by Hanan Samet
hp142
r
Collision = more than one rectangle in a block
resolve by using two one-dimensional MX-CIF trees to
store the rectangle intersecting the lines passing
through each subdivision point
4.
Copyright © 1998 by Hanan Samet
hp14
one for y-axis
Binary tree for y-
axis through A
Y1
Y2
10
Y4
2
Y5
Y3
6
Y7
8
Y6
3
g
Copyright © 1998 by Hanan Samet
hp14
if a rectangle intersects both x and y axes, then
associate it with the y axis
4
v
Copyright © 1998 by Hanan Samet
hp145
z
one for x-axis
Binary tree for x-
axis through A
X1
X3
9
X5
7
X4
X2
X6
Copyright © 1998 by Hanan Samet
K-D TREE (Bentley)
• Test one attribute at a time instead of all simultaneously
as in the point quadtree
• Usually cycle through all the attributes
• Shape of the tree depends on the order in which the
data is encountered
1 hp15
b
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
Chicago
Copyright © 1998 by Hanan Samet
K-D TREE (Bentley)
• Test one attribute at a time instead of all simultaneously
as in the point quadtree
• Usually cycle through all the attributes
• Shape of the tree depends on the order in which the
data is encountered
1 hp15
b
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
Chicago
Copyright © 1998 by Hanan Samet
2
r
hp15
(52,10)
Mobile
Mobile
x test
Copyright © 1998 by Hanan Samet
K-D TREE (Bentley)
• Test one attribute at a time instead of all simultaneously
as in the point quadtree
• Usually cycle through all the attributes
• Shape of the tree depends on the order in which the
data is encountered
1 hp15
b
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
Chicago
Copyright © 1998 by Hanan Samet
2
r
hp15
(52,10)
Mobile
Mobile
x test
Copyright © 1998 by Hanan Samet
3
z
hp15
(62,77)
Toronto
Toronto
y test
Copyright © 1998 by Hanan Samet
K-D TREE (Bentley)
• Test one attribute at a time instead of all simultaneously
as in the point quadtree
• Usually cycle through all the attributes
• Shape of the tree depends on the order in which the
data is encountered
1 hp15
b
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
Chicago
Copyright © 1998 by Hanan Samet
2
r
hp15
(52,10)
Mobile
Mobile
x test
Copyright © 1998 by Hanan Samet
3
z
hp15
(62,77)
Toronto
Toronto
y test
Copyright © 1998 by Hanan Samet
4
g
hp15
(82,65)
Buffalo
Buffalo
x test
Copyright © 1998 by Hanan Samet
K-D TREE (Bentley)
• Test one attribute at a time instead of all simultaneously
as in the point quadtree
• Usually cycle through all the attributes
• Shape of the tree depends on the order in which the
data is encountered
1 hp15
b
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
Chicago
Copyright © 1998 by Hanan Samet
2
r
hp15
(52,10)
Mobile
Mobile
x test
Copyright © 1998 by Hanan Samet
3
z
hp15
(62,77)
Toronto
Toronto
y test
Copyright © 1998 by Hanan Samet
4
g
hp15
(82,65)
Buffalo
Buffalo
x test
Copyright © 1998 by Hanan Samet
5
v
hp15
(5,45)
Denver
Denver
Copyright © 1998 by Hanan Samet
K-D TREE (Bentley)
• Test one attribute at a time instead of all simultaneously
as in the point quadtree
• Usually cycle through all the attributes
• Shape of the tree depends on the order in which the
data is encountered
1 hp15
b
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
Chicago
Copyright © 1998 by Hanan Samet
2
r
hp15
(52,10)
Mobile
Mobile
x test
Copyright © 1998 by Hanan Samet
3
z
hp15
(62,77)
Toronto
Toronto
y test
Copyright © 1998 by Hanan Samet
4
g
hp15
(82,65)
Buffalo
Buffalo
x test
Copyright © 1998 by Hanan Samet
5
v
hp15
(5,45)
Denver
Denver
Copyright © 1998 by Hanan Samet
6
g
hp15
(27,35)
Omaha
Omaha
Copyright © 1998 by Hanan Samet
K-D TREE (Bentley)
• Test one attribute at a time instead of all simultaneously
as in the point quadtree
• Usually cycle through all the attributes
• Shape of the tree depends on the order in which the
data is encountered
1 hp15
b
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
Chicago
Copyright © 1998 by Hanan Samet
2
r
hp15
(52,10)
Mobile
Mobile
x test
Copyright © 1998 by Hanan Samet
3
z
hp15
(62,77)
Toronto
Toronto
y test
Copyright © 1998 by Hanan Samet
4
g
hp15
(82,65)
Buffalo
Buffalo
x test
Copyright © 1998 by Hanan Samet
5
v
hp15
(5,45)
Denver
Denver
Copyright © 1998 by Hanan Samet
6
g
hp15
(27,35)
Omaha
Omaha
Copyright © 1998 by Hanan Samet
7
r
hp15
(85,15)
Atlanta
Atlanta
y test
Copyright © 1998 by Hanan Samet
K-D TREE (Bentley)
• Test one attribute at a time instead of all simultaneously
as in the point quadtree
• Usually cycle through all the attributes
• Shape of the tree depends on the order in which the
data is encountered
1 hp15
b
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
Chicago
Copyright © 1998 by Hanan Samet
2
r
hp15
(52,10)
Mobile
Mobile
x test
Copyright © 1998 by Hanan Samet
3
z
hp15
(62,77)
Toronto
Toronto
y test
Copyright © 1998 by Hanan Samet
4
g
hp15
(82,65)
Buffalo
Buffalo
x test
Copyright © 1998 by Hanan Samet
5
v
hp15
(5,45)
Denver
Denver
Copyright © 1998 by Hanan Samet
6
g
hp15
(27,35)
Omaha
Omaha
Copyright © 1998 by Hanan Samet
7
r
hp15
(85,15)
Atlanta
Atlanta
y test
Copyright © 1998 by Hanan Samet
8
v
hp15
(90,5)
Miami
Miami
Copyright © 1998 by Hanan Samet
hp16
K-D TREE DELETION
• Similar to deletion in binary search trees
• Assume branch to HISON if test value is ≥ root value
• Assume root discriminates on the x coordinate and
subsequent alternation with the y coordinate value
• Algorithm:
1. replace root by node in HISON with the minimum x
coordinate value
2. repeat the process for the position of the replacement
node using the x or y coordinate values depending on
whether the replacement node is an x or a y
discriminator
Copyright © 1998 by Hanan Samet
EXAMPLE OF K-D TREE DELETION
Ex:
hp171
b
J (28,45)
A (20,20)
B (10,30) D (40,50)
C (10,20) E (30,40) F (35,60)
G (25,30)
H (29,40)
I (27,35)
x
y
x
y
x
y
x
Copyright © 1998 by Hanan Samet
EXAMPLE OF K-D TREE DELETION
Ex:
hp171
b
J (28,45)
A (20,20)
B (10,30) D (40,50)
C (10,20) E (30,40) F (35,60)
G (25,30)
H (29,40)
I (27,35)
x
y
x
y
x
y
x
Copyright © 1998 by Hanan Samet
2
r
hp17
min x
Copyright © 1998 by Hanan Samet
EXAMPLE OF K-D TREE DELETION
Ex:
hp171
b
J (28,45)
A (20,20)
B (10,30) D (40,50)
C (10,20) E (30,40) F (35,60)
G (25,30)
H (29,40)
I (27,35)
x
y
x
y
x
y
x
Copyright © 1998 by Hanan Samet
2
r
hp17
min x
Copyright © 1998 by Hanan Samet
3
z
hp17
min y
Copyright © 1998 by Hanan Samet
EXAMPLE OF K-D TREE DELETION
Ex:
hp171
b
J (28,45)
A (20,20)
B (10,30) D (40,50)
C (10,20) E (30,40) F (35,60)
G (25,30)
H (29,40)
I (27,35)
x
y
x
y
x
y
x
Copyright © 1998 by Hanan Samet
2
r
hp17
min x
Copyright © 1998 by Hanan Samet
3
z
hp17
min y
Copyright © 1998 by Hanan Samet
4
g
hp17
min y
G (25,30)
B (10,30) D (40,50)
C (10,20) E (30,40) F (35,60)
I (27,35)
H (29,40)
J (28,45)
Copyright © 1998 by Hanan Samet
EXAMPLE OF K-D TREE DELETION
Ex:
hp171
b
J (28,45)
A (20,20)
B (10,30) D (40,50)
C (10,20) E (30,40) F (35,60)
G (25,30)
H (29,40)
I (27,35)
x
y
x
y
x
y
x
Copyright © 1998 by Hanan Samet
2
r
hp17
min x
Copyright © 1998 by Hanan Samet
3
z
hp17
min y
Copyright © 1998 by Hanan Samet
4
g
hp17
min y
G (25,30)
B (10,30) D (40,50)
C (10,20) E (30,40) F (35,60)
I (27,35)
H (29,40)
J (28,45)
Copyright © 1998 by Hanan Samet
5
v
hp17
• Analogy with binary search trees implies that we can
replace the root by the node in LOSON with the max x
coordinate value. However, if there is more than one
node with the same max value, then after replacement
there would be a node in LOSON which is not strictly less
than the new root (e.g., nodes B and C)
Copyright © 1998 by Hanan Samet
EXAMPLE OF K-D TREE DELETION
Ex:
hp171
b
J (28,45)
A (20,20)
B (10,30) D (40,50)
C (10,20) E (30,40) F (35,60)
G (25,30)
H (29,40)
I (27,35)
x
y
x
y
x
y
x
Copyright © 1998 by Hanan Samet
2
r
hp17
min x
Copyright © 1998 by Hanan Samet
3
z
hp17
min y
Copyright © 1998 by Hanan Samet
4
g
hp17
min y
G (25,30)
B (10,30) D (40,50)
C (10,20) E (30,40) F (35,60)
I (27,35)
H (29,40)
J (28,45)
Copyright © 1998 by Hanan Samet
5
v
hp17
• Analogy with binary search trees implies that we can
replace the root by the node in LOSON with the max x
coordinate value. However, if there is more than one
node with the same max value, then after replacement
there would be a node in LOSON which is not strictly less
than the new root (e.g., nodes B and C)
Copyright © 1998 by Hanan Samet
6
r
hp17
• What if the HISON of the root is empty?
A
T
Copyright © 1998 by Hanan Samet
EXAMPLE OF K-D TREE DELETION
Ex:
hp171
b
J (28,45)
A (20,20)
B (10,30) D (40,50)
C (10,20) E (30,40) F (35,60)
G (25,30)
H (29,40)
I (27,35)
x
y
x
y
x
y
x
Copyright © 1998 by Hanan Samet
2
r
hp17
min x
Copyright © 1998 by Hanan Samet
3
z
hp17
min y
Copyright © 1998 by Hanan Samet
4
g
hp17
min y
G (25,30)
B (10,30) D (40,50)
C (10,20) E (30,40) F (35,60)
I (27,35)
H (29,40)
J (28,45)
Copyright © 1998 by Hanan Samet
5
v
hp17
• Analogy with binary search trees implies that we can
replace the root by the node in LOSON with the max x
coordinate value. However, if there is more than one
node with the same max value, then after replacement
there would be a node in LOSON which is not strictly less
than the new root (e.g., nodes B and C)
Copyright © 1998 by Hanan Samet
6
r
hp17
• What if the HISON of the root is empty?
A
T
Copyright © 1998 by Hanan Samet
7
z
hp17
1. replace root node A with the node B in LOSON having
the minimum x coordinate value and set the HISON
pointer of A to be the old LOSON pointer of A
2. repeat process for the position of the replacement node
B
T’
B
B is the node with
the minimum x
coordinate value
and replaces A
Copyright © 1998 by Hanan Samet
K-D TREE SEARCH
• Search space can be pruned by testing if the search
region is completely contained in one of the partitions of
the node as now only one subtree must be examined
hp181
b
(0,100) (100,100)
(100,0)(0,0)
(5,45)
Denver (35,42)
Chicago
(27,35)
Omaha
(52,10)
Mobile
(62,77)
Toronto (82,65)
Buffalo
(85,15)
Atlanta
(90,5)
Miami
Copyright © 1998 by Hanan Samet
K-D TREE SEARCH
• Search space can be pruned by testing if the search
region is completely contained in one of the partitions of
the node as now only one subtree must be examined
hp181
b
(0,100) (100,100)
(100,0)(0,0)
(5,45)
Denver (35,42)
Chicago
(27,35)
Omaha
(52,10)
Mobile
(62,77)
Toronto (82,65)
Buffalo
(85,15)
Atlanta
(90,5)
Miami
Copyright © 1998 by Hanan Samet
2
r
hp18
• Ex: find all points within 10 of (20,30)
(20,30)
Copyright © 1998 by Hanan Samet
K-D TREE SEARCH
• Search space can be pruned by testing if the search
region is completely contained in one of the partitions of
the node as now only one subtree must be examined
hp181
b
(0,100) (100,100)
(100,0)(0,0)
(5,45)
Denver (35,42)
Chicago
(27,35)
Omaha
(52,10)
Mobile
(62,77)
Toronto (82,65)
Buffalo
(85,15)
Atlanta
(90,5)
Miami
Copyright © 1998 by Hanan Samet
2
r
hp18
• Ex: find all points within 10 of (20,30)
(20,30)
Copyright © 1998 by Hanan Samet
3
z
hp18
1. search the region entirely to the left of Chicago (x=35)
Copyright © 1998 by Hanan Samet
K-D TREE SEARCH
• Search space can be pruned by testing if the search
region is completely contained in one of the partitions of
the node as now only one subtree must be examined
hp181
b
(0,100) (100,100)
(100,0)(0,0)
(5,45)
Denver (35,42)
Chicago
(27,35)
Omaha
(52,10)
Mobile
(62,77)
Toronto (82,65)
Buffalo
(85,15)
Atlanta
(90,5)
Miami
Copyright © 1998 by Hanan Samet
2
r
hp18
• Ex: find all points within 10 of (20,30)
(20,30)
Copyright © 1998 by Hanan Samet
3
z
hp18
1. search the region entirely to the left of Chicago (x=35)
Copyright © 1998 by Hanan Samet
4
g
hp18
2. search the region entirely below Denver (y=45)
Copyright © 1998 by Hanan Samet
K-D TREE SEARCH
• Search space can be pruned by testing if the search
region is completely contained in one of the partitions of
the node as now only one subtree must be examined
hp181
b
(0,100) (100,100)
(100,0)(0,0)
(5,45)
Denver (35,42)
Chicago
(27,35)
Omaha
(52,10)
Mobile
(62,77)
Toronto (82,65)
Buffalo
(85,15)
Atlanta
(90,5)
Miami
Copyright © 1998 by Hanan Samet
2
r
hp18
• Ex: find all points within 10 of (20,30)
(20,30)
Copyright © 1998 by Hanan Samet
3
z
hp18
1. search the region entirely to the left of Chicago (x=35)
Copyright © 1998 by Hanan Samet
4
g
hp18
2. search the region entirely below Denver (y=45)
Copyright © 1998 by Hanan Samet
5
v
hp18
3. search yields Omaha
Copyright © 1998 by Hanan Samet
PR K-D TREE (Knowlton)
• A region contains at most one data point
• Analogous to EXCELL with bucket size of 1
1 hp19
b
(0,100) (100,100)
(100,0)(0,0)
Copyright © 1998 by Hanan Samet
PR K-D TREE (Knowlton)
• A region contains at most one data point
• Analogous to EXCELL with bucket size of 1
1 hp19
b
(0,100) (100,100)
(100,0)(0,0)
Copyright © 1998 by Hanan Samet
2
r
hp19
(35,42)
Chicago
Chicago
Copyright © 1998 by Hanan Samet
PR K-D TREE (Knowlton)
• A region contains at most one data point
• Analogous to EXCELL with bucket size of 1
1 hp19
b
(0,100) (100,100)
(100,0)(0,0)
Copyright © 1998 by Hanan Samet
2
r
hp19
(35,42)
Chicago
Chicago
Copyright © 1998 by Hanan Samet
3
z
hp19
(52,10)
Mobile
MobileChicago
Copyright © 1998 by Hanan Samet
PR K-D TREE (Knowlton)
• A region contains at most one data point
• Analogous to EXCELL with bucket size of 1
1 hp19
b
(0,100) (100,100)
(100,0)(0,0)
Copyright © 1998 by Hanan Samet
2
r
hp19
(35,42)
Chicago
Chicago
Copyright © 1998 by Hanan Samet
3
z
hp19
(52,10)
Mobile
MobileChicago
Copyright © 1998 by Hanan Samet
4
g
hp19
(62,77)
Toronto
Mobile Toronto
Copyright © 1998 by Hanan Samet
PR K-D TREE (Knowlton)
• A region contains at most one data point
• Analogous to EXCELL with bucket size of 1
1 hp19
b
(0,100) (100,100)
(100,0)(0,0)
Copyright © 1998 by Hanan Samet
2
r
hp19
(35,42)
Chicago
Chicago
Copyright © 1998 by Hanan Samet
3
z
hp19
(52,10)
Mobile
MobileChicago
Copyright © 1998 by Hanan Samet
4
g
hp19
(62,77)
Toronto
Mobile Toronto
Copyright © 1998 by Hanan Samet
5
v
hp19
(82,65)
Buffalo
BuffaloToronto
Copyright © 1998 by Hanan Samet
PR K-D TREE (Knowlton)
• A region contains at most one data point
• Analogous to EXCELL with bucket size of 1
1 hp19
b
(0,100) (100,100)
(100,0)(0,0)
Copyright © 1998 by Hanan Samet
2
r
hp19
(35,42)
Chicago
Chicago
Copyright © 1998 by Hanan Samet
3
z
hp19
(52,10)
Mobile
MobileChicago
Copyright © 1998 by Hanan Samet
4
g
hp19
(62,77)
Toronto
Mobile Toronto
Copyright © 1998 by Hanan Samet
5
v
hp19
(82,65)
Buffalo
BuffaloToronto
Copyright © 1998 by Hanan Samet
6
z
hp19
(5,45)
Denver
Chicago
Denver
Copyright © 1998 by Hanan Samet
PR K-D TREE (Knowlton)
• A region contains at most one data point
• Analogous to EXCELL with bucket size of 1
1 hp19
b
(0,100) (100,100)
(100,0)(0,0)
Copyright © 1998 by Hanan Samet
2
r
hp19
(35,42)
Chicago
Chicago
Copyright © 1998 by Hanan Samet
3
z
hp19
(52,10)
Mobile
MobileChicago
Copyright © 1998 by Hanan Samet
4
g
hp19
(62,77)
Toronto
Mobile Toronto
Copyright © 1998 by Hanan Samet
5
v
hp19
(82,65)
Buffalo
BuffaloToronto
Copyright © 1998 by Hanan Samet
6
z
hp19
(5,45)
Denver
Chicago
Denver
Copyright © 1998 by Hanan Samet
7
g
hp19
(27,35)
Omaha
ChicagoOmaha
Copyright © 1998 by Hanan Samet
PR K-D TREE (Knowlton)
• A region contains at most one data point
• Analogous to EXCELL with bucket size of 1
1 hp19
b
(0,100) (100,100)
(100,0)(0,0)
Copyright © 1998 by Hanan Samet
2
r
hp19
(35,42)
Chicago
Chicago
Copyright © 1998 by Hanan Samet
3
z
hp19
(52,10)
Mobile
MobileChicago
Copyright © 1998 by Hanan Samet
4
g
hp19
(62,77)
Toronto
Mobile Toronto
Copyright © 1998 by Hanan Samet
5
v
hp19
(82,65)
Buffalo
BuffaloToronto
Copyright © 1998 by Hanan Samet
6
z
hp19
(5,45)
Denver
Chicago
Denver
Copyright © 1998 by Hanan Samet
7
g
hp19
(27,35)
Omaha
ChicagoOmaha
Copyright © 1998 by Hanan Samet
8
r
hp19
(85,15)
Atlanta
Mobile
Atlanta
Copyright © 1998 by Hanan Samet
PR K-D TREE (Knowlton)
• A region contains at most one data point
• Analogous to EXCELL with bucket size of 1
1 hp19
b
(0,100) (100,100)
(100,0)(0,0)
Copyright © 1998 by Hanan Samet
2
r
hp19
(35,42)
Chicago
Chicago
Copyright © 1998 by Hanan Samet
3
z
hp19
(52,10)
Mobile
MobileChicago
Copyright © 1998 by Hanan Samet
4
g
hp19
(62,77)
Toronto
Mobile Toronto
Copyright © 1998 by Hanan Samet
5
v
hp19
(82,65)
Buffalo
BuffaloToronto
Copyright © 1998 by Hanan Samet
6
z
hp19
(5,45)
Denver
Chicago
Denver
Copyright © 1998 by Hanan Samet
7
g
hp19
(27,35)
Omaha
ChicagoOmaha
Copyright © 1998 by Hanan Samet
8
r
hp19
(85,15)
Atlanta
Mobile
Atlanta
Copyright © 1998 by Hanan Samet
(90,5)
Miami
9
v
hp19
MiamiAtlanta
Copyright © 1998 by Hanan Samet
ADAPTIVE K-D TREE
• Data is only stored in terminal nodes
• An interior node contains the median of the set as the
discriminator
• The discriminator key is the one for which the spread of
the values of the key is a maximum
hp201
b
(0,100) (100,100)
(100,0)(0,0)
(5,45)
Denver (35,42)
Chicago
(27,35)
Omaha
(52,10)
Mobile
(62,77)
Toronto
(82,65)
Buffalo
(85,15)
Atlanta
(90,5)
Miami
Copyright © 1998 by Hanan Samet
ADAPTIVE K-D TREE
• Data is only stored in terminal nodes
• An interior node contains the median of the set as the
discriminator
• The discriminator key is the one for which the spread of
the values of the key is a maximum
hp201
b
(0,100) (100,100)
(100,0)(0,0)
(5,45)
Denver (35,42)
Chicago
(27,35)
Omaha
(52,10)
Mobile
(62,77)
Toronto
(82,65)
Buffalo
(85,15)
Atlanta
(90,5)
Miami
Copyright © 1998 by Hanan Samet
2
v
hp20
57,X
Copyright © 1998 by Hanan Samet
ADAPTIVE K-D TREE
• Data is only stored in terminal nodes
• An interior node contains the median of the set as the
discriminator
• The discriminator key is the one for which the spread of
the values of the key is a maximum
hp201
b
(0,100) (100,100)
(100,0)(0,0)
(5,45)
Denver (35,42)
Chicago
(27,35)
Omaha
(52,10)
Mobile
(62,77)
Toronto
(82,65)
Buffalo
(85,15)
Atlanta
(90,5)
Miami
Copyright © 1998 by Hanan Samet
2
v
hp20
57,X
Copyright © 1998 by Hanan Samet
3
r
hp20
31,X
Copyright © 1998 by Hanan Samet
ADAPTIVE K-D TREE
• Data is only stored in terminal nodes
• An interior node contains the median of the set as the
discriminator
• The discriminator key is the one for which the spread of
the values of the key is a maximum
hp201
b
(0,100) (100,100)
(100,0)(0,0)
(5,45)
Denver (35,42)
Chicago
(27,35)
Omaha
(52,10)
Mobile
(62,77)
Toronto
(82,65)
Buffalo
(85,15)
Atlanta
(90,5)
Miami
Copyright © 1998 by Hanan Samet
2
v
hp20
57,X
Copyright © 1998 by Hanan Samet
3
r
hp20
31,X
Copyright © 1998 by Hanan Samet
4
z
hp20
16,X
OmahaDenver
Copyright © 1998 by Hanan Samet
ADAPTIVE K-D TREE
• Data is only stored in terminal nodes
• An interior node contains the median of the set as the
discriminator
• The discriminator key is the one for which the spread of
the values of the key is a maximum
hp201
b
(0,100) (100,100)
(100,0)(0,0)
(5,45)
Denver (35,42)
Chicago
(27,35)
Omaha
(52,10)
Mobile
(62,77)
Toronto
(82,65)
Buffalo
(85,15)
Atlanta
(90,5)
Miami
Copyright © 1998 by Hanan Samet
2
v
hp20
57,X
Copyright © 1998 by Hanan Samet
3
r
hp20
31,X
Copyright © 1998 by Hanan Samet
4
z
hp20
16,X
OmahaDenver
Copyright © 1998 by Hanan Samet
5
g
hp20
26,Y
ChicagoMobile
Copyright © 1998 by Hanan Samet
ADAPTIVE K-D TREE
• Data is only stored in terminal nodes
• An interior node contains the median of the set as the
discriminator
• The discriminator key is the one for which the spread of
the values of the key is a maximum
hp201
b
(0,100) (100,100)
(100,0)(0,0)
(5,45)
Denver (35,42)
Chicago
(27,35)
Omaha
(52,10)
Mobile
(62,77)
Toronto
(82,65)
Buffalo
(85,15)
Atlanta
(90,5)
Miami
Copyright © 1998 by Hanan Samet
2
v
hp20
57,X
Copyright © 1998 by Hanan Samet
3
r
hp20
31,X
Copyright © 1998 by Hanan Samet
4
z
hp20
16,X
OmahaDenver
Copyright © 1998 by Hanan Samet
5
g
hp20
26,Y
ChicagoMobile
Copyright © 1998 by Hanan Samet
6
g
hp20
40,Y
Copyright © 1998 by Hanan Samet
ADAPTIVE K-D TREE
• Data is only stored in terminal nodes
• An interior node contains the median of the set as the
discriminator
• The discriminator key is the one for which the spread of
the values of the key is a maximum
hp201
b
(0,100) (100,100)
(100,0)(0,0)
(5,45)
Denver (35,42)
Chicago
(27,35)
Omaha
(52,10)
Mobile
(62,77)
Toronto
(82,65)
Buffalo
(85,15)
Atlanta
(90,5)
Miami
Copyright © 1998 by Hanan Samet
2
v
hp20
57,X
Copyright © 1998 by Hanan Samet
3
r
hp20
31,X
Copyright © 1998 by Hanan Samet
4
z
hp20
16,X
OmahaDenver
Copyright © 1998 by Hanan Samet
5
g
hp20
26,Y
ChicagoMobile
Copyright © 1998 by Hanan Samet
6
g
hp20
40,Y
Copyright © 1998 by Hanan Samet
AtlantaMiami
7
v
hp20
10,Y10,Y
Copyright © 1998 by Hanan Samet
ADAPTIVE K-D TREE
• Data is only stored in terminal nodes
• An interior node contains the median of the set as the
discriminator
• The discriminator key is the one for which the spread of
the values of the key is a maximum
hp201
b
(0,100) (100,100)
(100,0)(0,0)
(5,45)
Denver (35,42)
Chicago
(27,35)
Omaha
(52,10)
Mobile
(62,77)
Toronto
(82,65)
Buffalo
(85,15)
Atlanta
(90,5)
Miami
Copyright © 1998 by Hanan Samet
2
v
hp20
57,X
Copyright © 1998 by Hanan Samet
3
r
hp20
31,X
Copyright © 1998 by Hanan Samet
4
z
hp20
16,X
OmahaDenver
Copyright © 1998 by Hanan Samet
5
g
hp20
26,Y
ChicagoMobile
Copyright © 1998 by Hanan Samet
6
g
hp20
40,Y
Copyright © 1998 by Hanan Samet
AtlantaMiami
7
v
hp20
10,Y10,Y
Copyright © 1998 by Hanan Samet
8
r
hp20
72,X
BuffaloToronto
Copyright © 1998 by Hanan Samet
hp21
GRID FILE (Nievergelt,Hinterberger,Sevcik)
• Two level grid for storing points
• Uses a grid directory (a 2d array of grid blocks) on disk
that contains the address of the bucket (i.e., page) that
contains the data associated with the grid block
• Linear scales (a pair of 1d arrays) in core that access the
grid block in the grid directory (on disk) that is associated
with a particular point thereby enabling the
decomposition of the space to be arbitrary
• Guarantees access to any record with two disk
operations — one for each level of the grid
1. access the grid block
2. access the bucket
• Each bucket has finite capacity
• Partition upon overflow
1. bucket partition — overflowing bucket is associated
with more than one grid block
2. grid partition — overflowing bucket is associated with
just one grid block
• Splitting policies
1. split at midpoint and cycle through attributes
2. adaptive
• increases granularity of frequently queried attributes
• favors some attributes over others
Copyright © 1998 by Hanan Samet
GRID FILE EXAMPLE
• Assume bucket size = 2
hp221
b
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
(52,10)
Mobile
Linear scales
1. Initially Chicago and Mobile in bucket A
x:
0 100
y:
0 100
A
Copyright © 1998 by Hanan Samet
GRID FILE EXAMPLE
• Assume bucket size = 2
hp221
b
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
(52,10)
Mobile
Linear scales
1. Initially Chicago and Mobile in bucket A
x:
0 100
y:
0 100
A
Copyright © 1998 by Hanan Samet
2
z
hp22
(62,77)
Toronto
2. Insert Toronto causing a grid partition yielding bucket B
45
x=45
B
Copyright © 1998 by Hanan Samet
GRID FILE EXAMPLE
• Assume bucket size = 2
hp221
b
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
(52,10)
Mobile
Linear scales
1. Initially Chicago and Mobile in bucket A
x:
0 100
y:
0 100
A
Copyright © 1998 by Hanan Samet
2
z
hp22
(62,77)
Toronto
2. Insert Toronto causing a grid partition yielding bucket B
45
x=45
B
Copyright © 1998 by Hanan Samet
3
r
hp22
(82,65)
Buffalo
3. Insert Buffalo causing a grid partition yielding bucket C
C
y=70
A
70
Copyright © 1998 by Hanan Samet
GRID FILE EXAMPLE
• Assume bucket size = 2
hp221
b
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
(52,10)
Mobile
Linear scales
1. Initially Chicago and Mobile in bucket A
x:
0 100
y:
0 100
A
Copyright © 1998 by Hanan Samet
2
z
hp22
(62,77)
Toronto
2. Insert Toronto causing a grid partition yielding bucket B
45
x=45
B
Copyright © 1998 by Hanan Samet
3
r
hp22
(82,65)
Buffalo
3. Insert Buffalo causing a grid partition yielding bucket C
C
y=70
A
70
Copyright © 1998 by Hanan Samet
4
v
hp22
(5,45)
Denver
4. Insert Denver causing no change
Copyright © 1998 by Hanan Samet
GRID FILE EXAMPLE
• Assume bucket size = 2
hp221
b
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
(52,10)
Mobile
Linear scales
1. Initially Chicago and Mobile in bucket A
x:
0 100
y:
0 100
A
Copyright © 1998 by Hanan Samet
2
z
hp22
(62,77)
Toronto
2. Insert Toronto causing a grid partition yielding bucket B
45
x=45
B
Copyright © 1998 by Hanan Samet
3
r
hp22
(82,65)
Buffalo
3. Insert Buffalo causing a grid partition yielding bucket C
C
y=70
A
70
Copyright © 1998 by Hanan Samet
4
v
hp22
(5,45)
Denver
4. Insert Denver causing no change
Copyright © 1998 by Hanan Samet
5
g
hp22
(27,35)
Omaha
5. Insert Omaha causing a grid partition yielding bucket D
y=43
43
D
D
C
Copyright © 1998 by Hanan Samet
GRID FILE EXAMPLE
• Assume bucket size = 2
hp221
b
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
(52,10)
Mobile
Linear scales
1. Initially Chicago and Mobile in bucket A
x:
0 100
y:
0 100
A
Copyright © 1998 by Hanan Samet
2
z
hp22
(62,77)
Toronto
2. Insert Toronto causing a grid partition yielding bucket B
45
x=45
B
Copyright © 1998 by Hanan Samet
3
r
hp22
(82,65)
Buffalo
3. Insert Buffalo causing a grid partition yielding bucket C
C
y=70
A
70
Copyright © 1998 by Hanan Samet
4
v
hp22
(5,45)
Denver
4. Insert Denver causing no change
Copyright © 1998 by Hanan Samet
5
g
hp22
(27,35)
Omaha
5. Insert Omaha causing a grid partition yielding bucket D
y=43
43
D
D
C
Copyright © 1998 by Hanan Samet
6
r
hp22
(85,15)
Atlanta
6. Insert Atlanta causing a bucket partition yielding bucket E
E
Copyright © 1998 by Hanan Samet
GRID FILE EXAMPLE
• Assume bucket size = 2
hp221
b
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
(52,10)
Mobile
Linear scales
1. Initially Chicago and Mobile in bucket A
x:
0 100
y:
0 100
A
Copyright © 1998 by Hanan Samet
2
z
hp22
(62,77)
Toronto
2. Insert Toronto causing a grid partition yielding bucket B
45
x=45
B
Copyright © 1998 by Hanan Samet
3
r
hp22
(82,65)
Buffalo
3. Insert Buffalo causing a grid partition yielding bucket C
C
y=70
A
70
Copyright © 1998 by Hanan Samet
4
v
hp22
(5,45)
Denver
4. Insert Denver causing no change
Copyright © 1998 by Hanan Samet
5
g
hp22
(27,35)
Omaha
5. Insert Omaha causing a grid partition yielding bucket D
y=43
43
D
D
C
Copyright © 1998 by Hanan Samet
6
r
hp22
(85,15)
Atlanta
6. Insert Atlanta causing a bucket partition yielding bucket E
E
Copyright © 1998 by Hanan Samet
7. Insert Miami causing a grid partition yielding bucket F
7
v
hp22
(90,5)
Miami
B
x=70
E
F
70
Copyright © 1998 by Hanan Samet
EXCELL (Tamminen)
• Uses regular decomposition
• Like grid file, guarantees access to any record with two
disk operations
• Differentiated from grid file by absence of linear scales
which enable decomposition of space to be arbitrary
• Grid partition results in doubling the size of grid directory
• Ex: bucket size = 2
hp231
b
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
(52,10)
Mobile
1. Initially, Chicago and Mobile in bucket A
A
Copyright © 1998 by Hanan Samet
EXCELL (Tamminen)
• Uses regular decomposition
• Like grid file, guarantees access to any record with two
disk operations
• Differentiated from grid file by absence of linear scales
which enable decomposition of space to be arbitrary
• Grid partition results in doubling the size of grid directory
• Ex: bucket size = 2
hp231
b
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
(52,10)
Mobile
1. Initially, Chicago and Mobile in bucket A
A
Copyright © 1998 by Hanan Samet
2
r
hp23
(62,77)
Toronto
2. Insert Toronto causing a grid partition yielding bucket B
B
Copyright © 1998 by Hanan Samet
EXCELL (Tamminen)
• Uses regular decomposition
• Like grid file, guarantees access to any record with two
disk operations
• Differentiated from grid file by absence of linear scales
which enable decomposition of space to be arbitrary
• Grid partition results in doubling the size of grid directory
• Ex: bucket size = 2
hp231
b
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
(52,10)
Mobile
1. Initially, Chicago and Mobile in bucket A
A
Copyright © 1998 by Hanan Samet
2
r
hp23
(62,77)
Toronto
2. Insert Toronto causing a grid partition yielding bucket B
B
Copyright © 1998 by Hanan Samet
3
z
hp23
(82,65)
Buffalo
3. Insert Buffalo causing a grid partition yielding bucket C
CA
Copyright © 1998 by Hanan Samet
EXCELL (Tamminen)
• Uses regular decomposition
• Like grid file, guarantees access to any record with two
disk operations
• Differentiated from grid file by absence of linear scales
which enable decomposition of space to be arbitrary
• Grid partition results in doubling the size of grid directory
• Ex: bucket size = 2
hp231
b
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
(52,10)
Mobile
1. Initially, Chicago and Mobile in bucket A
A
Copyright © 1998 by Hanan Samet
2
r
hp23
(62,77)
Toronto
2. Insert Toronto causing a grid partition yielding bucket B
B
Copyright © 1998 by Hanan Samet
3
z
hp23
(82,65)
Buffalo
3. Insert Buffalo causing a grid partition yielding bucket C
CA
Copyright © 1998 by Hanan Samet
4
r
hp23
(5,45)
Denver
4. Insert Denver causing no change
Copyright © 1998 by Hanan Samet
EXCELL (Tamminen)
• Uses regular decomposition
• Like grid file, guarantees access to any record with two
disk operations
• Differentiated from grid file by absence of linear scales
which enable decomposition of space to be arbitrary
• Grid partition results in doubling the size of grid directory
• Ex: bucket size = 2
hp231
b
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
(52,10)
Mobile
1. Initially, Chicago and Mobile in bucket A
A
Copyright © 1998 by Hanan Samet
2
r
hp23
(62,77)
Toronto
2. Insert Toronto causing a grid partition yielding bucket B
B
Copyright © 1998 by Hanan Samet
3
z
hp23
(82,65)
Buffalo
3. Insert Buffalo causing a grid partition yielding bucket C
CA
Copyright © 1998 by Hanan Samet
4
r
hp23
(5,45)
Denver
4. Insert Denver causing no change
Copyright © 1998 by Hanan Samet
5
v
hp23
(27,35)
Omaha
5. Insert Omaha; bucket A overflows; split A yielding bucket D
D
Copyright © 1998 by Hanan Samet
EXCELL (Tamminen)
• Uses regular decomposition
• Like grid file, guarantees access to any record with two
disk operations
• Differentiated from grid file by absence of linear scales
which enable decomposition of space to be arbitrary
• Grid partition results in doubling the size of grid directory
• Ex: bucket size = 2
hp231
b
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
(52,10)
Mobile
1. Initially, Chicago and Mobile in bucket A
A
Copyright © 1998 by Hanan Samet
2
r
hp23
(62,77)
Toronto
2. Insert Toronto causing a grid partition yielding bucket B
B
Copyright © 1998 by Hanan Samet
3
z
hp23
(82,65)
Buffalo
3. Insert Buffalo causing a grid partition yielding bucket C
CA
Copyright © 1998 by Hanan Samet
4
r
hp23
(5,45)
Denver
4. Insert Denver causing no change
Copyright © 1998 by Hanan Samet
5
v
hp23
(27,35)
Omaha
5. Insert Omaha; bucket A overflows; split A yielding bucket D
D
Copyright © 1998 by Hanan Samet
6
g
hp23
6. Bucket A is still too full, so perform a grid partition
D C
B
E
Copyright © 1998 by Hanan Samet
EXCELL (Tamminen)
• Uses regular decomposition
• Like grid file, guarantees access to any record with two
disk operations
• Differentiated from grid file by absence of linear scales
which enable decomposition of space to be arbitrary
• Grid partition results in doubling the size of grid directory
• Ex: bucket size = 2
hp231
b
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
(52,10)
Mobile
1. Initially, Chicago and Mobile in bucket A
A
Copyright © 1998 by Hanan Samet
2
r
hp23
(62,77)
Toronto
2. Insert Toronto causing a grid partition yielding bucket B
B
Copyright © 1998 by Hanan Samet
3
z
hp23
(82,65)
Buffalo
3. Insert Buffalo causing a grid partition yielding bucket C
CA
Copyright © 1998 by Hanan Samet
4
r
hp23
(5,45)
Denver
4. Insert Denver causing no change
Copyright © 1998 by Hanan Samet
5
v
hp23
(27,35)
Omaha
5. Insert Omaha; bucket A overflows; split A yielding bucket D
D
Copyright © 1998 by Hanan Samet
6
g
hp23
6. Bucket A is still too full, so perform a grid partition
D C
B
E
Copyright © 1998 by Hanan Samet
7
z
hp23
(85,15)
Atlanta
7. Insert Atlanta causing no change
Copyright © 1998 by Hanan Samet
EXCELL (Tamminen)
• Uses regular decomposition
• Like grid file, guarantees access to any record with two
disk operations
• Differentiated from grid file by absence of linear scales
which enable decomposition of space to be arbitrary
• Grid partition results in doubling the size of grid directory
• Ex: bucket size = 2
hp231
b
(0,100) (100,100)
(100,0)(0,0)
(35,42)
Chicago
(52,10)
Mobile
1. Initially, Chicago and Mobile in bucket A
A
Copyright © 1998 by Hanan Samet
2
r
hp23
(62,77)
Toronto
2. Insert Toronto causing a grid partition yielding bucket B
B
Copyright © 1998 by Hanan Samet
3
z
hp23
(82,65)
Buffalo
3. Insert Buffalo causing a grid partition yielding bucket C
CA
Copyright © 1998 by Hanan Samet
4
r
hp23
(5,45)
Denver
4. Insert Denver causing no change
Copyright © 1998 by Hanan Samet
5
v
hp23
(27,35)
Omaha
5. Insert Omaha; bucket A overflows; split A yielding bucket D
D
Copyright © 1998 by Hanan Samet
6
g
hp23
6. Bucket A is still too full, so perform a grid partition
D C
B
E
Copyright © 1998 by Hanan Samet
7
z
hp23
(85,15)
Atlanta
7. Insert Atlanta causing no change
Copyright © 1998 by Hanan Samet
8
v
hp23
(90,5)
Miami
8. Insert Miami causing a bucket partition yielding bucket F
F
Copyright © 1998 by Hanan Samet
SUMMARY
• Data structures can be grouped:
N = No data organization
D = Organize data to be
stored (e.g., binary
search tree)
E = Organize the embedding
space from which the
data is drawn (e.g.,
digital searching)
H = Hybrid (combines at
least two of N, D, and E)
hp241
b
Copyright © 1998 by Hanan Samet
SUMMARY
• Data structures can be grouped:
N = No data organization
D = Organize data to be
stored (e.g., binary
search tree)
E = Organize the embedding
space from which the
data is drawn (e.g.,
digital searching)
H = Hybrid (combines at
least two of N, D, and E)
hp241
b
Copyright © 1998 by Hanan Samet
2
r
hp24
Sequential
List
0: no
organization
N
Copyright © 1998 by Hanan Samet
SUMMARY
• Data structures can be grouped:
N = No data organization
D = Organize data to be
stored (e.g., binary
search tree)
E = Organize the embedding
space from which the
data is drawn (e.g.,
digital searching)
H = Hybrid (combines at
least two of N, D, and E)
hp241
b
Copyright © 1998 by Hanan Samet
2
r
hp24
Sequential
List
0: no
organization
N
Copyright © 1998 by Hanan Samet
3
z
hp24
Inverted
List
1: one attribute
H
Copyright © 1998 by Hanan Samet
SUMMARY
• Data structures can be grouped:
N = No data organization
D = Organize data to be
stored (e.g., binary
search tree)
E = Organize the embedding
space from which the
data is drawn (e.g.,
digital searching)
H = Hybrid (combines at
least two of N, D, and E)
hp241
b
Copyright © 1998 by Hanan Samet
2
r
hp24
Sequential
List
0: no
organization
N
Copyright © 1998 by Hanan Samet
3
z
hp24
Inverted
List
1: one attribute
H
Copyright © 1998 by Hanan Samet
4
g
hp24
Grid
Method
2: all attributes
(fixed number
of cells)
E
Copyright © 1998 by Hanan Samet
SUMMARY
• Data structures can be grouped:
N = No data organization
D = Organize data to be
stored (e.g., binary
search tree)
E = Organize the embedding
space from which the
data is drawn (e.g.,
digital searching)
H = Hybrid (combines at
least two of N, D, and E)
hp241
b
Copyright © 1998 by Hanan Samet
2
r
hp24
Sequential
List
0: no
organization
N
Copyright © 1998 by Hanan Samet
3
z
hp24
Inverted
List
1: one attribute
H
Copyright © 1998 by Hanan Samet
4
g
hp24
Grid
Method
2: all attributes
(fixed number
of cells)
E
Copyright © 1998 by Hanan Samet
5
v
hp24
Grid
File
3: permit the
number of
cells to vary
Excell
E D
Copyright © 1998 by Hanan Samet
SUMMARY
• Data structures can be grouped:
N = No data organization
D = Organize data to be
stored (e.g., binary
search tree)
E = Organize the embedding
space from which the
data is drawn (e.g.,
digital searching)
H = Hybrid (combines at
least two of N, D, and E)
hp241
b
Copyright © 1998 by Hanan Samet
2
r
hp24
Sequential
List
0: no
organization
N
Copyright © 1998 by Hanan Samet
3
z
hp24
Inverted
List
1: one attribute
H
Copyright © 1998 by Hanan Samet
4
g
hp24
Grid
Method
2: all attributes
(fixed number
of cells)
E
Copyright © 1998 by Hanan Samet
5
v
hp24
Grid
File
3: permit the
number of
cells to vary
Excell
E D
Copyright © 1998 by Hanan Samet
6
r
hp24
Point
Quadtree
4: only partition
the
overflowing
cellsD
PR
Quadtree
H
MX
Quadtree
E
Copyright © 1998 by Hanan Samet
SUMMARY
• Data structures can be grouped:
N = No data organization
D = Organize data to be
stored (e.g., binary
search tree)
E = Organize the embedding
space from which the
data is drawn (e.g.,
digital searching)
H = Hybrid (combines at
least two of N, D, and E)
hp241
b
Copyright © 1998 by Hanan Samet
2
r
hp24
Sequential
List
0: no
organization
N
Copyright © 1998 by Hanan Samet
3
z
hp24
Inverted
List
1: one attribute
H
Copyright © 1998 by Hanan Samet
4
g
hp24
Grid
Method
2: all attributes
(fixed number
of cells)
E
Copyright © 1998 by Hanan Samet
5
v
hp24
Grid
File
3: permit the
number of
cells to vary
Excell
E D
Copyright © 1998 by Hanan Samet
6
r
hp24
Point
Quadtree
4: only partition
the
overflowing
cellsD
PR
Quadtree
H
MX
Quadtree
E
Copyright © 1998 by Hanan Samet
7
z
hp24
K-d tree
5: only partition
the
overflowing
cells into 2
cellsD
PR
K-d tree
H
Copyright © 1998 by Hanan Samet
SUMMARY
• Data structures can be grouped:
N = No data organization
D = Organize data to be
stored (e.g., binary
search tree)
E = Organize the embedding
space from which the
data is drawn (e.g.,
digital searching)
H = Hybrid (combines at
least two of N, D, and E)
hp241
b
Copyright © 1998 by Hanan Samet
2
r
hp24
Sequential
List
0: no
organization
N
Copyright © 1998 by Hanan Samet
3
z
hp24
Inverted
List
1: one attribute
H
Copyright © 1998 by Hanan Samet
4
g
hp24
Grid
Method
2: all attributes
(fixed number
of cells)
E
Copyright © 1998 by Hanan Samet
5
v
hp24
Grid
File
3: permit the
number of
cells to vary
Excell
E D
Copyright © 1998 by Hanan Samet
6
r
hp24
Point
Quadtree
4: only partition
the
overflowing
cellsD
PR
Quadtree
H
MX
Quadtree
E
Copyright © 1998 by Hanan Samet
7
z
hp24
K-d tree
5: only partition
the
overflowing
cells into 2
cellsD
PR
K-d tree
H
Copyright © 1998 by Hanan Samet
8
g
hp24
Adaptive
K-d tree
6: partition using
attribute
having
greatest
spread across
the cell
D
Copyright © 1998 by Hanan Samet
hp25
DRAWBACKS OF MOST HASHING METHODS
• Require rehashing all the data when the hash table
becomes too full
• Goal: only move a few records
• Solutions:
1. Knott
• use a trie in the form of a binary tree
2. extendible hashing
(Fagin, Nievergelt, Pippenger, Strong)
• like a trie except that all buckets are at same level
• buckets are accessed by use of a directory
• directory elements are NOT the same as buckets
3. Linear hashing (Litwin)
• provides for linear growth in the number of
buckets (i.e., the hash table grows at a rate of one
bucket at a time)
• does not make use of a directory
1
b
A
B C
D
Copyright © 1998 by Hanan Samet
hp25
DRAWBACKS OF MOST HASHING METHODS
• Require rehashing all the data when the hash table
becomes too full
• Goal: only move a few records
• Solutions:
1. Knott
• use a trie in the form of a binary tree
2. extendible hashing
(Fagin, Nievergelt, Pippenger, Strong)
• like a trie except that all buckets are at same level
• buckets are accessed by use of a directory
• directory elements are NOT the same as buckets
3. Linear hashing (Litwin)
• provides for linear growth in the number of
buckets (i.e., the hash table grows at a rate of one
bucket at a time)
• does not make use of a directory
1
b
A
B C
D
Copyright © 1998 by Hanan Samet
hp252
r
• bucket overflow is solved by splitting
• drawback: accessing a bucket at
level m requires m operations
E F
Copyright © 1998 by Hanan Samet
hp25
DRAWBACKS OF MOST HASHING METHODS
• Require rehashing all the data when the hash table
becomes too full
• Goal: only move a few records
• Solutions:
1. Knott
• use a trie in the form of a binary tree
2. extendible hashing
(Fagin, Nievergelt, Pippenger, Strong)
• like a trie except that all buckets are at same level
• buckets are accessed by use of a directory
• directory elements are NOT the same as buckets
3. Linear hashing (Litwin)
• provides for linear growth in the number of
buckets (i.e., the hash table grows at a rate of one
bucket at a time)
• does not make use of a directory
1
b
A
B C
D
Copyright © 1998 by Hanan Samet
hp252
r
• bucket overflow is solved by splitting
• drawback: accessing a bucket at
level m requires m operations
E F
Copyright © 1998 by Hanan Samet
hp253
z
• implemented as a directory of pointers to buckets
• accessing a bucket requires 1 operation
A A A A B C D D
Copyright © 1998 by Hanan Samet
hp25
DRAWBACKS OF MOST HASHING METHODS
• Require rehashing all the data when the hash table
becomes too full
• Goal: only move a few records
• Solutions:
1. Knott
• use a trie in the form of a binary tree
2. extendible hashing
(Fagin, Nievergelt, Pippenger, Strong)
• like a trie except that all buckets are at same level
• buckets are accessed by use of a directory
• directory elements are NOT the same as buckets
3. Linear hashing (Litwin)
• provides for linear growth in the number of
buckets (i.e., the hash table grows at a rate of one
bucket at a time)
• does not make use of a directory
1
b
A
B C
D
Copyright © 1998 by Hanan Samet
hp252
r
• bucket overflow is solved by splitting
• drawback: accessing a bucket at
level m requires m operations
E F
Copyright © 1998 by Hanan Samet
hp253
z
• implemented as a directory of pointers to buckets
• accessing a bucket requires 1 operation
A A A A B C D D
Copyright © 1998 by Hanan Samet
hp254
g
• bucket overflow may cause doubling the directory
• e.g., EXCELL
A A A A A A A A B B E F D D D D
Copyright © 1998 by Hanan Samet
hp26
MECHANICS OF LINEAR HASHING
• Assume a file with m buckets
• Use two hashing functions hi(k) = f(k) mod 2i+1 for i = n
and i = n+1
1. compute hn+1(k) = x and use the result if x < m 2. otherwise use hn(k) • Such a file is said to be at level n,n+1 • There exist primary and overflow buckets • When a record hashes to a full primary bucket, then it is inserted into an overflow bucket corresponding to the primary bucket • τ : storage utilization factor τ = number of records in file divided by the total of available slots in primary and overflow buckets • When τ > a given load α, then one of the buckets is split
• When bucket i is split, it and its overflow bucket’s records
are rehashed using hn+1 and distributed into buckets i
and i + 2n as is appropriate
Copyright © 1998 by Hanan Samet
hp27
LINEAR HASHING INSERTION ALGORITHM
• Let s denote the identity of the next bucket to be split and
cycles from 0 to 2n-1
• Insertion algorithm
1. compute bucket address i for record r
2. insert r in bucket i
3. if τ > α, then split bucket i creating bucket i + 2n and
reinsert in buckets i and i + 2n
4. if s = 2n, then all buckets have been split
• increment n
• reset s to 0
5. if buckets i or i+1 overflow, then allocate an overflow
bucket
6. if rehashing causes some overflow buckets to be
reclaimed, repeat steps 3-5
• Notes
1. a bucket split need not necessarily occur when a
record hashes to a full bucket, nor does the bucket
being split need to be full
2. key principle is that eventually every bucket will be
split and ideally all overflow buckets will be emptied
and reclaimed
3. if the storage utilization gets too low, then buckets
should be reclaimed
Copyright © 1998 by Hanan Samet
hp28
BIT INTERLEAVING HASHING FUNCTIONS
• Bit interleaving takes one bit from the binary
representation of the x coordinate value and one bit
from the binary representation of the y coordinate
value and alternates them
• Use city coordinate values and divide by 12.5 so that
each coordinate value requires just three binary digits
• Example with y being more significant than x
City x y f(z)=z div 12.5 Bit Interleaved
x y Value
Chicago 35 42 2 3 14
Mobile 52 10 4 0 16
Toronto 62 77 4 6 56
Buffalo 82 65 6 5 54
Denver 5 45 0 3 10
Omaha 27 35 2 2 12
Atlanta 85 15 6 1 22
Miami 90 5 7 0 21
1
b
Copyright © 1998 by Hanan Samet
hp28
BIT INTERLEAVING HASHING FUNCTIONS
• Bit interleaving takes one bit from the binary
representation of the x coordinate value and one bit
from the binary representation of the y coordinate
value and alternates them
• Use city coordinate values and divide by 12.5 so that
each coordinate value requires just three binary digits
• Example with y being more significant than x
City x y f(z)=z div 12.5 Bit Interleaved
x y Value
Chicago 35 42 2 3 14
Mobile 52 10 4 0 16
Toronto 62 77 4 6 56
Buffalo 82 65 6 5 54
Denver 5 45 0 3 10
Omaha 27 35 2 2 12
Atlanta 85 15 6 1 22
Miami 90 5 7 0 21
1
b
Copyright © 1998 by Hanan Samet
hp282
r
0 0 1
1 1 0
y
x
• Ex: Atlanta (6,1)
Copyright © 1998 by Hanan Samet
hp28
BIT INTERLEAVING HASHING FUNCTIONS
• Bit interleaving takes one bit from the binary
representation of the x coordinate value and one bit
from the binary representation of the y coordinate
value and alternates them
• Use city coordinate values and divide by 12.5 so that
each coordinate value requires just three binary digits
• Example with y being more significant than x
City x y f(z)=z div 12.5 Bit Interleaved
x y Value
Chicago 35 42 2 3 14
Mobile 52 10 4 0 16
Toronto 62 77 4 6 56
Buffalo 82 65 6 5 54
Denver 5 45 0 3 10
Omaha 27 35 2 2 12
Atlanta 85 15 6 1 22
Miami 90 5 7 0 21
1
b
Copyright © 1998 by Hanan Samet
hp282
r
0 0 1
1 1 0
y
x
• Ex: Atlanta (6,1)
Copyright © 1998 by Hanan Samet
hp283
z
0 1 0 1 1 0 = 22
Copyright © 1998 by Hanan Samet
hp29
EXAMPLE OF LINEAR HASHING
• Assume primary and overflow bucket capacity is 2
• A bucket is split whenever τ ≥ α = 0.66
• Initially, only bucket 0 exists
1
b
0
Copyright © 1998 by Hanan Samet
hp29
EXAMPLE OF LINEAR HASHING
• Assume primary and overflow bucket capacity is 2
• A bucket is split whenever τ ≥ α = 0.66
• Initially, only bucket 0 exists
1
b
0
Copyright © 1998 by Hanan Samet
hp292
r
• Insert Chicago (14) and Mobile (16):
τ=1 and split bucket 0 creating bucket 1
Chicago
0 1
Mobile
Copyright © 1998 by Hanan Samet
hp29
EXAMPLE OF LINEAR HASHING
• Assume primary and overflow bucket capacity is 2
• A bucket is split whenever τ ≥ α = 0.66
• Initially, only bucket 0 exists
1
b
0
Copyright © 1998 by Hanan Samet
hp292
r
• Insert Chicago (14) and Mobile (16):
τ=1 and split bucket 0 creating bucket 1
Chicago
0 1
Mobile
Copyright © 1998 by Hanan Samet
hp29
• Insert Toronto (56) into bucket 0:
τ=0.75 and split bucket 0 creating bucket 2; move
Chicago to bucket 2
3
z
Toronto
Chicago
2
Copyright © 1998 by Hanan Samet
hp29
EXAMPLE OF LINEAR HASHING
• Assume primary and overflow bucket capacity is 2
• A bucket is split whenever τ ≥ α = 0.66
• Initially, only bucket 0 exists
1
b
0
Copyright © 1998 by Hanan Samet
hp292
r
• Insert Chicago (14) and Mobile (16):
τ=1 and split bucket 0 creating bucket 1
Chicago
0 1
Mobile
Copyright © 1998 by Hanan Samet
hp29
• Insert Toronto (56) into bucket 0:
τ=0.75 and split bucket 0 creating bucket 2; move
Chicago to bucket 2
3
z
Toronto
Chicago
2
Copyright © 1998 by Hanan Samet
hp29
• Insert Buffalo (54) into bucket 2:
τ=0.67 and split bucket 1 creating bucket 3
4
g
Buffalo
0 3
Copyright © 1998 by Hanan Samet
hp29
EXAMPLE OF LINEAR HASHING
• Assume primary and overflow bucket capacity is 2
• A bucket is split whenever τ ≥ α = 0.66
• Initially, only bucket 0 exists
1
b
0
Copyright © 1998 by Hanan Samet
hp292
r
• Insert Chicago (14) and Mobile (16):
τ=1 and split bucket 0 creating bucket 1
Chicago
0 1
Mobile
Copyright © 1998 by Hanan Samet
hp29
• Insert Toronto (56) into bucket 0:
τ=0.75 and split bucket 0 creating bucket 2; move
Chicago to bucket 2
3
z
Toronto
Chicago
2
Copyright © 1998 by Hanan Samet
hp29
• Insert Buffalo (54) into bucket 2:
τ=0.67 and split bucket 1 creating bucket 3
4
g
Buffalo
0 3
Copyright © 1998 by Hanan Samet
hp295
v
Denver
• Insert Denver (10) into bucket 2 which causes it to overflow
Copyright © 1998 by Hanan Samet
hp29
EXAMPLE OF LINEAR HASHING
• Assume primary and overflow bucket capacity is 2
• A bucket is split whenever τ ≥ α = 0.66
• Initially, only bucket 0 exists
1
b
0
Copyright © 1998 by Hanan Samet
hp292
r
• Insert Chicago (14) and Mobile (16):
τ=1 and split bucket 0 creating bucket 1
Chicago
0 1
Mobile
Copyright © 1998 by Hanan Samet
hp29
• Insert Toronto (56) into bucket 0:
τ=0.75 and split bucket 0 creating bucket 2; move
Chicago to bucket 2
3
z
Toronto
Chicago
2
Copyright © 1998 by Hanan Samet
hp29
• Insert Buffalo (54) into bucket 2:
τ=0.67 and split bucket 1 creating bucket 3
4
g
Buffalo
0 3
Copyright © 1998 by Hanan Samet
hp295
v
Denver
• Insert Denver (10) into bucket 2 which causes it to overflow
Copyright © 1998 by Hanan Samet
hp296
g
Omaha
• Insert Omaha (12) into bucket 0 which causes it to overflow
Copyright © 1998 by Hanan Samet
hp29
EXAMPLE OF LINEAR HASHING
• Assume primary and overflow bucket capacity is 2
• A bucket is split whenever τ ≥ α = 0.66
• Initially, only bucket 0 exists
1
b
0
Copyright © 1998 by Hanan Samet
hp292
r
• Insert Chicago (14) and Mobile (16):
τ=1 and split bucket 0 creating bucket 1
Chicago
0 1
Mobile
Copyright © 1998 by Hanan Samet
hp29
• Insert Toronto (56) into bucket 0:
τ=0.75 and split bucket 0 creating bucket 2; move
Chicago to bucket 2
3
z
Toronto
Chicago
2
Copyright © 1998 by Hanan Samet
hp29
• Insert Buffalo (54) into bucket 2:
τ=0.67 and split bucket 1 creating bucket 3
4
g
Buffalo
0 3
Copyright © 1998 by Hanan Samet
hp295
v
Denver
• Insert Denver (10) into bucket 2 which causes it to overflow
Copyright © 1998 by Hanan Samet
hp296
g
Omaha
• Insert Omaha (12) into bucket 0 which causes it to overflow
Copyright © 1998 by Hanan Samet
hp297
r
Atlanta
• Insert Atlanta (22) into bucket 2’s overflow area
Copyright © 1998 by Hanan Samet
hp29
EXAMPLE OF LINEAR HASHING
• Assume primary and overflow bucket capacity is 2
• A bucket is split whenever τ ≥ α = 0.66
• Initially, only bucket 0 exists
1
b
0
Copyright © 1998 by Hanan Samet
hp292
r
• Insert Chicago (14) and Mobile (16):
τ=1 and split bucket 0 creating bucket 1
Chicago
0 1
Mobile
Copyright © 1998 by Hanan Samet
hp29
• Insert Toronto (56) into bucket 0:
τ=0.75 and split bucket 0 creating bucket 2; move
Chicago to bucket 2
3
z
Toronto
Chicago
2
Copyright © 1998 by Hanan Samet
hp29
• Insert Buffalo (54) into bucket 2:
τ=0.67 and split bucket 1 creating bucket 3
4
g
Buffalo
0 3
Copyright © 1998 by Hanan Samet
hp295
v
Denver
• Insert Denver (10) into bucket 2 which causes it to overflow
Copyright © 1998 by Hanan Samet
hp296
g
Omaha
• Insert Omaha (12) into bucket 0 which causes it to overflow
Copyright © 1998 by Hanan Samet
hp297
r
Atlanta
• Insert Atlanta (22) into bucket 2’s overflow area
Copyright © 1998 by Hanan Samet
hp298
z
Omaha
4
• Insert Miami (21) into bucket 1:
τ=0.67 and split bucket 0 creating bucket 4; move Omaha to
bucket 4
Miami
Copyright © 1998 by Hanan Samet
hp29
EXAMPLE OF LINEAR HASHING
• Assume primary and overflow bucket capacity is 2
• A bucket is split whenever τ ≥ α = 0.66
• Initially, only bucket 0 exists
1
b
0
Copyright © 1998 by Hanan Samet
hp292
r
• Insert Chicago (14) and Mobile (16):
τ=1 and split bucket 0 creating bucket 1
Chicago
0 1
Mobile
Copyright © 1998 by Hanan Samet
hp29
• Insert Toronto (56) into bucket 0:
τ=0.75 and split bucket 0 creating bucket 2; move
Chicago to bucket 2
3
z
Toronto
Chicago
2
Copyright © 1998 by Hanan Samet
hp29
• Insert Buffalo (54) into bucket 2:
τ=0.67 and split bucket 1 creating bucket 3
4
g
Buffalo
0 3
Copyright © 1998 by Hanan Samet
hp295
v
Denver
• Insert Denver (10) into bucket 2 which causes it to overflow
Copyright © 1998 by Hanan Samet
hp296
g
Omaha
• Insert Omaha (12) into bucket 0 which causes it to overflow
Copyright © 1998 by Hanan Samet
hp297
r
Atlanta
• Insert Atlanta (22) into bucket 2’s overflow area
Copyright © 1998 by Hanan Samet
hp298
z
Omaha
4
• Insert Miami (21) into bucket 1:
τ=0.67 and split bucket 0 creating bucket 4; move Omaha to
bucket 4
Miami
Copyright © 1998 by Hanan Samet
hp299
v
Miami
5
• Reclaim the overflow area of bucket 0:
τ=0.67 again and split bucket 1 creating bucket 5; move
Miami to bucket 5
Copyright © 1998 by Hanan Samet
hp30
ORDER PRESERVING LINEAR HASHING (OPLH)
• Problem: the hash function hn(k) = k mod 2n implies that
all records in a given bucket agree in the least
n significant bits
1. OK for random access
2. unacceptable for sequential access as each record will
be in a different bucket
• Solution: use the hash function
hn(k) = (reverse(k)) mod 2n
1. tests the n most significant bits
2. all records in a bucket are within a given range
City x y f(z)=z mod 12.5 Bit Interleaved Value
x y x most sig y most sig
Chicago
35 42 2 3 44 28
Mobile 52 10 4 0 1 2
Toronto 62 77 4 6 11 7
Buffalo 82 65 6 5 39 27
Denver 5 45 0 3 40 20
Omaha 27 35 2 2 12 12
Atlanta 85 15 6 1 37 26
Miami 90 5 7 0 21 42
• Shortcomings
1. records may not be scattered too well
• overflow is much more common than with traditional
hashing methods
• random access is slower since several overflow
buckets may have to be examined
2. creates a large number of sparsely filled buckets
• sequential access may be slower as may have to
examine many empty buckets
Copyright © 1998 by Hanan Samet
hp31
EXAMPLE OF ORDER PRESERVING LINEAR HASHING
• Assume primary and overflow bucket capacity is 2
• A bucket is split whenever τ≥α=0.66
• Initially, only bucket 0 exists
(0,100) (100,100)
(100,0)(0,0)
0
1
b
Copyright © 1998 by Hanan Samet
hp31
EXAMPLE OF ORDER PRESERVING LINEAR HASHING
• Assume primary and overflow bucket capacity is 2
• A bucket is split whenever τ≥α=0.66
• Initially, only bucket 0 exists
(0,100) (100,100)
(100,0)(0,0)
0
1
b
Copyright © 1998 by Hanan Samet
hp312
r
(52,10)
Mobile
(35,42)
Chicago
1
• Insert Chicago (28) and Mobile (2): τ=1 and split bucket
0 creating bucket 1
Copyright © 1998 by Hanan Samet
hp31
EXAMPLE OF ORDER PRESERVING LINEAR HASHING
• Assume primary and overflow bucket capacity is 2
• A bucket is split whenever τ≥α=0.66
• Initially, only bucket 0 exists
(0,100) (100,100)
(100,0)(0,0)
0
1
b
Copyright © 1998 by Hanan Samet
hp312
r
(52,10)
Mobile
(35,42)
Chicago
1
• Insert Chicago (28) and Mobile (2): τ=1 and split bucket
0 creating bucket 1
Copyright © 1998 by Hanan Samet
hp313
z
(62,77)
Toronto
2
• Insert Toronto (7) into bucket 1: τ=0.75 and split bucket
0 creating bucket 2; move Mobile to bucket 2
Copyright © 1998 by Hanan Samet
hp31
EXAMPLE OF ORDER PRESERVING LINEAR HASHING
• Assume primary and overflow bucket capacity is 2
• A bucket is split whenever τ≥α=0.66
• Initially, only bucket 0 exists
(0,100) (100,100)
(100,0)(0,0)
0
1
b
Copyright © 1998 by Hanan Samet
hp312
r
(52,10)
Mobile
(35,42)
Chicago
1
• Insert Chicago (28) and Mobile (2): τ=1 and split bucket
0 creating bucket 1
Copyright © 1998 by Hanan Samet
hp313
z
(62,77)
Toronto
2
• Insert Toronto (7) into bucket 1: τ=0.75 and split bucket
0 creating bucket 2; move Mobile to bucket 2
Copyright © 1998 by Hanan Samet
hp314
g
(82,65)
Buffalo3
• Insert Buffalo (27) into 1: τ=0.67, split bucket 1 creating
bucket 3; move Toronto and Buffalo to bucket 3
Copyright © 1998 by Hanan Samet
hp31
EXAMPLE OF ORDER PRESERVING LINEAR HASHING
• Assume primary and overflow bucket capacity is 2
• A bucket is split whenever τ≥α=0.66
• Initially, only bucket 0 exists
(0,100) (100,100)
(100,0)(0,0)
0
1
b
Copyright © 1998 by Hanan Samet
hp312
r
(52,10)
Mobile
(35,42)
Chicago
1
• Insert Chicago (28) and Mobile (2): τ=1 and split bucket
0 creating bucket 1
Copyright © 1998 by Hanan Samet
hp313
z
(62,77)
Toronto
2
• Insert Toronto (7) into bucket 1: τ=0.75 and split bucket
0 creating bucket 2; move Mobile to bucket 2
Copyright © 1998 by Hanan Samet
hp314
g
(82,65)
Buffalo3
• Insert Buffalo (27) into 1: τ=0.67, split bucket 1 creating
bucket 3; move Toronto and Buffalo to bucket 3
Copyright © 1998 by Hanan Samet
hp315
v
(5,45)
Denver
• Insert Denver (20) into bucket 0
Copyright © 1998 by Hanan Samet
hp31
EXAMPLE OF ORDER PRESERVING LINEAR HASHING
• Assume primary and overflow bucket capacity is 2
• A bucket is split whenever τ≥α=0.66
• Initially, only bucket 0 exists
(0,100) (100,100)
(100,0)(0,0)
0
1
b
Copyright © 1998 by Hanan Samet
hp312
r
(52,10)
Mobile
(35,42)
Chicago
1
• Insert Chicago (28) and Mobile (2): τ=1 and split bucket
0 creating bucket 1
Copyright © 1998 by Hanan Samet
hp313
z
(62,77)
Toronto
2
• Insert Toronto (7) into bucket 1: τ=0.75 and split bucket
0 creating bucket 2; move Mobile to bucket 2
Copyright © 1998 by Hanan Samet
hp314
g
(82,65)
Buffalo3
• Insert Buffalo (27) into 1: τ=0.67, split bucket 1 creating
bucket 3; move Toronto and Buffalo to bucket 3
Copyright © 1998 by Hanan Samet
hp315
v
(5,45)
Denver
• Insert Denver (20) into bucket 0
Copyright © 1998 by Hanan Samet
hp316
g
(27,35)
Omaha4
• Insert Omaha (12) into 0: τ=0.75 and split bucket 0 creating
bucket 4; move Denver and Chicago to bucket 4
Copyright © 1998 by Hanan Samet
hp31
EXAMPLE OF ORDER PRESERVING LINEAR HASHING
• Assume primary and overflow bucket capacity is 2
• A bucket is split whenever τ≥α=0.66
• Initially, only bucket 0 exists
(0,100) (100,100)
(100,0)(0,0)
0
1
b
Copyright © 1998 by Hanan Samet
hp312
r
(52,10)
Mobile
(35,42)
Chicago
1
• Insert Chicago (28) and Mobile (2): τ=1 and split bucket
0 creating bucket 1
Copyright © 1998 by Hanan Samet
hp313
z
(62,77)
Toronto
2
• Insert Toronto (7) into bucket 1: τ=0.75 and split bucket
0 creating bucket 2; move Mobile to bucket 2
Copyright © 1998 by Hanan Samet
hp314
g
(82,65)
Buffalo3
• Insert Buffalo (27) into 1: τ=0.67, split bucket 1 creating
bucket 3; move Toronto and Buffalo to bucket 3
Copyright © 1998 by Hanan Samet
hp315
v
(5,45)
Denver
• Insert Denver (20) into bucket 0
Copyright © 1998 by Hanan Samet
hp316
g
(27,35)
Omaha4
• Insert Omaha (12) into 0: τ=0.75 and split bucket 0 creating
bucket 4; move Denver and Chicago to bucket 4
Copyright © 1998 by Hanan Samet
hp317
v
(85,15)
Atlanta
• Insert Atlanta (26) into bucket 2
Copyright © 1998 by Hanan Samet
hp31
EXAMPLE OF ORDER PRESERVING LINEAR HASHING
• Assume primary and overflow bucket capacity is 2
• A bucket is split whenever τ≥α=0.66
• Initially, only bucket 0 exists
(0,100) (100,100)
(100,0)(0,0)
0
1
b
Copyright © 1998 by Hanan Samet
hp312
r
(52,10)
Mobile
(35,42)
Chicago
1
• Insert Chicago (28) and Mobile (2): τ=1 and split bucket
0 creating bucket 1
Copyright © 1998 by Hanan Samet
hp313
z
(62,77)
Toronto
2
• Insert Toronto (7) into bucket 1: τ=0.75 and split bucket
0 creating bucket 2; move Mobile to bucket 2
Copyright © 1998 by Hanan Samet
hp314
g
(82,65)
Buffalo3
• Insert Buffalo (27) into 1: τ=0.67, split bucket 1 creating
bucket 3; move Toronto and Buffalo to bucket 3
Copyright © 1998 by Hanan Samet
hp315
v
(5,45)
Denver
• Insert Denver (20) into bucket 0
Copyright © 1998 by Hanan Samet
hp316
g
(27,35)
Omaha4
• Insert Omaha (12) into 0: τ=0.75 and split bucket 0 creating
bucket 4; move Denver and Chicago to bucket 4
Copyright © 1998 by Hanan Samet
hp317
v
(85,15)
Atlanta
• Insert Atlanta (26) into bucket 2
Copyright © 1998 by Hanan Samet
hp318
z
(90,5)
Miami
5
• Insert Miami (42) into bucket 2: τ=0.67 and split bucket 1
creating bucket 5; move Miami to bucket 2’s overflow area
Copyright © 1998 by Hanan Samet
hp32
COMPARISON OF OPLH WITH EXCELL
OPLH EXCELL
1. Implicit directory
• one directory element
per primary bucket
• all buckets stored at
one of two levels
• overflow buckets
1. Explicit directory
• set of primary buckets
2. Reverse bit interleaving 2. Reverse bit interleaving
3. Bucket overflow
• allocate at most two
additional buckets (one
for the bucket that has
been split and one for
the overflowing bucket)
3. Bucket overflow
• triggers bucket split or
directory doubling
4.Retrieval of a record
requires examining primary
and overflow buckets
4.Retrieve any record with
two disk accesses
• Summary: order preserving linear hashing (OPLH) yields
a more gradual growth in the size of the
directory at the expense of the loss of the
guarantee of retrieval of any record with just
two disk accesses
Copyright © 1998 by Hanan Samet
hp33
EXAMPLES OF COMPARISON OF OPLH WITH EXCELL
1. Reversed bit interleaving with the y coordinate value as
the most significant (i.e., y is split first)
OPLH: EXCELL:
• 6 primary buckets • 7 buckets
• 2 overflow buckets • 16 directory elements
2. Reversed bit interleaving with the x coordinate value as
the most significant (i.e., x is split first)
OPLH: EXCELL:
• 7 primary buckets • 6 buckets
• no overflow buckets • 8 directory elements
(52,10)
Mobile
(35,42)
Chicago
(62,77)
Toronto
(82,65)
Buffalo
(5,45)
Denver
(27,35)
Omaha (85,15)
Atlanta
(90,5)
Miami
(52,10)
Mobile
(5,45)
Denver
(85,15)
Atlanta
(27,35)
Omaha
(0,100) (100,100)
(100,0)(0,0)
0
1
2
3
4
5
(0,100) (100,100)
(100,0)(0,0)
A
(52,10)
Mobile
(35,42)
Chicago
B
(62,77)
Toronto
C
(82,65)
Buffalo
(5,45)
Denver
(27,35)
OmahaD
(85,15)
Atlanta
(90,5)
Miami
B B B B
B BB
A G
E
F F
(0,100) (100,100)
(100,0)(0,0)
A
(52,10)
Mobile
(35,42)
Chicago
D
(62,77)
Toronto
B
(82,65)
Buffalo
(5,45)
Denver
(27,35)
Omaha
(85,15)
Atlanta
(90,5)
Miami
D CC
E F
(0,100) (100,100)
(100,0)(0,0)
0
2
1
3
4
6
5
(35,42)
Chicago
(62,77)
Toronto
(82,65)
Buffalo
(90,5)
Miami
Copyright © 1998 by Hanan Samet
hp341
bSPIRAL HASHING (Martin)
• Drawbacks of linear hashing:
1. order in which buckets are split is unrelated to the
probability of the occurence of overflow
2. all buckets that are candidates for a split have the
same probability of overflowing
• Central idea is the existence of an ever-changing (and
growing) address space of active bucket addresses
1. records are distributed in the active buckets in an
uneven manner
2. split the bucket with the highest probability of
overflowing
• When a bucket s is split
1. create d new buckets
2. rehash the contents of s into the d new buckets
3. bucket s is no longer used
• If [s,t ] are the active buckets, then [s + 1,t + d ] are
the active buckets after the split
• Ex: assume that initially there are d – 1 active buckets
starting at address 1
• After s bucket splits, there are (s + 1)•(d + 1) active
buckets starting at address s + 1 (prove by induction)
1d = 2:
Copyright © 1998 by Hanan Samet
hp341
bSPIRAL HASHING (Martin)
• Drawbacks of linear hashing:
1. order in which buckets are split is unrelated to the
probability of the occurence of overflow
2. all buckets that are candidates for a split have the
same probability of overflowing
• Central idea is the existence of an ever-changing (and
growing) address space of active bucket addresses
1. records are distributed in the active buckets in an
uneven manner
2. split the bucket with the highest probability of
overflowing
• When a bucket s is split
1. create d new buckets
2. rehash the contents of s into the d new buckets
3. bucket s is no longer used
• If [s,t ] are the active buckets, then [s + 1,t + d ] are
the active buckets after the split
• Ex: assume that initially there are d – 1 active buckets
starting at address 1
• After s bucket splits, there are (s + 1)•(d + 1) active
buckets starting at address s + 1 (prove by induction)
1d = 2:
Copyright © 1998 by Hanan Samet
hp342
r
2 3
Copyright © 1998 by Hanan Samet
hp341
bSPIRAL HASHING (Martin)
• Drawbacks of linear hashing:
1. order in which buckets are split is unrelated to the
probability of the occurence of overflow
2. all buckets that are candidates for a split have the
same probability of overflowing
• Central idea is the existence of an ever-changing (and
growing) address space of active bucket addresses
1. records are distributed in the active buckets in an
uneven manner
2. split the bucket with the highest probability of
overflowing
• When a bucket s is split
1. create d new buckets
2. rehash the contents of s into the d new buckets
3. bucket s is no longer used
• If [s,t ] are the active buckets, then [s + 1,t + d ] are
the active buckets after the split
• Ex: assume that initially there are d – 1 active buckets
starting at address 1
• After s bucket splits, there are (s + 1)•(d + 1) active
buckets starting at address s + 1 (prove by induction)
1d = 2:
Copyright © 1998 by Hanan Samet
hp342
r
2 3
Copyright © 1998 by Hanan Samet
hp343
z
3 4 5
Copyright © 1998 by Hanan Samet
hp341
bSPIRAL HASHING (Martin)
• Drawbacks of linear hashing:
1. order in which buckets are split is unrelated to the
probability of the occurence of overflow
2. all buckets that are candidates for a split have the
same probability of overflowing
• Central idea is the existence of an ever-changing (and
growing) address space of active bucket addresses
1. records are distributed in the active buckets in an
uneven manner
2. split the bucket with the highest probability of
overflowing
• When a bucket s is split
1. create d new buckets
2. rehash the contents of s into the d new buckets
3. bucket s is no longer used
• If [s,t ] are the active buckets, then [s + 1,t + d ] are
the active buckets after the split
• Ex: assume that initially there are d – 1 active buckets
starting at address 1
• After s bucket splits, there are (s + 1)•(d + 1) active
buckets starting at address s + 1 (prove by induction)
1d = 2:
Copyright © 1998 by Hanan Samet
hp342
r
2 3
Copyright © 1998 by Hanan Samet
hp343
z
3 4 5
Copyright © 1998 by Hanan Samet
hp344
g
4 5 6 7
Copyright © 1998 by Hanan Samet
hp35
SPIRAL HASHING FUNCTION
• Assume that initially there are d −1 active buckets starting
at address 1
• Key idea is the behavior of the function y = dx
1. dx+1 − dx = dx • (d −1)
2. bucket s has just been split
3. let dx = s +1 = address of first active bucket
4. implies dx+1 − dx = (s +1) • (d −1) = number of active buckets
5. last active bucket is at address dx+1 −1
6. [dx,dx+1 ) is range of active buckets
• Use two hashing functions
1. h(k) maps key k uniformly into [0,1) which is the range of the
difference in exponent values of addresses iny = dx+1 − dx
2. y(k) maps h(k) into an address in [s+1,(s+1)+(s+1)•(d −1))
• y(k) = dx(k) with x(k) in range [logd (s+1),logd (s+1)+1)
• before split, active buckets lay in range [logd s,logd s+1)
• want to make sure that all key values previously in
bucket s − i.e., x(k) in [logds,logd (s+1)) are rehashed
into one of the new buckets with an x(k) value in
[logds+1,logd (s+1)+1)
• leave other key values in [logd (s+1),logd s +1) unchanged
• difficult to choose x(k)
a. x(k) = logd (s+1) + h(k)
• drawback: must rehash all keys when a bucket is split
b. x(k) = logd (s+1) − h(k) + h(k)
• guarantees that if k is hashed into bucket b (≥ s + 1)
then it continues to hash there until bucket b is split
• implies that x(k) is a number in the range
[logd (s +1),logd (s+1) +1) whose fractional part is h(k)
Copyright © 1998 by Hanan Samet
hp36
BEHAVIOR OF THE SPIRAL HASHING FUNCTION
• Ex: d = 2
1
b
y
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
y = 2x
x
Bucket
Number
Relative
Load
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0.0
0.1
0.2
0.5
1.0
The function y = 2x Relative load of buckets 1-15
Bucket address Hash interval Relative load
1 [0.0000,1.0000) 1.0000
2 [0.0000,0.5849) 0.5849
3 [0.5849,1.0000) 0.4151
4 [0.0000,0.3219) 0.3219
5 [0.3219,0.5849) 0.2630
6 [0.5849,0.8073) 0.2224
7 [0.8073,1.0000) 0.1927
8 [0.0000,0.1699) 0.1699
9 [0.1699,0.3219) 0.1520
10 [0.3219,0.4594) 0.1375
11 [0.4594,0.5849) 0.1255
12 [0.5849,0.7004) 0.1155
13 [0.7004,0.8073) 0.1069
14 [0.8073,0.9068) 0.0995
15 [0.9068,1.0000) 0.0932
Copyright © 1998 by Hanan Samet
hp36
BEHAVIOR OF THE SPIRAL HASHING FUNCTION
• Ex: d = 2
1
b
y
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
y = 2x
x
Bucket
Number
Relative
Load
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0.0
0.1
0.2
0.5
1.0
The function y = 2x Relative load of buckets 1-15
Bucket address Hash interval Relative load
1 [0.0000,1.0000) 1.0000
2 [0.0000,0.5849) 0.5849
3 [0.5849,1.0000) 0.4151
4 [0.0000,0.3219) 0.3219
5 [0.3219,0.5849) 0.2630
6 [0.5849,0.8073) 0.2224
7 [0.8073,1.0000) 0.1927
8 [0.0000,0.1699) 0.1699
9 [0.1699,0.3219) 0.1520
10 [0.3219,0.4594) 0.1375
11 [0.4594,0.5849) 0.1255
12 [0.5849,0.7004) 0.1155
13 [0.7004,0.8073) 0.1069
14 [0.8073,0.9068) 0.0995
15 [0.9068,1.0000) 0.0932
Copyright © 1998 by Hanan Samet
hp362
r
• Split bucket 3
Copyright © 1998 by Hanan Samet
hp36
BEHAVIOR OF THE SPIRAL HASHING FUNCTION
• Ex: d = 2
1
b
y
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
y = 2x
x
Bucket
Number
Relative
Load
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0.0
0.1
0.2
0.5
1.0
The function y = 2x Relative load of buckets 1-15
Bucket address Hash interval Relative load
1 [0.0000,1.0000) 1.0000
2 [0.0000,0.5849) 0.5849
3 [0.5849,1.0000) 0.4151
4 [0.0000,0.3219) 0.3219
5 [0.3219,0.5849) 0.2630
6 [0.5849,0.8073) 0.2224
7 [0.8073,1.0000) 0.1927
8 [0.0000,0.1699) 0.1699
9 [0.1699,0.3219) 0.1520
10 [0.3219,0.4594) 0.1375
11 [0.4594,0.5849) 0.1255
12 [0.5849,0.7004) 0.1155
13 [0.7004,0.8073) 0.1069
14 [0.8073,0.9068) 0.0995
15 [0.9068,1.0000) 0.0932
Copyright © 1998 by Hanan Samet
hp362
r
• Split bucket 3
Copyright © 1998 by Hanan Samet
hp363
z
• Yields buckets 6 and 7
Copyright © 1998 by Hanan Samet
hp37
TABLES FOR EXAMPLE OF SPIRAL HASHING
• Use bit interleaving to form a value k – i.e., take one bit
from the binary representation of the x coordinate
value and one bit from the binary representation of the
y coordinate value and alternate them
• Use city coordinate values and divide by 12.5 so that
each coordinate value requires just three binary digits
• Use h(k) = k/64 which has same effect as reverse bit
interleaving and behavior is analogous to OPLH
• Example with y being more significant than x
City x y f(z)=z div 12.5
x y k/64
Chicago 35 42 2 3 14 .21875
Mobile 52 10 4 0 16 .25
Toronto 62 77 4 6 56 .875
Buffalo 82 65 6 5 54 .84375
Denver 5 45 0 3 10 .15625
Omaha 27 35 2 2 12 .1875
Atlanta 85 15 6 1 22 .34375
Miami 90 5 7 0 21 .328125
1
b
Copyright © 1998 by Hanan Samet
hp37
TABLES FOR EXAMPLE OF SPIRAL HASHING
• Use bit interleaving to form a value k – i.e., take one bit
from the binary representation of the x coordinate
value and one bit from the binary representation of the
y coordinate value and alternate them
• Use city coordinate values and divide by 12.5 so that
each coordinate value requires just three binary digits
• Use h(k) = k/64 which has same effect as reverse bit
interleaving and behavior is analogous to OPLH
• Example with y being more significant than x
City x y f(z)=z div 12.5
x y k/64
Chicago 35 42 2 3 14 .21875
Mobile 52 10 4 0 16 .25
Toronto 62 77 4 6 56 .875
Buffalo 82 65 6 5 54 .84375
Denver 5 45 0 3 10 .15625
Omaha 27 35 2 2 12 .1875
Atlanta 85 15 6 1 22 .34375
Miami 90 5 7 0 21 .328125
1
b
Copyright © 1998 by Hanan Samet
hp372
r
0 0 1
1 1 0
y
x
• Ex: Atlanta (6,1)
Copyright © 1998 by Hanan Samet
hp37
TABLES FOR EXAMPLE OF SPIRAL HASHING
• Use bit interleaving to form a value k – i.e., take one bit
from the binary representation of the x coordinate
value and one bit from the binary representation of the
y coordinate value and alternate them
• Use city coordinate values and divide by 12.5 so that
each coordinate value requires just three binary digits
• Use h(k) = k/64 which has same effect as reverse bit
interleaving and behavior is analogous to OPLH
• Example with y being more significant than x
City x y f(z)=z div 12.5
x y k/64
Chicago 35 42 2 3 14 .21875
Mobile 52 10 4 0 16 .25
Toronto 62 77 4 6 56 .875
Buffalo 82 65 6 5 54 .84375
Denver 5 45 0 3 10 .15625
Omaha 27 35 2 2 12 .1875
Atlanta 85 15 6 1 22 .34375
Miami 90 5 7 0 21 .328125
1
b
Copyright © 1998 by Hanan Samet
hp372
r
0 0 1
1 1 0
y
x
• Ex: Atlanta (6,1)
Copyright © 1998 by Hanan Samet
hp373
z
0 1 0 1 1 0 = 22
Copyright © 1998 by Hanan Samet
hp38
MECHANICS OF EXAMPLE OF SPIRAL HASHING
• Assume d = 2 and primary and overflow bucket capacity of 2
• A bucket is split whenever τ ≥ α = 0.66
• Initially, only bucket 1 exists
1
b
1
1
Copyright © 1998 by Hanan Samet
hp38
MECHANICS OF EXAMPLE OF SPIRAL HASHING
• Assume d = 2 and primary and overflow bucket capacity of 2
• A bucket is split whenever τ ≥ α = 0.66
• Initially, only bucket 1 exists
1
b
1
1
Copyright © 1998 by Hanan Samet
hp38
• Insert Chicago (.22) and Mobile (.25): τ=1 and split bucket 1
creating buckets 2 and 3; Chicago and Mobile are moved to
bucket 2
2
r
Mobile
Chicago
3
1 2
2 3
CHI
MOB
Copyright © 1998 by Hanan Samet
hp38
MECHANICS OF EXAMPLE OF SPIRAL HASHING
• Assume d = 2 and primary and overflow bucket capacity of 2
• A bucket is split whenever τ ≥ α = 0.66
• Initially, only bucket 1 exists
1
b
1
1
Copyright © 1998 by Hanan Samet
hp38
• Insert Chicago (.22) and Mobile (.25): τ=1 and split bucket 1
creating buckets 2 and 3; Chicago and Mobile are moved to
bucket 2
2
r
Mobile
Chicago
3
1 2
2 3
CHI
MOB
Copyright © 1998 by Hanan Samet
hp38
• Insert Toronto (.87) into bucket 3: τ=0.75 and split bucket 2
creating buckets 3 and 4; Chicago and Mobile are moved to
bucket 4
3
z
Toronto
5
4
5
4 5
TOR CHI
MOB
Copyright © 1998 by Hanan Samet
hp38
MECHANICS OF EXAMPLE OF SPIRAL HASHING
• Assume d = 2 and primary and overflow bucket capacity of 2
• A bucket is split whenever τ ≥ α = 0.66
• Initially, only bucket 1 exists
1
b
1
1
Copyright © 1998 by Hanan Samet
hp38
• Insert Chicago (.22) and Mobile (.25): τ=1 and split bucket 1
creating buckets 2 and 3; Chicago and Mobile are moved to
bucket 2
2
r
Mobile
Chicago
3
1 2
2 3
CHI
MOB
Copyright © 1998 by Hanan Samet
hp38
• Insert Toronto (.87) into bucket 3: τ=0.75 and split bucket 2
creating buckets 3 and 4; Chicago and Mobile are moved to
bucket 4
3
z
Toronto
5
4
5
4 5
TOR CHI
MOB
Copyright © 1998 by Hanan Samet
hp38
• Insert Buffalo (.84) into bucket 3: τ=0.67 and split bucket 3
creating buckets 6 and 7; Toronto and Buffalo are moved to
bucket 7
4
g
Buffalo
6 7
6 7
TOR
BUF
Copyright © 1998 by Hanan Samet
hp38
MECHANICS OF EXAMPLE OF SPIRAL HASHING
• Assume d = 2 and primary and overflow bucket capacity of 2
• A bucket is split whenever τ ≥ α = 0.66
• Initially, only bucket 1 exists
1
b
1
1
Copyright © 1998 by Hanan Samet
hp38
• Insert Chicago (.22) and Mobile (.25): τ=1 and split bucket 1
creating buckets 2 and 3; Chicago and Mobile are moved to
bucket 2
2
r
Mobile
Chicago
3
1 2
2 3
CHI
MOB
Copyright © 1998 by Hanan Samet
hp38
• Insert Toronto (.87) into bucket 3: τ=0.75 and split bucket 2
creating buckets 3 and 4; Chicago and Mobile are moved to
bucket 4
3
z
Toronto
5
4
5
4 5
TOR CHI
MOB
Copyright © 1998 by Hanan Samet
hp38
• Insert Buffalo (.84) into bucket 3: τ=0.67 and split bucket 3
creating buckets 6 and 7; Toronto and Buffalo are moved to
bucket 7
4
g
Buffalo
6 7
6 7
TOR
BUF
Copyright © 1998 by Hanan Samet
hp385
g
• Insert Denver (.16) and Omaha (.19) into bucket 4’s overflow
area
Omaha
Denver
DEN
OMA
Copyright © 1998 by Hanan Samet
hp38
MECHANICS OF EXAMPLE OF SPIRAL HASHING
• Assume d = 2 and primary and overflow bucket capacity of 2
• A bucket is split whenever τ ≥ α = 0.66
• Initially, only bucket 1 exists
1
b
1
1
Copyright © 1998 by Hanan Samet
hp38
• Insert Chicago (.22) and Mobile (.25): τ=1 and split bucket 1
creating buckets 2 and 3; Chicago and Mobile are moved to
bucket 2
2
r
Mobile
Chicago
3
1 2
2 3
CHI
MOB
Copyright © 1998 by Hanan Samet
hp38
• Insert Toronto (.87) into bucket 3: τ=0.75 and split bucket 2
creating buckets 3 and 4; Chicago and Mobile are moved to
bucket 4
3
z
Toronto
5
4
5
4 5
TOR CHI
MOB
Copyright © 1998 by Hanan Samet
hp38
• Insert Buffalo (.84) into bucket 3: τ=0.67 and split bucket 3
creating buckets 6 and 7; Toronto and Buffalo are moved to
bucket 7
4
g
Buffalo
6 7
6 7
TOR
BUF
Copyright © 1998 by Hanan Samet
hp385
g
• Insert Denver (.16) and Omaha (.19) into bucket 4’s overflow
area
Omaha
Denver
DEN
OMA
Copyright © 1998 by Hanan Samet
hp386
v
• Insert Atlanta (.34) into bucket 5: τ=0.7 and split bucket 4
creating buckets 8 and 9; Denver is moved to bucket 8,
while Chicago, Mobile, and Omaha are moved to bucket 9
which causes it to overflow
Atlanta
9
8
9
8 9
CHI
MOB
DEN
OMA
ATL
Copyright © 1998 by Hanan Samet
hp38
MECHANICS OF EXAMPLE OF SPIRAL HASHING
• Assume d = 2 and primary and overflow bucket capacity of 2
• A bucket is split whenever τ ≥ α = 0.66
• Initially, only bucket 1 exists
1
b
1
1
Copyright © 1998 by Hanan Samet
hp38
• Insert Chicago (.22) and Mobile (.25): τ=1 and split bucket 1
creating buckets 2 and 3; Chicago and Mobile are moved to
bucket 2
2
r
Mobile
Chicago
3
1 2
2 3
CHI
MOB
Copyright © 1998 by Hanan Samet
hp38
• Insert Toronto (.87) into bucket 3: τ=0.75 and split bucket 2
creating buckets 3 and 4; Chicago and Mobile are moved to
bucket 4
3
z
Toronto
5
4
5
4 5
TOR CHI
MOB
Copyright © 1998 by Hanan Samet
hp38
• Insert Buffalo (.84) into bucket 3: τ=0.67 and split bucket 3
creating buckets 6 and 7; Toronto and Buffalo are moved to
bucket 7
4
g
Buffalo
6 7
6 7
TOR
BUF
Copyright © 1998 by Hanan Samet
hp385
g
• Insert Denver (.16) and Omaha (.19) into bucket 4’s overflow
area
Omaha
Denver
DEN
OMA
Copyright © 1998 by Hanan Samet
hp386
v
• Insert Atlanta (.34) into bucket 5: τ=0.7 and split bucket 4
creating buckets 8 and 9; Denver is moved to bucket 8,
while Chicago, Mobile, and Omaha are moved to bucket 9
which causes it to overflow
Atlanta
9
8
9
8 9
CHI
MOB
DEN
OMA
ATL
Copyright © 1998 by Hanan Samet
hp387
v
• Insert Miami (.33) into bucket 5: τ=0.67 and split bucket 5
creating buckets 10 and 11; Move Atlanta and Miami to
bucket 10
10 11
ATL
MIA
Miami
1110
11
Copyright © 1998 by Hanan Samet
hp39
COMPARISON OF SPIRAL AND LINEAR HASHING
• Main advantage of spiral hashing over linear hashing is
that the bucket being split is the one most likely to
overflow
• Disadvantages of spiral hashing:
1. the buckets that have been split are not reused
• overcome by using a mapping between logical and
physical addresses
2. expensive to calculate function y = dx
• overcome by use of an approximation
Copyright © 1998 by Hanan Samet
hp40
WHY SPIRAL?
• Can rewrite y = dx as ρ = e jθ using polar coordinates
which yields the equation of a spiral
• Ex: ρ = e (ln 2)•θ/2π
1
b
-10 -5 5 10 15
-10
-5
5
10
3
6
12
14
7
15
13
5
10
11
9
• Polar coordinates mean that the active buckets are always
within one complete arc of the spiral − i.e., θ = 2π
• Mechanics of a bucket split
1. let first active bucket be at a = e j•b (i.e., θ = b )
2. last active bucket is at c = ej•(b+2π) − 1
3. bucket split means that the contents of the active
bucket at ρ = a (i.e., θ = b) are distributed into buckets
c + 1 through g where g = ej•(b+2π+φ) and φ is the
solution of a + 1 = ej•(b+φ) − i.e., φ = (ln (a + 1))/j − b
4. buckets a + 1 through g are now the active buckets
• Ex: d = 2
8
16
Copyright © 1998 by Hanan Samet
hp40
WHY SPIRAL?
• Can rewrite y = dx as ρ = e jθ using polar coordinates
which yields the equation of a spiral
• Ex: ρ = e (ln 2)•θ/2π
1
b
-10 -5 5 10 15
-10
-5
5
10
3
6
12
14
7
15
13
5
10
11
9
• Polar coordinates mean that the active buckets are always
within one complete arc of the spiral − i.e., θ = 2π
• Mechanics of a bucket split
1. let first active bucket be at a = e j•b (i.e., θ = b )
2. last active bucket is at c = ej•(b+2π) − 1
3. bucket split means that the contents of the active
bucket at ρ = a (i.e., θ = b) are distributed into buckets
c + 1 through g where g = ej•(b+2π+φ) and φ is the
solution of a + 1 = ej•(b+φ) − i.e., φ = (ln (a + 1))/j − b
4. buckets a + 1 through g are now the active buckets
• Ex: d = 2
8
16
Copyright © 1998 by Hanan Samet
hp402
r
1. active buckets 6 through 11
Copyright © 1998 by Hanan Samet
hp40
WHY SPIRAL?
• Can rewrite y = dx as ρ = e jθ using polar coordinates
which yields the equation of a spiral
• Ex: ρ = e (ln 2)•θ/2π
1
b
-10 -5 5 10 15
-10
-5
5
10
3
6
12
14
7
15
13
5
10
11
9
• Polar coordinates mean that the active buckets are always
within one complete arc of the spiral − i.e., θ = 2π
• Mechanics of a bucket split
1. let first active bucket be at a = e j•b (i.e., θ = b )
2. last active bucket is at c = ej•(b+2π) − 1
3. bucket split means that the contents of the active
bucket at ρ = a (i.e., θ = b) are distributed into buckets
c + 1 through g where g = ej•(b+2π+φ) and φ is the
solution of a + 1 = ej•(b+φ) − i.e., φ = (ln (a + 1))/j − b
4. buckets a + 1 through g are now the active buckets
• Ex: d = 2
8
16
Copyright © 1998 by Hanan Samet
hp402
r
1. active buckets 6 through 11
Copyright © 1998 by Hanan Samet
hp403
z
2. split bucket 6 to yield buckets 12 and 13
Copyright © 1998 by Hanan Samet
hp40
WHY SPIRAL?
• Can rewrite y = dx as ρ = e jθ using polar coordinates
which yields the equation of a spiral
• Ex: ρ = e (ln 2)•θ/2π
1
b
-10 -5 5 10 15
-10
-5
5
10
3
6
12
14
7
15
13
5
10
11
9
• Polar coordinates mean that the active buckets are always
within one complete arc of the spiral − i.e., θ = 2π
• Mechanics of a bucket split
1. let first active bucket be at a = e j•b (i.e., θ = b )
2. last active bucket is at c = ej•(b+2π) − 1
3. bucket split means that the contents of the active
bucket at ρ = a (i.e., θ = b) are distributed into buckets
c + 1 through g where g = ej•(b+2π+φ) and φ is the
solution of a + 1 = ej•(b+φ) − i.e., φ = (ln (a + 1))/j − b
4. buckets a + 1 through g are now the active buckets
• Ex: d = 2
8
16
Copyright © 1998 by Hanan Samet
hp402
r
1. active buckets 6 through 11
Copyright © 1998 by Hanan Samet
hp403
z
2. split bucket 6 to yield buckets 12 and 13
Copyright © 1998 by Hanan Samet
hp404
g
3. active buckets 7 through 13
Copyright © 1998 by Hanan Samet
hp40
WHY SPIRAL?
• Can rewrite y = dx as ρ = e jθ using polar coordinates
which yields the equation of a spiral
• Ex: ρ = e (ln 2)•θ/2π
1
b
-10 -5 5 10 15
-10
-5
5
10
3
6
12
14
7
15
13
5
10
11
9
• Polar coordinates mean that the active buckets are always
within one complete arc of the spiral − i.e., θ = 2π
• Mechanics of a bucket split
1. let first active bucket be at a = e j•b (i.e., θ = b )
2. last active bucket is at c = ej•(b+2π) − 1
3. bucket split means that the contents of the active
bucket at ρ = a (i.e., θ = b) are distributed into buckets
c + 1 through g where g = ej•(b+2π+φ) and φ is the
solution of a + 1 = ej•(b+φ) − i.e., φ = (ln (a + 1))/j − b
4. buckets a + 1 through g are now the active buckets
• Ex: d = 2
8
16
Copyright © 1998 by Hanan Samet
hp402
r
1. active buckets 6 through 11
Copyright © 1998 by Hanan Samet
hp403
z
2. split bucket 6 to yield buckets 12 and 13
Copyright © 1998 by Hanan Samet
hp404
g
3. active buckets 7 through 13
Copyright © 1998 by Hanan Samet
hp405
v
• Observations
1. insead of h(k) uniformly mapping key k into [0,1), use
hθ(k) to uniformly map k into [0,2π)
2. length of arc has constant value between successive
integer values of ρ
Copyright © 1998 by Hanan Samet