This lab focuses on two major themes in modern computing — computational geometry and algorithms for large data sets. You’ll implement foundational algorithms that are widely used in fields like GIS, computer graphics, spatial databases, and big data analytics.
Table of contents
Open Table of contents
Introduction
Computational geometry provides the algorithmic tools needed to solve spatial problems, while large data sets require scalable, memory-efficient strategies. Mastering these domains prepares students to handle challenges in areas like:
- Map-based applications and spatial search
- Data analytics at scale
- Robotics and computer vision
- Scientific simulations and real-time monitoring
In this lab, we implement:
- Convex Hull (2D) — the smallest polygon enclosing a set of points.
- Range Search using KD-Tree — efficient spatial querying.
- Streaming Algorithms for Large Data — approximate solutions using limited memory.
Geometry and Big Data Algorithms
1. Convex Hull using Graham Scan
Concept: Given a set of 2D points, the convex hull is the minimal convex boundary containing all points.
import matplotlib.pyplot as plt
def graham_scan(points):
points = sorted(points)
def cross(o, a, b):
return (a[0] - o[0])*(b[1] - o[1]) - (a[1] - o[1])*(b[0] - o[0])
lower, upper = [], []
for p in points:
while len(lower) >= 2 and cross(lower[-2], lower[-1], p) <= 0:
lower.pop()
lower.append(p)
for p in reversed(points):
while len(upper) >= 2 and cross(upper[-2], upper[-1], p) <= 0:
upper.pop()
upper.append(p)
return lower[:-1] + upper[:-1]
2. 2D Range Search with KD-Tree
from scipy.spatial import KDTree
def build_kd_tree(points):
return KDTree(points)
def range_query(kd_tree, query_point, radius):
return kd_tree.query_ball_point(query_point, radius)
3. Count Distinct Elements in Streaming Data (using HyperLogLog-style estimation)
import hashlib
import random
def simulate_stream(n=100000):
return [random.randint(0, 10**6) for _ in range(n)]
def hash_value(x):
return int(hashlib.md5(str(x).encode()).hexdigest(), 16)
def estimate_cardinality(stream, num_buckets=64):
max_trailing_zeros = [0]*num_buckets
for val in stream:
h = hash_value(val)
bucket = h % num_buckets
zeros = len(bin(h >> 6)) - len(bin((h >> 6).rstrip('0')))
max_trailing_zeros[bucket] = max(max_trailing_zeros[bucket], zeros)
avg = sum(max_trailing_zeros) / len(max_trailing_zeros)
return int(2**avg * num_buckets * 0.77351)
Example Usage
Click to expand
# 1. Convex Hull
points = [(random.randint(0, 100), random.randint(0, 100)) for _ in range(30)]
hull = graham_scan(points)
# 2. KD-Tree Range Query
tree = build_kd_tree(points)
neighbors = range_query(tree, (50, 50), 20)
# 3. Streaming Approximation
stream_data = simulate_stream()
approx_unique = estimate_cardinality(stream_data)
print("Convex Hull Points:", hull)
print("Neighbors in range:", neighbors)
print("Estimated Unique Elements:", approx_unique)
Visualization
Click to expand
def plot_convex_hull(points, hull):
xs, ys = zip(*points)
hx, hy = zip(*(hull + [hull[0]])) # Close the polygon
plt.figure(figsize=(7, 7))
plt.scatter(xs, ys, label="Points")
plt.plot(hx, hy, 'r-', label="Convex Hull")
plt.title("Convex Hull (Graham Scan)")
plt.legend()
plt.show()
plot_convex_hull(points, hull)
Time and Space Complexity
Algorithm | Time Complexity |
---|---|
Convex Hull (Graham Scan) | \(O(n \log n)\) |
KD-Tree Construction | \(O(n \log n)\) |
KD-Tree Range Query | \(O(\log n + k)\) |
Cardinality Estimation | \(O(n)\) |
Applications
-
Convex Hull:
- Geographic boundary detection
- Collision detection in games
- Object shape recognition
-
KD-Tree Search:
- Nearest neighbor in spatial datasets
- Robotics and motion planning
- Location-based services
-
Streaming Algorithms:
- Real-time analytics on sensor or clickstream data
- Network traffic analysis
- Social media trend monitoring
Conclusion
This lab bridges classic geometry algorithms with scalable data techniques. From determining geometric boundaries to handling millions of data points efficiently, these tools are foundational to both theoretical and applied computing.
References
- Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C.
Introduction to Algorithms, 3rd/4th Edition, MIT Press.