Geospatial Search: Quad Tree vs Geohash vs PostGIS vs Elasticsearch

Quick Comparison Table

AspectQuad TreeGeohashPostGISElasticsearch
TypeTree-based spatial indexHash-based grid encodingDatabase extensionSearch engine
Data StructureRecursive tree subdivisionString encodingB-tree spatial indexInverted index + spatial
Query SpeedO(log n) averageO(1) lookupVery fast with indexesFast full-text + spatial
PrecisionAdjustable (depth)Fixed by hash lengthHigh precisionAdjustable
Memory UsageHigh (tree structure)Low (strings only)ModerateHigh (full index)
ImplementationIn-memory, customDatabase native or appPostgreSQL extensionBuilt-in
ScalabilitySingle machineScales well distributedScales with PostgreSQLHorizontal scaling
Use CaseReal-time, in-memoryMobile apps, cachingProduction databasesSearch + location
Learning CurveHighLowModerateLow
Range QueriesExcellentGood (prefix matching)ExcellentGood
Radius QueriesGoodRequires grid searchExcellentExcellent

1. Quad Tree

Concept

Quad Tree is a recursive spatial partitioning structure that divides 2D space into four quadrants.

Click to view code
             World (max_points=4)
            /        |         \       \
          NW        NE         SW       SE
         /  \      /  \       /  \     /  \
       NW   NE   NW   NE    NW   NE  NW   NE

How It Works

Insertion Algorithm:

Click to view code (python)
class QuadNode:
    def __init__(self, boundary, max_points=4):
        self.boundary = boundary  # (x, y, width, height)
        self.points = []
        self.max_points = max_points
        self.divided = False
        self.northeast = None
        self.northwest = None
        self.southeast = None
        self.southwest = None
    
    def insert(self, point):
        # Point = (lat, lng, data)
        if not self.boundary.contains(point):
            return False
        
        if len(self.points) < self.max_points:
            self.points.append(point)
            return True
        
        if not self.divided:
            self.subdivide()
        
        # Try to insert into children
        return (self.northeast.insert(point) or
                self.northwest.insert(point) or
                self.southeast.insert(point) or
                self.southwest.insert(point))
    
    def query_radius(self, center, radius):
        found = []
        
        # Check if search area intersects boundary
        if not self.boundary.intersects_circle(center, radius):
            return found
        
        # Check points in this node
        for point in self.points:
            if distance(center, point) <= radius:
                found.append(point)
        
        # Recursively check children
        if self.divided:
            found.extend(self.northeast.query_radius(center, radius))
            found.extend(self.northwest.query_radius(center, radius))
            found.extend(self.southeast.query_radius(center, radius))
            found.extend(self.southwest.query_radius(center, radius))
        
        return found
    
    def subdivide(self):
        x, y, w, h = self.boundary
        half_w, half_h = w / 2, h / 2
        
        self.northeast = QuadNode((x + half_w, y, half_w, half_h))
        self.northwest = QuadNode((x, y, half_w, half_h))
        self.southeast = QuadNode((x + half_w, y + half_h, half_w, half_h))
        self.southwest = QuadNode((x, y + half_h, half_w, half_h))
        self.divided = True

Query Example (Radius Search):

Click to view code (python)
# Find all drivers within 1km of rider
quad_tree = build_quad_tree(all_drivers)
nearby_drivers = quad_tree.query_radius(
    center=(40.7128, -74.0060),  # NYC
    radius=1.0  # 1km
)

Pros

  • ✅ O(log n) average query time
  • ✅ Excellent for dynamic updates (drivers entering/leaving)
  • ✅ Works in-memory (fast for real-time)
  • ✅ Adaptive: Tree depth adjusts to data distribution
  • ✅ Good for visualization (spatial hierarchy)

Cons

  • ❌ Complex to implement correctly
  • ❌ High memory overhead (tree nodes)
  • ❌ Unbalanced trees can degrade to O(n)
  • ❌ Difficult to persist and replicate
  • ❌ Not suitable for very large datasets (GB+)

Real-World Example: Uber Driver Matching

Click to view code (python)
class UberDriverMatcher:
    def __init__(self):
        self.quad_tree = QuadNode(
            boundary=(0, 0, 180, 180),  # World bounds
            max_points=10  # Split after 10 drivers
        )
    
    def add_driver(self, driver_id, lat, lng):
        self.quad_tree.insert((lat, lng, {'id': driver_id}))
    
    def find_nearby_drivers(self, rider_lat, rider_lng, radius_km=2):
        # Convert km to lat/lng degrees (approx)
        radius_degrees = radius_km / 111.0
        
        drivers = self.quad_tree.query_radius(
            center=(rider_lat, rider_lng),
            radius=radius_degrees
        )
        
        # Sort by distance
        drivers.sort(
            key=lambda d: distance(
                (rider_lat, rider_lng),
                (d[0], d[1])
            )
        )
        
        return drivers

# Usage
matcher = UberDriverMatcher()
matcher.add_driver('D1', 40.7128, -74.0060)
matcher.add_driver('D2', 40.7180, -74.0050)

nearby = matcher.find_nearby_drivers(
    rider_lat=40.7150,
    rider_lng=-74.0055,
    radius_km=2
)

When to Use

  • Real-time location services (Uber, Lyft)
  • Game engines (collision detection)
  • In-memory spatial queries (<100M points)
  • Dynamic data with frequent updates
  • Low-latency requirements (<100ms)

2. Geohash

Concept

Geohash converts lat/lng to a string by recursively dividing the world into grids.

Click to view code
World (8 parts):        Each part (4 parts):
  0 1                        0 1
  2 3                        2 3
  4 5                        4 5
  6 7                        6 7

"9q8yy" =
  9    -> quadrant 9
  q    -> further subdivision
  8    -> more subdivision
  y    -> even more
  y    -> finest detail

How It Works

Encoding Algorithm:

Click to view code (python)
import base32

# Base32 alphabet for Geohash
BASE32 = "0123456789bcdefghjkmnpqrstuvwxyz"

def encode_geohash(lat, lng, precision=11):
    """
    precision: hash length (higher = more precise)
    precision 1 = ~5,000km
    precision 5 = ~4.8km
    precision 9 = ~3.8 meters
    precision 11 = ~1.5 meters (Uber precision)
    """
    lat_range = [-90.0, 90.0]
    lng_range = [-180.0, 180.0]
    
    geohash = []
    bits = 0
    bit = 0
    
    is_lat = False
    while len(geohash) < precision:
        if is_lat:
            mid = (lat_range[0] + lat_range[1]) / 2
            if lat > mid:
                bits = (bits << 1) + 1
                lat_range[0] = mid
            else:
                bits = bits << 1
                lat_range[1] = mid
        else:
            mid = (lng_range[0] + lng_range[1]) / 2
            if lng > mid:
                bits = (bits << 1) + 1
                lng_range[0] = mid
            else:
                bits = bits << 1
                lng_range[1] = mid
        
        is_lat = not is_lat
        bit += 1
        
        if bit == 5:
            geohash.append(BASE32[bits])
            bits = 0
            bit = 0
    
    return ''.join(geohash)

def decode_geohash(geohash):
    """Decode geohash back to lat/lng range"""
    lat_range = [-90.0, 90.0]
    lng_range = [-180.0, 180.0]
    
    is_lat = False
    for c in geohash:
        bits = BASE32.index(c)
        
        for i in range(4, -1, -1):
            bit = (bits >> i) & 1
            
            if is_lat:
                mid = (lat_range[0] + lat_range[1]) / 2
                if bit == 1:
                    lat_range[0] = mid
                else:
                    lat_range[1] = mid
            else:
                mid = (lng_range[0] + lng_range[1]) / 2
                if bit == 1:
                    lng_range[0] = mid
                else:
                    lng_range[1] = mid
            
            is_lat = not is_lat
    
    lat = (lat_range[0] + lat_range[1]) / 2
    lng = (lng_range[0] + lng_range[1]) / 2
    return lat, lng

Radius Search Using Geohash:

Click to view code (python)
from itertools import product

def get_neighbor_geohashes(geohash, precision=11):
    """
    Get all geohashes that neighbor the given geohash
    
    For radius search, we need:
    1. The geohash itself
    2. All 8 neighbors
    3. Neighbors of neighbors (if radius is large)
    """
    # Neighbors mapping (pre-computed)
    NEIGHBORS = {
        'right': {...},
        'left': {...},
        'top': {...},
        'bottom': {...}
    }
    
    # Get 8 neighbors
    neighbors = set()
    neighbors.add(geohash)
    
    # Simplified: truncate and search all permutations
    # In production, use precomputed neighbor tables
    prefix = geohash[:-1]
    for i in range(32):  # 32 base32 chars
        neighbor = prefix + BASE32[i]
        neighbors.add(neighbor)
    
    return neighbors

def find_nearby_with_geohash(lat, lng, radius_km=2):
    """Find all points within radius using geohash"""
    
    # 1. Encode target location
    target_hash = encode_geohash(lat, lng, precision=9)
    
    # 2. Get neighbors (geohashes can be split across boundaries)
    neighbor_hashes = get_neighbor_geohashes(target_hash)
    
    # 3. Query database for all points in those geohashes
    candidates = []
    for hash_prefix in neighbor_hashes:
        # SELECT * FROM locations WHERE geohash LIKE 'ezs42...'
        candidates.extend(
            db.query(f"geohash LIKE '{hash_prefix}%'")
        )
    
    # 4. Filter by actual distance (geohash is approximate)
    results = []
    for point in candidates:
        if distance((lat, lng), (point.lat, point.lng)) <= radius_km:
            results.append(point)
    
    return results

Geohash Precision Table

PrecisionErrorUse Case
1±5,000 kmContinental scale
3±1,000 kmCountry scale
5±4.8 kmCity scale
7±150 mNeighborhood
9±3.8 mBuilding
11±1.5 mStreet location (Uber)
13±0.6 mPrecision meter

Pros

  • ✅ Very easy to implement
  • ✅ Compact string representation (sortable)
  • ✅ O(1) encoding/decoding
  • ✅ Easy to persist and cache
  • ✅ Good for prefix-based queries
  • ✅ Works well distributed (partition by hash)

Cons

  • ❌ Geohash boundaries can split search areas (need neighbor check)
  • ❌ Less accurate than true R-tree indexes
  • ❌ Requires post-filtering by distance
  • ❌ Fixed precision limits accuracy
  • ❌ Border issues (point on boundary of grids)

Real-World Example: Uber Pool Matching

Click to view code (python)
class GeoHashPool:
    def __init__(self, redis_client):
        self.redis = redis_client  # Fast in-memory store
        self.precision = 11  # 1.5 meter precision
    
    def add_rider(self, rider_id, lat, lng):
        geohash = encode_geohash(lat, lng, self.precision)
        
        # Store in Redis sorted set for range queries
        # Key pattern: "riders:{geohash_prefix}"
        prefix = geohash[:5]  # Use first 5 chars for bucketing
        
        self.redis.zadd(
            f"riders:{prefix}",
            {rider_id: float(geohash)},
            score=lat  # Can also score by distance
        )
    
    def find_pool_matches(self, rider_lat, rider_lng, pool_size=4):
        geohash = encode_geohash(rider_lat, rider_lng, self.precision)
        prefix = geohash[:5]
        
        # Get nearby riders
        nearby_riders = set()
        
        # Search in main bucket and neighbors
        for neighbor_prefix in self.get_neighbor_prefixes(prefix):
            riders = self.redis.zrange(
                f"riders:{neighbor_prefix}",
                0, -1
            )
            nearby_riders.update(riders)
        
        # Filter by actual distance
        matches = []
        for rider_id in nearby_riders:
            rider_data = self.redis.hgetall(f"rider:{rider_id}")
            distance = calculate_distance(
                (rider_lat, rider_lng),
                (float(rider_data['lat']), float(rider_data['lng']))
            )
            
            if distance <= 2.0:  # 2km
                matches.append({
                    'rider_id': rider_id,
                    'distance': distance
                })
        
        return sorted(matches, key=lambda x: x['distance'])[:pool_size]

# Usage
pool = GeoHashPool(redis_client)
pool.add_rider('R1', 40.7128, -74.0060)
pool.add_rider('R2', 40.7150, -74.0055)

matches = pool.find_pool_matches(40.7140, -74.0065, pool_size=4)

When to Use

  • Mobile apps (location caching)
  • Distributed systems (partition by geohash)
  • Redis/cache-based storage
  • Approximate location queries
  • URL shortener with geographic prefix (like what3words)

3. PostGIS (PostgreSQL Spatial Extension)

Concept

PostGIS adds native spatial data types and functions to PostgreSQL using R-tree indexes.

Click to view code (sql)
-- Create table with spatial column
CREATE TABLE drivers (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100),
    location GEOMETRY(POINT, 4326),  -- WGS84 projection
    updated_at TIMESTAMP
);

-- Create spatial index (B-tree on GIST)
CREATE INDEX idx_drivers_location ON drivers 
USING GIST (location);

How It Works

R-tree Index Structure:

Click to view code
                    Root (bounds entire space)
                   /          |          \
              Branch1      Branch2      Branch3
             /   |   \    /   |   \    /   |   \
           Leaf Leaf Leaf Leaf Leaf Leaf...
           
Each leaf contains:
- Bounding box
- Pointer to actual data
- MBR (Minimum Bounding Rectangle)

Real-Time Driver Location Updates:

Click to view code (sql)
-- Update driver location
UPDATE drivers 
SET location = ST_Point(-74.0060, 40.7128)
WHERE id = 1;

-- Find nearest 5 drivers
SELECT id, name, 
       ST_Distance(location, ST_Point(-74.0065, 40.7140)) as distance
FROM drivers
WHERE ST_DWithin(
    location,
    ST_Point(-74.0065, 40.7140),
    0.02  -- ~2km (in degrees)
)
ORDER BY distance
LIMIT 5;

Radius Search with PostGIS:

Click to view code (sql)
-- Find all drivers within 2km (using haversine formula)
SELECT 
    id,
    name,
    ST_Distance(
        location::geography,
        ST_Point(-74.0065, 40.7140)::geography
    ) / 1000 as distance_km
FROM drivers
WHERE ST_DWithin(
    location::geography,
    ST_Point(-74.0065, 40.7140)::geography,
    2000  -- 2000 meters
)
ORDER BY distance_km
LIMIT 10;

Complex Spatial Queries:

Click to view code (sql)
-- Find drivers within polygon (e.g., Manhattan boundaries)
SELECT id, name
FROM drivers
WHERE ST_Contains(
    ST_GeomFromText('POLYGON((-74.0265 40.6960, -73.9262 40.6960, -73.9262 40.8895, -74.0265 40.8895, -74.0265 40.6960))'),
    location
);

-- Find drivers near a route
SELECT id, name
FROM drivers
WHERE ST_DWithin(
    location,
    ST_GeomFromText('LINESTRING(-74.0 40.7, -74.0 40.8)'),
    0.02  -- 2km buffer
);

-- Find drivers in delivery zones with highest concentration
SELECT zone_id, COUNT(*) as driver_count
FROM drivers d
JOIN delivery_zones z ON ST_Contains(z.boundary, d.location)
GROUP BY zone_id
ORDER BY driver_count DESC;

Python Integration:

Click to view code (python)
from psycopg2 import sql
from psycopg2.extras import RealDictCursor

class PostGISDriver:
    def __init__(self, db_connection):
        self.conn = db_connection
    
    def update_driver_location(self, driver_id, lat, lng):
        with self.conn.cursor() as cur:
            cur.execute(
                """
                UPDATE drivers 
                SET location = ST_Point(%s, %s),
                    updated_at = NOW()
                WHERE id = %s
                """,
                (lng, lat, driver_id)  # Note: Point(lng, lat)
            )
        self.conn.commit()
    
    def find_nearby_drivers(self, rider_lat, rider_lng, radius_km=2):
        with self.conn.cursor(cursor_factory=RealDictCursor) as cur:
            cur.execute(
                """
                SELECT 
                    id,
                    name,
                    ST_Distance(
                        location::geography,
                        ST_Point(%s, %s)::geography
                    ) / 1000 as distance_km
                FROM drivers
                WHERE ST_DWithin(
                    location::geography,
                    ST_Point(%s, %s)::geography,
                    %s
                )
                AND updated_at > NOW() - INTERVAL '5 minutes'
                ORDER BY distance_km
                LIMIT 10
                """,
                (rider_lng, rider_lat, rider_lng, rider_lat, radius_km * 1000)
            )
            return cur.fetchall()
    
    def find_drivers_in_zone(self, zone_id):
        with self.conn.cursor(cursor_factory=RealDictCursor) as cur:
            cur.execute(
                """
                SELECT d.id, d.name, d.location
                FROM drivers d
                JOIN zones z ON z.id = %s
                WHERE ST_Contains(z.boundary, d.location)
                """,
                (zone_id,)
            )
            return cur.fetchall()

# Usage
postgis = PostGISDriver(db_connection)
postgis.update_driver_location(1, 40.7128, -74.0060)

nearby = postgis.find_nearby_drivers(40.7140, -74.0065, radius_km=2)
for driver in nearby:
    print(f"{driver['name']}: {driver['distance_km']:.2f}km away")

Spatial Index Types

Index TypeUse CaseSpeed
GISTGeneral purpose, mixed queriesGood
BRINLarge tables, sequential dataVery Fast
SPGISTNon-traditional spacesGood
GINFull-text + spatialVery Fast

Pros

  • ✅ Production-grade, battle-tested
  • ✅ Very accurate spatial calculations
  • ✅ Complex spatial operations (intersect, contains, etc.)
  • ✅ ACID transactions with spatial data
  • ✅ Excellent query optimization
  • ✅ Persistent and reliable
  • ✅ Good performance with millions of points

Cons

  • ❌ Requires PostgreSQL infrastructure
  • ❌ Updates require disk I/O (slower than in-memory)
  • ❌ Complex setup and tuning
  • ❌ Overkill for simple radius searches
  • ❌ Network latency for remote queries
  • ❌ Scaling requires sharding/replication

When to Use

  • Production location services (Uber, Google Maps)
  • Complex spatial queries
  • Large datasets (100M+ points)
  • Need for historical data and analytics
  • Enterprise systems requiring reliability

4. Elasticsearch (Geo Spatial)

Concept

Elasticsearch is a search engine with built-in geospatial indexing, great for combining full-text search with location.

Click to view code (json)
// Elasticsearch mapping
{
  "mappings": {
    "properties": {
      "driver_name": { "type": "text" },
      "driver_rating": { "type": "float" },
      "location": { "type": "geo_point" },  // Spatial data type
      "vehicle_type": { "type": "keyword" }
    }
  }
}

How It Works

Indexing:

Click to view code (python)
from elasticsearch import Elasticsearch

es = Elasticsearch(['localhost:9200'])

# Create index with geo mapping
es.indices.create(
    index='drivers',
    body={
        "settings": {
            "number_of_shards": 1,
            "number_of_replicas": 1
        },
        "mappings": {
            "properties": {
                "driver_id": {"type": "keyword"},
                "name": {"type": "text"},
                "rating": {"type": "float"},
                "location": {
                    "type": "geo_point"
                },
                "vehicle_type": {
                    "type": "keyword",
                    "fields": {
                        "raw": {"type": "keyword"}
                    }
                },
                "is_available": {"type": "boolean"},
                "updated_at": {"type": "date"}
            }
        }
    }
)

# Index a driver
es.index(
    index='drivers',
    id='D1',
    body={
        'driver_id': 'D1',
        'name': 'John Doe',
        'rating': 4.8,
        'location': {
            'lat': 40.7128,
            'lon': -74.0060
        },
        'vehicle_type': 'sedan',
        'is_available': True,
        'updated_at': '2025-12-27T10:30:00Z'
    }
)

Radius Search:

Click to view code (python)
# Find drivers within 2km
search_result = es.search(
    index='drivers',
    body={
        "query": {
            "bool": {
                "must": [
                    {
                        "geo_distance": {
                            "distance": "2km",
                            "location": {
                                "lat": 40.7140,
                                "lon": -74.0065
                            }
                        }
                    },
                    {
                        "term": {
                            "is_available": True
                        }
                    }
                ]
            }
        },
        "sort": [
            {
                "_geo_distance": {
                    "location": {
                        "lat": 40.7140,
                        "lon": -74.0065
                    },
                    "order": "asc",
                    "unit": "km"
                }
            }
        ],
        "size": 10
    }
)

# Process results
for hit in search_result['hits']['hits']:
    driver = hit['_source']
    distance = hit['sort'][0]  # Distance in km
    print(f"{driver['name']}: {distance:.2f}km away, Rating: {driver['rating']}")

Combined Geo + Text Search:

Click to view code (python)
# Find "premium" drivers (rating > 4.5) within 3km with keyword match
search_result = es.search(
    index='drivers',
    body={
        "query": {
            "bool": {
                "must": [
                    {
                        "geo_distance": {
                            "distance": "3km",
                            "location": {
                                "lat": 40.7140,
                                "lon": -74.0065
                            }
                        }
                    },
                    {
                        "multi_match": {
                            "query": "sedan suv",
                            "fields": ["vehicle_type^2", "name"]
                        }
                    }
                ],
                "filter": [
                    {
                        "range": {
                            "rating": {"gte": 4.5}
                        }
                    }
                ]
            }
        },
        "sort": [
            {
                "_geo_distance": {
                    "location": {
                        "lat": 40.7140,
                        "lon": -74.0065
                    },
                    "order": "asc"
                }
            },
            {"rating": {"order": "desc"}}  # Then by rating
        ]
    }
)

Aggregations (Analytics):

Click to view code (python)
# Get driver distribution by zones
search_result = es.search(
    index='drivers',
    body={
        "aggs": {
            "drivers_by_zone": {
                "geohash_grid": {
                    "field": "location",
                    "precision": 6  # ~1.2km cells
                },
                "aggs": {
                    "avg_rating": {
                        "avg": {"field": "rating"}
                    },
                    "vehicle_types": {
                        "terms": {"field": "vehicle_type", "size": 5}
                    }
                }
            }
        },
        "size": 0
    }
)

# Analyze
for bucket in search_result['aggregations']['drivers_by_zone']['buckets']:
    print(f"Zone: {bucket['key']}")
    print(f"  Drivers: {bucket['doc_count']}")
    print(f"  Avg Rating: {bucket['avg_rating']['value']:.2f}")

Geo Query Types

Query TypeUse CaseExample
geo_distanceFind within radius2km radius
geoboundingboxRectangle searchNYC bounds
geo_polygonIrregular zoneDelivery zone
geohash_gridAggregation by gridHeatmap

Pros

  • ✅ Combines full-text + location search
  • ✅ Horizontal scaling (distributed)
  • ✅ Real-time indexing
  • ✅ Powerful aggregations
  • ✅ Analytics and visualization ready
  • ✅ REST API (language agnostic)
  • ✅ High availability built-in

Cons

  • ❌ Requires cluster management
  • ❌ Higher memory footprint than databases
  • ❌ Query results are approximate
  • ❌ Complex tuning and optimization
  • ❌ More operational overhead
  • ❌ Overkill if you don't need full-text search

When to Use

  • Ride-sharing with search (find drivers + filter by rating)
  • Job listings (location + job title/skills)
  • Real estate (location + amenities search)
  • E-commerce (location + product search)
  • Analytics/Dashboards with spatial data

Comparison: Head-to-Head

Accuracy

Click to view code
PostGIS      ████████████████████ (Perfect - WGS84)
Elasticsearch ██████████░░░░░░░░░░ (Very Good)
Quad Tree    ██████████░░░░░░░░░░ (Depends on depth)
Geohash      ████████░░░░░░░░░░░░ (Good, limited precision)

Query Speed

Click to view code
Quad Tree    ████████████████░░░░ (O(log n), in-memory)
Geohash      ████████████████░░░░ (O(1), + distance filter)
Elasticsearch ██████████░░░░░░░░░░ (Fast, distributed)
PostGIS      ██████████░░░░░░░░░░ (Very fast, disk I/O)

Scalability

Click to view code
Elasticsearch ████████████████░░░░ (Horizontal)
PostGIS      ███████░░░░░░░░░░░░░ (Sharding needed)
Geohash      ████████████████░░░░ (Partition by hash)
Quad Tree    ███░░░░░░░░░░░░░░░░░ (Single machine)

Implementation Complexity

Click to view code
Geohash      ████░░░░░░░░░░░░░░░░ (Very simple)
Elasticsearch ██████░░░░░░░░░░░░░░ (Moderate)
PostGIS      ████████░░░░░░░░░░░░ (Moderate-High)
Quad Tree    ████████████░░░░░░░░ (Complex)

Cost (Infrastructure)

Click to view code
Quad Tree    ██░░░░░░░░░░░░░░░░░░ (Single box)
Geohash      ███░░░░░░░░░░░░░░░░░ (Minimal overhead)
PostGIS      ███████░░░░░░░░░░░░░ (PostgreSQL license)
Elasticsearch ██████████░░░░░░░░░░ (Cluster + memory)

Decision Matrix

Choose Quad Tree if:

  • Real-time, in-memory (<100M points)
  • Need sub-100ms latency
  • Building a game engine or collision detection
  • Custom requirements

Choose Geohash if:

  • Mobile app needs offline capability
  • Distributed system with easy partitioning
  • Approximate location is acceptable
  • Memory efficiency critical

Choose PostGIS if:

  • Production system with millions of drivers
  • Need complex spatial operations
  • Already using PostgreSQL
  • Historical data and analytics important

Choose Elasticsearch if:

  • Need combined full-text + location search
  • Building user-facing search interface
  • Need real-time aggregations/analytics
  • Want to scale horizontally easily

Hybrid Approaches

Uber's Architecture (Estimated)

Click to view code
Mobile App
    ↓
  Geohash (for caching)
    ↓
Redis (geohash-based buckets)
    ↓
Quad Tree (in-memory for real-time)
    ↓
PostgreSQL + PostGIS (persistent storage)

Netflix: Recommendations

Click to view code
User Location (Geohash)
    ↓
Elasticsearch (Search by genre + location)
    ↓
Custom ML model (Personalized recommendations)

Google Maps

Click to view code
User Query (Full-text)
    ↓
Elasticsearch (Search results)
    ↓
PostGIS (Route optimization)
    ↓
Quad Tree (Visualization tiles)

Performance Benchmarks

Click to view code
Scenario: Find 100 drivers within 2km of location
Dataset: 10M drivers across US

Quad Tree:        45ms  (after loading into memory)
Geohash:          80ms  (including distance filter)
PostGIS:         150ms  (network + query)
Elasticsearch:   200ms  (distributed, added overhead)

Summary Table: Feature Capabilities

FeatureQuad TreeGeohashPostGISElasticsearch
Radius Search✅ Excellent⚠️ Good✅ Excellent✅ Excellent
Range Search⚠️ Good✅ Excellent✅ Excellent✅ Excellent
Polygon Search❌ No❌ No✅ Excellent✅ Good
Distance Calculation⚠️ Approximate❌ Requires post-filter✅ Exact✅ Exact
Full-text Search❌ No❌ No❌ No✅ Excellent
Aggregation⚠️ Limited⚠️ Limited✅ Good✅ Excellent
ACID Transactions❌ No⚠️ Limited✅ Full❌ No
Clustering❌ No✅ Yes✅ Yes✅ Yes
Real-time Updates✅ Excellent✅ Excellent⚠️ Good✅ Excellent
Memory Efficient❌ No✅ Yes✅ Yes⚠️ Moderate