GeoDataViewer
Menu
Launch Studio
Theme
GeoDataViewer Team

Extracting POIs from OpenStreetMap and Overture Maps: A Practical Guide

Learn how to extract EV charging stations and gas stations for New York City using osmium on OSM data, and compare the workflow with Overture Maps' GeoParquet pipeline.

Extracting POIs from OpenStreetMap and Overture Maps: A Practical Guide

Point-of-interest (POI) extraction is one of the most common tasks in modern geospatial data engineering. Whether you are building a routing application, analyzing retail density, or mapping infrastructure, you need a reliable pipeline to pull location-tagged features from large-scale vector datasets.

This guide demonstrates two approaches to the same problem: extracting EV charging stations and gas stations within the New York City metropolitan boundary. We will first use osmium to process raw OpenStreetMap (OSM) data, then replicate the same query using Overture Maps’ cloud-native GeoParquet pipeline.


1. The Data Sources

SourceFormatUpdate FrequencyLicense
OpenStreetMapXML / PBF (Protocolbuffer Binary Format)Real-time (minutely diffs)ODbL
Overture MapsGeoParquet (cloud-optimized)Monthly releasesODbL-derived

OSM is the gold standard for community-edited vector data, but it requires preprocessing. Overture Maps is a curated, conflated dataset produced by a consortium (Amazon, Esri, Meta, Microsoft, TomTom) that normalizes OSM and other sources into a unified schema.


2. Introduction to osmium-tool

osmium-tool is a fast, multi-purpose command-line utility for processing OpenStreetMap data in PBF or XML format. Built on the high-performance libosmium C++ library, it is the standard tool in the OSM ecosystem for tasks that do not require a full database.

Key capabilities relevant to this guide:

CommandPurpose
osmium extractClip a PBF file to a bounding box or polygon boundary
osmium tags-filterExtract objects matching specific tag patterns (e.g., amenity=fuel)
osmium exportConvert OSM data to GeoJSON / OGR formats
osmium mergeCombine multiple PBF files into one
osmium updateApply minutely / hourly / daily OSM diffs to keep an extract current
osmium check-refsValidate referential integrity (ways → nodes, relations → members)

osmium-tool streams data rather than loading the entire file into memory, which makes it suitable for processing multi-gigabyte regional extracts on modest hardware. It is the closest equivalent to ogr2ogr for OSM-native workflows.


3. Extracting POIs with osmium

3.1 Prerequisites

Install osmium-tool and osmconvert (if you need format conversion):

# macOS
brew install osmium-tool

# Ubuntu / Debian
sudo apt-get install osmium-tool

# Arch Linux
sudo pacman -S osmium-tool

You will also need a bounding box for New York City. The approximate WGS84 extent is:

West:   -74.30
South:   40.48
East:   -73.68
North:   40.92

2.2 Downloading the OSM Extract

Geofabrik provides continent- and country-level extracts. For the United States, download the Northeast region:

wget https://download.geofabrik.de/north-america/us/northeast-latest.osm.pbf

This file includes all OSM data for the northeastern US (~2–3 GB). For a production pipeline, consider using osmium update with minutely diffs to keep the extract current.

2.3 Extracting the NYC Bounding Box

Instead of processing the entire region, extract a subset first. This reduces memory pressure and speeds up tag filtering:

osmium extract \
  --bbox="-74.30,40.48,-73.68,40.92" \
  --strategy=smart \
  --output=nyc.osm.pbf \
  northeast-latest.osm.pbf

The --strategy=smart flag tells osmium to preserve complete multipolygon relations even when only part of the geometry falls inside the bbox — critical for administrative boundaries and large buildings.

2.4 Tag-Based POI Filtering

OSM stores POI semantics in tags. The keys we care about are:

POI TypePrimary Tag
EV Charging Stationamenity=charging_station
Gas Stationamenity=fuel

Run osmium’s tag filter to extract nodes, ways, and relations matching these tags:

# Extract EV charging stations
osmium tags-filter \
  nyc.osm.pbf \
  n/amenity=charging_station \
  w/amenity=charging_station \
  r/amenity=charging_station \
  --overwrite \
  --output=nyc_charging_stations.osm.pbf

# Extract gas stations
osmium tags-filter \
  nyc.osm.pbf \
  n/amenity=fuel \
  w/amenity=fuel \
  r/amenity=fuel \
  --overwrite \
  --output=nyc_gas_stations.osm.pbf

Prefixing with n/, w/, and r/ restricts the filter to nodes, ways, and relations respectively. Charging stations are usually mapped as nodes, but some are mapped as ways (building footprints) or relations (complex sites), so filtering all three geometry types is safest.

OSM Tag Attributes for These POIs

Raw OSM data stores POI semantics as free-form key-value pairs. Below are the most useful tags for our two target categories. These tags are community-contributed; completeness varies by region.

EV Charging Station (amenity=charging_station)

TagDescriptionExample
amenityPOI category (fixed)charging_station
operatorOperating companyTesla, ChargePoint, EVgo
brandBrand nameTesla Supercharger, Electrify America
capacityNumber of charging points8
socket:type2Count of Type 2 (Mennekes) connectors4
socket:chademoCount of CHAdeMO connectors2
socket:tesla_superchargerCount of Tesla Supercharger stalls8
socket:ccsCount of CCS connectors4
voltageSupply voltage400
amperageMaximum current350
outputPower output in kW250 kW
feeIs a fee required?yes / no
opening_hoursHours of operation24/7
payment:*Accepted payment methodspayment:app=yes
networkCharging network identifierChargePoint Network
accessAccess restrictionyes / customers / private

Gas Station (amenity=fuel)

TagDescriptionExample
amenityPOI category (fixed)fuel
brandFuel brandShell, BP, Exxon
operatorOperating companyShell Oil Company
fuel:dieselDiesel availabilityyes
fuel:octane_9595 octane petrol availabilityyes
fuel:octane_9898 octane petrol availabilityyes
fuel:e10E10 ethanol blend availabilityyes
fuel:electricOn-site EV chargingyes
opening_hoursHours of operationMo-Su 06:00-23:00
payment:*Accepted payment methodspayment:credit_cards=yes
self_serviceSelf-service pumps available?yes / no
car_washCar wash on site?yes / no
shopConvenience store attached?convenience

Note: Because OSM tags are crowdsourced, not every station will have all of these fields. Production pipelines should treat all tags as optional and handle missing keys gracefully.

2.5 Converting to GeoJSON for Visualization

The resulting .osm.pbf files are not directly viewable in most GIS tools. Convert them to GeoJSON using osmium export:

osmium export \
  --geometry-type=point \
  --add-unique-id=type+id \
  --format=geojson \
  nyc_charging_stations.osm.pbf \
  --output=nyc_charging_stations.geojson

osmium export \
  --geometry-type=point \
  --add-unique-id=type+id \
  --format=geojson \
  nyc_gas_stations.osm.pbf \
  --output=nyc_gas_stations.geojson

The --geometry-type=point flag forces all features to point geometry. OSM ways mapped as charging station building footprints will be represented by their centroid, which is the standard behavior for POI analysis.

Quick preview: If you are working inside VS Code, drag the resulting .geojson into the editor and press Ctrl/Cmd + Alt + M to open it in the Geo Data Viewer Fast extension — instant Kepler.gl map preview without leaving your workspace.

2.6 One-Command Pipeline

For automated pipelines, chain the steps:

osmium extract --bbox="-74.30,40.48,-73.68,40.92" \
  --strategy=smart \
  --output=- \
  northeast-latest.osm.pbf \
  | osmium tags-filter - \
  n/amenity=charging_station w/amenity=charging_station r/amenity=charging_station \
  --output=- \
  | osmium export \
  --geometry-type=point \
  --format=geojson \
  --output=nyc_charging_stations.geojson

This streams data through memory without writing intermediate files.


4. Extracting the Same POIs from Overture Maps

3.1 What Is Overture Maps?

Overture Maps is a cloud-native geospatial dataset built by normalizing and conflating multiple sources — primarily OpenStreetMap, but also proprietary feeds from Esri, TomTom, and other partners. Data is released monthly as GeoParquet files hosted on AWS S3, partitioned by theme and type.

Key advantages over raw OSM:

  • Unified schema: Tags are normalized into a consistent categories taxonomy
  • Conflation: Duplicate features from multiple sources are merged
  • Cloud-native: Directly queryable via DuckDB or Python without downloading full regions
  • No preprocessing: No need to run osmium extract or tag-filter

Overture Maps Schema Attributes

Overture normalizes source data into a strict schema. The places theme (which contains POIs like charging stations and gas stations) exposes the following columns:

ColumnTypeDescription
idstringUnique Overture feature ID
geometryGeoParquet WKBPoint geometry (WGS84)
bboxstructSpatial bounding box (minX, maxX, minY, maxY) for partition pruning
confidencedoubleConfidence score (0.0–1.0) from conflation
categoriesstructprimary category + alternate array of secondary categories
namesstructprimary name + common array + rules for localization
addressesstructStreet, locality, region, postcode, country
phonesarrayPhone numbers
emailsarrayContact emails
websitesarrayURLs
socialsarraySocial media links
opening_hoursarrayStructured opening-hours rules
brandstructnames + wikidata reference
operatorstructnames + wikidata reference
sourcesarrayProvenance: source dataset, record ID, and confidence per source
subtypestringFeature sub-classification

For EV charging stations and gas stations, the critical fields are:

Overture FieldEV Charging StationGas Station
categories.primaryelectric_vehicle_charging_stationgas_station
names.primaryStation or location nameStation or location name
brandCharging network brandFuel brand
operatorOperating companyOperating company
addressesStreet addressStreet address
confidenceConflation confidenceConflation confidence

Unlike OSM’s flat tag model, Overture’s schema is typed and hierarchical. Missing values are represented as NULL rather than absent keys, which simplifies downstream analytics.

3.2 Method A: DuckDB + Spatial Extension

The fastest way to query Overture Maps is via DuckDB with the spatial extension. DuckDB can read GeoParquet directly from S3 using HTTP range requests.

Install DuckDB:

brew install duckdb        # macOS
# Or download from https://duckdb.org/docs/installation/

Run the query:

INSTALL spatial;
LOAD spatial;

INSTALL httpfs;
LOAD httpfs;

-- Query EV charging stations in NYC bbox
SELECT
    id,
    names.primary AS name,
    categories.primary AS category,
    confidence,
    ST_X(geometry) AS lon,
    ST_Y(geometry) AS lat
FROM read_parquet(
    's3://overturemaps-us-west-2/release/2025-04-23.0/theme=places/type=place/*',
    hive_partitioning = true
)
WHERE
    bbox.minX > -74.30
    AND bbox.maxX < -73.68
    AND bbox.minY > 40.48
    AND bbox.maxY < 40.92
    AND categories.primary = 'electric_vehicle_charging_station';

DuckDB uses the bbox column (a struct of minX, maxX, minY, maxY) for efficient spatial filtering without reading every row. This is the GeoParquet equivalent of a spatial index.

For gas stations, change only the category:

    AND categories.primary = 'gas_station'

Export results to GeoJSON:

COPY (
    SELECT
        id,
        names.primary AS name,
        categories.primary AS category,
        confidence,
        geometry
    FROM read_parquet('s3://overturemaps-us-west-2/release/2025-04-23.0/theme=places/type=place/*', hive_partitioning = true)
    WHERE
        bbox.minX > -74.30
        AND bbox.maxX < -73.68
        AND bbox.minY > 40.48
        AND bbox.maxY < 40.92
        AND categories.primary IN ('electric_vehicle_charging_station', 'gas_station')
) TO 'nyc_pois_overture.geojson'
WITH (FORMAT GDAL, DRIVER 'GeoJSON');

3.3 Method B: Python with overturemaps-py

For Python-centric workflows, install the official overturemaps package:

pip install overturemaps

Query and export:

import overturemaps
import geopandas as gpd

# Define NYC bounding box
bbox = (-74.30, 40.48, -73.68, 40.92)

# Download places within bbox
gdf = overturemaps.core.geodataframe("place", bbox=bbox)

# Filter for charging stations and gas stations
pois = gdf[gdf["categories"].apply(
    lambda x: x.get("primary") in [
        "electric_vehicle_charging_station",
        "gas_station"
    ] if isinstance(x, dict) else False
)]

# Save to GeoJSON
pois.to_file("nyc_pois_overture.geojson", driver="GeoJSON")

print(f"Extracted {len(pois)} POIs")

overturemaps.core.geodataframe handles S3 partitioning, Parquet reading, and geometry reconstruction automatically. The result is a GeoDataFrame ready for analysis or export.

VS Code workflow: If you saved the output as .geojson (via pois.to_file(...)), you can preview it directly in VS Code with the Geo Data Viewer Fast extension — no need to switch to a browser.

3.4 Method C: Command-Line Interface

For shell-based pipelines, use the overturemaps CLI:

pip install overturemaps

overturemaps download \
  --bbox="-74.30,40.48,-73.68,40.92" \
  -f geojson \
  --type=place \
  -o nyc_places.geojson

Then filter locally with jq or ogr2ogr:

# Filter for charging stations and gas stations using ogr2ogr
ogr2ogr -f GeoJSON nyc_pois_overture.geojson nyc_places.geojson \
  -where "JSONExtractString(categories, 'primary') IN ('electric_vehicle_charging_station', 'gas_station')"

Note: The --type=place flag targets the places theme. Other available types include building, division, segment, and connector.


5. Comparing the Two Approaches

Dimensionosmium (Raw OSM)Overture Maps
Data freshnessReal-time (minutely diffs)Monthly releases
Preprocessing requiredYes — download, extract, tag-filterNo — query directly from S3
Schema consistencyTag-dependent (community-defined)Normalized categories taxonomy
ConflationNone — raw source dataYes — deduplicated across sources
Query languageosmium CLI / C++ APIDuckDB SQL / Python / CLI
Output formatOSM XML / PBF / GeoJSONGeoParquet / GeoJSON / GeoDataFrame
Best forReal-time applications, custom tag logicAnalytics, rapid prototyping, conflated datasets

When to Choose osmium

  • You need minute-level freshness (e.g., tracking newly opened charging stations)
  • You are working with custom OSM tags not yet normalized by Overture
  • You are building a custom extract pipeline for a specific region
  • You need the full OSM object graph (ways, relations, member references)

When to Choose Overture Maps

  • You want immediate queryability without downloading multi-gigabyte regions
  • You need a clean, conflated dataset for analysis or visualization
  • You are running exploratory analytics where schema consistency matters
  • You want cloud-native access without local storage constraints

6. Validating the Results

Regardless of the pipeline you choose, always validate the output before analysis:

  1. Count check: NYC should have ~500–800 EV charging stations and ~1,000–1,500 gas stations (subject to OSM mapping completeness)
  2. Spatial check: Plot the points on a basemap to verify they fall within NYC boroughs, not New Jersey or Long Island
  3. Attribute check: Inspect name, operator, brand, and capacity fields for charging stations; brand and opening_hours for gas stations
  4. Duplicate check: Overture’s conflation should reduce duplicates; raw OSM may contain multiple nodes for the same physical location

To inspect the GeoJSON visually, you have two options:

  • In VS Code: Open the .geojson file in your editor and press Ctrl/Cmd + Alt + M to launch the Geo Data Viewer Fast extension for an instant Kepler.gl preview.
  • In the browser: Drag the file into GeoDataViewer’s Studio to render the points on an interactive map with attribute tables.

7. Summary

TaskRecommended ToolCommand
Extract raw OSM POIsosmiumosmium extractosmium tags-filterosmium export
Query cloud-native POIsOverture Maps + DuckDBSELECT ... FROM read_parquet('s3://...') WHERE categories.primary = ...
Python-based extractionOverture Maps + Pythonoverturemaps.core.geodataframe("place", bbox=...)
Quick shell pipelineOverture Maps CLIoverturemaps download --bbox=... --type=place

Both OpenStreetMap and Overture Maps are powerful, but they serve different stages of the data pipeline. Use osmium when you need surgical precision on the freshest raw data. Use Overture Maps when you want clean, analytics-ready data without the preprocessing overhead.

For viewing the extracted GeoJSON files, upload them to GeoDataViewer’s Studio for instant browser-based inspection — no QGIS required.

Share this post: