The Evolution of GIS Data Loading: From Single Files to Modern Tile Architectures
A technical overview of how geospatial data delivery has evolved — from monolithic file formats through raster and vector tiles to modern single-file tile archives like PMTiles — and how to choose the right approach for your data.
Introduction
In 2005, Google Maps shipped a radical idea: instead of downloading a map, you streamed it. The browser fetched 256×256 pixel tiles as you panned and zoomed, assembling a seamless view from dozens of tiny images. It felt like magic — and it set off a chain reaction that would reshape how the entire GIS industry thinks about data delivery.
Twenty years later, that chain reaction has produced a rich ecosystem: single-file parsers, raster tile pyramids, vector tile pipelines, single-file tile archives, and on-the-fly tile servers. Each stage emerged to solve the problems created by the previous one. This article traces that evolution — not as a catalog of technologies, but as a story of constraints, breakthroughs, and the new constraints those breakthroughs created.
1. Phase One: The Single File (2000s–Present)
The problem: You have a dataset. You want to see it on a map. The simplest possible answer: parse the file, extract the geometries, draw them.
This is where every GIS practitioner starts. Drag a GeoJSON, Shapefile, KML, or GPX onto a map and watch it render. The model is beautifully simple — the file is the dataset, every attribute is available, and nothing stands between you and your data.
What it got right:
- Zero preprocessing. No build step, no tile generation, no server.
- Full attribute fidelity. Every column, every property, every metadata field survives the journey from disk to screen.
- Offline-first. A laptop in the field with no connectivity can still load and inspect local files.
The cracks appear around 50 MB. A browser must parse the entire file before rendering a single pixel. At 200 MB — a modest municipal building footprint dataset — you’re looking at multi-second parse times and hundreds of megabytes of memory pressure. Formats like GML and Shapefile add XML or binary overhead that makes the problem worse. And there’s no spatial indexing: every pan and zoom re-renders from the full geometry set, so performance degrades linearly with feature count.
The single-file model didn’t break — it just hit a ceiling. For datasets under 50 MB and 100,000 features, it remains the right answer. But the world had bigger data, and that demanded a different approach.
2. Phase Two: Raster Tiles (2005–Present)
The breakthrough: Don’t send the data. Send pre-rendered pictures of the data.
Google Maps didn’t invent tile-based mapping — the concept dates back to early GIS — but it proved the model at planetary scale. The idea was elegantly simple:
- Render the entire world into 256×256 pixel images at 20 zoom levels.
- Store them in a predictable URL pattern:
/{z}/{x}/{y}.png. - The client requests only the 12–20 tiles visible in the current viewport.
What this solved:
- Scale independence. A global satellite imagery basemap and a neighborhood zoning map cost the same to render in the browser — 20 tile requests, composited by the GPU as textures.
- CDN compatibility. Every tile URL is immutable. Cache it at the edge, serve it to millions of users, never regenerate it.
- Predictable performance. The server did the heavy lifting at build time. The client just stitches images.
The new problem: Raster tiles are pictures. You cannot click a building to see its name. You cannot change the color of roads from gray to red without re-rendering the entire planet. You cannot toggle a layer on and off — the layers are baked into the pixels. And at global scale, the storage cost is staggering: a full zoom-14 tile set is roughly 4 million tiles and tens of gigabytes.
Raster tiles solved the scale problem but created an interactivity problem. The next phase would tackle that directly.
3. Phase Three: Vector Tiles (2013–Present)
The breakthrough: Send the geometry, not the picture. Let the client render it.
Mapbox launched the Mapbox Vector Tile (MVT) specification in 2013, and it fundamentally changed the equation. Instead of pre-rendered PNGs, the server sends Protocol Buffer-encoded geometry — points, lines, polygons with a subset of attributes — in the same 256×256 tile grid. The client (MapLibre GL JS, Mapbox GL JS) decodes the buffers and renders them with GPU-accelerated WebGL.
What this solved:
- Interactivity. Click any feature. Hover for tooltips. The geometry is live in the browser, not baked into pixels.
- Client-side styling. Change road colors, building fills, label fonts, line widths — all in the browser, no tile regeneration needed. One tile set serves infinite visual styles.
- Efficient encoding. MVT Protocol Buffers are typically 5–10× smaller than equivalent GeoJSON. A tile that would be 200 KB as GeoJSON might be 30 KB as MVT.
- Label collision resolution. The browser sees the full tile context and can resolve label overlaps intelligently.
The new problem: Tile generation is expensive. A global OpenStreetMap extract takes hours on consumer hardware. And attribute fidelity takes a hit — MVT tiles carry only a subset of properties to keep tile sizes manageable. You trade full attribute access for scalable interactivity.
This phase also spawned a new generation of tools, each optimized for a different point in the size–speed spectrum:
| Tool | Approach | Best For |
|---|---|---|
| Tippecanoe | CLI, GeoJSON → MVT | Medium datasets (50 MB–2 GB). Fine control over zoom levels, feature dropping, and attribute filtering |
| Planetiler | Java, OSM PBF → MVT | Planet-scale OSM extracts. Processes the 70 GB planet file in ~2 hours on a 32-core machine |
| Martin | Rust, PostgreSQL/PostGIS → MVT | Live tile serving from a database. Generates tiles on-the-fly, supports PMTiles as a source |
Vector tiles solved interactivity. But they created a new operational headache: millions of tiny files.
4. Phase Four: The Single-File Tile Archive (2011–2021)
The problem: A global vector tile set is millions of individual .mvt files. Deploying them means either a tile server (infrastructure to maintain) or a directory tree with millions of files (a storage and transfer nightmare).
The first answer — MBTiles (2011): Pack the entire tile pyramid into a single SQLite database. One .mbtiles file replaces millions of individual tiles. SQLite is universally supported, and the ecosystem is mature — QGIS, MapTiler, TileMill, and countless tools produce MBTiles natively.
But MBTiles carried SQLite’s baggage into the browser. Reading individual tiles from a SQLite database over HTTP requires understanding SQLite’s page structure — you either download the entire file or use a WASM-compiled SQLite engine (~1 MB added to your bundle) to issue virtual queries.
The second answer — PMTiles (2021): What if you could have the single-file convenience of MBTiles but with native HTTP range-request access? That’s the insight behind PMTiles (Pyramid Map Tiles). A PMTiles file is a simple binary structure: a header with metadata and directory (first 512 KB), followed by raw tile data. The client reads the header, then issues byte-range requests for individual tiles — exactly the tiles it needs, nothing more.
What this unlocked:
- True serverless deployment. Host a
.pmtilesfile on S3, Cloudflare R2, or any static file server. No tile server, no database, no backend code. - Minimal client overhead. The PMTiles JavaScript reader is ~10 KB. Compare that to SQLite WASM at ~1 MB.
- Native MapLibre support. MapLibre GL JS reads PMTiles directly via the
pmtiles://protocol handler — no plugin needed.
The trade-off: PMTiles is write-once. Unlike MBTiles (which you can modify with SQL UPDATE statements), PMTiles files are not designed for incremental updates. For static basemaps and periodic releases, this is fine. For live, frequently-updated data, you need a different approach — which brings us to Martin and on-the-fly tile serving.
The arc from MBTiles to PMTiles mirrors the broader evolution: each phase simplifies the deployment model while preserving the access pattern of the previous phase. PMTiles gives you vector tile interactivity with single-file simplicity and zero server infrastructure.
5. Where We Are Now: A Decision Framework
The evolution didn’t make earlier approaches obsolete — it expanded the toolkit. Each phase is the right answer for a specific combination of dataset size, interactivity needs, and deployment constraints.
┌──────────────────────────────────────────────────────────────────┐
│ THE EVOLUTIONARY ARC │
│ │
│ PHASE 1 PHASE 2 PHASE 3 & 4 │
│ Single File Raster Tiles Vector Tiles │
│ ─────────── ──────────── ──────────── │
│ │
│ Problem: Problem: Problem: │
│ "I have a file, "My data is too "I need clickable │
│ show it to me" big for a browser features at scale" │
│ to parse" │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ Solution: Solution: Solution: │
│ Parse & render Pre-render into Send geometry │
│ entire file millions of 256px as Protocol Buffers │
│ client-side images, fetch in tiles, let the │
│ only visible tiles GPU render them │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ New problem: New problem: New problem: │
│ Doesn't scale No interactivity, Millions of files │
│ past ~50 MB can't re-style to manage │
│ │ │ │
│ └─────────────────────┘ │
│ │ │
│ ▼ │
│ Solution: │
│ PMTiles — single-file │
│ archive with HTTP │
│ range-request access │
└──────────────────────────────────────────────────────────────────┘
Decision rules for today’s practitioner:
-
Under 50 MB, under 100K features, need full attributes → Phase One. Load as a single file. GeoJSON, Shapefile, KML, GPX — parse in under a second, get every attribute, zero preprocessing. This is the domain of data exploration and quick inspection.
-
50 MB–2 GB, need interactivity, static deployment → Phase Three + Four. Convert to vector tiles with Tippecanoe, package as PMTiles, host on any static storage. You get clickable features, client-side styling, and zero server costs.
-
Over 2 GB, planet-scale basemap → Phase Three + Four at scale. Use Planetiler to process OSM extracts into PMTiles. A 32-core machine turns the 70 GB planet file into a 50–100 GB PMTiles archive in about two hours.
-
Live database, dynamic data, frequent updates → Phase Three with Martin. Serve vector tiles directly from PostgreSQL/PostGIS. Tiles generated on-the-fly, no build step, always current.
-
Raster imagery, no interactivity needed → Phase Two, modernized. Use COG (Cloud Optimized GeoTIFF) for single raster files with range-request access, or raster PMTiles for tiled imagery. Same tile model, better formats.
6. The Constant Across All Phases: Preprocessing
If there’s one lesson that spans the entire evolution, it’s this: raw source data is never optimized for web delivery. Every phase — single file, raster tiles, vector tiles — benefits enormously from preprocessing. The difference between a sluggish 10-second load and an instant render is often 10 minutes of data preparation.
For vector data (applies to all phases):
- Simplify geometry. A coastline polygon with 50,000 vertices is indistinguishable from one with 500 vertices at zoom level 10. Use Mapshaper,
ST_Simplifyin PostGIS, or Tippecanoe’s built-in simplification. This single step often reduces file size by 80–95%. - Drop unnecessary attributes. A Shapefile of building footprints might carry 40 columns of tax assessment data. For a web map, keep the 3–5 columns users actually need. Every dropped column reduces parse time and memory pressure.
- Reproject to WGS 84 (EPSG:4326). Web mapping libraries expect latitude/longitude. Source data in State Plane, UTM, or other projected coordinate systems must be reprojected before loading.
- Split multi-layer files. A GML file might contain roads, buildings, and water bodies in one XML document. Split them into separate layers for independent styling and visibility control.
- Convert to a web-friendly intermediate format. GeoJSON is the lingua franca of web mapping. If your source is GML, Shapefile, or KML, converting to GeoJSON (or directly to vector tiles) before loading eliminates format parsing overhead in the browser.
For raster data:
- Use COG format. A standard GeoTIFF requires reading the entire file to access any portion. A Cloud Optimized GeoTIFF supports HTTP range requests — the client reads only the data it needs for the current viewport and zoom level.
- Build internal overviews. COG files with pyramid levels enable smooth zooming without the client doing expensive downsampling on the fly.
- Choose compression wisely. JPEG for RGB imagery (10:1 compression with minimal visual loss), LZW or ZSTD for DEM/terrain data where precision matters.
7. GeoDataViewer’s Place in the Evolutionary Arc
GeoDataViewer sits at the entry point of this pipeline. It’s the tool you reach for when you first encounter a dataset and need to understand what you’re working with — before committing to a deployment architecture.
Where it fits:
- Phase One (single file): GeoDataViewer excels here. Drag-and-drop support for 20+ formats — GeoJSON, Shapefile, KML, KMZ, GPX, GML, CSV, GeoParquet, FlatGeobuf, WKT, Excel, and more. Instant rendering with full attribute tables, feature inspection, filtering, and timeline animation.
- Phase Two (raster): Load GeoTIFF and COG files for immediate visual assessment. DEM files render as hillshade terrain with adjustable exaggeration.
- Phase Four (PMTiles): Load
.pmtilesarchives directly — vector or raster — with per-layer styling controls. Inspect the tile archive’s metadata, vector layers, and zoom range before deploying to production. - Format conversion: Convert between 19 format families using GDAL WASM, all client-side. Shapefile → GeoJSON, KML → GeoPackage, GeoJSON → FlatGeobuf — the conversion step that often precedes tile generation.
What GeoDataViewer intentionally does not do:
- It is not a tile generation pipeline. It does not produce MVT tiles from source data.
- It is not a persistent database. Data lives in the browser session and is gone on reload.
- It is not a replacement for QGIS or ArcGIS for spatial analysis workflows.
The recommended workflow, end to end:
- Inspect in GeoDataViewer. Drop your file, understand its structure — geometry types, attribute columns, feature count, spatial extent, time fields if any.
- Preprocess based on what you learned. Simplify, filter columns, reproject, split layers. GeoDataViewer’s converter can handle the format conversion step.
- Choose your deployment architecture using the decision framework above.
- Generate tiles with Tippecanoe (medium data), Planetiler (planet-scale), or serve dynamically with Martin (live database).
- Deploy as PMTiles on static hosting, or via Martin for dynamic data.
8. Conclusion: Evolution Is Accumulation, Not Replacement
The story of GIS data loading is not one of obsolescence. Each phase built on the last, and every phase remains the right answer for its domain:
- Single files are still the best way to inspect, explore, and share small-to-medium datasets. Nothing beats drag-and-drop simplicity when the data fits in memory.
- Raster tiles still power every basemap you use daily. When you don’t need interactivity, pre-rendered images are the most efficient delivery mechanism.
- Vector tiles made web maps interactive at scale, and the MVT format is now an open standard supported by virtually every mapping library.
- PMTiles collapsed the tile server and the file archive into a single static asset, making planet-scale vector maps deployable with zero infrastructure.
The modern GIS practitioner doesn’t pick one approach — they pick the right approach for each dataset. A 5 KB GeoJSON of GPS waypoints, a 200 MB Shapefile of building footprints, and a 50 GB planet-scale basemap are all “geospatial data,” but they demand entirely different pipelines.
The through-line across all four phases: preprocess your data for the web. A raw GML file with 200,000 features and 50 attribute columns will never perform well in a browser — not as a single file, not as vector tiles, not in any architecture. Spend 10 minutes simplifying geometry, dropping unused columns, and converting to a web-friendly format, and the same data becomes responsive at any scale.
GeoDataViewer is your first step in that journey. See what you have. Understand what you need. Then choose the right tool for the job — whether that’s Tippecanoe, Planetiler, Martin, or simply a well-preprocessed GeoJSON file.
Related Posts
View GIS Files Without Leaving VS Code: Meet Geo Data Viewer Fast
Preview 10+ geospatial formats (Shapefile, GeoJSON, KML, PMTiles) directly inside VS Code with Kepler.gl rendering.
Viewing COG and GeoTIFF Raster Data in GeoDataViewer
Learn how GeoDataViewer supports Cloud Optimized GeoTIFF (COG) and standard GeoTIFF files — drag-and-drop raster visualization in your browser.
Comprehensive Comparison of GIS Data Viewers: From Desktop to Browser
Compare GIS viewers from ArcGIS and QGIS to Kepler.gl and GeoDataViewer. Desktop vs browser, cost, format support, and speed.
Understanding GML (Geography Markup Language) and GeoDataViewer Support
Learn what GML is, how it compares to GeoJSON, common use cases, and how to open and visualize GML files using GeoDataViewer.