Contents

First Dive Into the Geospatial World: PostGIS, JTS, and Lessons Learned

First Dive Into the Geospatial World: PostGIS, JTS, and Lessons Learned webp image

First, I’m not from the GIS world. In my day-to-day work, I work with Scala-based, microservice-oriented systems, where most of our concerns revolve around business logic, clean APIs, and distributed architecture. Working with geospatial data was a new area for me - and this blog post is a summary of the first dive into it - insights, lessons, and tips - from the perspective of a backend engineer working with geospatial features for the first time in a real-world project.

We needed geospatial logic to manipulate, display, and analyze shipment trajectories using data sourced from vessels, trucks, IoT devices, and more. Two main tools which were particularly useful are:

  • PostGIS - a spatial extension for PostgreSQL that adds support for geographic objects - allows storing, querying, and manipulating location-based data directly in the database.
  • JTS Topology Suite - a Java library for modeling and processing 2D linear geometry.

These two tools complement each other well. PostGIS efficiently handles large-scale, persistent spatial data, while JTS offers more control and flexibility for complex logic in the application layer - such as business rules involving multiple conditional steps, temporary geometry transformations, or situations where avoiding repeated database calls is critical for performance.

Blog%20post%20pic%2001

Example spatial data of shipment trajectory between ports in Rotterdam, Netherlands and Houston, USA.

Here are some insights & lessons I learned while working with coordinates and geospatial queries:

1. Integrating PostGIS into the backend

Since PostGIS is a PostgreSQL extension, using it is just like running regular SQL. From the backend perspective, the main thing is handling spatial types in your application code.

To work with PostGIS types like geography(Point, 4326), you’ll need to convert between your application model and the spatial format expected by the database. This is especially relevant when ingesting data from sources like JSON APIs - you parse raw coordinates into your domain model, then insert or query them using PostGIS functions such as:

ST_SetSRID(ST_MakePoint(longitude, latitude), 4326)

In my particular case, I worked with Doobie, a functional JDBC layer in Scala. It allows you to write raw SQL and define how spatial types are moved between the database and the application (with postgis-jdbc driver) via implicit mapping.

2. Spatial indices

To work effectively with PostGIS, it's important to first understand why it performs so well on spatial queries - and the answer starts with spatial indexing. Standard indexing strategies like B-tree aren't sufficient for spatial operations. PostGIS leverages PostgreSQL's native index types to support spatial data operations, and this is one of its most powerful features.

PostGIS primarily utilizes GiST (Generalized Search Tree) indexes, which organize data into a hierarchical structure based on the bounding boxes of geometries. This approach allows the database to quickly eliminate non-relevant records by checking for bounding box intersections before performing more precise geometry calculations. PostGIS uses an R-Tree index implemented on top of GiST to index spatial data. R-Trees break up data into rectangles, sub-rectangles, and sub-sub rectangles.

Blog%20post%20pic%2002

R-Tree hierarchy, (source).

SP-GiST (Space-Partitioned Generalized Search Tree) is quite similar to GiST but is specifically tuned for spatial partitioning - i.e., where data is non-overlapping and can be perfectly divided into spaces.

Another index worth mentioning is BRIN (Block Range Index), which provides even faster index creation time, and a much smaller index size but is appropriate to use only for special kinds of data - spatially sorted, with infrequent or no update (e.g., timestamped GPS logs).

To learn more about spatial indices, I recommend PostGIS documentation itself.

3. PostGIS makes complex operations easy

PostGIS abstracts away the complexity of spatial computation behind a concise SQL. Take a practical example: filtering out vessel trajectories that cross a restricted area. The real-world use case would be, e.g., the closure of Suez Canal, resulting in suddenly blocking access to it.

With a PostGIS geography(LineString, 4326) column representing vessel positions:

CREATE TABLE trajectories(
   id uuid PRIMARY KEY,
   positions geography(LineString, 4326) NOT NULL
)

PostGIS allows us to express this requirement in a single, readable SQL query, with no need for complex loops or geometry math in your application code:

SELECT * FROM trajectories
WHERE NOT ST_Intersects(
positions, 
ST_MakeEnvelope(32.1, 30.01, 32.66, 31.03, 4326)::geography
);

Blog%20post%20pic%2003

Geo boundary box [32.1, 30.01, 32.66, 31.03] used to present a simple Suez Canal region.

Another real-world scenario is filtering trajectories based on the distance to some position - for example, finding vessels near the port in Gdańsk (approx. [18.698, 54.401] long/lat) to trigger alerts:

SELECT * FROM trajectories WHERE ST_Distance(
positions,
ST_SetSRID(ST_MakePoint(18.698, 54.401), 4326)::geography
) <= 100000; -- 100km

4. JTS in-memory computations are also valuable

Sometimes, in-memory computations are the way, even for large-scale data. Let’s consider the following real-world example - we needed to roughly check whether the vessel trajectories collide with the earth's land, identifying incorrect trajectories for analytical purposes. The volume was significant; validation should detect only really serious collisions (ignore “small” ones due to lost signal for a couple of minutes), and this validation had to run on every request.

The first idea was to:

  1. Load some open-source earth coast lines map into PostGIS. We choose NaturalEarth data - I will cover it later in the blogpost
  2. Apply ST_Buffer and ST_Simplify to shrink and clean up the geometries - improving performance and reducing noise.
  3. Include checks on SQL level with ST_Intersects

Blog%20post%20pic%2004

Zoomed-in examples of greatly simplified earth coastlines map for some major sea ports.

However, after those operations, the coastline map size in the WKB (Well-Known-Binary) format turned out to be only 175KB. As this API is rather heavily used, and our PostgreSQL instances hosted on Azure often were limited in some operations, we decided that we could afford to load the whole map into JVM at the start and perform the checks in memory using the JTS library.

private val CoastlinesPolygonsFilePath = "/coastlines_polygons.wkb"

private val coastlinesJTSSpatialIndex: jts.index.STRtree = {
 val inputStream = getClass.getResourceAsStream(CoastlinesPolygonsFilePath)
 if (inputStream == null) throw new FileNotFoundException(CoastlinesPolygonsFilePath)

 val index = new jts.index.STRtree()
 val reader = new jts.io.WKBReader()

 Using.resource(Source.fromInputStream(inputStream)) { source =>
   source.getLines().foreach { wkbHex =>
     val wkbBytes = jts.io.WKBReader.hexToBytes(wkbHex)
     val geometry = reader.read(wkbBytes)
     index.insert(geometry.getEnvelopeInternal, geometry)
   }
 }

 index.build()
 index
}

def isTrajectoryColliding(trajectory: jts.geom.LineString): Boolean =
   coastlinesJTSSpatialIndex
       .query(trajectory.getEnvelopeInternal)
       .asInstanceOf[java.util.List[Geometry]]
       .asScala
       .exists(coastlines => coastlines.intersects(trajectory))

5. Mind coordinate system - SRID

In PostGIS, geometries are tied to an SRID (Spatial Reference System Identifiers), like 4326 for standard GPS coordinates, and spatial functions rely on it. JTS, on the other hand, treats geometries as simple X/Y points without any coordinate system awareness. Once you process geometry in JTS, you lose the SRID information unless you manage it manually.

What I learned the hard way is the risk is that if you modify geometry in JTS and reinsert it into PostGIS without resetting the SRID, operations like ST_Distance, ST_Intersects, or ST_Area might give incorrect results. So always explicitly set the SRID again before saving geometries back to PostGIS, for example:

SELECT ST_SetSRID(geometry, 4326)

6. Crossing the 180th meridian is tricky

This one surprised me. If your geometry crosses the 180th meridian (known as antimeridian), it might behave oddly. Some functions don’t handle it well, and even valid geometries can become “invalid” from the PostGIS / JTS point of view.
An example trajectory that crosses the 180th meridian, when projected in GeoJSON format, will look like this:

Blog%20post%20pic%2005

Vessel trajectory from Tokyo, Japan to Los Angeles, USA.

In this example, the trajectory looks like it spans the entire globe. It is a well-known problem in the GIS world. By default, longitude values range from -180° to +180°, with the 180th meridian marking the boundary. Unfortunately, many geometry engines - including PostGIS and JTS - treat this boundary as a hard edge rather than a wraparound.

Blog%20post%20pic%2006

Discontinuity from +180° to -180°.

The workaround we use to cover this case is normalizing longitudes to the 0°-360° range before processing. It can be achieved by built-in PostGIS or JTS methods (e.g., ST_ShiftLongitude) and works well for most internal processing.

On the other hand, the GeoJSON spec recommended solution to handle such cases is: Any geometry that crosses the antimeridian SHOULD be represented by cutting it in two such that neither part's representation crosses the antimeridian.

This would allow, e.g., your frontend map to render two valid geometries instead of one invalid.

7. No standard order of coordinates

Another surprising thing I found is that there is no universally accepted order for coordinates (latitude/longitude or longitude/latitude). If some API returns coordinates in raw array format like [[10, 0], [15, 0]], it’s not immediately apparent whether you're looking at [long, lat] or [lat, long], unless it's explicitly stated.

To give some more examples:

  • PostGIS stores geometry in WKB (Well-Known-Binary) or WKT (Well-Known-Text) format inside the database, and following the long/lat convention
  • Google Maps search by coordinates is by lat/long
  • JTS library works in a Cartesian plane and uses the Coordinate(x, y) model, and it treats x as horizontal (longitude) and y as vertical (latitude)

If you accidentally loaded data into PostGIS in lat/long order, you can still easily swap it using the ST_FlipCoordinates function. Anyway, always check your data coordinate order because it could send you halfway around the world. Literally.

8. Natural Earth & Open Street Map data

There are two excellent open-source resources that cover most of the basic good geospatial data - Natural Earth and OpenStreetMap (OSM).

While OpenStreetMap is a well-known massive global project offering high-detail maps, I was particularly interested in data which would roughly cover country borders and coastlines. To this purpose, Natural Earth was the winner, offering clean, pre-simplified geometries. The datasets are lightweight and easy to import into PostGIS. For global-scale spatial queries and overlays, it just works.

9. IntelliJ extensions to see geographies visually

If you want to quickly check the geographies visually, you can export your PostGIS data to GeoJson format and paste it to the geojson.io site. However, as a Scala developer, I work mainly in IntelliJ, and I would rather stay within the IDE if possible. To achieve this, you can use:

  • IntelliJ built-in GeoViewer - works out of the box; however, you need an IntelliJ Ultimate license. While working with the database plugin, you can click Show Geo Viewer on your query result from PostGIS. What is nice, it offers a lot of map backgrounds (in the example below, Esri.WorldPhysical).

Blog%20post%20pic%2007

IntelliJ built-in GeoViewer.

  • GeoJSON Editor - an IntelliJ plugin that adds full support for files ending in .geo.json. It is useful if you have such files in your repo, for e.g., keeping snapshots of particular geospatial test results, readmes, etc., or exporting data in GeoJSON format from PostGIS to keep it around if needed.

Blog%20post%20pic%2008

GeoJSON Editor in IntelliJ.

What is particularly interesting is that the GeoJSON editor allows you to edit geographies, saving the output in real-time to the file by adding some simple shapes or modifying the route. This is quite useful for creating specific test cases.

Blog%20post%20pic%2009

GeoJSON Editor edit mode.

Summary

Coming from a non-GIS background, my first dive into geospatial data was a surprisingly pleasant experience. Tools like PostGIS and JTS made it natural to build features directly in a backend, from simple basic distance checks to handling antimeridian crossings and coastline collisions. I hope you’ve found this blog helpful - whether you’re new to spatial data or experimenting with it in your work.

Reviewed by: Krzysztof Grajek

Blog Comments powered by Disqus.