Wednesday, May 29, 2013

Natural Language Geocoding


CLAVIN (Cartographic Location And Vicinity INdexer) is an open source software package for geotagging and geoparsing text. The tool combines a variety of open source tools to geocode natural language, extracting location names from unstructured text documents.

CLAVIN does not simply 'look up' location names mentioned in a piece of text but "uses intelligent heuristics in an attempt to identify precisely which "Springfield" (for example) was intended by the author, based on the context of the document."

You can try out CLAVIN for yourself by cutting and pasting or typing your own text into the Online Demo. The demo gives you the option to parse the document and return a list of all the locations mentioned in the text or to view all the locations on a Google Map.

You can view the code for CLAVIN on its GitHub.

1 comment:

Anonymous said...

I tried the following snippet of text from a Google search "Russia wikipedia" and received perfect results!

"Russia or, also officially known as the Russian Federation, is a country in northern Eurasia. It is a federal semi-presidential republic, comprising 83 federal subjects. Wikipedia
Capital: Moscow"

I really like the slanted box and text field look as well, great demo!