Python Simplified

PythonSimplifiedcomLogo

Geocoding in Python Using Geopy

Introduction

Have you ever come across a dataset having addresses like below? How do you handle the address column? If the dataset is large and if you try to encode the address column, it will result in high cardinality issue. 

address sample data2

As part of the feature engineering, you can handle this in many ways. For example, you can extract country, city, and area from the given addresses and use them as features. Another way is to geocode the addresses into geographical coordinates (latitude and longitudes) and use them as features.

Some of the popular packages that are used for geocoding and reverse geocoding in Python are geopy, geocoder, opencage, etc.

In this article, you will first understand what geocoding and reverse geocoding are, and then explore the geopy package to convert addresses into latitudes and longitudes and vice versa. And finally, we will see how to calculate the distance between the two addresses.

Geocoding

Geocoding is the process of converting addresses into geographic coordinates (i.e. latitude and longitude).

Reverse Geocoding

Reverse Geocoding is the process of converting geographic coordinates (latitude & longitude) into a human-readable address.

Geocoding example

Geocoding and reverse geocoding are provided by different service providers such as OpenStreetMap, Bing, Google, AzureMaps, etc. These services provide APIs which can be used by anyone. However, each of these incurs costs for using their services and comes with limitations of their own. 

Geopy

The geopy package is not geocoding service provider. It just provides an interface to connect to several services under a single package. 

Geopy services
Source: https://geopy.readthedocs.io

Below is the list of all the services that are implemented in geopy. You can use any of these geocoder services but keep in mind that each service comes with its own terms of conditions, pricing, API keys, etc. The OpenStreetMap service is free so we’ll be using the Nominatim service in this article.

Geopy geocoders list

If you don’t want to use geopy, then you can directly use the API provided by the above services. For example, you can use Google Geocoding API directly instead of geopy. 

Installation

				
					pip install geopy
				
			

Syntax

				
					from geopy.geocoders import Nominatim
geolocator = Nominatim(user_agent="<<some_app_name>>")
location = geolocator.geocode("<<address>>")
				
			

Geocoding (forward geocoding) using geopy

Let’s look at an example for the address Georgia Aquarium, Atlanta, USA. First, we need to create a Nominatim geolocator object called geolocator. As mentioned earlier, you can use any other geolocator of your choice. But we will stick to Nominatim throughout this post as it is free to use without any API keys, etc.

Next, you need to pass the address for which you want to get latitude and longitude. Then the result is stored in the location object using which we can get the required details such as longitude and latitude as below.

				
					geolocator = Nominatim(user_agent="myapp")
location = geolocator.geocode("225 Baker St NW, Atlanta, GA 30313, USA")
				
			
				
					print(location.address)
				
			
				
					Georgia Aquarium, 225, Baker Street Northwest, Atlanta, Fulton County, Georgia, 30313, United States
				
			
				
					print(location.latitude, location.longitude)
				
			
				
					(33.76326745, -84.39511726814364)
				
			
				
					print(location.raw)
				
			
				
					{
   "boundingbox":[
      "33.7623777",
      "33.7643007",
      "-84.3960032",
      "-84.3939931"
   ],
   "class":"tourism",
   "display_name":"Georgia Aquarium, 225, Baker Street Northwest, Atlanta, Fulton County, Georgia, 30313, United States",
   "importance":0.9273629297966992,
   "lat":"33.76326745",
   "licence":"Data © OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright",
   "lon":"-84.39511726814364",
   "osm_id":28912103,
   "osm_type":"way",
   "place_id":97669427,
   "type":"aquarium"
}
				
			

Reverse Geocoding using geopy

In the above example, from the address, we got latitude and longitude. Let’s now apply reverse geocoding to get the address from the given geographic coordinates (latitude and longitude).

For reverse geocoding, you need to call the reverse() method on the geolocator object by passing latitude and longitude as parameters. Then you can get the address and a lot of other details as you can see below.

				
					location = geolocator.reverse("33.76326745, -84.39511726814364")
print(location.address)
				
			
				
					Georgia Aquarium, 225, Baker Street Northwest, Atlanta, Fulton County, Georgia, 30313, United States
				
			

Note that other geolocators such as Bing, Google API may require additional parameters. So, keep an eye on these additional parameters. 

Geocode has other parameters such as timeout, limit, language, geometry, etc. You can explore these additional parameters to control the geocoder output.

Distance between two coordinates

Sometimes you may want to calculate the distance between two addresses given latitude and longitude. For this, geopy provides two ways to calculate the distances: geodesic distance or great-circle distance.

Let’s consider below two coordinates for calculating the distance between them. Coordinate 1 refers to Georgia Aquarium and coordinate 2 refers to Stone Mountains park. 

Coordinate 1: 33.76326745, -84.39511726814364 (225 Baker St NW, Atlanta, GA 30313, USA — Georgia Aquarium)

 

Coordinate 2: 33.804504, -84.1587461 (1000 Robert E Lee Blvd, Stone Mountain, GA 30083, United States — Stone Mountain Park)

Distance between coordinates using the geodesic method:

				
					from geopy import distance

georgia_aquarium = (33.76326745, -84.39511726814364)
stone_mountain = (33.804504, -84.1587461)

print(distance.distance(georgia_aquarium, stone_mountain).miles)
print(distance.distance(georgia_aquarium, stone_mountain).km)
				
			
				
					13.896931316085105 
22.36494303195367
				
			

Distance between coordinates using the great-circle method:

				
					from geopy import distance

georgia_aquarium = (33.76326745, -84.39511726814364)
stone_mountain = (33.804504, -84.1587461)

print(distance.great_circle(georgia_aquarium, stone_mountain).miles)
print(distance.great_circle(georgia_aquarium, stone_mountain).km)
				
			
				
					13.869732545588302 
22.321170853847264
				
			

Conclusion

In this article, you have understood geocoding and reverse geocoding, how to use geopy package for geocoding and reverse geocoding, and then we saw how to calculate the distance between two coordinates. I hope you found this article useful. 

References

Share on facebook
Share on twitter
Share on linkedin
Share on whatsapp
Share on email
Chetan Ambi

Chetan Ambi

A Software Engineer & Team Lead with over 10+ years of IT experience, a Technical Blogger with a passion for cutting edge technology. Currently working in the field of Python, Machine Learning & Data Science. Chetan Ambi holds a Bachelor of Engineering Degree in Computer Science.

3 thoughts on “Geocoding in Python Using Geopy”

  1. Hello:
    Nice blog & very informative.
    I want to know how to extract data like building, school, hospital residential etc… based on image and display in dataframe.

    Any lead or link is greatly appreciated.

    thanking in advance.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top