Office for Technology Commercialization
http://www.research.umn.edu/techcomm
612-624-0550

Spatial Hadoop: Spatial Data Analysis Extension for Apache Hadoop

Technology #20150081

Questions about this technology? Ask a Technology Manager

Download Printable PDF

Image Gallery
Apache HadoopSpatial DataLocation Data
Categories
Researchers
Ahmed Eldawy
Ph. D Candidate, Data Management Lab, Department of Computer Science and Engineering, College of Science and Engineering
External Link (www-users.cs.umn.edu)
Mohamed F. Mokbel
Associate Professor, Department of Computer Science and Engineering, College of Science and Engineering
External Link (www-users.cs.umn.edu)
Managed By
Andrew Morrow
Technology Licensing Officer

Location and Spatial Data Analysis

Through use of an open-source commodity computing platform Apache Hadoop, and an extension labeled Spatial Hadoop, it is possible to perform spatial analysis on large quantities of location and GPS data.

Please Note: SpatialHadoop has been adopted by the Eclipse Foundation under the name GeoJinni.

Apache Hadoop Commodity Computing Extension

Today’s availability of big data sets brings high demand for computational power. Apache Hadoop is an open-source data analysis system designed to facilitate the use of commodity computing. An extension of this, Spatial Hadoop, has been developed that utilizes a unique high-level language, termed Pigeon, and a variety of operative tools for large-scale spatial and location data analysis.

Spatial Big Data Opportunities

Massive amounts of locational data is being gathered through the use of GPS data, geo-tagged social media, map and location services, satellite imagery, and check in services. Spatial Hadoop provides spatial indexes, range query, spatial join, and a suite of computational geometry tools. Data can be processed efficiently to create new opportunities in a variety of fields such as location advertising, weather prediction, and transportation optimization.

BENEFITS AND FEATURES OF SPATIAL HADOOP:

  • Process location data and GPS data efficiently through commodity computing.
  • Simple, high-level language for data management.
  • Open source and compatible with other Apache Hadoop extensions.

Phase of Development Product available; released open-source for download in Fall 2014.