The core mission of GeoCommons is bringing the power of geography to the masses. First we focused on making data more accessible, then quality cartographic visualization of data, and today we’ve added geographic analysis of data. Over Christmas we blogged about the “12 Analytics of Christmas” we’d built for GeoIQ. We are celebrating Christmas in June bringing all those analytics and more to GeoCommons. Now anyone can run analysis on their data and share it with the world. GeoCommons has a variety of geoprocessing and statistical analysis tools available for anyone to use. If you aren’t familiar with the lingo – geoprocessing capabilities allow you do geographic queries like aggregate customers by zipcode, or find all the houses within one mile of a metro stop. Statistical analysis tools let you look at things like the difference between two time time periods, or run a correlation between two data sets. We have tutorials and user manuals to step you through all the new capabilities. Also there are step by step videos and guides for each new analysis feature embedded at the point you are choosing it. The focus across the board is to make geographic analysis as accessible to as many people as possible. Bill made this quick video so you can see an example of running through an analysis:
So, why such a long wait since Christmas to get some analytic goodness on GeoCommons? In short – scalability. Providing analysis capabilities for even a large enterprise pales in comparison to the traffic we get on public GeoCommons. There was a concern we could melt the servers if the whole of GeoCommons users got analysis happy. We wanted everyone to be able to go ballistic with analysis – big data sets, small data sets, concurrent analyses from a single users. The problem was the traditional approach to running GIS analysis is using a synchronous dedicated client server connection. You kick off your analysis, lock up your desktop and go get coffee while you hope for the best. That is lame enough on a desktop, but definitely won’t work for a multi-tenant Web application. We could not afford to be CPU bound by a single server architecture if we wanted tens of thousands of users running analyses across GeoCommons. What to do?
Brilliant engineering team to the rescue. Tim “Chippy” Waters, Andrew “The Wizard” Semprebon, and Matt “Do The” Dew were in the trenches building a new approach to let us unleash analytics across GeoCommons. I’ll leave it to them to get into the gritty details and just give the Cliff Notes version. The team built a distributed asynchronous analysis capability that allows GeoCommons to run multiple concurrent analyses at scale. As a user you can kick off multiple analyses and GeoCommons will notify you when they are done and let you keep trucking along making data and maps in the application. If the load starts getting heavy we can just spin up more analysis servers to distribute jobs across. That means pushing the limits of analysis to a point where we are computationally efficient enough to provide it for free to everyone. So, give it a shot – run some analysis. It is not nearly as scary as folks make out. A couple of clicks and you’ll be cranking out powerful geographic analysis like an easy bake oven.
Welcome to the Esri DC Development Center blog. We write about features of our work on big data analytics, open platforms, and open data, what is new and exciting in the Esri and community, and general industry thought leadership and discussions of geospatial data visualization and analysis.
Please explore what we're working on and let us know if you have any questions or ideas!