It has been fun to watch different members of the GeoCommons team provide their take on each of the analytics we’ve created. This is really just the tip of the iceburg of what is possible. The real excitement is what analytics you, the user, will create. In addition the pre-baked analytical tool you also have a blank slate to create your own equation.

For any data set in Geocommons that you would like to run an equation over you can select the “custom analysis” widget.
custom_equation_selection
Then you type in your equation, selecting from the attributes available in the data set. For example, I’ve pulled in a data set from the US Census with the rate of population growth, and I would like to predict how long it will take the population to increase by 10% for each county in the United States. To do so I punch in the equation similar to how I would in Excel clicking on the attributes I’d like to be my variables – like below.

custom_equation

If you are not the overly nerdy you can skip this next bit.

In order to predict how long it will take for a county too increase it’s popualtion by 10% I made the following assumptions. The current population level is “P0″ and rate of growth is “r” %. Then using the simple formula change in population over time t = rate of change * base population.

1.1 * base population = exp (r * t) * base population.

natural log (1.1) = r. t

Therefore t = natural log (1.1) / r

if r is a percent figure then

t = 100 * natural log(1.1) / r

If your Saturday night party is not seeing who can recite the most digits of Pi from memory then this is the spot to start reading again.

GeoCommons then creates a new attribute from the result of my analysis and I can create a map to visualize it like below:

population_prediction
All places that were experiencing negative growth were excluded from the results

I can see from the map that there are a few places that are taking a really long time to grow like York County in Maine, which will see 10% growth in 4,811 years. Mapping place that are stagnate is really not all that interesting. What I’d really like to see is what are the fastest growing places, and how much longer it will be before they add another 10% to their population. To do this I’m going to take the same data and map the growth rate for each county then I can click on the fastest growing counties to see how long it will take for them to add another 10% to their population.

#maker_map_41315 {width: 100%; height: 400px;}

View full map

maker_map_41315 = new F1.Maker.Map({map_id: “41315″, dom_id: “maker_map_41315″});

Using the filter tool I can see that Loving County, Texas has the highest growth rate at 12.5%, and it will take 9 months for them to add another 10% to their population. That means another 10% of the population is going to need housing and services, and this is a market that could have opportunity in it. Vice versa it could also be a county that will need more community services like schools and health care, and this could be an indicator of the need for more public investment by the state.

You could go in a lot of directions with a relatively simple analysis. Another user may want to tweak the analysis and look at the growth of school aged children to see what the impact will be on the education system. Another user may examine locations by income bracket to see what goods and service could take advantage of the increase in demand.

This is the power of making analytics social and collaborative. All these analyses and results are discoverable and linked. Even better, all the linking and metadata generation is done automatically. That is the beauty of building on the Web the ability to interconnect data and now analysis.

We can also make analysis itself more accessible to the masses. I don’t necessarily need to know all the mathematical intricacies of the equation we ran for the result:

(100)*Math.log(1.1)/[popp8t9]

If another user had a data set with the growth rate for elementary aged children, and that attribute was called [child5t10] a user could just plug that into the equation – I don’t necessarily need to know natural logarithms to use the equation:

(100)*Math.log(1.1)/[child5t10]

Or let’s say I did not have a rate of change and just two time periods for elementary school aged children [child5t10_2008] and [child5t10_2009] then I could run this equation to calculate the growth rate:

([child5t10_2009] – [child5t10_2008])/[child5t10_2008] * 100

I could decide that calculating growth rates is a generic enough of a function to create my own analysis widget like “buffer” or “aggregation”…well that is a topic for another post.

ps – in fact I do need to recognize the help from Raj Kulkarni at GMU, because I was a bit rusty on logistic equations myself – too much blogging and emailing sad to say. Thanks Raj!

 

2 Responses to On the 7th Day of Analytics – Tabula Rosa

  1. [...] beyond simple spatial querying into some much more complex analytics. Sean described how you can write your own expression. The power to now share some advanced analysis with subject matter experts so that they can do [...]

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>