Democratizing GIS through Participatory Accessibility
Over the last few blog posts we’ve raised several problems with the current approach to GIS, but have not specifically offered solutions. At some point you have to stop being an armchair quarterback and provide suggestion for a path forward. Since our critique has meandered across multiple blog posts I thought it would be useful to first give a quick recap of the challenges facing GIS:
- Historically geography as a discipline has been focused on creating a “specialty” around the science of geographic information vs. building broadly accessible tools
- The emergence of “location as a feature” and the popularity of SoLoMo has empowered a massive new audiences to generate geographic data outside the discipline of geography and GIS
- The technical structure and culture of GIS is monolithic, where the creation, management, analysis and visualization of geographic data happens in a single closed end-to-end workflow
- In the emerging ecosystem of location data, the monolithic structure of GIS has problem of scope (requiring GIS to house all specialty expertise for any subject matter with a geographic component) and scale (the need for a GIS professionals to verify all external data whose volume far outstrips the supply of professionals)
- The computational architecture of GIS is bounded by CPU speed, which has peaked, requiring the need for distributed computational approaches
- The size, speed, and persistence of location data being generated changes fundamental concepts around error bounds, sample sizes, and margin of error that make traditional “authoritative” approaches to data problematic
There are several potential solutions to these problems, and many have been successfully fielded in numerous domains already. The common theme that cuts across all potential solutions is a shift from a monolithic to a distributed architecture and culture for GIS as an industry. Adapting and augmenting the current technical “architecture” of GIS is likely less important and easier than changing the monolithic “culture” of GIS. To keep this blog post manageable I will just hit on the highlights of a potential technical and cultural evolution for GIS.
Shifting to a Distributed Architecture – The Data Ecosystem
So, let’s take a deep breath and dive in (just imagine @ajturner saying this at about 327 wpm). The shift from a monolithic architecture is dependent on the premise; geospatial data is going to be created by a wide variety of applications and platforms. Further, data will not need to be siphoned through any single point or system in order for it to be usable. Data can remain resident in its home location then queried and indexed remotely through open standards. Search across multiple data stores can be unified in a federated manner and communicate validating metadata. Data should be translatable into multiple formats to meet end user demand. Critical across all of these is that they are self service, and a user without GIS training can easily discover and create data. A distributed approach to data means the creation of an ecosystem. In an ecosystem there is the ability to perform tasks in a variety of applications that are interconnected through both data portability and distributed analytics. This prevents vendor lock-in and the promulgation of proprietary formats.
It is insufficient to only provide connectors to allow GIS applications to ingest third party data – so called “volunteered geographic information”. This means all data must still run through a GIS before being consumed, so that it can be stamped as authoritative. This ensures that GIS will remain monolithic by forcing all emerging data streams to be pushed to “‘private data silos” rather than being part of a larger system of applications that work together (Wikipedia 2011)”. As the number of applications proliferate, which produce useful geospatial data, this approach will hit both scale and scope barriers.
Just as data is being generated by a wide number of new applications, the analysis of this data is increasingly occurring in places other than GIS applications. This includes both spatial and non-spatial analysis of data, each of which are very useful to the community. Projects like “R” (an open source statistical programming language) provide valuable analytical routines including sophisticated spatial analysis functions. For very large sets of data a variety of users are leveraging technologies like Hadoop, Cassandra, MongoDB, CouchDB and several others for sophisticated analysis of streaming data sources. Spatial extensions are actively being integrated into NoSQL data stores. SimpleGeo has done great work with Cassandra, SPADAC’s innovations on top of Hadoop with MrGEO are impressive and the community has been rapidly building out spatial capabilities for CouchDB and MongoDB. All of these examples have the ability to distribute processing and not be CPU bound. By creating an ecosystem of data and analysis, it allows new innovation to be easily plugged in without the burden of legacy approaches as a choke point.
Any of the various sources of “data” should be able to have their data pushed to an “analysis” end point to have a question answered. For instance, if a user wanted to take an aggregation of data from a MongoDB datastore, and the results of an intersection from MrGEO to run a statistical correlation in “R” that should be a fluid task and not require a user to visit three separate interfaces. Further, the results should be easily discoverable and tweakable; a user swaps out the MongoDB data with another source and the same correlation is run, or the user changes the equation to take another perspective.
In a distributed Web environment each of these actions can be tracked, so a persistent lineage of data analysis can be created automatically. Each analysis or derivative of a data set can inherit the metadata from the source, and then catalog what manipulations or analyses were done to the data. This allows any user to introspect a map or analysis to see each step taken to create it as well as the original source for all data. The crowd of users, both “on the ground” users and “analysts” can provide quality assurance feedback on both data accuracy and data analysis. A critical analysis may have been based on an event happening at a specific location, but an “on the ground” user notices the event is in the wrong location and pushes back photographs with evidence. Alternatively, thirteen different users all provide feedback identifying a wrong location and the alternative location hits a critical mass. Factual has done some brilliant work scaling this concept successfully to massive POI data stores.
From the opposite perspective, a sales group in the field runs a correlation of new demographic data or social media trending against their current sales data, and an analyst at headquarters notes they ran the analysis incorrectly and provides feedback/corrections. The same approach provides value even among fellow GIS professionals where analysis lineage provides the opportunity to collaborate around work. Importantly it allows external experts to participate as well – the spatial analysis was spot on but the anthropological assumptions were misguided. The anthropologist can access the work and data, and contribute back a correction or better model without having to learn the intricacies of GIS. Leveraging the “crowd” is not just useful for “open” data, but also leveraging the internal crowd across an enterprise to better leverage “internal” data. The VGI mind set is just as limiting internally for an enterprise as it is externally in the public domain.
An ecosystem fosters equality, which leads to better collaboration and better data/analysis. It is an illusion that just because the data is created and analyzed by a GIS professionalthat it is always going to be correct. Especially considering the incredible number of subject matter areas that have to be covered by spatial data and analysis. The reality is that all data and analysis starts off quite bad, and only by having many eyes and perspectives iterate across it does the product converge with truth. A distributed ecosystem of data and analysis maximizes the number of eyes and perspectives validating work in a method that creates collective wisdom through data lineage and analysis tracking.
Shifting to a Distributed Culture
The separation of data as “authoritative” and “volunteered” is another label to the “professional” vs. “amateur” debate that has permeated the democratization enabled by the Web over the last decade. This was the same argument pushed against Wikipedia by the encyclopedia and broader publishing industry. Yet, multiple studies have found the accuracy of Wikipedia meets and exceeds those of popular encyclopedias. The same argument emerged from journalists with the emergence of blogs: amateurs could not be trusted and would never be viable source of trusted content by the public. Blogs, though, have become an important part of the media community, but the technical and cultural approach to blogging did not emerge from word processing.
Granted GIS is not word processing and the situations are different at many levels – just as journalists did not disappear because of the rise of blogs neither will GIS analysts. In both situations, though, the community had to evolve and adapt. GIS is at a similar cross road – not only does technology need to adapt, but so does the culture. GIS and the professionals that are qualified to use it cannot be the only place where location data can be created, managed, analyzed and visualized. Technology is meaningless without addressing to people side of the equation; they are the part of the system and will be the root of all solutions. All of these functions need be available across thecommunity through a self service delivery mechanism available from “novice to expert”. In short we need to stop creating artificial barriers and distinctions between the GIS community and the rest of the world working with geographic data. While it is easy to think these distinctions will create job security and relevance the reality is the opposite. The most relevant journalists today are the ones that embraced blogging and social media. It will be no different in GIS and the GIS professionals/academics/researchers I’ve respected the most over the years are the ones at the leading edge – embracing the technological and cultural shift in front of us.
3 Responses to Democratizing GIS through Participatory Accessibility
Leave a Reply Cancel reply
About Us
Welcome to the GeoIQ blog. We write about features of our GeoIQ analytics engine, what is new and exciting in the GeoCommons community, and general industry thought leadership and discussions of geospatial data visualization and analysis.
Please explore what we're working on and let us know if you have any questions or ideas!
New GeoCommons Maps- RW-map1 lynnr321
- NHGH 1941 data JCReut
- RW-map1 lynnr321
- SRE Citas axas_@hotmail.com
- US FactFinder (2010 Contiguous States) kobl0019
- Colorado Hunting Orientation Map pizard
Recent Comments
- Bargain homes in Murrieta on A Quick Test Drive of Google Table Fusion
- Bargain homes in Murrieta on A Quick Test Drive of Google Table Fusion
- balayı otelleri on Dataset of the Day: Early Voting—November 3, 2008
- haber,haberleri,başbakan on Dataset of the Day: Early Voting—November 3, 2008
- realtor tampa bay on The Spillover Effects of Foreclosures





‘…Data can remain resident in its home location then queried and indexed remotely through open standards.’ True.
‘…Data should be translatable into multiple formats to meet end user demand. Critical across all of these is that they are self service, and a user without GIS training can easily discover and create data.’ Also true, and … almost theoretically done
‘.. A distributed approach to data means the creation of an ecosystem. In an ecosystem there is the ability to perform tasks in a variety of applications that are interconnected through both data portability and distributed analytics.’ Agreed.
‘…This prevents vendor lock-in and the promulgation of proprietary formats.’
You’ll have to concede than either way, some vendor would find out ways to get around this by spilting its offer into segments, and … fortunately for the sake of legitimate ROI, will make the necessary bucks to live while giving the community ways to get in. No innovation on the horizon without a somewhat mixed model, as we’ve got to see in various sectors related.
On the other side, the real leap is for the end-user to be given tools that are likely to help make sense of such amount of data. That’s really what it is about.
I love what you’re doing. Truely.
“Historically geography as a discipline has been focused on creating a “specialty” around the science of geographic information vs. building broadly accessible tools”
You are confusing the Discipline of Geography with practitioners that have chosen to specialize in in the application of geographic information systems. Geographers are incredibly broad in their interests and have traditionally synthesized spatial data from broad and disparate sources.
GIS was and is powerful tool for studying large spatial datasets and thus was made use of by Geographers. Some of us decided to capitalize on the growing interest in GIS and make a living specializing in it. Are we still geographers? Sure. But we are not defining the discipline.
Geography is a social science that has traditionally mixed more defined and rigid topics together. In many ways classically traded geographers are desperately needed to help understand the world better through the explosion of new information that geospatial technology adoption is enabling.
My 2 cents – Justin
Thanks for the comments Justin and Ceedoo –
Ceedoo I agree there need to be monetization paths, but I believe that should be based on usage and not vendor lock-in. Proprietary systems that make data portability challenging to stifle competition I don’t believe is a sustainable monetization path going forward. Too many negative externalities. An open and distributed architecture does not mean there are not opportunities for monetization, it just means the customer has an open choice without the heavy burden of unreasonable switching costs.
Justin – I 100% agree that we need more people trained in geography. Although I don’t think the status quo is well suited to meet demand. In my opinion it is the discipline of Geography that has tied practitioners of GIS to it. This was the whole debate around GIS being science and not a tool. Since it is a science it has to be housed in Geography – see the AAG annals article here on it all – http://dusk.geo.orst.edu/annals.html
While not exclusively, the majority of us learn GIS and related spatial thinking in universities. Personally I think if we are going to meet demand we need to start teaching spatial thinking and technology in Computer Science departments, Business Schools etc. Take the knowledge to the people instead of expecting the people to come to the knowledge. I’d love to see Geography as a discipline blossom and take advantage of the upsurge in demand, but I think they are too focused on the post modern interpretations of Google Earth and ubiquitous location tracking. My Geography alma mater is constantly fighting budget cuts and reaching for alumni donations to just keep level. In order to meet the needs of the future we need the bigger more well funded departments to get behind geospatial technologies.
We now have 4 cents