Hierarchy or Folksonomy? Is there a Hybrid between Order and Chaos
When we started the very first iteration of GeoCommons in 2005 folksonomies were all the rage and we jumped on board using tags to organize the geospatial data that was pushed into the new platform. During the time we had the prototype deployed we ran into many of the same issues other applications have found with folksonomies
1) people’s tags may be difficult for others to understand,
2) people may have tagged items inappropriately for others’ needs.
In short your users will not always implement tags in ways that are productive for the community – in the extreme resulting in Flickr’s 20 million unique tags. How many of those 20 million tags are misspelled words or so off the path they never get found.
In addition to the problems you encounter with folksonomies in general you have the further complications of geopspatial data. All geospatial data sets have location tags, but adding them in an unstructured way creates enough chaos that it is very difficult to leverage location tags in a thorough way. Secondly many potential users do not know the variety of geodata available. Put more simply they do not know what to search for, and having the ability to browse through data by topics is appealing.
Despite the downsides of folksonomies they are incredibly powerful and have been hugely effective in organizing vast amount of data on the web. So, as we worked on the next iteration of GeoCommons we started looking at possible hybrid approaches to folksonomies and hierarchies.
Specifically we looked at the two problems specific to geospatial data listed above 1) place tags and 2) organizing data for browsing. Solving the problems required both short term and long term solutions.
Fortunately we had a small advantage over many crowd sourced project in that we have a full time data team. They are a great group of folks that spend their day finding cool geodata and coming up with clever ways to organize it.
Through the data team and the other community members that contributed data to the first iteration of GeoCommons we had a big pool of data with a wide variety of tags to examine. What we found were some distinct trends in the tagging and titling of data. Across the data there were a commons set of tags that broke the data up into a useful set of distinct categories, but there were also many data sets that were tagged with elements that made them often indiscoverable. After the analysis we started to look at structures we could establish to help create self similarity in tagging that still had the flexibility to be adaptive.
The result was the creation of a location and topical taxonomy based on our existing corpus of data that has the intelligence to adapt as the content grows and evolves. I can’t go into the technical details in depth, but fundamentally the concept is to intelligently leverage the taxonomies and structures to provide suggestions to users to tag their data better.
In many cases this can be very simple – like providing tips on how to tag and title effectively to make your data more valuable to the community. For instance with titles we found across GeoCommons there were four key pieces of information used for datasets in the past.
1) Source name, 2) Original Name of Dataset from Source (or short description of dataset) 3) Geographic Area, 4) Time period of data
Examples:
Communicating this effectively to users is a great way to get better consistency across data contributions, while still allowing flexibility for users to be creative and bring in information that does fit the rigid mold of a hierarchy. Of course this is the most simple and you can get far more clever.
Del.icio.us for instance has a great feature that notifies a user they are putting in a new tag no one has used before and asking if that is what they meant to do. You can also suggest tags from your taxonomy that are semantically related to the data the user is contributing. This creates a consistency across tags that makes data easier to find as the system scales to larger volumes.
The nice thing about taxonomies as opposed to folksonomies is that they can be structured as trees, which means you can compute across them quite easily. With a solid and adaptive taxonomy in place you can go a long ways in intelligently guiding users towards creating better and more consistent tags. At least that is what we think and it will be fun to see how it works out after the launch.
One Response to Hierarchy or Folksonomy? Is there a Hybrid between Order and Chaos
Leave a Reply Cancel reply
About Us
Welcome to the GeoIQ blog. We write about features of our GeoIQ analytics engine, what is new and exciting in the GeoCommons community, and general industry thought leadership and discussions of geospatial data visualization and analysis.
Please explore what we're working on and let us know if you have any questions or ideas!
New GeoCommons Maps- Ancient Near East bradconner
- CO BLM Oil & Gas Leases 01/30/2012 ConnorBailey
- BVTRIERG azmisy
- geotaps mariosadio
- U.S. Bank Exposure to Europe thefactfile
- 2011_Mulch_Order dcacner
Recent Comments
- Using the Google Translate Function to Make Multilingual Maps in GeoCommons | GeoIQ Blog on Dynamically Map your Google Spreadsheets with GeoCommons
- Coffee Machines on Dataset of the Day: Starbucks Closure Data
- JulieB on Dataset of the Day: Who is more Generous? Republicans or Democrats?
- JulieB on Dataset of the Day: Who is more Generous? Republicans or Democrats?
- En Ucuz Tefal on Dataset of the Day: Early Voting—November 3, 2008





[...] merits, there’s some downside when folks out there start tagging, some weird things happen. Sean Gorman at fortiusone makes the point for geospatial systems: 1) people’s tags may be difficult for [...]