With all the discussion around Spatial Data Infrastructures, a National GIS and public geodata I thought it would be useful to look at a very recent example of what the status quo is proposing. Specifically, Cal Atlas, which was released yesterday. It is great seeing California investing in making their geospatial data available to the public, but I believe we should look critically at what the investment achieved and if this is the type of approach the rest of the nation should follow.

The Cal-Atlas breaks up their portal into four functions; discover, download, view, and contribute. Discover is a metadata search engine including, what looks like, a corpus of 1,145 data sets. Of those it looks like 650 have links to additional information of which some should lead you to dowloadable data. I followed a few and it took me another five or six pages of navigation. I arrived at a spreadsheet of agricultural census data, but no georeferencing to it. Then I had to go back and track down the metadata link to sort out what the attributes mean and hopefully find a boundary file to join it to. That required an entirely new search.

On the upside there is also a download page where I can get data directly. Awesome – except I can’t search across it and by my quick count there are only seventy data sets. There is metadata which come as a XML file that Firefox tries to download then defaults to having IE open. IE then can’t read it. I’m sure I could put some more work in and get at the metadata, but meeting at 10:30 so, fail.

Then we have “view” which consists of PDF’d maps. On the upside users can contribute their maps. Their is also a “Common Operating Picture” – an interactive map build on the ESRI flex API. Although it does not appear to have any of the California data in it – just a few USGS feeds. Lastly their is “contribute” where you can share metadata, PDF maps, imagery and vector data. Positive step, but will leave the link in regards to the work required to do so.

It is great to see the effort and investment from the state of California in opening up their geodata, but there are several red flags here. Namely the amount of money invested versus the amount of data effectively made available to the public. From my perspective not a good ratio. Also the usability of the system for accessing the data is fairly challenging. By comparison on both points look at what Jason Birch did for the City of Nanaimo.

The second red flag is the vendor lock in:

“To see what we have, you can connect to our services via ArcGIS Desktop or ArcGIS Explorer by opening the following URL with those applications: http://atlas.resources.ca.gov/arcgis/services.”

If you don’t have ESRI your only other option is SOAP. Want to know how to use the SOAP interface just use this link…..oooops! This is exactly what concerns me about the various SDI and National GIS proposals floating around. Governments have the right motivation to make data available, but we seem to do a good job as an industry preventing that from actually happening.

 

18 Responses to Cal-Atlas: The SDI Canary in the Coal Mine

  1. Tom says:

    If you dig deeper, you’ll find the ESRI REST API that we’ve all come to know and love. ;)

    http://atlas.resources.ca.gov/ArcGIS/rest/services

    So you’re not stuck with accessing the maps through SOAP. You can use KML:

    http://atlas.resources.ca.gov/ArcGIS/rest/services/Live/USGS_StreamWatch/MapServer/kml/mapImage.kmz

    or WMS:

    http://atlas.resources.ca.gov/arcgis/services/Live/USGS_StreamWatch/MapServer/WMSServer?request=GetCapabilities&service=WMS

    It’s a shame that the verbiage on the Cal-Atlas home page is so oriented towards ESRI desktop software. I was able to view the maps using the KML link just fine in Google Earth.

  2. Sean:

    Well said on all points. Making spatial data search-engine friendly–with metadata easily accessible–would seem to be a more effective goal than creating SDIs that reflect the hierarchical thought patterns of programmers/engineers rather than the methods normal people use to find information.

    Brian Timoney

  3. Dave Smith says:

    SDI is not and should not be about OGC vs. REST vs. ESRI vs. Google vs. whoever… SDI is a high-level concept, wrapped around open data discovery and access. Nor should it be wrapped around a monolithic, single point of entry and access. It’s more about getting together as a community and agreeing on common, best practices, and working together and collaboratively to densify/augment/enhance the disparate datasets that everyone is already working on, and to make these data far more transparent, discoverable and accessible.

    Before even jumping into technology implementations, core questions still remain to be asked – can we even find the data? Is it even accessible on the web? If so, is there adequate metadata to make informed decisions on whether or not it’s even the appropriate data for what we need? Is the metadata record actually accurate? Is the associated data current, complete, and accurate? Is it integratable into other applications? And so on…

    Specific technologies and implementations come and go. Certainly we still see ArcIMS servers and other things out there, and certainly many success stories such as the Nanaimo Geo REST implementation exist – as such, it’s a fundamental premise that whatever NSDI we continue to pursue, will NEED to accomodate a melange of technologies as the various participants and providers make improvements and updates to their stacks.

    And – as Brian points out above – we need to think about ontologies and other aspects – where does the data fit in a science domain? Engineering? Is it really informative to just talk about “roads” without some notion of whether the focus is addressed and direction, roadway maintenance, traffic volumes, or whatever else? Similarly, “streams” – do we need to know about stream flow volumes? Flooding? What’s upstream/downstream? Fish passage? Biodiversity of macroinvertebrates in that reach? There’s a lot more to this than just some simple technology implementation or providing basic FGDC metadata records. But the first step is in building the community and the partnerships and elevating the dialogue.

  4. Sean Gillies says:

    Technology may come and go, Dave, but it’s wrong to say that all are equally good/bad or fleeting/lasting, or that it’s not important to make the right choices. Going against the grain of the Web, which is a hallmark of SDI web sites, leads to failure, and will for years to come.

    That said, it’s good to be reminded of all the challenges that remain *inside* the SDI.

  5. Dave Smith says:

    I don’t think it’s a fair characterization to say that “going against the grain of the Web is a hallmark of SDI web sites”, nor did I suggest all are “equally good/bad”, nor that it’s NOT important to make the right choices.

    Again, the point to be made is that any forward-thinking SDI has to be able to accomodate technologies that already exist and which are in widespread use, as well as to be able to accomodate emergent technologies. Things like OGC WxS and ESRI technologies, for whatever technical shortcomings, have been easily implementable and usable, up and running directly out of the box in just a matter of minutes, and as such they have proliferated, so we can’t just ignore WxS. Obviously, as some of these platforms continue to be updated, they will begin to implement new technologies. GeoServer has added REST support, even ESRI is recognizing the need to support and integrate with other technologies, and to anyone who’s watched the changes taking place there over the past year, it’s not at all unreasonable or unfathomable to speculate that they would support REST in some upcoming iteration in the near future as well.

    Certainly, as RESTful implementations such as Jason Birch has developed in Nanaimo become more replicatable and consistent, specifically in terms of how geo assets, metadata, capabilities and analysis are handled, and become more widespread and easily implementable, it’s quite reasonable to anticipate and expect that these types of implementations WILL phase out legacy map services over time. We ARE still just in the infancy and “early adopter” phase of RESTful implementations, whereas we are already a decade into the implementation curve for WxS, and certainly well into the waning part of the curve for ArcIMS map services, with these widely implemented and supported throughout the community – as such, it’s NOT reasonable to expect that we can just altogether ignore WxS or vendor-specific implementations and technology, and to refuse to acknowledge or implement them due to their shortcomings, in building an SDI.

    In looking beyond individual implementations, we have to look at the big picture. We have to be pragmatic, we have to consider demands and usage. COTS GIS software is typically not yet geared to connecting to RESTful assets, and the installed base of COTS GIS software clients remains huge, particularly when compared to the number of web and desktop apps that support RESTful geo. As the benefits of RESTful implementations continue to proliferate, we will hopefully see more and more vendors able to support and access RESTful geo assets, at which point, we will gradually get more and more RESTful data and capabilities slipstreamed into the use of all those non-programmer analysts, engineers, scientists and other users of desktop ArcGIS and similar software.

    As such, we quite clearly need to anticipate and support both existing AND new technologies, *in parallel*, as the curve for one continues to wane and the curve for the next continues to rise. I believe I’ve been quite consistent in saying this throughout, for the last month now.

  6. Peter Keane says:

    @DaveSmith-

    Forgive me for saying so, but

    “GeoServer has added REST support, even ESRI is recognizing the need to support and integrate with other technologies, and to anyone who’s watched the changes taking place there over the past year, it’s not at all unreasonable or unfathomable to speculate that they would support REST in some upcoming iteration in the near future as well”

    betrays a misunderstanding of what we mean by REST (or RESTful design). You seem to suggest that REST is a “technology” to be “supported.” It’s not — it is simply a set of design principles upon which the Web is based. Any technology can be deemed to be more or less RESTful. Those that are more RESTful (i.e., follow the five primary constraints of REST) will enjoy the same benefits the web enjoys: scalability, evolvability, interoperability, etc. Less RESTful technologies will necessarily be less scalable, evolvable, interoperable in all cases.

    It is essentialy that any effort to develop or implement technologies on the Web be informed by the principles and practices of RESTful design. I would suggest that is is *particularly” important for the GIS community since the benefits of RESTful design are so consonant with the ideal GIS infrastructure. Just to be clear — REST is simply an articulation of the principles “at play” in a particular type of networked environment (the Web). These principles have been true since day one. The “emergence” of REST is nothing more than a growing awareness of the principles behind good web-based system design. If the GIS infrastructure and the technologies behind it do not follow RESTful principles, I would be a great mistake to NOT start thinking about implementing better design principles either wholesale or piecemeal (there are many, many ways to move towards more RESTful design).

    –peter keane

  7. Steven Feldman says:

    Wouldn’t more discussion about What could be achieded through a NSDI and Why that might be important be more important than a discussion on How to implement?

  8. Sean Gillies says:

    I think there’s abundant agreement on “what”: rich, coherent, current datasets that get attention and care, and take on lives of their own (insert “Web 2.0″ boilerplate here).

    Nevermind the implementation (and thanks, Peter, for pointing out that REST isn’t a technology), I’m talking about design. A successful NSDI needs unifying, proven design principles, and I think that means adopting the design principles of the Web. Not as an afterthought, but from the start.

  9. Dave Smith says:

    For Peter Keane, to clarify what I meant above, it’s NOT a matter of “supporting” REST, nor REST as “technology”, what I am referring to is, quite specifically, and confined very narrowly, to a matter of how *geospatial* technology, data and analysis is supported within the framework of RESTful patterns and practices.

    For example, how do we handle and deal consistently with spatial extents, datums, projections, temporal slices, different types of spatial queries and filters, how can we process and analyze this data –in many ways, the geospatial community is only just beginning to look at how to implement these types of things in any kind of consistent fashion. OGC has already been looking at these questions for some time, and does have some pieces worked out – some of which can and are being adapted to RESTful approaches by various folks. Adopting RESTful patterns and practices is only *part* of the design question, it is not the design question in and of itself.

    Meanwhile, we have to recognize and deal with the fact that, across the country, there are tens of thousands of organizations, state, county, local government, NGOs, academia and others, who currently host a gamut of geospatial data and resources, easily many tens of thousands of individual layers and assets, as OGC WxS, ArcIMS, ESRI shapefiles, and so on – we can’t just turn our backs on these assets simply because they aren’t RESTful.

    For Sean, I have never suggested REST as an “afterthought”. From the outset, I have quite consistently been suggesting RESTful patterns and practice in conjunction with with legacy WxS and other services – you will also see this echoed in the NSDI2.net proposal and elsewhere – it’s anticipated that we would need to support a hybrid mélange of technologies. From the outset, I have been suggesting that we need to be forward-thinking, to accommodate new and better patterns and practices as they emerge – while at the same time still accommodating legacy approaches and technology. Just as it would be a grave design flaw to NOT have a RESTful vision moving forward, it would likewise be a grave design flaw to NOT support all of the tens of thousands of WxS and other legacy assets already out there.

  10. Peter Keane says:

    @DaveSmith said:

    “For example, how do we handle and deal consistently with spatial extents, datums, projections, temporal slices, different types of spatial queries and filters, how can we process and analyze this data –in many ways, the geospatial community is only just beginning to look at how to implement these types of things in any kind of consistent fashion. OGC has already been looking at these questions for some time, and does have some pieces worked out – some of which can and are being adapted to RESTful approaches by various folks. Adopting RESTful patterns and practices is only *part* of the design question, it is not the design question in and of itself.”

    It sounds very much to me like you are ignoring the idea of a “uniform interface” Here’s a bit from Roy F.’s dissertation:

    “”"
    5.1.5 Uniform Interface

    The central feature that distinguishes the REST architectural style from other network-based styles is its emphasis on a uniform interface between components (Figure 5-6). By applying the software engineering principle of generality to the component interface, the overall system architecture is simplified and the visibility of interactions is improved. Implementations are decoupled from the services they provide, which encourages independent evolvability.
    “”"

    I’ll quote one other bit from Roy, this from a recent thread on rest-discuss:

    “”"
    “All important resources should be identifiable by URI.”

    I think you should look at each of those words in turn and consider why they were chosen. That particular quote is from

    http://www.w3.org/2001/tag/2002/01-uriMediaType-9
    and
    http://www.w3.org/2002/04/22-tag-summary

    but it was also in the first drafts of the TAG’s webarch. That principle was not new — I remember TimBL mentioning it during his keynote in Geneva, May 1994, and it dates from Engelbart’s work:

    http://www.bootstrap.org/augdocs/augment-132082.htm#11K

    which in turn influenced my design when HTTP/1.0 needed finishing. By definition, working on improving the Web Project meant increasing the number of Web-accessible resources.

    Here is a more recent variation on the same theme that I just ran across while doing a search:

    http://derivadow.com/2007/12/28/web-design-20-its-all-about-the-resource-and-its-url/

    Other things that might be worth keeping in mind is that REST is designed for reuse, not just use. The notion that anyone has control over a successful application’s reuse is pure fantasy, as described in

    ….Roy
    “”"

    There is a whole lot of wisdom there, it seems to me, and might (I’d hope) give some food for thought to the GIS community.

    –peter

  11. Sean Gorman says:

    Take a weekend off and all sorts of good discussion pops up. Obviously I should do it more often ;-)

    My take is that in order to make itself truly relevant the geospatial niche needs to make itself meaningful and accessible to the mainstream. This starts with IT and the Web. GIS in general has operated outside of the IT mainstream for years. Creating funky proprietary approaches to things and sometimes even rolling those up into standards. The problem is often no one uses the standards, simply because that is not the way the rest of the world works.

    There is a great piece in the NYT on the opportunities to revamp broken systems in times of crisis, because it presents the chance the avoid special interests that normally stand in the way.

    http://www.nytimes.com/2009/02/01/magazine/01Economy-t.html?pagewanted=all

    A lot of folks have talked about the impact of Vivek Kundra’s appointment at OMB. At the DC OCTO he had a running portfolio assessment of all IT projects. Projects that were not providing a good return on investment would either be killed or require a hostile takeover. Meaning the project was a bad idea, so why throw good money after bad – simply cut your losses. Or the idea was good and the implementation team stunk, so you replace the team. Looking at the track record of SDI’s over the last thirty years and the amount of money invested into them, you have to wonder how it would stack up in such a portfolio analysis.

  12. Dave Smith says:

    @ Peter Keane – I haven’t at all disagreed on REST or uniform interfaces. In fact, what I have been talking to has been STRESSING the need for uniform interfaces.

    Again, if you go back and actually _read and consider_ what I’m saying above, you will see that I make the point of fundamental questions like, how do we consistently construct a RESTful URI which will fetch all features within a given bounding box, or within a given radius, or within a buffered distance, and so on. Are we talking bounding box in lat/long? Or are we talking SPCS? Are we talking buffer distance in feet, km, miles? You might structure that RESTful URI in one way, another person might structure it in another way. Yes, folks are doing these things in a RESTful fashion, but in many instances, without any consistency among them as of yet other than within the scope of each individual application. How do we build desktop client software and server-to-server software to transparently discover, connect to and use them? You keep answering the GENERIC questions of REST which were not asked, but which have already been answered over and over, but as of yet, nobody has addressed these kinds of SPECIFIC questions. And the reason is, that nobody yet has the answers. The answers will come, but again, that is why we need to anticipate and be ready for and promote the maturation of RESTful geo implementations, but in the meanwhile, still be able to function with and accomodate the tens of thousands of services and assets already out there.

    @Sean Gorman Gotta work 24/7 to keep up with me, there’s no such thing as a weekend off ;^) – IMHO, not enough people are talking about what Vivek Kundra’s appointment means in terms of geo. I’m one of the very few who is: http://surveying-mapping-gis.blogspot.com/2009/02/nsdi-for-democracy.html – and if you look at what he did for the District of Columbia with his Open Data Catalog, essentially what he did is build and promote a (say it with me) SDI for DC, and lo and behold, it’s turned out to be a huge success. Again, SDI is about facilitating informed discovery and access. It’s NOT some shrink-wrapped piece of ESRI software, nor is it about some one group of folks going out and finding data sets and bringing them into their own environment and massaging them so that folks can use some “one-stop shopping” portal to build mashups. It’s about collaboration, participation, contribution, and so on.

  13. Peter Keane says:

    @DaveSmith

    Actually, the beauty of REST is that you *can* talk about it in generic ways — a system’s RESTfulness has very little to do with the specific task it is accomplishing — those are domain concerns and while obviously important, have very little to do with a systems RESTfulness (and, it follows, appropriateness for the web). If domain concerns are leaking into your interface beyond the standardization of formats (media types) and link relations, then you are not designing a RESTful system.

    I do not mean to be snarky, but your comment “how do we consistently construct a RESTful URI which will fetch all features within a given bounding box” again betrays a fundamental misunderstanding about what constitutes a RESTful system. REST say nothing whatsoever about URI construction beyond the fact that they need to be opaque. My best guess as to what might consititute a “RESTful URI” would be whether it was exposed via hypertext (and thus discoverable). URI construction based on anything other than links in hypertext (e.g., by some documentation saying here’s how you embed lat/lon in a URL) is inherently unRESTful. Constructing good URIs may be important for other aspects of a system, but it has nothing to do with REST.

    I’ll just add — your question might reasonably be reframed as such: “When performing a GET request on the URI representing a bounding box, what format will the returned representation be in, and what link relations are defined in that representation that will lead the user (human or machine) to representations of all of the features it contains)?”

    I fear that REST, by embracing genericism and simplicity, is misunderstood to be “easy” or “obvious” — it is neither. It is a very disciplined approach to building distributed systems. I am not addressing your specific concerns because those have nothing to do with REST. Here are the first few important questions to ask: “Does every important resource have a name (i.e., URI)?”, “What does it mean to GET/PUT/DELETE to a URI?”,”Are those actions appropriately idempotent?”, “Have I designed resources that accept POST requests to create new resources?”,”What media type(s) will a request return?”, “What media type(s) will a POST accept?” These are questions that any RESTful system will address.

    I think the cognitive dissonance here is that conversations are going on about two different layers — the domain layer and the interface layer. The confusion arises because in more RPC-ish systems, the two are entwined and must be discussed together. The beauty of a REST approach is that the interface is abstracted and can be approached in itself — domain concerns mainly arise in decisions around *formats* NOT *actions*.

    REST is neither easy, nor is it a silver bullet. Designing good systems is extremely difficult no matter how you slice it. But what REST offers in terms of transparency, interoperability, and evolvability could be extremely useful to the GIS community. To not take a hard look at what needs to be done to make things more RESTful would be a missed opportunity.

  14. Indeed this is a very good conversation to be having.

    My primary concern, as @seangorman pointed out, is that Geo- too often thinks of itself as the leading factor in developing an architecture or solution rather than as just an aspect of a large architecture.

    These various, and important, questions about services, queries, etc. are broader concerns. Doing a bounding box search is not much different from a time search or person or related query.

    We should be looking to and aligning our geospatial tools with generic interfaces. This was the point behind GeoRSS and OpenSearch-Geo. Already broadly implemented and accepted standards of RSS, Atom, and OpenSearch that merely needed “geo” added to them was simpler in terms of design as well as developer acceptance.

    Where our expertise comes in is in answering very domain specific questions: coordinate order and CRS definitions, industry standards for return formats such as GeoTIFF, shapefiles (fun in themselves in asking about how to represent multiple files as a single resource). But the answers should fit within the non-geo frameworks and interfaces.

  15. Dave Smith says:

    @Peter Keane: I see you are finally starting to understand what I’ve been getting at all along – For me, the conversation has ALWAYS been around the geo-specific questions. E.g., how to use a RESTful interface to ask for features within a bounding box, and to get back results as intended – whether that response is a graphic image, whether it’s feature data, what formats, and so on – and how to do so consistently.

    It’s not enough to just expose these interfaces as URIs and have them be crawled and discovered, as we STILL have to know some specific things about these interfaces, again, what specific capabilities they have and what specifically it’s expecting and providing (Can you give me a .png image? Can your image show major highways as a 3 pixel wide line? Can you give me results as a JSON response? What units of measure are you using? Lat/Long? SPCS? What datum? How DO you represent Lat/Long – DDDMMSS? DDD.DDDD? 360 degrees or +/-180? Can you intersect that dataset with another and give me those results? – and those questions HAVE to be answered, we can’t just shrug them off, and YES, EXACTLY, they are NOT easy or obvious questions to answer, and have NOT yet fully been answered) – It’s never safe to just make assumptions about what the API might or might not do, or what the developers on the other end actually had in mind. And those are the critical questions that we need to wrap our heads around as a community.

    While these questions are awaiting answer, are we to just stop everything and say “Um, we can’t build an SDI because we have to wait for the Geo REST community to catch up and answer these questions”? No, we can’t do that either. The needs are now, the needs have existed all along, the effort’s already been ongoing for over a decade, and is all the more are needed if we are to invest in infrastructure. So again, as I’ve said all along, support a hybrid melange, with support for legacy AND newer technologies, practices and approaches.

  16. Dave Smith says:

    @Andrew Turner: Yes, there are some problems with interfaces, but these can and should be fixable over time. But in the big-picture view, of a constellation of disparate servers, e.g. FEMA, USEPA, District of Columbia OCTO, Fairfax County, et cetera – all the various stewards of data – that OVERALL architecture does not change much at all, whether individual assets are RESTful, or WxS or whatever – the only things that really change in the big picture view are those specific interfaces at each server and how we discover, find and use those interfaces – hence again the need to consider hybrid solutions capable of supporting both legacy and current technology. Meanwhile, the big picture view – that overall piece is what SDI concerns itself with more than anything – a tool for identifying the gaps, overlaps, aligning assets and leveraging investments, and so on.

  17. [...] is the Cooperative Agreement Program. While I’ve voiced my opinions on the current state of SDI’s and using bailout money to fund it on this blog, I do believe the goal of this grant very much [...]

  18. [...] Cal Atlas The SDI Canary in the Coal Mine Off the Map Posted by root 37 minutes ago (http://blog.fortiusone.com) I do not mean to be snarky but your comment how do we is proudly using the simpla theme originally designed by phu powered by wordpress Discuss  |  Bury |  News | Cal Atlas The SDI Canary in the Coal Mine Off the Map [...]

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>