thammond – 2009 June 05
[Update - 2009.06.07: As pointed out by Todd Carpenter of NISO (see comments below) the phrase “SRU by contrast is an initiative to update Z39.50 for the Web” is inaccurate. I should have said “By contrast SRU is an initiative recognized by ZING (Z39.50 International Next Generation) to bring Z39.50 functionality into the mainstream Web“.]
[Update - 2009.06.08: Bizarrely I find in mentioning query languages below that I omitted to mention SQL. I don’t know what that means. Probably just that there’s no Web-based API. And that again it’s tied to a particular technology - RDBMS.]
(Click image to enlarge.)
There are two well-known public search APIs for generic Web-based search: OpenSearch and SRU. (Note that the key term here is “generic”, so neither Solr/Lucene nor XQuery really qualify for that slot. Also, I am concentrating here on “classic” query languages rather than on semantic query languages such as SPARQL.)
OpenSearch was created by Amazon’s A9.com and is a cheap and cheerful means to interface to a search service by declaring a template URL and returning a structured XML format. It therefore allows for structured result sets while placing no constraints on the query string. As outlined in my earlier post Search Web Service, there is support for search operation control parameters (pagination, encoding, etc.), but no inroads are made into the query string itself which is regarded as opaque.
SRU by contrast is an initiative to update Z39.50 for the Web and is firmly focussed on structured queries and responses. Specifically a query can be expressed in the high-level query language CQL which is independent of any underlying implementation. Result records are returned using any declared W3C XML Schema format and are transported within a defined XML wrapper format for SRU. (Note that the SRU 2.0 draft provides support for arbitrary result formats based on media type.)
One can summarize the respective OpenSearch and SRU functionalities as in this table:
What I wanted to discuss here was the OpenSearch and SRU interfaces to a Search Web Service such as outlined in my previous post. The diagram at top of this post shows query forms for OpenSearch and SRU and associated result types. The Search Web Service is taken to be exposing an SRU interface. It might be simplest to walk through each of the cases.
thammond – 2009 May 26
thammond – 2009 May 08
The new OAI-PMH interface to Nature.com sports one particular novelty which may well be of interest here: it makes use of the PRISM Aggregator Message. (For an announcement of this service see the post on our web publishing blog Nascent.)
As a protocol for the harvesting of metadata records within a digital repository, OAI-PMH records may be expressed in a variety of different metadata formats. For reasons of interoperability a base metadata format (‘Dublin Core’) is mandated for all OAI-PMH implementations. The expectation is that this base format would be augmented by community-specific vocabularies.
Our natural inclination was to mirror the article descriptions which we already circulate in our RSS feeds and within our HTML pages (as META tags) and PDF files (as XMP packets). In these cases we have used open data models (e.g. RDF) with simple properties cherry-picked from the DC and PRISM namespaces. But OAI-PMH has a special ‘gotcha’ in this regard: any metadata format must allow for W3C XML Schema validation. That is, the properties need to be constrained by an XSD data model. Enter PRISM Aggregator Message (PAM).
Chuck Koscher – 2009 May 06
Geoffrey Bilder – 2009 May 01
Geoffrey Bilder – 2009 April 27
Geoffrey Bilder – 2009 March 23
Geoffrey Bilder – 2009 March 20
2020 January 14
2020 January 13
2019 December 17
2019 December 11