Friday 14 March 2008

Shout 'Yahoo!' : more use of metadata and the Semantic Web

Within the lucrative world of information retrieval on the web, Yahoo! is considered an 'old media company'; a company that has gone in a different direction to, say, Google. Yahoo! has been a bit patchy when it comes to openness. It is keen on locking data and widgets down; Google is keen on unlocking data and widgets. And to Yahoo!'s detriment, Google has discovered that there is more money to be made their way, and that users and developers alike are – to a certain extent - very happy with the Google ethos. Google Code is an excellent example of this fine ethos, with the Google Book Search API being a recently announced addition to the Code arsenal.

Since there must be some within Yahoo! ranks attributing their current fragility to a lack of openness, Yahoo! have recently announced their Yahoo! Search 'open platform'. They might be a little slow in fully committing to openness, but cracking open Yahoo! Search is a big and interesting step. For me, it's particularly interesting...

Yesterday Amit Kumar (Product Management, Yahoo! Search) released further details of the new Yahoo! Search platform. This included (among many other things), a commitment to harnessing the potential of metadata and Semantic Web content. More specifically, this means greater support of Dublin Core, Friend-of-a-Friend (FOAF) and other applications of RDF (Resource Description Framework), Creative Commons, and a plethora of microformats.

Greater use of these initiatives by Yahoo! is great news for the information and computing professions, not least because it may stimulate the wider deployment of the aforementioned standards, thus making the introduction of a mainstream 'killer app' that fully harnesses the potential of structured data actually possible. For example, if your purpose is to be discovered by Google, there is currently no real demand for Dublin Core (DC) metadata to be embedded within the XHTML of a web page, or for you to link to an XML or RDF encoded DC file. Google just doesn't use it. It may use dc.title, but that's about it. That is not to say that it's useless of course. Specialist search tools use it, Content Management Systems (CMS) use it, many national governments use it as the basis for metadata interoperability and resource discovery (e.g. eGMS), it forms the basis of many Information Architecture (IA) strategies, etc, etc. But this Google conundrum has been a big problem for the metadata, indexing and Semantic Web communities (see, for example). Their tools provide so much potential; but this potential is generally confined to particular communities of practice. Your average information junkie has never used a Semantic Web tool in his/her life. But if a large scale retrieval device (i.e. Yahoo!) showed some real commitment to harnessing structured data, then it could usher in a new age of large scale information retrieval; one based on an intelligent blend of automatic indexing, metadata, and Semantic Web tools (e.g. OWL, SKOS, FOAF, etc.). In short, there would be a huge demand for the 'data web' outside the distinct communities of practice over which librarians, information managers and some computer scientists currently preside. And by implication this would entail greater demand for their skills. All we need to do now is get more people to create metadata and ontologies!

Given the fragile state of Yahoo!, initiatives like this (if they come to fruition!) should be applauded. Shout 'Yahoo!' for Yahoo! I'm not sure if it will prevent a Microsoft takeover though...

1 comment:

  1. Just following on from my comments about Google openness, the said company have now released an API for Google Book Search. The Library Wageningen UR digital library have taken a stab at implementing it. See: http://wowter.net/2008/03/18/implementing-the-google-book-search-api-at-library-wageningen-ur/

    ReplyDelete