Information Strategy Group, LJMU: Personal Information Management (PIM)

Showing posts with label Personal Information Management (PIM). Show all posts

Friday, 9 October 2009

Wave a washout?

This is just a brief posting to flag up a review of Google Wave on the BBC dot.life blog.

Google unveiled Wave at their Google I/O conference in late May 2009. The Wave development team presented a lengthy demonstration of what it can do and – given that it was probably a well rehearsed presentation and demo – Wave looked pretty impressive. It might be a little bit boring of me, but I was particularly impressed by the context sensitive spell checker ("Icland is an icland" – amazing!). Those of you that missed that demonstration can check it out in the video below. And try not to get annoyed at the sycophantic applause of their fellow Google developers...

Since then Wave has been hyped up by the technology press and even made mainstream news headlines at the BBC, Channel 4 News, etc. when it went on limited (invitation only) release last week. Dot.life has reviewed Wave and the verdict was not particularly positive. Surprisingly they (Rory Cellan-Jones, Stephen Fry, Bill Thompson and others) found it pretty difficult to use and pretty chaotic. I'm now anxious to try it out myself because I was convinced that it would be pretty amazing. Their review is funny and worth reading in full; but the main issues were noted as follows:

"Well, I'm not entirely sure that our attempt to use Google Wave to review Google Wave has been a stunning success. But I've learned a few lessons.

First of all, if you're using it to work together on a single document, then a strong leader (backed by a decent sub-editor, adds Fildes) has to take charge of the Wave, otherwise chaos ensues. And that's me - so like it or lump it, fellow Wavers.

Second, we saw a lot of bugs that still need fixing, and no very clear guide as to how to do so. For instance, there is an "upload files" option which will be vital for people wanting to work on a presentation or similar large document, but the button is greyed out and doesn't seem to work.

Third, if Wave is really going to revolutionise the way we communicate, it's going to have to be integrated with other tools like e-mail and social networks. I'd like to tell my fellow Wavers that we are nearly done and ready to roll with this review - but they're not online in Wave right now, so they can't hear me.

And finally, if such a determined - and organised - clutch of geeks and hacks struggle to turn their ripples and wavelets into one impressive giant roller, this revolution is going to struggle to capture the imagination of the masses."

My biggest concern about Wave was the important matter of critical mass, and this is something the dot.life review hints at too. A tool like Wave is only ever going to take off if large numbers of people buy into it; if your organisation suddenly dumps all existing communication and collaboration tools in favour of Wave. It's difficult to see that happening any time soon...

Tuesday, 16 June 2009

11 June 2009: the day Common Tags was born and collaborative tagging died?

Mirroring the emergence of other Web 2.0 concepts, 2004-2006 witnessed a great deal of hyperbole about collaborative tagging (or 'folksonomies' as they are sometimes known). It is now 2009 and most of us know what collaborative tagging is so I'll avoid contributing to the pile of definitions already available. The hype subsided after 2006 (how active is Tagsonomy now?), but the implementation of tagging within services of all types didn't; tagging became and is ubiquitous.

The strange thing about collaborative tagging is that when it emerged the purveyors of its hype (e.g. Clay Shirky in particular, but there were many others) drowned out the comments made by many in the information, computer and library sciences. The essence of these comments was that collaborative tagging broke so many of the well established rules of information retrieval that it would never really work in general resource discovery contexts. In fact, collaborative tagging was so flawed on a theoretical level that further exploration of its alleged benefits was considered futile. Indeed, to this day, research has been limited for this reason, and I recall attending a conference in Bangalore in which lengthy discussions ensued about tagging being ineffective and entirely unscalable. For the tagging evangelists though, these comments simply provided proof that these communities were 'stuck-in-their-way' and harboured an unwillingness to break with theoretical norms. One of the most irritating aspects of the position adopted by the evangelists was that they relied on the power of persuasion and were never able to point to evidence. Moreover, even their powers of persuasion were lacking because most of them were generally 'technology evangelists' with no real understanding of the theories of information retrieval or knowledge organisation; they were simply being carried along by the hype.

The difficulties surrounding collaborative tagging for general resource discovery are multifarious and have been summarised elsewhere; but one of the intractable problems relates to the lack of vocabulary control or collocation and the effect this has on retrieval recall and precision. The Common Tags website summarises the root problem in three sentences (we'll come back to Common Tags in a moment…):

"People use tags to organize, share and discover content on the Web. However, in the absence of a common tagging format, the benefits of tagging have been limited. Individual things like New York City are often represented by multiple tags (like 'nyc', 'new_york_city', and 'newyork'), making it difficult to organize related content; and it isn’t always clear what a particular tag represents—does the tag 'jaguar' represent the animal, the car company, or the operating system?"

These problems have been recognised since the beginning and were anticipated in the theoretical arguments posited by those in our communities of practice. Research has therefore focused on how searching or browsing tags can be made more reliable for users, either by structuring them, mapping them to existing knowledge structures, or using them in conjunction with other retrieval tools (e.g. supplementing tools based on automatic indexing). In short, tags in themselves are of limited use and the trend is now towards taming them using tried and tested methods. For advocates of Web 2.0 and the social ethos it often promotes, this is really a reversal of the tagging philosophy - but it appears to be necessary.

The root difficulty relates to use of collaborative tagging in Personal Information Management (PIM). Make no bones about it, tagging originally emerged as PIM tool and it is here that it has been most successful. I, for example, make good use of BibSonomy to organise my bookmarks and publications. BibSonomy might be like delicious on steroids, but one of its key features is the use of tags. In late 2005 I submitted a paper to the WWW2006 Collaborative Tagging Workshop with a colleague. Submitted at the height of tagging hyperbole, it was a theoretical paper exploring some of the difficulties with tagging as general resource discovery tool. In particular, we aimed to explore the difficulties in expecting a tool optimised for PIM to yield benefits when used for general resource discovery and we noted how 'PIM noise' was being introduced into users' results. How could tags that were created to organise a personal collection be expected to provide a reasonable level of recall, let alone precision? Unfortunately it wasn't accepted; but since it scored well in peer review I like to think that the organising committee were overwhelmed by submissions!! (It is also noteworthy that no other collaborative tagging workshops have been held since.)

Nevertheless, the basic thesis remains valid. It is precisely this tension (i.e. PIM vs. general resource discovery) which has compromised the effectiveness of collaborative tagging for anything other than PIM. Whilst patterns can be observed in collaborative tagging behaviour, we generally find that the problems summarised in the Common Tags quote above are insurmountable – and this simply because tags are used for PIM first and foremost, and often tell us nothing about the intellectual content of the resource ('toPrint' anyone? 'toRead', 'howto', etc.). True – users of tagging systems can occasionally discover similar items tagged by other users. But how useful is this and how often do you do it? And how often do you search tags? I never do any of these things because the results are generally feeble and I'm not particularly interested in what other people have been tagging. Is anyone? So whilst tags have taken off in PIM, their utility in facilitating wider forms of information retrieval has been quite limited.

Common Tags

Last Friday the Common Tags initiative was officially launched. Common Tags is a collaboration between some established Web companies and university research centres, including DERI at the National University of Ireland and Yahoo!. It is an attempt to address the multifarious problems above and to widen the use of tags. Says the Common Tags website:

"The Common Tag format was developed to address the current shortcomings of tagging and help everyone—including end users, publishers, and developers—get more out of Web content. With Common Tag, content is tagged with unique, well-defined concepts – everything about New York City is tagged with one concept for New York City and everything about jaguar the animal is tagged with one concept for jaguar the animal. Common Tag also provides access to useful metadata that defines each concept and describes how the concepts relate to one another. For example, metadata for the Barack Obama Common Tag indicates that he's the President of the United States and that he’s married to Michelle Obama."

Great! But how is Common Tags achieving this? Answer: RDFa. What else? Common Tags enables each tag to be defined using a concept URI taken from Freebase or DBPedia (much like more formal methods, e.g. SKOS/RDF) thus permitting the unique identification of concepts and ameliorating some of our resource discovery problems (see Common Tags workflow diagram below). A variety of participating social bookmarking websites will also enable users to bookmark using Common Tags (e.g. ZigTag, Faviki, etc.). In short, Common Tags attempts to Semantic Web-ify tags using RDFa/XHTML compliant web pages and in so doing makes tags more useful in general resource discovery contexts. Faviki even describes them as Semantic Tags and employs the logo strap line, 'tags that make sense'. Common Tags won't solve everything but at least to will see some improvement recall and increase the precision in certain circumstances, as well as offering the benefits of Semantic Web integration.

So, in summary, collaborative tagging hasn't died, but at least now - at long last - it might become useful for something other than PIM. There is irony in the fact that formal description methods have to be used to improve tag utility, but will the evangelists see it? Probably not.

Monday, 24 November 2008

Wikifying search

This blog follows a series of other blogs pontificating about the efficacy of search engines in information retrieval. Over the weekend Google announced the release of Google SearchWiki. Google SearchWiki essentially allows users to customise searches by re-ranking, deleting, adding, and commenting on their results. This is personalised searching (see video below). As the Official Google Blog notes:

"With just a single click you can move the results you like to the top or add a new site. You can also write notes attached to a particular site and remove results that you don't feel belong."

The advantages of this are a little unclear at first; however, things become clearer when we learn that such changes can only be affected if you have an iGoogle account. Google have – quite understandably – been very specific about this aspect of SearchWiki. Search is their bread and butter; messing with the formula would be like dancing with the devil!

Google SearchWiki doesn't do anything further to address our Anomalous State of Knowledge (ASK), nor can I see myself using it, but it is an indication that Google is interested in better exploring the potential of social data to improve relevance feedback. Google will, of course, harvest vast amounts of data pertaining to users' information seeking behaviour which can then be channelled into improving their bread and butter. (And from my perspective, I would be interested to know how they analyse such data and affect changes in their PageRank algorithm). Their move also resonates with an increasing trend to support users in their Personal Information Management (PIM); to assist users in re-finding information they have previously located, or those frequently conducting the same searches over and over. It particularly reminds me of research undertaken by Bruce et al. (2004). For example, users increasingly chose not to bookmark a useful or interesting web page, but simply find it again – because they know they can. If you continually encounter information that is irrelevant to your area, re-rank it accordingly - so the SearchWiki ethos goes...

Perusing recent blogs it is clear that some consider this development to have business motivations. Technology guru and Wired magazine founder, John Battelle, thinks SearchWiki is an attempt to attract more users of iGoogle (which at the moment is small), whilst simultaneously rendering iGoogle the centre of users’ personal web universe. To my mind Google is always about business. PageRank is a great free-text searching tool, thus permitting huge market penetration. SearchWiki is simply another business tool which happens to offer some (vaguely?) useful functionality.

Wednesday, 12 December 2007

Microsoft Listas: a chicken or egg conundrum?

Microsoft Live Labs has recently launched a new 'tech preview' called Listas, a personal information management web tool. Essentially this is a social bookmarking and collaborative tagging application (similar to the likes of del.icio.us, RawSugar, etc.), allowing users to share web content they have encountered with other users. As the name suggests, Listas is about creating lists. Lists? Sounds a bit boring, eh? Perhaps it is in a way; however, Listas allows you to create extremely rich lists, comprising text, images, RSS feeds, multimedia and so forth, making Listas more like a web clippings service - which is probably the smartest aspect of this tool.

Part of the supposed attraction of Listas is the ability of users to collaborate and share their lists with other users. Users can also subscribe to persons that create particularly interesting lists (top lists and tags are available on the homepage). There is also a community section where you can find the most popular items from the public lists and peruse the tag clouds.

Microsoft have created a toolbar to assist users in compiling lists quickly, but alas it is only available for IE. The absence of a FireFox plug-in is a spectacular oversight from Microsoft; kind of like cutting off your nose to spite your face. These systems are so reliant on active user communities such that a Catch 22 scenario inevitably ensues: collaborative Web 2.0 tools require an active community to attract more users. The more users; the more powerful the tool becomes.

Although Listas may fulfil the needs of some personal information management junkies (the term 'information management' is used in the loosest possible sense here!) and has some neat web clipping features, I can't imagine this service getting the critical mass it requires to be useful from a community perspective. More to the point, I can't imagine that users would be particularly interested in someone else's shopping list, meeting minutes, or random web clippings. Have Microsoft completely missed the boat here? Or maybe it's me – can I not see the community value in this? Of course, the beauty of Microsoft Live Labs is that it doesn't really matter; let's just try it and see if it works – a nice ethic to have, if you can afford it.

Information Strategy Group, LJMU