14 July 2008

"Producing great search results" comments

I've just been reading an interesting blog article by Jared Spool about information search. This, along with information scent and structure, was my PhD thesis topic. Given the chance to dedicate myself to this topic for several years, I would like to weight in with a few comments about this.

First of all, a minor point: Jared said, "In fact, we’re confident that it really takes a lot of hard work and skill to make something that will create a delightful experience for your users." This is very true indeed. In one of my research's experiments, I found that the most effective way of presenting research results was the least satisfactory. People hated it because it required the most work to make a judgement, but their relevance judgements were the best.

And what was this method of summarising a document?

We found that it was just the title of the page (assuming that it was in some way related to the content and not due to a technical glitch like a 404'd page). We tried a number of different methods of summarising texts and these were all inferior to the title even when combined. For some reason, the summarisation methods interfered with judgements in a negative way which led to less accurate judgements. The summarisation methods we tested were: the initial text (as used by Alta Vista which was popular when my thesis began), keyword embedded text (as used by Google which was becoming big), and keywords. By this time, meta information was already being abused and few documents were workable.

There are other methods such as thumbnails of the page, lists of graphics, but I'm not aware of any research that shows that they improve upon a plain text summary.

Though we didn't test this, I think that a human written summary would provide the best way for humans to judge. Remember that this isn't supposed to be part of a machine's judgement, but rather a human's judgement. Sadly, with the amount of competition for ranking in Google and dynamic pages, there is little chance of a decent system now being developed without immediately being abused. A proper human-readable text summary of a document created by machine is still a long way off.

The main problem is when users try to identify a document's relevance from something that is considerably less than the complete document. To put it into other terms, the problem is whether users can judge the information scent properly. Conventionally, judgements are made using a short text summary or abstraction of the document that contains enough information for accurate discrimination.

Despite all this, people's searches returned relevant documents most of the time regardless of the type of information need; and even better, the "discrimination index". This was a metric we developed to work out the overall accuracy of a judgement. To calculate this, take the number of "hits" (relevant documents that were judged as relevant) and divide by the total number of relevant documents. This gives a proportion of hits. Then take the number of false positives (documents judged as relevant when they are not) and divide by the total number of documents that are not relevant. This is the proportion of false positives. Finally, subtract the proportion of hits from the proportion of false positives. This gives a real number between -1.0 and +1.0 with -1.0 being completely wrong, +1.0 being completely right and 0 being chance and is a nice stabilised figure than can be used across different studies.

You might also be interested in the idea of negative information scent. Pirolli and Card said that there was no such thing: there was only a presence (which would cause movement towards) or absence (which would cause random movements). We found that there was such a thing as negative scent. Something that was perceived as clearly non-relevant would discourage any pursuit.

"The problem with Search is that we force the user to specify their goal in terms of the phrase they think will most likely produce a reasonable result." This is true. See part 1 of my "interface as translators" article which explains how interfaces are two way translators between a user and his or her needs, and a language that the machine needs to use.

From personal experience, a lot of search interfaces are extremely damaged. I'm sure can all think of experiences with flight booking systems that seem designed without careful consideration never mind adequate testing.

For example, I search for a flights on particular days (journey out and return) only to be told that there are no flights available. Okay fine. But why? Is it because all the flights are fully booked (in which case I will either move my dates to quieter weekdays or fly several weeks later), or because the airline just doesn't fly that day (in which case I will try the days before). And if they don't, what days do they fly?

However, the screen remains mute for me. I have to go back (sometimes having to input all my information again which with dates and dodgy Javascript calendars is no fun) and try another date. I tried on a Thursday - would Friday be better? Or Wednesday? Is there only one suitable flight a week? How many are there?

And if I do find a date with seats, is there a cheaper day to travel? Perhaps if I tried the day before... But of course, I don't know flight frequency. Then multiply this by a number of airlines and it becomes a source of frustration that seems as though the airline companies are making it difficult for me to give them money. This is a very bad way to do business: Never make it hard for customers to give you money.

All this is information that a searcher wants. The best flight-booking system I have come across showed dates for both legs of the journey for the entire month along with price. Admittedly it was only a local flight service, but I could instantly see price and availability for a range of dates and adjust my journey accordingly with only a couple of clicks on each date and no waiting for new pages to download.

It was so good that I have flown several times with them and look forward with joy to using their website again because it is so much better.

Still, it's an interesting article and very important. As I have shown, my experience with more complex searches has not been good. I might spend some time developing a prototype flight booking system that works (TM)!

Part 2 has just come out and I will comment on that tomorrow.

No comments: