While Lycos’ iq co-conceiver Oliver Wagner views the failure of the social search revolution as a missed opportunity (though not referring to pure social search approaches) and Robert Basic even considers social bookmark services as social search engines, a social search based solely on user ratings has failed from the start for those who have already looked deeper into the complexity of a search algorithm.
A social search, where the evaluations of the search results by users influence future rankings, cannot work. This does not mean that the iq concept cannot work, because here it is not the search results themselves that are being evaluated, but rather the contributions of users in a community, and then, if relevant to a search query, displayed in the organic results.
But why can’t a pure social search work? The Long Tail, according to Chris Anderson, also applies to the population of search queries, based on Excite’s search log files. Few terms are searched frequently (“Britney Spears naked”), many terms are searched rarely (“Computational Lexicography”), some even only once a month, a year, or in some cases for the very first time. Obviously, it is impossible to obtain sufficient data for the rarely searched terms that can be used for ranking. It’s not enough to have an evaluation for a search query-URL pair and grant it influence on the ranking, because anyone could click on their own page and thus improve their ranking. People also have different opinions, so only a variety of evaluations will make a significant picture recognizable. Of course, one could argue that if you waited long enough, you would have sufficient data for every search query-URL pair. And even if you didn’t have data for the long long long tail, at least for frequent searches and a large part of the Long Tail, there would be something available. (Has it been noticed here that I’m not talking about evaluating URLs, but about search query-URL pairs? Obviously, a URL can be relevant for one search query, but not for another. An evaluation can therefore only be made for the combination of search query and URL. This makes the sum of evaluable data even smaller.)
The next problem with this approach is that the data collected over a long period of time are no longer “fresh”. Search queries and pages evolve, and what is relevant today may not be tomorrow. Someone searching for “stock market crash” today probably doesn’t want to see pages about the stock market collapse in 2000 (although these would also be theoretically relevant and certainly are for some users; it takes a huge amount of data to filter out this noise). This does not apply to all search query-URL pairs. Apple’s iPod page was relevant three years ago and still is today. Some evaluations would therefore have a short half-life, while others would have a longer one. But how can you distinguish between them? There would certainly be a way to do this (which I currently consider very complex), but we are again faced with the problem that we cannot use the data for the long tail. We would probably have enough data to benefit from the social component for “Britney Spears naked”, but for “Computational Lexicography” the outlook is bleak, and one has to rely on classical methods. If you take a closer look at the query population, you can quickly see that searching for popular search queries won’t get you anywhere.
If you look at the source code of all popular search engines, you can see that they track clicks on the results. Instead of giving users the option to rate a result as good or bad, there seems to be another mechanism here, which is even more complex. Just because a user has clicked on a result does not mean that the page is then considered good. Of course, one could now measure how long a user stays on a page before returning to the search results page, but the tabs commonly used in browsers today allow multiple results to be opened, so the user does not return as quickly and thus distorts the result (of course, this can also be filtered out again, after all, a user has then clicked on several results within a short time). However, it is doubtful that this click popularity alone is sufficient for a good ranking algorithm, just like link popularity alone would not be enough. Google itself admits that PageRank is one of more than 100 factors, and click popularity will also be one of these more than 100 factors.
Social Search can be, if at all, only a small gimmick if it is not supplemented by further procedures. The social idea, however, suffers from this restriction, because how can you persuade users to evaluate results when only a fraction of these results are usable and thus offer a benefit to other searchers?