The current discussion around declining search quality on Google goes to the main bread and butter issue in organic search: How good are the search results in the first page? And in this context, the discussion is dominated by the topics of search spam and content farms and gaming of the Google algorithm. That makes sense!
In my opinion, there are a lot of unaddressed “big problems” in search beyond fixing the spam issue. I’m citing just a few of these here.
The content explosion: There is a growing diversity of content types, explosive growth of online content, and multi lingual content, all of which contribute to the complexity of what the current and next generation search engine needs to handle. No single search engine really is able to cover the complete set of information on the Web today, and this will remain a big challenge for search engines into the future.
Hidden content sources: Part of the content explosion continues to be the proliferation of specialized content sources and databases, content from within which we can’t readily discover from mainstream search engines. This phenomenon is called the Invisible Web or the Deep Web, first written about in the late ’90s (my previous startup Intelliseek, delivered the first search engine for the Invisible Web in 1999), and continues to remain a big open issue. Attention on it has lessened only because of the sheer noise around other memes like social search, real-time search and so forth in the past few years.
Understanding user intent: Then there are age-old issues that haven’t been addressed around understanding user intent. Much of the quality of search results has to do with not knowing what the heck the searcher really needs. We are still feeding keywords into a single search box and expecting the magic to happen on the part of the search engine to give us what we need. Not finding our answers, more of us are doing longer queries, hoping that will give us the answers we need. i.e. We are compensating as users for something that search engines fundamentally do not understand today: our search intent.
Understanding the content: 16+ years since the first Web search engine, we are still processing textual information with little understanding of the semantics involved. Search engines do not understand the meaning of the content that they index. This is another contributing factor that limits the quality of results delivered by a search engine to users. For long, there’s been a buzz about the semantic Web, which is supposed to usher in richer search and information experiences starting from more meaningful data and sophisticated software that can make inferences from the data in ways that is not possible today. Hailed as “Web 3.0”, it is seen as the next phase in the evolution of the Web, and that is a realm of new problems and opportunities for search engines.
Handling User input: For the most part, search interfaces have continued to use the age old search box for typing keywords as input. While promising work has been done with accepting natural language questions as input, nothing commercially viable has really turned up that works in Web scale. Without solving this problem first, there is no hope of being able to speak to a search engine to have it bring back what you are looking for.
Presenting search results: The 10 results per page read-only SERP interface that first came about in the mid 90’s is what we are essentially stuck with even today (granted that there have been recent touches like page previews / summaries added to it, and showing videos / images etc. along with links to pages / sites). A retrospective look at this 2007 interview with usability expert Jakob Nielsen which looks into possible changes in search result interfaces by 2010 is very revealing about the relatively slow pace of change with the SERP interface. Others have attempted purely visual searches, and still others have tried to categorize / cluster search results. Still, what the mainstream search engines offer in terms of a interface for search results consumption is not noticeably innovative.
Personalizing: For the most part, search results are a one-size-fits-all thing. Everyone gets the same results regardless of your interests and your connections. Some attempts have been made to personalize search results both based on some model of individual interests and on the likes / recommendations of their social group, but that is a really challenging problem to solve well. At Zakta, our Zakta.com service made the SERP read-write, and personalizable. Other services have tried to bypass the search engine itself with Q&A services that flow through a user’s social network.
Leveraging social connections and recommendations: First generation attempts have been made to have search results be influenced by the recommendations of others in a person’s social circle. Some speculate that Facebook might be sitting on so much recommendation data that they might have a potent alternative to Google in the search arena. Regardless, this remains an unsolved search problem today.
Facilitating collaboration in search: Web searching has been a lonely activity since its inception. Combined with the limiting read-only SERP interface, searchers have never really been able to leverage the work, findings or knowledge of others (including those that they deeply trust) in the search process. In the post Web 2.0 world we are in, this remains an noticeable gap in search. One area of opportunity is for search engines to let people search together to find what they need.
Specialized searches in verticals and niches: For a while in the early and mid 2000’s, the buzz was all about vertical search engines and somehow that meme just faded away. The core reasons for the attractiveness of vertical / specialized search engines remain. Shopping, Travel, and plenty of other verticals represent areas which could benefit from continued development of specialized search solutions that go beyond the mainstream search engine experience.
These are but a few examples of the many open “big problems” with Search. Seeing this, we cannot but acknowledge that we are still in our infancy in meeting the search needs of an increasingly online, connected and mobile populace.
At Zakta, my startup, we are working on solutions for some aspects of these big search problems. We are combining semantics, curation, and collaboration technologies with traditional Web searching to deliver a new search engine called SearchTeam. Perfect for collaborative search, or as a research tool for personal or collaborative curation of web content, we hope that SearchTeam will become a very useful part of people’s search toolset. At this time, SearchTeam is in private beta.
What do you think are open problems, big or small, with search engines?