The first website was launched at CERN by Tim Berners Lee on August 6th, 1991. When the W3C standard became public domain in 1994, there were over 3,000 public websites. It’s now estimated that there are more than 1.13 billion websites with around 200 million websites that are actively maintained and visited by internet users.
Given the plethora of web pages, it’s no surprise that web search engines have emerged to become the most important and widely used service on the Internet. In this series of posts, we’ll explore the rise of web search engines and how everything is now changing with the emergence of ChatGPT.
Early Years of Web Search Innovation
The first web search engine was JumpStation which was created by Jonathon Fletcher in December 1993. It had the basic components of modern web search engines in that it used a web robot to find and crawl web pages to build a searchable index. It also provided a web form for keywords which then returned a listing of links to web pages.
JumpStation was quite limited at launch and it had two major limitations. It didn’t index the full content of a web page and there was no ranking of search results. The real breakthrough came when people started developing better ways of rating the relevance and quality of website content to generate a ranked listing of websites in search results.
The first significant breakthrough in search engine technology was created by Robin Li who received a patent for RankDex in 1996. RankDex was the first to analyze the hyperlinks between web pages to determine relevance and quality of websites for a search query to return a ranked listing of web pages. Two years after the RankDex’s patent, Larry Page was issued a patent for PageRank which references RankDex in the patent filing.
Both RankDex and PageRank were important innovations because they were able to address two important challenges. The first was returning a ranked listing of highly relevant and high quality web pages and the second was addressing search engine spamming by low quality websites.
Many search engines in the early days relied on traditional information retrieval methods which analyzed the contents of a document to determine relevance. The challenge with traditional information retrieval methods is that while the web page itself might contain the same keywords or related terms, it’s hard to tell if a web page has high quality content. Even worse, it’s susceptible to crude search engine spamming by web pages with the matching keywords and related terms but low quality spam content.
With RankDex and PageRank, the pattern of outgoing and incoming hyperlinks on a web page enables network analysis across all the web pages on the internet.
This network analysis can reveal patterns such as authoritative websites with large volumes of incoming hyperlinks from other trusted sites, clusters of websites with related content with mutual links, and much more. These signals enabled RankDex and PageRank to provide a superior search experience that returned highly relevant and high quality websites while suppressing spammy websites.
It’s hard to overstate the importance of RankDex and PageRank. These breakthrough innovations were one of the key reasons why Robin Li and Larry Page were able to grow Baidu and Google to become some of the most successful internet companies that beat their competitors in their respective markets.
Source: andrewjdupree.com
Search Engine Innovation after RankDex and PageRank
If we look back at the track record of technology innovation after RankDex/PageRank, it’s been a while since we had an equivalent level of disruptive innovation. Google has continued to dominate and maintain its leadership by enhancing search and scaling with the growth of the internet.
From a user perspective, the experience still feels very similar to what it was like using early search engines. Users still enter search keywords and then navigate a ranked listing of links to web pages.
To be fair, there has been a huge number of incremental improvements and enhancements to the search experience. Seb search engines can now enable users to search across all forms of content on the web including text, images, videos, books, maps, product listings, travel information, and the list keeps growing with the explosion of content types on the web. Users can now see summarized content extracted from web pages via Google’s knowledge graph and users can also search using images instead of just keywords. These are all important and meaningful improvements that have enabled Google to continue dominating Search.
Despite all the progress to date, Search still feels like a DIY tool instead of a real solution
Users turn to search engines because they’re trying to find an answer to a question they have in their minds and/or to achieve a task or goal. For simple questions such as “Why is the Sky Blue?” Google does a great job of answering the question via a knowledge panel with a link to an authoritative source.
For more complex questions or tasks where there isn’t a one size fits all answer, it becomes more difficult. For example, here’s what Google shows for “recommendations for romantic trip to Paris”
There is a lot of great information on the search results page and there are links to great websites for more information. However, it’s still a DIY affair where the user has to click on each website listing, navigate the website content, and then compare information across various websites to plan their romantic trip to Paris.
Users may have to open multiple browser tabs with different variations of search keywords to find websites that provide recommendations that are more in-line with their preferences and price points. What’s worse is that every search is new and independent from the previous searches so you feel like you’re starting from scratch with every search. It’s not hard to imagine the user spending hours doing multiple searches and visiting dozens of websites and clicking on multiple ads to plan their trip.
What if there was a better way? What if Google was like a personal concierge who knew your preferences and could engage in a natural back and forth dialogue to help you plan your trip?
Google already has all the technology pieces to transform search from a DIY tool to a real valuable full service solution for users. Google already has a lot of your personal information and your interaction history across multiple Google products. Google already has some of the best machine learning and AI technology to learn user preferences and needs to provide recommendations. It also turns out that it has some of the best conversational technology with LAMBDA to enable natural back and forth conversations between Google and users.
So why did it take 20+ years for there to be real change in the Search landscape?
The cynical answer is that it wasn’t in Google’s best interests to truly solve the Search problem because it would kill their search advertising revenue. If Google Search had a perfect understanding of the user’s intent, it would find the perfect website to show in the search results page and the user’s needs would be met with one query. There also wouldn’t be the need or opportunity to pollute this perfect experience with ads. This would dramatically reduce the volume of ad impressions and clicks which in turn would lead to a huge reduction in revenue for Google.
The simpler answer is that Google was innovating at the pace that it needed to grow and fend off its competitors and challengers.
Thanks to PageRank and the subsequent stream of innovation, enhancement, and refinement of its product and services, Google was able to continue offering the best search experience. It wasn’t perfect but as long as there wasn’t a better alternative, Google could continue to be a successful company.
ChatGPT changes everything
We’re now seeing the emergence of a better alternative to Search engines with ChatGPT which was launched in November, 2022. Google is now facing a real Innovator’s Dilemma moment and is aggressively taking action to accelerate the pace of innovation and change across Search and its other products and services.
For the next post, we’ll explore ChatGPT and why it’s creating such an existential crisis for Search and why Search will never be the same again.
Please stay tuned!
Continue to Part 2 of “Search will never be the same after ChatGPT”