How Search Engines Work

Exhibit 26.2  How the Google search engines works. (Source: Google).

A search engine is a software platform that searches and retrieves information from the internet. When a user types words into a search bar, search engines like Google, Yahoo and Bing scour their index of webpages, and analyse, rank and retrieve pages that match the search query, within a fraction of a second. This is a two-pronged process — the maintenance of a webpage index, and the search and retrieval of webpages.

These processes are explained by Google in the “How Search Works” video linked to Exhibit 26.2.

Crawl and Index

A webpage index is a gigantic database that stores the variables or clues that search algorithms use to generate search results. The Google index, which is over 100 million gigabytes in size, contains more than 200 such variables. This index, which is well over 100 petabytes in size, is distributed in data centres across the globe through distributed file systems such as Google’s Colossus.

To create and maintain an index, the search engines use automated programmes called spiders or bots (short for robot e.g., Googlebot) that crawl the webpages from link to link and retrieve data about the pages.

Spiders source domain names and IP addresses from ICANN (Internet Corporation for Assigned Names and Numbers). They use this information to seek out websites, and crawl from link to link within each domain, scouring the site for information on the site’s pages.

Search and Retrieve

As a user seeking information types into the engine’s search box, algorithms interpret the information the user is seeking and identify the relevant pages from the index.

These user queries usually generate millions of pages with relevant information. Algorithms rank these pages based on several factors, such as the over 200 unique signals or clues that Google relies on. The results are then posted in rank order on to the user’s search engine result pages (SERP).


Previous     Next

Use the Search Bar to find content on MarketingMind.