Exhibit 26.2 How the Google search engines works. (Source: Google).
A search engine is a software platform that searches and retrieves information
from the internet. When a user types words into a search bar, search engines like Google, Yahoo and Bing scour their index of webpages,
and analyse, rank and retrieve pages that match the search query, within a fraction of a second. This is a two-pronged
process — the maintenance of a webpage index, and the search and retrieval of webpages.
These processes are explained by Google in the “How Search Works” video linked to Exhibit 26.2.
A webpage index is a gigantic database that stores the variables or clues that
search algorithms use to generate search results. The Google index, which is over 100 million gigabytes in size, contains
more than 200 such variables. This index, which is well over 100 petabytes in size, is distributed in
data centres across the globe through distributed file systems such as Google’s Colossus.
To create and maintain an index, the search engines use automated programmes called spiders or bots (short
for robot e.g., Googlebot) that crawl the webpages from link to link and retrieve data about the pages.
Spiders source domain names and IP addresses from ICANN (Internet Corporation for Assigned Names and Numbers).
They use this information to seek out websites, and crawl from link to link within each domain, scouring the site for
information on the site’s pages.
As a user seeking information types into the engine’s search box, algorithms interpret
the information the user is seeking and identify the relevant pages from the index.
These user queries usually generate millions of pages with relevant information. Algorithms rank these pages
based on several factors, such as the over 200 unique
signals or clues that Google relies on. The results are then posted in rank order on to the user’s search engine
result pages (SERP).