1 of 7

Slide Notes

DownloadGo Live

HOW A SEARCH ENGINE WORKS

Published on Jul 02, 2017

No Description

PRESENTATION OUTLINE

HOW A SEARCH ENGINE WORKS

                            BY: RUDRA, SHARAD
Photo by DocChewbacca

Crawling

  • Crawling is where it all begins. The acquisition of data about a website. This involves scanning the site and getting a complete list of everything on there – the page title, images, keywords it contains, and any other pages it links to

How is a website crawled exactly?

  • An automated bot – a spider – visits each page, just like you or I would, only very quickly. Even in the earliest days, Google reported that they were reading a few hundred pages a second.
Photo by LaMenta3

Indexing

  • indexing is the process of taking all of that data you have from a crawl, and placing it in a big database. Imagine trying to a make a list of all the books you own, their author and the number of pages. Going through each book is the crawl and writing the list is the index.
Photo by quimby

Indexing

  • Any site that is linked to from another site already indexed, or any site that manually asked to be indexed, will eventually be crawled – some sites more frequently than others and some to a greater depth.

Ranking & Retrieval

  • The last step is what you see – you type in a search query, and the search engine attempts to display the most relevant documents it finds that match your query. This is the most complicated step, but also the most relevant to you. It is also the area in which search engines differentiate themselves.
Photo by Brett Jordan

THE RESULT

  • The ranking algorithm checks your search query against billions of pages to determine how relevant each one is. This operation is so complex that companies closely guard their own ranking algorithms as secrets.