HOW A SEARCH ENGINE WORKS by rudrajoshi007

1 of 7

Slide Notes

HOW A SEARCH ENGINE WORKS

Published on Jul 02, 2017

No Description

PRESENTATION OUTLINE

HOW A SEARCH ENGINE WORKS

BY: RUDRA, SHARAD

Photo by DocChewbacca

Crawling

Crawling is where it all begins. The acquisition of data about a website. This involves scanning the site and getting a complete list of everything on there – the page title, images, keywords it contains, and any other pages it links to

Photo by Christiaan Colen

How is a website crawled exactly?

An automated bot – a spider – visits each page, just like you or I would, only very quickly. Even in the earliest days, Google reported that they were reading a few hundred pages a second.

Photo by LaMenta3

Indexing

indexing is the process of taking all of that data you have from a crawl, and placing it in a big database. Imagine trying to a make a list of all the books you own, their author and the number of pages. Going through each book is the crawl and writing the list is the index.

Photo by quimby

Indexing

Any site that is linked to from another site already indexed, or any site that manually asked to be indexed, will eventually be crawled – some sites more frequently than others and some to a greater depth.

Photo by Christiaan Colen

Ranking & Retrieval

The last step is what you see – you type in a search query, and the search engine attempts to display the most relevant documents it finds that match your query. This is the most complicated step, but also the most relevant to you. It is also the area in which search engines differentiate themselves.

Photo by Brett Jordan

THE RESULT

The ranking algorithm checks your search query against billions of pages to determine how relevant each one is. This operation is so complex that companies closely guard their own ranking algorithms as secrets.

Photo by Jacob Whittaker

Friend of Haiku Deck