From the very beginning, indexing has had a very simple principle: to integrate everything new on the Internet at a given time, sites, pages, content updates, and then display everything that is integrated and relevant to a topic, to a keyword search.
The place of books is taken from large lists of web pages that Google has “heard”, all new content and page updates are immediately retrieved and indexed.
The analysis is called “crawling”, the way a page is analyzed and then indexed depends on the quality of the Google Spider crawler, quite good, since there are currently index updates at all times.
Some parts of the site may be “hidden” by these robots by using the Robots.txt file or by a no-index tag.
All crawling results are added to the Google Index database only after careful and quick analysis that clearly states whether the site is of good quality or not.
Crawling starts with web pages already captured in previous processes plus data provided by webmasters sitemaps.