Source: Ahref Blog
.
Critics:
The leading search engines, such as Google, Bing, and Yahoo!, use crawlers to find pages for their algorithmic search results. Pages that are linked from other search engine-indexed pages do not need to be submitted because they are found automatically. The Yahoo! Directory and DMOZ, two major directories which closed in 2014 and 2017 respectively, both required manual submission and human editorial review.
Google offers Google Search Console, for which an XML Sitemap feed can be created and submitted for free to ensure that all pages are found, especially pages that are not discoverable by automatically following links in addition to their URL submission console.
Yahoo! formerly operated a paid submission service that guaranteed to crawl for a cost per click;however, this practice was discontinued in 2009. Search engine crawlers may look at a number of different factors when crawling a site. Not every page is indexed by search engines. The distance of pages from the root directory of a site may also be a factor in whether or not pages get crawled.
Mobile devices are used for the majority of Google searches. In November 2016, Google announced a major change to the way they are crawling websites and started to make their index mobile-first, which means the mobile version of a given website becomes the starting point for what Google includes in their index.
In May 2019, Google updated the rendering engine of their crawler to be the latest version of Chromium (74 at the time of the announcement). Google indicated that they would regularly update the Chromium rendering engine to the latest version. In December 2019, Google began updating the User-Agent string of their crawler to reflect the latest Chrome version used by their rendering service.
The delay was to allow webmasters time to update their code that responded to particular bot User-Agent strings. Google ran evaluations and felt confident the impact would be minor. To avoid undesirable content in the search indexes, webmasters can instruct spiders not to crawl certain files or directories through the standard robots.txt file in the root directory of the domain.
Additionally, a page can be explicitly excluded from a search engine’s database by using a meta tag specific to robots (usually <meta name=”robots” content=”noindex”> ). When a search engine visits a site, the robots.txt located in the root directory is the first file crawled. The robots.txt file is then parsed and will instruct the robot as to which pages are not to be crawled. As a search engine crawler may keep a cached copy of this file, it may on occasion crawl pages a webmaster does not wish to crawl.
Pages typically prevented from being crawled include login-specific pages such as shopping carts and user-specific content such as search results from internal searches. In March 2007, Google warned webmasters that they should prevent indexing of internal search results because those pages are considered search spam. In 2020, Google sunsetted the standard (and open-sourced their code) and now treats it as a hint not a directive.
To adequately ensure that pages are not indexed, a page-level robot’s meta tag should be included. A variety of methods can increase the prominence of a webpage within the search results. Cross linking between pages of the same website to provide more links to important pages may improve its visibility. Page design makes users trust a site and want to stay once they find it. When people bounce off a site, it counts against the site and affects its credibility.
Writing content that includes frequently searched keyword phrases so as to be relevant to a wide variety of search queries will tend to increase traffic. Updating content so as to keep search engines crawling back frequently can give additional weight to a site. Adding relevant keywords to a web page’s metadata, including the title tag and meta description, will tend to improve the relevancy of a site’s search listings, thus increasing traffic.
URL canonicalization of web pages accessible via multiple URLs, using the canonical link element or via 301 redirects can help make sure links to different versions of the URL all count towards the page’s link popularity score. These are known as incoming links, which point to the URL and can count towards the page link’s popularity score, impacting the credibility of a website.
SEO techniques can be classified into two broad categories: techniques that search engine companies recommend as part of good design (“white hat”), and those techniques of which search engines do not approve (“black hat”). Search engines attempt to minimize the effect of the latter, among them spamdexing. Industry commentators have classified these methods and the practitioners who employ them as either white hat SEO or black hat SEO.
White hats tend to produce results that last a long time, whereas black hats anticipate that their sites may eventually be banned either temporarily or permanently once the search engines discover what they are doing. An SEO technique is considered a white hat if it conforms to the search engines’ guidelines and involves no deception. As the search engine guidelines are not written as a series of rules or commandments, this is an important distinction to note.
White hat SEO is not just about following guidelines but is about ensuring that the content a search engine indexes and subsequently ranks is the same content a user will see. White hat advice is generally summed up as creating content for users, not for search engines, and then making that content easily accessible to the online “spider” algorithms, rather than attempting to trick the algorithm from its intended purpose.
White hat SEO is in many ways similar to web development that promotes accessibility, although the two are not identical. Black hat SEO attempts to improve rankings in ways that are disapproved of by the search engines or involve deception. One black hat technique uses hidden text, either as text colored similar to the background, in an invisible div, or positioned off-screen. Another method gives a different page depending on whether the page is being requested by a human visitor or a search engine, a technique known as cloaking.
Another category sometimes used is grey hat SEO. This is in between the black hat and white hat approaches, where the methods employed avoid the site being penalized but do not act in producing the best content for users. Grey hat SEO is entirely focused on improving search engine rankings.
Search engines may penalize sites they discover using black or grey hat methods, either by reducing their rankings or eliminating their listings from their databases altogether. Such penalties can be applied either automatically by the search engines’ algorithms or by a manual site review. One example was the February 2006 Google removal of both BMW Germany and Ricoh Germany for the use of deceptive practices.
Both companies, however, quickly apologized, fixed the offending pages, and were restored to Google’s search engine results page. SEO is not an appropriate strategy for every website, and other Internet marketing strategies can be more effective, such as paid advertising through pay-per-click (PPC) campaigns, depending on the site operator’s goals. Search engine marketing (SEM) is the practice of designing, running, and optimizing search engine ad campaigns.
Its difference from SEO is most simply depicted as the difference between paid and unpaid priority ranking in search results. SEM focuses on prominence more so than relevance; website developers should regard SEM with the utmost importance with consideration to visibility as most navigate to the primary listings of their search. A successful Internet marketing campaign may also depend upon building high-quality web pages to engage and persuade internet users, setting up analytics programs to enable site owners to measure results, and improving a site’s conversion rate.
In November 2015, Google released a full 160-page version of its Search Quality Rating Guidelines to the public, which revealed a shift in their focus towards “usefulness” and mobile local search. In recent years the mobile market has exploded, overtaking the use of desktops, as shown in by StatCounter in October 2016, where they analyzed 2.5 million websites and found that 51.3% of the pages were loaded by a mobile device.
Google has been one of the companies that are utilizing the popularity of mobile usage by encouraging websites to use their Google Search Console, the Mobile-Friendly Test, which allows companies to measure up their website to the search engine results and determine how user-friendly their websites are. The closer the keywords are together their ranking will improve based on key terms.
Classifying Web Search Queries in Order to Identify High Revenue Generating CustomersArchived
“Finding What People Want: Experiences with the WebCrawler”
Intro to Search Engine Optimization | Search Engine Watch”.
“Who Invented the Term “Search Engine Optimization”?”.
“What is a tall poppy among web pages?”.
“Is Keyword Density Still Important for SEO”.
“Sites Get Dropped by Search Engines After Trying to ‘Optimize’ Rankings”.
“Legal Showdown in Search Fracas”. Wired Magazine.
Google’s Guidelines on Site Design”.
“The Anatomy of a Large-Scale Hypertextual Web Search Engine”.
Google Keeps Tweaking Its Search Engine”.
“Rundown On Search Ranking Factors”.
Understanding Search Engine Patents”.
“Google Personalized Search Leaves Google Labs”.
“8 Things We Learned About Google PageRank”.
“Google Loses “Backwards Compatibility” On Paid Link Blocking & PageRank Sculpting”.
Our new search index: Caffeine” .
“Relevance Meets Real-Time Web”. Google Blog.
What You Need to Know About Google’s Penguin Update”.
Google Penguin looks mostly at your link source, says Google”.
“FAQ: All About The New Google “Hummingbird” Algorithm”.
^ “Understanding searches better than ever before”.
^ “Submitting To Directories: Yahoo & The Open Directory”.
^ “What is a Sitemap file and why should I have one?”.
^ “Search Console – Crawl URL”.
Submitting To Search Crawlers: Google, Yahoo, Ask & Microsoft’s Live Search”. Search Engine Watch.
“Efficient crawling through URL ordering”.
^ “Mobile-first Index”. Archived
^ “The new evergreen Googlebot”.
^ “Updating the user agent of Googlebot”.
“Newspapers Amok! New York Times Spamming Google? LA Times Hijacking Cars.com?”. Search Engine Land.
“Google Downgrades Nofollow Directive. Now What?”. .
“Bing – Partnering to help solve duplicate content issues
Search Engine Showdown: Black hats vs. White hats at SES”.
“Black Hat/White Hat Search Engine Optimization”.
“High Accessibility Is Effective Search Engine Optimization”. A List Apart. Archived from the original on May 4, 2007. Retrieved May 9, 2007.
Ramping up on international webspam”.
“The Battle Between Search Engine Optimization and Conversion: Who Wins?”.
“SEO Tips and Marketing Strategies”.
Search Quality Evaluator Guidelines” How Search Works November 12, J
“Mobile web usage overtakes desktop for first time”.
“Schmidt’s testimony reveals how Google tests algorithm changes”.
Search Engines as Leeches on the Web”.
“The search engine that could”.
“Stats Show Google Dominates the International Search Landscape”.
“Search Engine Optimizing for Europe”.
“Google UK closes in on 90% market share”.
“Search King, Inc. v. Google Technology,
“Judge dismisses suit against Google”.
Technology & Marketing Law Blog: Google Sued Over Rankings—KinderStart.com v. Google”.
.
Leave a Reply