Why Google Indexes Blocked Web Pages

September 6, 2024

why google indexes disallowed content 798.jpg

Google’s John Mueller answered a query about why Google indexes pages which are disallowed from crawling by robots.txt and why the it’s secure to disregard the associated Search Console experiences about these crawls.

Table of Contents

Bot Visitors To Question Parameter URLs

The individual asking the query documented that bots had been creating hyperlinks to non-existent question parameter URLs (?q=xyz) to pages with noindex meta tags which are additionally blocked in robots.txt. What prompted the query is that Google is crawling the hyperlinks to these pages, getting blocked by robots.txt (with out seeing a noindex robots meta tag) then getting reported in Google Search Console as “Listed, although blocked by robots.txt.”

The individual requested the next query:

“However right here’s the massive query: why would Google index pages once they can’t even see the content material? What’s the benefit in that?”

Google’s John Mueller confirmed that if they will’t crawl the web page they will’t see the noindex meta tag. He additionally makes an attention-grabbing point out of the location:search operator, advising to disregard the outcomes as a result of the “common” customers gained’t see these outcomes.

He wrote:

“Sure, you’re right: if we will’t crawl the web page, we will’t see the noindex. That mentioned, if we will’t crawl the pages, then there’s not loads for us to index. So whilst you would possibly see a few of these pages with a focused web site:-query, the typical consumer gained’t see them, so I wouldn’t fuss over it. Noindex can also be positive (with out robots.txt disallow), it simply means the URLs will find yourself being crawled (and find yourself within the Search Console report for crawled/not listed — neither of those statuses trigger points to the remainder of the location). The vital half is that you simply don’t make them crawlable + indexable.”

Takeaways:

1. Mueller’s reply confirms the restrictions in utilizing the Web site:search superior search operator for diagnostic causes. A type of causes is as a result of it’s not related to the common search index, it’s a separate factor altogether.

Google’s John Mueller commented on the location search operator in 2021:

“The brief reply is {that a} web site: question will not be meant to be full, nor used for diagnostics functions.

A web site question is a selected sort of search that limits the outcomes to a sure web site. It’s mainly simply the phrase web site, a colon, after which the web site’s area.

This question limits the outcomes to a selected web site. It’s not meant to be a complete assortment of all of the pages from that web site.”

2. Noindex tag with out utilizing a robots.txt is ok for these sorts of conditions the place a bot is linking to non-existent pages which are getting found by Googlebot.

3. URLs with the noindex tag will generate a “crawled/not listed” entry in Search Console and that these gained’t have a unfavourable impact on the remainder of the web site.

Learn the query and reply on LinkedIn:

Why would Google index pages once they can’t even see the content material?

Featured Picture by Shutterstock/Krakenimages.com

Why Google Indexes Blocked Web Pages

Bot Visitors To Question Parameter URLs

Takeaways:

Google’s AI Overviews Reach 1.5 Billion Monthly Users

How To Remove Site From Search Without Verifying Ownership

Google Quietly Ends COVID-Era Structured Data Support

Most Popular

What The Scrub Daddy Tells Us About The Perfect...

X Adds Option to Embed Videos in Isolation from...

LinkedIn Adds Tools To Help Healthcare Workers Find the...

Meta’s Prompting Group Admins to Sign-Up for its New...

TikTok Adds Post Scheduling to Studio App

Beyond the bait: Threads’ engagement challenge reshapes digital marketing

The Essential Role of Video Converting in Digital Agencies:...

EDITOR PICKS

10% dividend growth! 2 FTSE 100 stocks tipped to supercharge cash...

Could buying FTSE 100 stocks lead to an early retirement?

Why Entrepreneurs Should Invest in Service, Not Just Sales

Popular News

Here’s how to build a £100k ISA starting with £5k today

Boost Productivity With This Adjustable Stand With Port Hub for Just...

2 world-class growth stocks to consider buying in May

POPULAR Tags

Popular Tags

ABOUT US

FOLLOW US

Why Google Indexes Blocked Web Pages

Bot Visitors To Question Parameter URLs

Takeaways:

Related posts:

Most Popular

EDITOR PICKS

Popular News

POPULAR Tags

Popular Tags

ABOUT US

FOLLOW US