1.2 Indexability & statusHighVerified

Indexed though blocked by robots

A URL blocked in robots.txt can still be indexed as a bare link with no description, because blocking crawl does not remove a URL from the index. To keep a page out, I allow crawling and use noindex, not a Disallow.

What it is

URL blocked in robots.txt yet still indexed (often URL-only).

Why it matters

Google can index a blocked URL it can’t read, producing a poor, contentless listing it can’t improve.

How to fix it

Allow crawling and add noindex if exclusion is wanted (the two are mutually exclusive).

How to find it on your site

  1. Search Google with the site: operator and look for URLs showing no available description.
  2. Check the Search Console Pages report for Indexed, though blocked by robots.txt.
  3. Decide whether each URL should be indexed or not.
  4. To remove it, allow crawling and add noindex, rather than relying on robots.txt.

Cross-reference to ranking and citation factors

A URL indexed without content adds nothing and clutters the index. The fix aligns crawl and index directives so the page is handled as intended.

Impact

Medium-high; poor listings. Direct.

Evidence

A robots-blocked URL can still be indexed; use noindex (crawlable) to exclude. Google Search Central, Block Search indexing with noindex; Google Search Central, Intro to robots.txt