Indexed though blocked by robots
A URL blocked in robots.txt can still be indexed as a bare link with no description, because blocking crawl does not remove a URL from the index. To keep a page out, I allow crawling and use noindex, not a Disallow.
URL blocked in robots.txt yet still indexed (often URL-only).
Why it matters
Google can index a blocked URL it can’t read, producing a poor, contentless listing it can’t improve.
How to fix it
Allow crawling and add noindex if exclusion is wanted (the two are mutually exclusive).
How to find it on your site
- Search Google with the site: operator and look for URLs showing no available description.
- Check the Search Console Pages report for Indexed, though blocked by robots.txt.
- Decide whether each URL should be indexed or not.
- To remove it, allow crawling and add noindex, rather than relying on robots.txt.
Cross-reference to ranking and citation factors
A URL indexed without content adds nothing and clutters the index. The fix aligns crawl and index directives so the page is handled as intended.
Impact
Medium-high; poor listings. Direct.
Evidence
A robots-blocked URL can still be indexed; use noindex (crawlable) to exclude. Google Search Central, Block Search indexing with noindex; Google Search Central, Intro to robots.txt
Sources