{"id":10072,"date":"2026-06-16T02:28:44","date_gmt":"2026-06-15T18:28:44","guid":{"rendered":"http:\/\/longzhuplatform.com\/?p=10072"},"modified":"2026-06-16T02:28:44","modified_gmt":"2026-06-15T18:28:44","slug":"googles-mueller-says-llms-txt-cant-help-llms-differentiate-sites-via-sejournal-mattgsouthern","status":"publish","type":"post","link":"http:\/\/longzhuplatform.com\/?p=10072","title":{"rendered":"Google\u2019s Mueller Says llms.txt Can\u2019t Help LLMs Differentiate Sites via @sejournal, @MattGSouthern"},"content":{"rendered":"<p><\/p> <div id=\"narrow-cont\"> <p>Google\u2019s John Mueller argued that LLM systems can\u2019t use files like llms.txt to decide which websites to surface for a given query.<\/p> <p>He made the comments on a recent episode of Search Off the Record, the podcast from Google\u2019s Search Relations team.<\/p> <p>His comment points to a broader signal problem, not just intentional gaming. Even a well-written llms.txt file is still self-reported information from the site that wants to be chosen.<\/p> <p>For discovery, Mueller pointed back to normal HTML pages and internal links.<\/p> <h2>What Mueller Said<\/h2> <p>The conversation started with a question about whether publishers should convert websites to Markdown for LLMs. Mueller and co-host Martin Splitt agreed that HTML is still the foundation for crawling and discovery.<\/p> <p>The discussion got specific when Mueller turned to llms.txt. He described the discovery use case as a dead end:<\/p> <p><iframe class=\"sej-iframe-auto-height\" id=\"in-content-iframe\" scrolling=\"no\" src=\"https:\/\/www.searchenginejournal.com\/wp-json\/sscats\/v2\/tk\/Middle_Post_Text\"><\/iframe><\/p> <blockquote> <p>\u201cIt\u2019s basically you\u2019re telling these systems, like, I have the best website ever. And here are all of the pages that everyone must go to. And you must buy all of my products or whatever you put in there. So in LLM system, it basically, by design, can\u2019t trust what is here as a way of differentiating between different websites.\u201d<\/p> <\/blockquote> <p>His argument comes down to differentiating. If sites use llms.txt to promote themselves, the files can make similar claims. An LLM deciding which site best answers a query still needs another way to differentiate between them.<\/p> <h2>What \u2018By Design\u2019 Might Mean<\/h2> <p>\u201cBy design\u201d could mean two different things, and Mueller didn\u2019t clarify which.<\/p> <p>One reading is architectural. LLM systems evaluate web content and can\u2019t use self-reported files when picking sources.<\/p> <p>The other reading treats it as a signal problem. Self-reported signals lose value when everyone provides them. Meta keywords stopped working for the same reason. Every site stuffed them, and search engines couldn\u2019t extract a useful ranking signal.<\/p> <p>Both readings reach the same conclusion on discovery. But they imply different things about whether the limitation could change over time.<\/p> <h2>Where Mueller Sees A Role<\/h2> <p>Mueller didn\u2019t reject all uses of llms.txt. He carved out one case where it could help:<\/p> <blockquote> <p>\u201cIf someone is already on your website, maybe some kind of automated system is helpful.\u201d<\/p> <\/blockquote> <p>He used the example of an agent trying to buy a photograph from a specific site. The LLM would visit the site and look for instructions on how to complete the purchase.<\/p> <p>The argument splits discovery from navigation. llms.txt can\u2019t help an LLM choose which site to visit. But it could help once the agent is already there, like a store directory for someone who already walked in.<\/p> <h2>Beyond The Gaming Argument<\/h2> <p>Mueller has called building Markdown pages for bots \u201ca stupid idea\u201d. He\u2019s also compared llms.txt to the keywords meta tag.<\/p> <p>SEJ\u2019s Roger Montti wrote that llms.txt is \u201cinherently untrustworthy\u201d because nothing stops site owners from adding self-serving content. SE Ranking\u2019s analysis of 300,000 domains found no link between llms.txt adoption and citation frequency in LLM answers.<\/p> <p>Those arguments focused on what happens when people game the files. Mueller\u2019s podcast comment adds the nuance that there\u2019s no mechanism within the files to help an LLM pick one site over another.<\/p> <h2>Why This Matters<\/h2> <p>The gaming argument against llms.txt has always had a counterargument available. Platforms could learn to penalize manipulation, the way search engines handled spammy structured data.<\/p> <p>The differentiation argument leaves a harder problem. Penalizing manipulation may address abuse, but it doesn\u2019t explain how self-reported files help an LLM choose one site over another. Your most accurate llms.txt file still can\u2019t tell an LLM to pick your site over a competitor\u2019s.<\/p> <h2>Looking Ahead<\/h2> <p>Standards for how agents navigate sites haven\u2019t settled yet, Mueller acknowledged. He mentioned WebMCP alongside other file types under discussion.<\/p> <p>None have become a standard. By his estimate, it could take six months to a year, or longer, for agentic systems to settle on a format. The discovery layer, where HTML and internal linking already work, isn\u2019t part of that discussion.<\/p> <\/div> <p>Generative AI,News#Googles #Mueller #llms.txt #LLMs #Differentiate #Sites #sejournal #MattGSouthern1781548124<\/p> ","protected":false},"excerpt":{"rendered":"<p>Google\u2019s John Mueller argued that LLM systems can\u2019t use files like llms.txt to decide which websites to surface for a given query. He made the comments on a recent episode of Search Off the Record, the podcast from Google\u2019s Search Relations team. His comment points to a broader signal problem, not just intentional gaming. Even [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":10073,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[16],"tags":[28285,179,299,4318,90,180,80,3181],"class_list":["post-10072","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-accessibility","tag-differentiate","tag-googles","tag-llms","tag-llms-txt","tag-mattgsouthern","tag-mueller","tag-sejournal","tag-sites"],"acf":[],"_links":{"self":[{"href":"http:\/\/longzhuplatform.com\/index.php?rest_route=\/wp\/v2\/posts\/10072","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/longzhuplatform.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/longzhuplatform.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/longzhuplatform.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/longzhuplatform.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=10072"}],"version-history":[{"count":0,"href":"http:\/\/longzhuplatform.com\/index.php?rest_route=\/wp\/v2\/posts\/10072\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"http:\/\/longzhuplatform.com\/index.php?rest_route=\/wp\/v2\/media\/10073"}],"wp:attachment":[{"href":"http:\/\/longzhuplatform.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=10072"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/longzhuplatform.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=10072"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/longzhuplatform.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=10072"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}