{"id":2450,"date":"2026-01-30T00:11:11","date_gmt":"2026-01-29T16:11:11","guid":{"rendered":"http:\/\/longzhuplatform.com\/?p=2450"},"modified":"2026-01-30T00:11:11","modified_gmt":"2026-01-29T16:11:11","slug":"what-the-latest-web-almanac-report-reveals-about-bots-cms-influence-llms-txt-via-sejournal-theshelleywalsh","status":"publish","type":"post","link":"http:\/\/longzhuplatform.com\/?p=2450","title":{"rendered":"What The Latest Web Almanac Report Reveals About Bots, CMS Influence &amp; llms.txt via @sejournal, @theshelleywalsh"},"content":{"rendered":"<p><\/p> <div id=\"narrow-cont\"> <p>The Web Almanac is an annual report that translates the HTTP Archive dataset into practical insight, combining large-scale measurement with expert interpretation from industry experts.<\/p> <p>To get insights into what the 2025 report can tell us about what is actually happening in SEO, I spoke with one of the authors of the SEO chapter update, Chris Green, a well-known industry expert with over 15 years of experience.<\/p> <p>Chris shared with me some surprises about the adoption of llms.txt files and how CMS systems are shaping SEO far more than we realize. Little-known facts that the data surfaced in the research, and surprising insights that usually would go unnoticed.<\/p> <p>You can watch the full interview with Chris on the IMHO recording at the end, or continue reading the article summary.<\/p> <blockquote> <p>\u201cI think the data [in the Web Almanac] helped to show me that there\u2019s still a lot broken. The web is really messy. Really messy.\u201d<\/p> <\/blockquote> <h2><strong>Bot Management Is No Longer \u2018Google, Or Not Google?\u2019<\/strong><\/h2> <p>Although bot management has been binary for some time \u2013 allow\/disallow Google \u2013 it\u2019s becoming a new challenge. Something that Eoghan Henn had picked up previously, and Chris found in his research.<\/p> <p>We began our conversation by talking about how robots files are now being used to express intent about AI crawler access.<\/p> <p>Chris responded to say that, firstly, there is a need to be conscious of the different crawlers, what their intention is, and fundamentally what blocking them might do, i.e., blocking some bots has bigger implications than others.<\/p> <p>Second to that, requires the platform providers to actually listen to those rules and treat those files as appropriate. That isn\u2019t always happening, and the ethics around robots and AI crawlers is an area that SEOs need to know about and understand more.<\/p> <p>Chris explained that although the Almanac report showed the symptom of robots.txt usage, SEOs need to get ahead and understand how to control the bots.<\/p> <blockquote> <p>\u201cIt\u2019s not only understanding what the impact of each [bot\/crawler] is, but also how to communicate that with the business. If you\u2019ve got a team who want to cut as much bot crawling as possible because they want to save money, that might desperately impact your AI visibility.\u201d<\/p> <\/blockquote> <p>Equally, you might have an editorial team that doesn\u2019t want to get all of their work scraped and regurgitated. So, we, as SEOs, need to understand that dynamic, how to control it technically, but how to put that argument forward in the business as well.\u201d Chris explained.<\/p> <p>As more platforms and crawlers are introduced, SEO teams will have to consider all implications, and collaborate with other teams to ensure the right balance of access is applied to the site.<\/p> <h2><strong>Llms.txt Is Being Applied Despite No Official Platform Adoption\u00a0<\/strong><\/h2> <p>The first surprising finding of the report was that adoption for the proposed llms.txt standard is around 2% of sites in the dataset.<\/p> <p>Llms.txt has been a heated topic in the industry, with many SEOs dismissing the value of the file. Some tools, such as Yoast, have included the standard, but as yet, there has been no demonstration of actual uptake by AI providers.<\/p> <p>Chris admitted that 2% was a higher adoption than he expected. But much of that growth appears to be driven by SEO tools that have added llms.txt as a default or optional feature.<\/p> <p>Chris is skeptical of its long-term impact. As he explained, Google has repeatedly stated it does not plan to use llms.txt, and without clear commitment from the major AI providers, especially OpenAI, it risks remaining a niche, symbolic gesture rather than a functional standard.<\/p> <p>That said, Chris has experienced log-file data suggesting some AI crawlers are already fetching these files, and in limited cases, they may even be referenced as sources. Green views this less as a competitive advantage and more as a potential parity mechanism, something that may help certain sites be understood, but not dramatically elevate them.<\/p> <blockquote> <p>\u201cGoogle has time and again said they don\u2019t plan to use llms.txt which they reiterated in Zurich at Search Central last year. I think, fundamentally, Google doesn\u2019t need it as they do have crawling and rendering nailed. So, I think it hinges on whether OpenAI say they will or won\u2019t use it and I think they have other problems than trying to set up a new standard.\u201d<\/p> <\/blockquote> <h2>Different, But Reassuringly The Same Where It Matters<\/h2> <p>I went on to ask Chris about how SEOs can balance the difference between search engine visibility and machine visibility.<\/p> <p>He thinks there is \u201ca significant overlap between what SEO was before we started worrying about this and where we are at the start of 2026.\u201d<\/p> <p>Despite this overlap, Chris was clear that if anyone thinks optimizing for search and machines is the same, then they are not aware of the two different systems, the different weightings, the fact that interpretation, retrieval, and generation are completely different.<\/p> <p>Although there are different systems and different capabilities in play, he doesn\u2019t think SEO has fundamentally changed. His belief is that SEO and AI optimization are \u201ckind of the same, reassuringly the same in the places that matter, but you will need to approach it differently\u201d because it diverges in how outputs are delivered and consumed.<\/p> <p>Chris did say that SEOs will move more towards feeds, feed management, feed optimization.<\/p> <blockquote> <p>\u201cGoogle\u2019s universal commerce protocol where you could potentially transact directly from search results or from a Gemini window obviously changes a lot. It\u2019s just another move to push the website out of the loop. But the information, what we\u2019re actually optimizing still needs to be optimized. It\u2019s just in a different place.\u201d<\/p> <\/blockquote> <h2><strong>CMS Platforms Shape The Web More Than SEOs Realize<\/strong><\/h2> <p>Perhaps the biggest surprise from Web Almanac 2025 was the scale of influence exerted by CMS platforms and tooling providers.<\/p> <p>Chris said that he hadn\u2019t realized just how big that impact is. \u201cPlatforms like Shopify, Wix, etc. are shaping the actual state of tech SEO probably more profoundly than I think a lot of people truly give it credit for.\u201d<\/p> <p>Chris went on to explain that \u201cas well-intentioned as individual SEOs are, I think our overall impact on the web is minimal outside of CMS platforms providers. I would say if you are really determined to have an impact outside of your specific clients, you need to be nudging WordPress or Wix or Shopify or some of the big software providers within those ecosystems.\u201d<\/p> <p>This creates opportunity: Websites that do implement technical standards correctly could achieve significant differentiation when most sites lag behind best practices.<\/p> <p>One of the more interesting insights from this conversation was that so much on the web is broken and how little impact we [SEOs] really have.<\/p> <blockquote> <p>Chris explained that \u201ca lot of SEOs believe that Google owes us because we maintain the internet for them. We do the dirty work, but I also don\u2019t think we have as much impact perhaps at an industry level as maybe some like to believe. I think the data in the Web Almanac kind of helped show me that there\u2019s still a lot broken. The web is really messy. Really messy.\u201d<\/p> <\/blockquote> <h2><strong>AI Agents Won\u2019t Replace SEOs, But They Will Replace Bad Processes<\/strong><\/h2> <p>Our conversation concluded with AI agents and automation. Chris started by saying, \u201cAgents are easily misunderstood because we use the term differently.\u201d<\/p> <p>He emphasized that agents are not replacements for expertise, but accelerators of process. Most SEO workflows involve repetitive data gathering and pattern recognition, areas well-suited to automation. The value of human expertise lies in designing processes, applying judgment, and contextualizing outputs.<\/p> <p>Early-stage agents could automate 60-80% of the work, similar to a highly capable intern. \u201cIt\u2019s going to take your knowledge and your expertise to make that applicable to your given context. And I don\u2019t just mean the context of web marketing or the context of ecommerce. I mean the context of the business that you\u2019re specifically working for,\u201d he said.<\/p> <p>Chris would argue that a lot of SEOs don\u2019t spend enough time customizing what they do to the client specifically. He thinks there\u2019s an opportunity to build an 80% automated process and then add your real value when your human intervention optimizes the last 20% business logic.<\/p> <p>SEOs who engage with agents, refine workflows, and evolve alongside automation are far more likely to remain indispensable than those who resist change altogether.<\/p> <p>However, when experimenting with automation, Chris warned we should avoid automating broken processes.<\/p> <blockquote> <p>\u201cYou need to understand the process that you\u2019re trying to optimize. If the process isn\u2019t very good, you\u2019ve just created a machine to produce mediocrity at scale, which frankly doesn\u2019t help anyone.\u201d<\/p> <\/blockquote> <p>Chris thinks that this will give SEOs an edge as AI is more widely adopted. \u201cI suggest the people that engage with it and make those processes better and show how they can be continually evolved, they\u2019ll be the ones that have greater longevity.\u201d<\/p> <h2><strong>SEOs Can Succeed By Engaging With The Complexity<\/strong><\/h2> <p>The Web Almanac 2025 doesn\u2019t suggest that SEO is being replaced, but it does show that its role is expanding in ways many teams haven\u2019t fully adapted to yet. Core principles like crawlability and technical hygiene still matter, but they now exist within a more complex ecosystem shaped by AI crawlers, feeds, closed systems, and platform-level decisions.<\/p> <p>Where technical standards are poorly implemented at scale, those who understand the systems that shape them can still gain a meaningful advantage.<\/p> <p>Automation works best when it accelerates well-designed processes and fails when it simply scales inefficiency. SEOs who focus on process design, judgment, and business context will remain essential as automation becomes more common.<\/p> <p>In an increasingly messy and machine-driven web, the SEOs who succeed will be those willing to engage with that complexity rather than ignore it.<\/p> <p>SEO in 2026 isn\u2019t about choosing between search and AI; it\u2019s about understanding how multiple systems consume content and where optimization now happens.<\/p> <p>Watch the full video interview with Chris Green here:<\/p> <p><iframe loading=\"lazy\" title=\"Web Almanac Report Data Analysis: Bot Management, CMS Influence &amp; llms.txt - IMHO With Chris Green\" width=\"640\" height=\"360\" src=\"https:\/\/www.youtube.com\/embed\/d8CrlTFLesg?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe><\/p> <p>Thank you to Chris Green for offering his insights and being my guest on IMHO.<\/p> <p><strong>More Resources:\u00a0<\/strong><\/p> <hr\/> <p><em>Featured Image: Shelley Walsh\/Search Engine Journal<\/em><\/p> <\/div> <p>SEO#Latest #Web #Almanac #Report #Reveals #Bots #CMS #Influence #amp #llms.txt #sejournal #theshelleywalsh1769703071<\/p> ","protected":false},"excerpt":{"rendered":"<p>The Web Almanac is an annual report that translates the HTTP Archive dataset into practical insight, combining large-scale measurement with expert interpretation from industry experts. To get insights into what the 2025 report can tell us about what is actually happening in SEO, I spoke with one of the authors of the SEO chapter update, [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":2451,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[16],"tags":[6986,87,89,820,2257,1031,4318,840,4089,80,1387,345],"class_list":["post-2450","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-accessibility","tag-almanac","tag-amp","tag-bots","tag-cms","tag-influence","tag-latest","tag-llms-txt","tag-report","tag-reveals","tag-sejournal","tag-theshelleywalsh","tag-web"],"acf":[],"_links":{"self":[{"href":"http:\/\/longzhuplatform.com\/index.php?rest_route=\/wp\/v2\/posts\/2450","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/longzhuplatform.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/longzhuplatform.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/longzhuplatform.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/longzhuplatform.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=2450"}],"version-history":[{"count":0,"href":"http:\/\/longzhuplatform.com\/index.php?rest_route=\/wp\/v2\/posts\/2450\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"http:\/\/longzhuplatform.com\/index.php?rest_route=\/wp\/v2\/media\/2451"}],"wp:attachment":[{"href":"http:\/\/longzhuplatform.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=2450"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/longzhuplatform.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=2450"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/longzhuplatform.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=2450"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}