<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:wfw="http://wellformedweb.org/CommentAPI/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
>
<channel>
<title>Crawl Budget Optimization for JavaScript Websites – Prerender.io</title>
<atom:link href="https://prerender.io/blog/crawl-budget/feed/" rel="self" type="application/rss+xml" />
<link>https://prerender.io/crawl-budget/</link>
<description>Prerender. JavaScript SEO, solved with Dynamic Rendering</description>
<lastBuildDate>Thu, 18 Sep 2025 20:08:46 +0000</lastBuildDate>
<language>en-US</language>
<sy:updatePeriod>
hourly </sy:updatePeriod>
<sy:updateFrequency>
1 </sy:updateFrequency>
<generator>https://wordpress.org/?v=6.9.4</generator>
<image>
<url>https://prerender.io/wp-content/uploads/favicon-150x150.png</url>
<title>Crawl Budget Optimization for JavaScript Websites – Prerender.io</title>
<link>https://prerender.io/crawl-budget/</link>
<width>32</width>
<height>32</height>
</image>
<item>
<title>Google JS Rendering: Noindex No Longer Means Not Rendered</title>
<link>https://prerender.io/blog/understanding-google-noindex-rendering/</link>
<dc:creator><![CDATA[Prerender]]></dc:creator>
<pubDate>Thu, 18 Sep 2025 20:08:45 +0000</pubDate>
<category><![CDATA[Crawl Budget]]></category>
<category><![CDATA[crawl budget]]></category>
<category><![CDATA[crawl budget optimization]]></category>
<category><![CDATA[crawling]]></category>
<category><![CDATA[javascript seo]]></category>
<category><![CDATA[prerendering]]></category>
<category><![CDATA[rendering]]></category>
<guid isPermaLink="false">https://prerender.io/?p=6251</guid>
<description><![CDATA[Discover how Google's new JavaScript rendering behavior affects noindex pages and what it means for your site. Understand the implications for SEO, learn to manage crawl budgets efficiently, and explore solutions like Prerender.io to optimize content processing.]]></description>
<content:encoded><![CDATA[<p data-block-id="2kpfm">For most site owners, Google’s search pipeline feels like a black box. We know the broad strokes: Google crawls, renders, indexes, and ranks content, but the finer details are often obscure.</p>
<p data-block-id="4lqvp">That is why technical SEOs pay close attention to every shift in behavior, as even subtle changes in how Google processes websites can have big implications. One of those shifts is happening right now in JavaScript rendering: pages with noindex directives are still being rendered.</p>
<p data-block-id="5qr3c">In this article, we’ll explore Google’s new rendering behavior, why pages marked noindex are now being rendered, and what it means for your site’s health, and what SEOs should do differently.</p>
<h2 id="343qf" data-block-id="343qf">A Quick Refresher: How Google Processes Websites</h2>
<p data-block-id="37lk1">Before we dive into the main topic, let’s quickly revisit the three core stages of Google’s search pipeline:</p>
<ol type="1">
<li><strong>Crawling:</strong> this is the first stage where Googlebot discovers a page and fetches its HTML, CSS, and JS.</li>
<li><strong>Rendering:</strong> Google’s Web Rendering Service (based on Chromium) then executes the JavaScript and builds a DOM to see what the page actually displays.</li>
<li><strong>Indexing:</strong> finally, Google decides whether to store the page in its index, making it eligible to rank in search results.</li>
</ol>
<p data-block-id="gpbd">And where does <strong>noindex</strong> come in? A noindex directive (via meta tag or HTTP header) tells Google to exclude a webpage from its index. Basically, it’s like hanging a “<em>you can look, but don’t list me</em>” sign for Googlebot.</p>
<p data-block-id="245r5"><strong>Related: </strong>Learn More About <a href="https://prerender.io/blog/how-prerender-renders-javascript-websites/">How Google Crawls and Index Websites</a></p>
<h2 id="8kc45" data-block-id="8kc45">The Old Understanding: Noindex Meant No JS Rendering (and No Indexing)</h2>
<p data-block-id="2qp1r">For years, the general understanding of the SEO community was that a noindex tag would stop Google from indexing a page and prevent Googlebot from rendering its JavaScript. This is still spelled out in the <a href="https://developers.google.com/search/docs/crawling-indexing/javascript/javascript-seo-basics#:~:text=When%20Google%20encounters%20,render%20or%20index%20the%20page">Google Search Documentation</a>:</p>
<p data-block-id="18q0f">“<em>When Google encounters noindex in the robots meta tag before running JavaScript, it doesn’t render or index the page.</em>”</p>
<p data-block-id="fc7oi">For this reason, noindex became a reliable safety net that SEOs built strategies around. You could keep certain pages out of search results while still allowing Google to crawl them for link discovery. And since rendering was skipped, the assumption was that fewer resources were being used, conserving your <a href="https://prerender.io/blog/impact-of-noindex-vs-nofollow-tags/">crawl budget</a>.</p>
<p data-block-id="c4a0c">This principle of handling and managing noindex pages also shaped how technical specialists approached site architecture, navigation, and audits, particularly for <a href="https://prerender.io/blog/how-to-optimize-seo-for-large-scale-websites/">large-scale websites</a> with thousands of pages.</p>
<p data-block-id="56irh">However, recent observations suggest this long-established understanding no longer holds true.</p>
<h2 id="7pave" data-block-id="7pave">The New Understanding: Noindex No Longer Stops JS Rendering</h2>
<p data-block-id="fhru1">It appears that Google is now rendering noindex pages, at least when it comes to executing JS and handling fetch requests. To better put things into context, <strong>Dave Smart</strong>, a Technical SEO Expert, ran a series of <a href="https://tamethebots.com/blog-n-bits/noindex-does-not-mean-not-rendered">controlled tests</a> to confirm and document this behavioral shift.</p>
<p data-block-id="dlr70">He set up pages that triggered JavaScript fetch() calls to a logging endpoint and monitored Googlebot’s behavior. If Googlebot rendered the page and executed the script, the requests would appear in the server logs.</p>
<p data-block-id="an0gq">The results were conclusive across multiple test scenarios:</p>
<p data-block-id="3i4rv"><strong>Test 1: Page with <meta name=”robots” content=”noindex”> and JS fetch. </strong>Googlebot made the POST fetch request (indicating it executed the script). However, the page was still treated as noindexed and was not put in the index.</p>
<figure class="image strchf-type-image undefined strchf-size-undefined strchf-align-center" data-wp-editing="1"><picture><source srcset="https://images.storychief.com/account_57345/csnwhhwinw7-twehflxevnowodfmnnvjwecpkb5-sy67qyrpepj-hmqmlxtg-yqlulwsakeytrtfpnx6cs-zylsao4nazq_ddcf60baeba0a4a67302268175e3a92a_800.jpg 1x" media="(max-width: 768px)" /><source srcset="https://images.storychief.com/account_57345/csnwhhwinw7-twehflxevnowodfmnnvjwecpkb5-sy67qyrpepj-hmqmlxtg-yqlulwsakeytrtfpnx6cs-zylsao4nazq_ddcf60baeba0a4a67302268175e3a92a_800.jpg 1x" media="(min-width: 769px)" /><img decoding="async" src="https://images.storychief.com/account_57345/csnwhhwinw7-twehflxevnowodfmnnvjwecpkb5-sy67qyrpepj-hmqmlxtg-yqlulwsakeytrtfpnx6cs-zylsao4nazq_ddcf60baeba0a4a67302268175e3a92a_800.jpg" /></picture><figcaption></figcaption></figure>
<p data-block-id="81kiq"><strong>Test 2: Page with X-Robots-Tag: noindex (HTTP header) and JS fetch. </strong>Googlebot again performed the POST fetch. The page remained excluded from the index, consistent with Test 1.</p>
<figure class="image strchf-type-image undefined strchf-size-undefined strchf-align-center"><picture><source srcset="https://images.storychief.com/account_57345/vi0zxwy4be68dbgnqxmay0rrei-lwnwbjrfdbcuef-plhobw3vtmiisupcrnltmsmkudakeytrtfpnx6cs-zylsao4nazq_ddcf60baeba0a4a67302268175e3a92a_800.jpg 1x" media="(max-width: 768px)" /><source srcset="https://images.storychief.com/account_57345/vi0zxwy4be68dbgnqxmay0rrei-lwnwbjrfdbcuef-plhobw3vtmiisupcrnltmsmkudakeytrtfpnx6cs-zylsao4nazq_ddcf60baeba0a4a67302268175e3a92a_800.jpg 1x" media="(min-width: 769px)" /><img fetchpriority="high" decoding="async" class="alignnone" src="https://images.storychief.com/account_57345/vi0zxwy4be68dbgnqxmay0rrei-lwnwbjrfdbcuef-plhobw3vtmiisupcrnltmsmkudakeytrtfpnx6cs-zylsao4nazq_ddcf60baeba0a4a67302268175e3a92a_800.jpg" alt="Test 2: Page with X-Robots-Tag: noindex (HTTP header) and JS fetch" width="800" height="509" /></picture><figcaption></figcaption></figure>
<p data-block-id="7hl17"><strong>Test 3: 404 page with JS fetch. </strong>No rendering happened here. Googlebot didn’t execute the script at all, which shows that a hard 404 still stops the rendering pipeline.</p>
<figure class="image strchf-type-image undefined strchf-size-undefined strchf-align-center"><picture><source srcset="https://images.storychief.com/account_57345/ilrxxkwsvrtn7wdmaaeomtppedlncwr266b-h2cjsoorj0tyj7togaqyqkzuc-kf8umiikeytrtfpnx6cs-zylsao4nazq_e942b39c8ad4703bc0d719f3458ad2e1_800.jpg 1x" media="(max-width: 768px)" /><source srcset="https://images.storychief.com/account_57345/ilrxxkwsvrtn7wdmaaeomtppedlncwr266b-h2cjsoorj0tyj7togaqyqkzuc-kf8umiikeytrtfpnx6cs-zylsao4nazq_e942b39c8ad4703bc0d719f3458ad2e1_800.jpg 1x" media="(min-width: 769px)" /><img decoding="async" class="alignnone" src="https://images.storychief.com/account_57345/ilrxxkwsvrtn7wdmaaeomtppedlncwr266b-h2cjsoorj0tyj7togaqyqkzuc-kf8umiikeytrtfpnx6cs-zylsao4nazq_e942b39c8ad4703bc0d719f3458ad2e1_800.jpg" alt="Test 3: 404 page with JS fetch" width="800" height="490" /></picture><figcaption></figcaption></figure>
<p data-block-id="fpnq4"><strong>Test 4: Noindex page with JS redirect. </strong>Googlebot executed the script and logged the POST request, but the page was still excluded and, interestingly, the redirect target wasn’t discovered or followed.</p>
<figure class="image strchf-type-image undefined strchf-size-undefined strchf-align-center"><picture><source srcset="https://images.storychief.com/account_57345/o3-mq2-ega2pvnzchcksoh6j9jx0fzounzochv2ubswwj0y7q7gv14kk8txubi7aqqingkeytrtfpnx6cs-zylsao4nazq_e942b39c8ad4703bc0d719f3458ad2e1_800.jpg 1x" media="(max-width: 768px)" /><source srcset="https://images.storychief.com/account_57345/o3-mq2-ega2pvnzchcksoh6j9jx0fzounzochv2ubswwj0y7q7gv14kk8txubi7aqqingkeytrtfpnx6cs-zylsao4nazq_e942b39c8ad4703bc0d719f3458ad2e1_800.jpg 1x" media="(min-width: 769px)" /><img decoding="async" class="alignnone" src="https://images.storychief.com/account_57345/o3-mq2-ega2pvnzchcksoh6j9jx0fzounzochv2ubswwj0y7q7gv14kk8txubi7aqqingkeytrtfpnx6cs-zylsao4nazq_e942b39c8ad4703bc0d719f3458ad2e1_800.jpg" alt="Test 4: Noindex page with JS redirect" width="800" height="797" /></picture><figcaption><strong>[[Image]]</strong></figcaption></figure>
<p data-block-id="5kf6t">In short,<strong> Google still respects noindex for indexing, but it no longer skips rendering. </strong>And in Smart’s words:</p>
<p data-block-id="4m7sg">“<em>The fact that the requests made to the test API endpoint were made with a POST method, and not a GET method, gives me more confidence that these requests are being made as part of the rendering process.</em>”</p>
<p data-block-id="f3152">These findings have also been corroborated by other technical SEO experts, showing a systematic change in Google rendering behavior.</p>
<h2 id="121gg" data-block-id="121gg">Why Would Google Render Noindex Pages?</h2>
<p data-block-id="4ra89">At the surface level, this seems inefficient. Why spend resources rendering a page that will not be indexed? While Google hasn’t officially confirmed the reasoning, we can draw a straight line to these three likely drivers:</p>
<ul>
<li><strong>Detecting manipulation: </strong>some websites try to game the system by dynamically removing noindex tags via JavaScript after initial crawling. Rendering allows Google to spot these tricks.</li>
<li><strong>Extracting signals: </strong>even if a page is “noindex,” it may contain useful internal links, <a href="https://prerender.io/blog/structured-data-for-seo/" target="_blank" rel="noopener noreferrer">structured data</a>, or other signals that help Google understand the rest of the site.</li>
<li><strong>Comprehensive site analysis: </strong>rendering provides Google with a complete picture of how a website functions, including user experience metrics and technical implementation quality.</li>
</ul>
<p data-block-id="dklt9">To be fair, from Google’s perspective, this makes sense for maintaining search quality and preventing abuse. However, NOT so much for website owners as it introduces unexpected challenges.</p>
<h2 id="88adr" data-block-id="88adr">SEO Impacts of Rendering Noindex Pages</h2>
<p data-block-id="3601e">Google’s shift to rendering noindex pages affects how your site is crawled, analyzed, and reported. Here’s what you need to know:</p>
<h3 id="2os1h" data-block-id="2os1h">1. Technical Issue Visibility</h3>
<p data-block-id="apikl">Noindex tags no longer shield technical problems from Google’s analysis. Issues such as JS errors, slow load times, broken API calls, or accessibility problems previously hidden behind the noindex directive now become visible in Search Console or other <u><a href="https://prerender.io/blog/best-technical-seo-tools/">SEO testing tools</a></u>.</p>
<p data-block-id="1ka65">Now, while these pages won’t directly impact rankings, the technical issues they reveal may indicate broader <a href="https://prerender.io/resources/free-downloads/site-audit-tool/" target="_blank" rel="noopener noreferrer">site health problems</a>.</p>
<p data-block-id="u4vf"><strong>Resource:</strong> <a href="https://prerender.io/blog/technical-seo-issues/">See 19 Technical SEO Issues That Hurt Your Website Performance</a></p>
<h3 id="ei7ts" data-block-id="ei7ts">2. Signals Still Get Processed</h3>
<p data-block-id="t7hk">Noindexed pages don’t disappear from Google’s understanding of your site. Structured data, canonical tags, and internal links are still read, meaning misconfigurations can bleed into your broader site signals.</p>
<p data-block-id="4lhd5">For example, if a noindexed filter page points to the wrong canonical, it could confuse Google about which product page should be authoritative.</p>
<h3 id="4f8hl" data-block-id="4f8hl">3. Crawl Budget and Server Strain</h3>
<p data-block-id="3ogu3">It’s no secret that JavaScript rendering is resource-intensive. On small sites, this may be negligible. But on <a href="https://prerender.io/blog/ecommerce-requests-wasting-your-crawl-budget/">large ecommerce sites</a> with thousands of URLs, the extra load can waste both your server capacity and crawl budget.</p>
<p data-block-id="d140n">For example, imagine an ecommerce site with 50,000 noindex filter URLs. If each requires a few seconds of JS execution, that could mean dozens of wasted rendering hours every crawl cycle. Multiply that across thousands of fetches, and you’re looking at serious server strain and crawl budget drain that could have been allocated to indexable revenue-driving pages.</p>
<h3 id="5ichn" data-block-id="5ichn">4. Reporting Complexity</h3>
<p data-block-id="8q8vl">Because Google renders these pages, diagnostic tools may flag irrelevant issues, creating noise in your SEO reporting. As a result, you’ll now need to filter out this noise and also diagnose rendering issues that genuinely impact site health.</p>
<h2 id="ch7uk" data-block-id="ch7uk">How Can Marketers and SEOs Handle This New Behavior?</h2>
<p data-block-id="bjfgt">The impacts we’ve covered clearly point to this: marketers and SEO specialists can no longer treat noindex as a simple “end of the line.” Instead, they must be deliberate in optimizing what Google sees and processes, especially when resources and crawl budget management matter<strong>.</strong></p>
<h3 id="fmd8f" data-block-id="fmd8f">Apply the Proper Directive Strategy</h3>
<p data-block-id="erk2n">Whether it’s a noindex or robots.txt tag, a clean directive strategy helps prevent wasted crawl budget and also helps Google allocate resources more efficiently across your site. However, too often, these two get lumped together as if they solve the same problem, but they do not.</p>
<ul>
<li><strong>Noindex: </strong>best for pages that can be crawled and rendered, but should not appear in Google’s search results (e.g., thank-you pages or private members-only content).</li>
<li><strong>Disallow (robots.txt): </strong>best for pages that you want Google to completely ignore crawling (e.g., admin areas, private user data, or staging environments). Google will not fetch or render these pages.</li>
</ul>
<p data-block-id="bpdce"><strong>Note: </strong>never use both noindex and disallow tags on the same URL, as this prevents Google from seeing the noindex instruction. Check out our guide for detailed instructions on <a href="https://prerender.io/blog/robots-txt-and-seo/">how to apply robots.txt directive to your website</a>.</p>
<h3 id="b2okm" data-block-id="b2okm">Expand Your Technical Audit Requirements</h3>
<p data-block-id="av1gs"><a href="https://prerender.io/blog/how-to-conduct-a-technical-seo-audit/">SEO audits</a> that only check whether a page is “indexed or not” are no longer enough. Audits now need to:</p>
<ul>
<li><strong>Analyze server logs or GSC Crawl Stats </strong>to identify how much Googlebot activity is spent on noindex pages.</li>
<li><strong>Compare raw HTML vs. rendered HTML </strong>to identify signals or errors hiding behind JS execution. Tools like <a href="https://prerender.io/prerender-vs-screaming-frog/" target="_blank" rel="noopener noreferrer">Screaming Frog</a>, Sitebulb, or Google’s URL Inspection tool can help you detect mismatches.</li>
<li><strong>Check for conflicting directives</strong>, like pages marked noindex but also blocked in robots.txt, or canonical tags pointing to noindexed pages.</li>
<li><strong>Evaluate internal linking and sitemaps </strong>to ensure noindex pages are not heavily linked from your main navigation or included in XML sitemaps.</li>
</ul>
<h3 id="eb28m" data-block-id="eb28m">Implement a JS Prerendering Solution</h3>
<p data-block-id="5p247">Several approaches, such as <a href="https://prerender.io/blog/what-is-srr-and-why-do-you-need-to-know/" target="_blank" rel="noopener noreferrer">self-built SSR</a>, static rendering, and hydration, can improve how search engines process JavaScript, but they’re costly to build and maintain. Instead, a prerendering solution like <strong>Prerender.io</strong> offers a more powerful and efficient alternative for saving crawl budget.</p>
<p data-block-id="4lqdo"><a href="http://prerender.io/">Prerender.io</a> is a dynamic JS rendering solution that serves fast, static HTML versions of your JS-heavy pages directly to search engine crawlers such as Googlebot and Bingbot. This improves your crawl efficiency, reduces server load, and avoids the high costs of maintaining an in-house rendering setup.</p>
<p data-block-id="aomcj">It also helps ensure consistent handling of noindex directives, so your pages are rendered and processed exactly as intended. In other words, as Google’s rendering behavior evolves, Prerender.io gives you stability and technical control.</p>
<p data-block-id="62uc3"><a href="https://prerender.io/blog/a-guide-to-prerender-process-and-benefits/">Learn how Prerender.io works</a> in more detail and <a href="https://prerender.io/comparison/">how Prerender.io compares to other solutions</a>.</p>
<figure class="image strchf-type-image undefined strchf-size-undefined strchf-align-center"><picture><source srcset="https://images.storychief.com/account_57345/iqnqvyvkzmcxebjava9oesncese4ylxohhgfn2giuhz1dihgr3xatrxzrlagltskbvavokeytrtfpnx6cs-zylsao4nazq_e942b39c8ad4703bc0d719f3458ad2e1_800.jpg 1x" media="(max-width: 768px)" /><source srcset="https://images.storychief.com/account_57345/iqnqvyvkzmcxebjava9oesncese4ylxohhgfn2giuhz1dihgr3xatrxzrlagltskbvavokeytrtfpnx6cs-zylsao4nazq_e942b39c8ad4703bc0d719f3458ad2e1_800.jpg 1x" media="(min-width: 769px)" /><img loading="lazy" decoding="async" class="alignnone" src="https://images.storychief.com/account_57345/iqnqvyvkzmcxebjava9oesncese4ylxohhgfn2giuhz1dihgr3xatrxzrlagltskbvavokeytrtfpnx6cs-zylsao4nazq_e942b39c8ad4703bc0d719f3458ad2e1_800.jpg" alt="Prerender vs. SSR vs. Static Rendering vs. Hydration" width="800" height="563" /></picture><figcaption></figcaption></figure>
<h2 id="476qt" data-block-id="476qt">Improve Website JavaScript Rendering With Prerender.io</h2>
<p data-block-id="blfnj">Again, while Google itself hasn’t officially confirmed this new rendering pattern (and it’s possible the behavior could evolve again), evidence from multiple SEO experts has pointed to it, and site owners can’t afford to ignore it.</p>
<p data-block-id="bvsfk">The takeaway is simple: <strong>refine your directive strategy, broaden your audit scope, and make sure Google’s resources are spent on the pages that drive real value through effective dynamic content indexing.</strong></p>
<p data-block-id="3pvlr">Also, by integrating <strong>Prerender.io</strong>, you can easily solve your site’s technical SEO issues, bid crawl budget troubles goodbye, and future-proof your SEO strategy. It is <a href="https://prerender.io/blog/how-to-install-prerender/">easy to install</a> and use, and fits right into your <a href="https://prerender.io/framework/">JS-based framework</a>, so there’s no need to change any of your tech stack.</p>
<p data-block-id="88g32">Enjoy the rendering benefits for your JavaScript website <a href="https://prerender.io/pricing/">by adopting Prerender.io for free today</a>!<!-- End strchf script --></p>
]]></content:encoded>
</item>
<item>
<title>6 Best Practices for Managing Crawl Budgets on Large Sites</title>
<link>https://prerender.io/blog/crawl-budget-management-for-large-websites/</link>
<dc:creator><![CDATA[Prerender]]></dc:creator>
<pubDate>Wed, 26 Feb 2025 07:00:00 +0000</pubDate>
<category><![CDATA[Crawl Budget]]></category>
<category><![CDATA[technical seo]]></category>
<guid isPermaLink="false">https://prerender.io/?p=5086</guid>
<description><![CDATA[Discover proven tactics to optimize crawl budgets on large sites.]]></description>
<content:encoded><![CDATA[ <div data-elementor-type="wp-post" data-elementor-id="5086" class="elementor elementor-5086" data-elementor-post-type="post">
<div class="elementor-element elementor-element-14abdd91 e-flex e-con-boxed e-con e-parent" data-id="14abdd91" data-element_type="container" data-e-type="container" data-settings="{"jet_parallax_layout_list":[]}">
<div class="e-con-inner">
<div class="elementor-element elementor-element-1a8e4526 elementor-widget elementor-widget-text-editor" data-id="1a8e4526" data-element_type="widget" data-e-type="widget" data-widget_type="text-editor.default">
<div class="elementor-widget-container">
<p></p>
<p>For large websites, getting discovered and indexed promptly by search engines can be challenging. Sadly, no matter how great your content is, if search engine crawlers can’t access and process your pages due to a depleted crawl budget, it won’t show up on the SERPs. This is why, smart crawl budget management is a must for enterprise-size websites.</p>
<p></p>
<p>This crawl budget guide shares some of the crawl budget best practices, focusing on newly emphasize website indexing strategies and practical tips based on the latest additions to Google’s workflows and documentation. You’ll discover how to reduce crawl waste, refine URL parameters, leverage separate subdomains for resource-heavy content, and more.</p>
<p></p>
<h2 class="wp-block-heading">Why Are Large Sites More Vulnerable to Poor Crawl Budget Management?</h2>
<p></p>
<p>Managing crawl budgets effectively becomes even more pressing when your site is massive. This is because the crawl budget doesn’t go proportionately with website sizes but is based on your website’s <strong>crawl rate</strong> and <strong>crawl demand</strong>.</p>
<p></p>
<ul class="wp-block-list">
<li style="list-style-type: none;">
<ul></ul>
</li>
<li>Crawl rate refers to the number of connections Googlebot can use to crawl your website. It is highly dependent on the speed of your servers.</li>
<li>Crawl demand refers to how many pages Google wants and can crawl. It is highly dependent on your content quality, website popularity, and content update frequency.</li>
</ul>
<p></p>
<p></p>
<p></p>
<p>Now, if you have a small website with static content, you’re likely to have plenty of crawl budget available for Google to effectively crawl and index the pages. However, if you have an ecommerce website with thousands of product pages but a slow server, poor content quality, but frequent content updates, your crawl budget is probably low and likely to be emptied out fast.</p>
<p></p>
<p>Need a refresher on the crawl budget role in content crawling and indexing? Our <a href="https://prerender.io/resources/free-downloads/white-papers/crawl-budget-guide/">crawl budget optimization guide</a> can help.</p>
<p></p>
<h2 class="wp-block-heading">6 Best Practices to Optimize Crawl Budget Efficiency for Enterprise Websites</h2>
<p></p>
<figure><img decoding="async" src="https://prerender.io/wp-content/uploads/image-131-1024x576.png" alt="6 Best Practices to Optimize Crawl Budget Efficiency for Enterprise Websites" /></figure>
<p></p>
<h3 class="wp-block-heading">1. Minimize Crawl Budget Resource Usage for Page Rendering </h3>
<p></p>
<p>The first and most crucial step in managing crawl budgets for large websites is ensuring <strong>crawlers quickly load and understand each page</strong>. There are a few strategies you can do:</p>
<p></p>
<ul class="wp-block-list">
<li style="list-style-type: none;">
<ul></ul>
</li>
</ul>
<ul>
<li><strong>Prioritize critical rendering paths</strong>: delay non-essential scripts to load after the initial render to ensure above-the-fold content is displayed quickly.</li>
<li><strong>Optimize images</strong>: use next-generation formats like WebP and compress images to reduce file sizes without sacrificing quality. Head out to our <a href="https://prerender.io/blog/how-to-optimize-media-for-javascript-seo-and-spas/">media files optimization guide</a> to learn how.</li>
<li><strong>Serve minified assets</strong>: deliver minified CSS and JavaScript files through a reliable CDN to decrease page load times.</li>
<li><strong>Bundle and chunk scripts</strong>: organize scripts so that each page loads only the resources it needs, minimizing unnecessary overhead.</li>
</ul>
<p></p>
<p></p>
<h3 class="wp-block-heading">2. Avoid Cache-Busting Parameters and Duplicate URLs</h3>
<p></p>
<p>Nothing can inflate your indexable URL count and devour your crawl budget faster than random or unnecessary query strings. Cache-busting parameters (such as ‘<em>?v=1234</em>’ appended to CSS or JavaScript files) can create new URLs every time they change, potentially confusing search engines and prompting them to request the same resource repeatedly.</p>
<p></p>
<p>To avoid wasting your crawl budget on unnecessary URLs, <strong>keep your URL structure simple and consisten</strong>t. Instead of using versioning in parameters like <em>styles.css?v=2</em>, include the version in the file name itself, like <em>styles_v2.css</em>.</p>
<p></p>
<p>Another smart practice is using canonical tags to tell search engines which version of a page or resource is the main one so they know exactly what to prioritize. </p>
<p></p>
<p>You can also use the “URL Parameters” tool in Google Search Console to manage how search engines handle certain parameters. These small changes can make a big difference by cutting down duplicate URLs, reducing confusion for crawlers, and helping your crawl budget focus on what matters.</p>
<p></p>
<p><strong><em>Pro tip</em></strong><em>: while simplifying your site’s URL structure by removing cache-busting query strings is a good practice, thorough testing is crucial. Be aware of edge cases: some parameters do affect page content and should either be fully crawlable or consolidated with canonical tags.</em></p>
<p></p>
<h3 class="wp-block-heading">3. Streamline Site Architecture and Navigation</h3>
<p></p>
<figure><img decoding="async" src="https://prerender.io/wp-content/uploads/image-132-1024x576.png" alt="Hierarchial Sitemap Example" /></figure>
<p></p>
<p>Optimizing crawl efficiency means keeping the structure of your site architecture as flat as possible so that Googlebot won’t spend a lot of crawl budget navigating through your pages.</p>
<p></p>
<p>When pages are logically organized and interlinked, search engine crawlers can quickly discover and revisit high-value content without wasting resources on complex crawling paths. This ensures that <strong>your crawl budget focuses on fresh or critical URLs rather than being wasted on low-value or duplicate pages</strong>.</p>
<p></p>
<p>Here’s a few pointers to keep in mind:</p>
<p></p>
<ul class="wp-block-list">
<li style="list-style-type: none;">
<ul></ul>
</li>
<li><strong>Lean site structure</strong>: limit the number of clicks required to reach any critical page, ideally no more than two clicks from the homepage.</li>
<li><strong>Implement site breadcrumbs</strong>: use breadcrumbs to clearly show how pages fit into the site hierarchy, helping both users and crawlers navigate effectively without unnecessary detours. </li>
<li><strong>Set up hub pages</strong>: create hub pages for broad categories instead of cluttering the homepage with countless product or subcategory links. By consolidating related pages under these hubs, you guide crawlers toward the most important parts of your site and ensure they prioritize high-value content, maximizing your crawl budget’s impact.</li>
</ul>
<p></p>
<p></p>
<p></p>
<p></p>
<h3 class="wp-block-heading">4. Leveraging HTTP/2 and Server Push</h3>
<p></p>
<p>Adopting <a href="https://www.wallarm.com/what/what-is-http-2-and-how-is-it-different-from-http-1" target="_blank" rel="noreferrer noopener">HTTP/2</a> is a vital technical step for accelerating page load times for both end users and search engine bots. By enabling multiplexing, HTTP/2 consolidates multiple file requests—such as CSS, JavaScript, or images—into a single connection. This approach <strong>reduces latency and allows crawlers to process your site more efficiently</strong>, which is particularly beneficial if your pages rely on a large number of assets.</p>
<p></p>
<p>Complementing HTTP/2 multiplexing is <a href="https://www.ioriver.io/terms/http-2-server-push" target="_blank" rel="noreferrer noopener">Server Push</a>, which proactively delivers critical resources to the client before they even need to request them. Though support and implementation details vary by browser and framework, Server Push can significantly decrease wait times.</p>
<p></p>
<p>Combining HTTP/2 and Server Push allows crawlers to<strong> quickly access and index your content</strong>, leading to a more efficient use of your crawl budget.</p>
<p></p>
<h3 class="wp-block-heading">5. Keep Sitemaps Accurate and Up-to-Date</h3>
<p></p>
<p>Sitemaps are essential for search engine optimization. They provide a clear structure, making it easier for search engines to find and index your content. This is particularly beneficial for large websites with numerous products or a large blog.</p>
<p></p>
<p>Google limits its sitemaps to 50 MB (uncompressed) and 50,000 URLs. This means that for enterprise-size sites, you’re likely to need to set multiple sitemaps, each dedicated to a specific section of the site, such as products, blog posts, or category pages. This segmentation helps search engines parse your sitemaps more efficiently and zero in on the most relevant areas of your site.</p>
<p></p>
<p>From a crawl budget management perspective, well-maintained sitemaps directly influence how efficiently search engines allocate their crawling resources. By keeping your sitemaps updated—ideally through an automated generation process that reflects any new pages or recent updates—<strong>you signal to search engines which URLs merit the most attention</strong>. </p>
<p></p>
<p>Submitting these up-to-date sitemaps to Google Search Console or other search engines helps prevent them from crawling outdated or irrelevant URLs, allowing more of your available crawl budget toward fresh, high-priority pages. This targeted approach ensures that <strong>crucial sections of your site get indexed promptly while low-value URLs are not repeatedly re-crawled.</strong></p>
<p></p>
<p><strong>Related: </strong>it’s recommended to audit and clean up your content from low-value pages. You don’t want your site to experience ‘index bloating.’ Learn more about what it is and how to mitigate this indexing issue <a href="https://prerender.io/blog/how-to-fix-index-bloating-seo/">here</a>.</p>
<p></p>
<h3 class="wp-block-heading">6. Adopt Prerender.io as a Sustainable Crawl Budget Management Tool</h3>
<p></p>
<figure><img decoding="async" src="https://prerender.io/wp-content/uploads/image-133-1024x457.png" alt="Prerender.io Crawl Budget Management" /></figure>
<p></p>
<p>We’ve discussed several proven methods to optimize crawl efficiency for large websites. Although they work, you’ll need more effort to manually implement them and regularly monitor their progress. If you’re looking for a crawl budget solution that works more sustainably, Prerender.io is an excellent option.</p>
<p></p>
<p>Prerender.io is made to improve your JS crawling SEO performance by pre-rendering your JavaScript pages into their HTML version. That way, you free up Google from rendering and processing them, minimizing the amount of crawl budget they use to crawl and index your site on SERPs.</p>
<p></p>
<p>Does this mean Prerender.io is only suitable for static content? No. You can set up the <a href="https://docs.prerender.io/docs/modify-the-cache-expiration" target="_blank" rel="noreferrer noopener">cache intervals</a> to match your content update needs. For instance, if you own an online shop in which products’ stock and prices change frequently, you can set up Prerender.io to re-render/re-cache these specific product pages every 24 hours. This allows Google to always pick up the latest update on your content and present it in the search results.</p>
<p></p>
<p>Other benefits of pre-rendering your JavaScript content include: faster indexing, better link previews, more visibility in AI search engine results, and the potential to mitigate sudden drops in traffic.</p>
<p><iframe title="Seeing A Sudden Drop In Traffic? This Technical SEO Issue Could Be Why" width="640" height="360" src="https://www.youtube.com/embed/ToGusQ8qqcE?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe></p>
<p></p>
<p>Want to see how else Prerender can help you improve your online visibility? Explore <a href="https://prerender.io/resources/case-studies/">our case studies</a> for real-world examples.</p>
<h2>How to Check Your Site’s Crawl Budget Performance</h2>
<h3>Option 1: Manual Crawl Budget Site Audit with Google Search Console</h3>
<p>Google and other search engines don’t disclose how much crawl budget a website gets, but you can discover how many pages Google has crawled on your website to give you a ballpark figure.</p>
<p>To do this, log in to your Google Search Console account. Click on Settings > Crawling > Crawl Stats. You can see the number of pages Google crawled in the last 90 days. In the example below, Google crawled 17,900 requests in 90 days, or almost 199 pages per day.</p>
<p><img loading="lazy" decoding="async" class="aligncenter wp-image-5712 size-large" src="https://prerender.io/wp-content/uploads/How-to-check-crawl-budget-1024x368.png" alt="How to check crawl budget - Prerender.io" width="640" height="230" srcset="https://prerender.io/wp-content/uploads/How-to-check-crawl-budget-1024x368.png 1024w, https://prerender.io/wp-content/uploads/How-to-check-crawl-budget-300x108.png 300w, https://prerender.io/wp-content/uploads/How-to-check-crawl-budget-768x276.png 768w, https://prerender.io/wp-content/uploads/How-to-check-crawl-budget-1536x552.png 1536w, https://prerender.io/wp-content/uploads/How-to-check-crawl-budget.png 1600w" sizes="(max-width: 640px) 100vw, 640px" /></p>
<p>Knowing your crawl budget number is gold for SEO success. It will tell whether your website suffers from a crawl budget deficiency (which may explain why it hasn’t reached its full potential). To check this, you also need to know how many pages you have on your website.</p>
<p>You can get this data from your XML sitemap. Run a site query with Google using ‘site:yourdomain.com,’ or use an SEO tool like Screaming Frog. Then, calculate your crawl budget efficiency using the following formula:</p>
<blockquote>
<p><strong>Total Number of Pages / Average Pages Crawled Per Day = Crawl Budget Efficiency</strong></p>
</blockquote>
<p>If the result is less than 3, your crawl budget is already optimal. If the result is higher than 10, you have 10x more pages on your site than what Google crawls per day. This means that crawl budget optimization is needed.</p>
<p><em>Disclaimer: The calculation provides a rough illustration of your crawl budget status, not an exact or official conclusion.</em></p>
<h3>Option 2: Use Prerender.io Free Site Audit Tool</h3>
<p><span style="font-weight: 400;">By analyzing the crawl data from Google Search Console (GSC) and comparing it with the pages listed in your XML sitemap, you can obtain a general estimate of your crawl budget’s health. This method, however, primarily monitors how your crawl budget is being spent.</span></p>
<p><span style="font-weight: 400;">For a more in-depth technical SEO audit that covers page delivery speed, indexing status, and Lighthouse performance metrics, use </span><a href="https://prerender.io/resources/free-downloads/site-audit-tool/"><span style="font-weight: 400;">Prerender.io’s free site audit tool</span></a><span style="font-weight: 400;">. Simply submit your website’s URL to Prerender.io, and you will receive a comprehensive technical SEO audit along with optimization recommendations via email. Take the first step to assess your site’s crawl budget health today!</span></p>
<div class="hs-cta-embed hs-cta-simple-placeholder hs-cta-embed-230694382796" style="max-width: 100%; max-height: 100%; width: 600px; height: 200px;" data-hubspot-wrapper-cta-id="230694382796"><a href="https://cta-eu1.hubspot.com/web-interactives/public/v1/track/redirect?encryptedPayload=AVxigLJl4H74VYUV2nTr1nwqGhQ7sFnajeA42EvtZE7BtXb6sBMm4q4BPs19sKq3Jwf3ehe%2Fv3jC3LcbgICf%2Fjb3sSzDKqYQa2zchMbgMJS7ZmTRkO%2BXk8HQ7otuGnm%2BqfeSvuWtGHonMAIXKyNad35que9maRNbtrRX5e19h8y7Pr9dXqg0dh9k5wG6oManWX8Btbj4H49pdNmRohUkLLkJZEv0&webInteractiveContentId=230694382796&portalId=25756777" target="_blank" rel="noopener"> <img decoding="async" style="height: 100%; width: 100%; object-fit: fill;" src="https://hubspot-no-cache-eu1-prod.s3.amazonaws.com/cta/default/25756777/interactive-230694382796.png" alt="Site Audit HS banner for blogs" /> </a></div>
<p></p>
<h2 class="wp-block-heading">Common Pitfalls in Crawl Budget Management</h2>
<p></p>
<p>Even with the most carefully designed website indexing strategies, large websites can run into specific roadblocks that drain or misallocate their crawl budget. Here are some examples of it.</p>
<p></p>
<h3 class="wp-block-heading">A. Over-Reliance on Robots.txt</h3>
<p></p>
<p>Relying solely on <a href="https://prerender.io/blog/robots-txt-and-seo/">robots.txt to manage crawling</a> can lead to unintended consequences. When a URL has already been crawled or indexed, simply blocking it in robots.txt will not remove it from the index. Moreover, blocked pages may still appear in search results if there are external links pointing to them. </p>
<p></p>
<figure><img decoding="async" src="https://prerender.io/wp-content/uploads/image-134.png" alt="7 robots.txt files best practices" /></figure>
<p></p>
<p>If you want to remove sensitive or obsolete pages from search results, it’s more effective to use the <em>noindex</em> directive or remove those pages entirely from your server. In other words, robots.txt is best used to block access to certain sections, such as admin folders or duplicate staging environments, but <strong>should not be the only line of defense for controlling which URLs appear in the index</strong>.</p>
<p></p>
<p><strong>Related</strong>: How do noindex and nofollow affect crawling efficiency and SEO? Let’s examine <a href="https://prerender.io/blog/impact-of-noindex-vs-nofollow-tags/">the differences between noindex and nofollow</a>.</p>
<p></p>
<h3 class="wp-block-heading">B. Ignoring Mobile-First Indexing</h3>
<p></p>
<p>With Google’s shift to <a href="https://prerender.io/blog/mobile-first-indexing-for-javascript/">mobile-first indexing</a>, the mobile version of your site often determines how your content is crawled and ranked. Large sites that neglect mobile optimization may experience slow page loads or incomplete rendering on smartphones, limiting how efficiently crawlers can traverse the site. </p>
<p></p>
<p>If you want your site to perform better, ensure that your mobile pages mirror the content and functionality of your desktop site.<strong> Implement responsive designs, compress media for faster loading, and test mobile usability frequently</strong>. By providing a robust mobile experience, you help search engines crawl your site more effectively, ultimately preserving your crawl budget for pages with true ranking potential.</p>
<p></p>
<h3 class="wp-block-heading">C. Mishandling Temporarily Unavailable Content</h3>
<p></p>
<p>Large websites, especially ecommerce platforms, frequently deal with items going in and out of stock or seasonal pages appearing and disappearing. When these pages remain live but are effectively “dead ends,” crawlers may re-visit them unnecessarily. </p>
<p></p>
<p>One solution is to use appropriate HTTP status codes or structured data to indicate when content is no longer available. If the page is permanently gone, a 404 or 410 status code can let search engines know to remove it from active indexing. </p>
<p></p>
<p>However, if you plan to restock or reuse the page, consider adding clear messages for users while preserving essential metadata. Properly handling these temporary or seasonal pages prevents search engines from repeatedly crawling low-value or empty content.</p>
<p></p>
<p>Learn more about <a href="https://prerender.io/blog/how-to-manage-expired-listings-and-old-content/">how to manage old content and expired listings for ecommerce SEO here</a>.</p>
<p></p>
<h2 class="wp-block-heading">Improve Your Crawl Budget Efficiency with Prerender.io</h2>
<p></p>
<p>When dealing with SEO for large sites, crawl efficiency optimization is no longer optional—it’s the key to either thriving or failing in search rankings. Every bit of wasted crawl capacity is a missed opportunity for your key content to be discovered.</p>
<p></p>
<p>Focus on these strategies to maximize crawl efficiency and guide search engines to your key content: reducing resource consumption, simplifying site architecture, effectively managing URL parameters, and consistently updating sitemaps.</p>
<p></p>
<p>Another option is to use <a href="https://prerender.io/">Prerender.io</a>. This dynamic rendering JavaScript tool minimizes search engine crawlers’ excessive use of your valuable crawl budget. Having index-ready content also improves your server time for bots and crawlers and ensures Google picks up all your SEO and web page elements, boosting your page ranking.</p>
<p></p>
<p><strong><a href="https://prerender.io/pricing/">Sign up for Prerender.io today</a> to get started with a free 1,000 renders!</strong></p>
<p></p>
<h2 class="wp-block-heading">FAQs – Crawl Budget Management Best Practices</h2>
<p></p>
<h3 class="wp-block-heading">How Do I Know If I Have Crawl Budget Issues?</h3>
<p></p>
<p>Check Google Search Console’s Crawl Stats report. Signs of crawl budget issues include:</p>
<p></p>
<ul class="wp-block-list">
<li style="list-style-type: none;">
<ul></ul>
</li>
<li>High number of unindexed pages</li>
<li>Slow indexing of new content</li>
<li>Excessive crawling of non-important pages</li>
<li>Server errors during crawling</li>
</ul>
<p></p>
<p></p>
<h3 class="wp-block-heading">How Does JavaScript Affect Crawl Budget Consumption?</h3>
<p></p>
<p>JavaScript execution consumes more crawl budget because additional resources need loading, API calls require extra processing, and DOM updates need more rendering time.</p>
<p></p>
<h3 class="wp-block-heading">How Can Prerender.io Help With Crawl Budget Optimization?</h3>
<p></p>
<p>Prerender serves pre-rendered HTML versions of your JavaScript pages to search engines, making crawling more efficient. This helps reduce server load, speed up crawling, improve indexing efficiency, and maximize crawl budget usage. This also helps you save on costs compared to in-house server-side rendering.</p>
<p></p>
<p><strong>Interested in further reading? You might also enjoy:</strong></p>
<p></p>
<ul class="wp-block-list">
<li style="list-style-type: none;">
<ul></ul>
</li>
<li><a href="https://prerender.io/blog/sudden-drop-in-traffic/">Experienced a Sudden Drop in Traffic? Here’s How to Fix It</a></li>
<li><a href="https://prerender.io/blog/how-to-get-indexed-on-ai-platforms/">How to Get Your Website ‘Indexed’ By ChatGPT and Other AI Search Engines</a></li>
<li><span style="color: #800080;"><a style="color: #800080;" href="https://prerender.io/blog/types-of-crawl-errors/">5 Types of Crawl Errors and How to Fix Them</a></span></li>
</ul>
<p></p>
<p></p> </div>
</div>
</div>
</div>
</div>
]]></content:encoded>
</item>
<item>
<title>Crawl Budget Optimization Tips for Career Sites or Job Boards</title>
<link>https://prerender.io/blog/crawl-budget-optimization-tips-for-career-sites-or-job-boards/</link>
<dc:creator><![CDATA[Prerender]]></dc:creator>
<pubDate>Thu, 31 Oct 2024 18:52:51 +0000</pubDate>
<category><![CDATA[Crawl Budget]]></category>
<category><![CDATA[Technical SEO]]></category>
<category><![CDATA[crawl budget optimization]]></category>
<category><![CDATA[indexing]]></category>
<category><![CDATA[javascript]]></category>
<category><![CDATA[javascript seo]]></category>
<guid isPermaLink="false">https://prerender.io/?p=4403</guid>
<description><![CDATA[Optimize your job board's crawl budget to ensure fast indexing of job listings.]]></description>
<content:encoded><![CDATA[ <div data-elementor-type="wp-post" data-elementor-id="4403" class="elementor elementor-4403" data-elementor-post-type="post">
<div class="elementor-element elementor-element-4bb80c17 e-flex e-con-boxed e-con e-parent" data-id="4bb80c17" data-element_type="container" data-e-type="container" data-settings="{"jet_parallax_layout_list":[]}">
<div class="e-con-inner">
<div class="elementor-element elementor-element-15c79cf9 elementor-widget elementor-widget-text-editor" data-id="15c79cf9" data-element_type="widget" data-e-type="widget" data-widget_type="text-editor.default">
<div class="elementor-widget-container">
<p>Career sites and job boards, such as Indeed and Glassdoor, often face unique SEO challenges due to their dynamic content and the large number of hosted pages. Limited crawl budgets further complicate matters, as it becomes more difficult to ensure new or updated job listings are crawled and indexed promptly before they become outdated.</p>
<p>In this crawl budget guide, we’ll explore effective crawl budget optimization strategies specifically designed to enhance SEO for career websites and job boards. These will help you solve JS indexing issues, maximize your site’s visibility, and ensure that your job listings are always providing the most updated job vacancies in SERPs.</p>
<h2 class="wp-block-heading">Why is Managing Crawl Budget a Challenge for Job Board Websites?</h2>
<p>Crawl budget refers to the number of pages Googlebot crawls and indexes on your site within a specific timeframe. It is heavily influenced by factors like the size of your site, the number of requests Googlebot receives, your server’s capacity, and the frequency of content updates.</p>
<p><strong>Related</strong>: Need a refresher about the role of crawl budget in SEO? Download our <a href="https://prerender.io/resources/free-downloads/white-papers/crawl-budget-guide/" target="_blank" rel="noreferrer noopener">FREE crawl budget whitepaper</a> now.</p>
<p>As mentioned, the challenge of managing your crawl budget for job marketplaces or career websites is amplified by two main factors:</p>
<ul class="wp-block-list">
<li><strong>The sheer volume of pages</strong>: including job listings, employer profiles, categories, etc.</li>
<li><strong>The quickly-expired content</strong>: either the job postings are changed, added, or removed within a short period of time.</li>
</ul>
<p>This urgency demands efficient crawling and indexing to ensure that up-to-date job postings are always presented to users. Outdated or expired job listings can negatively impact user experience and hurt the job board site’s relevance in search results. There’s also the problem of <a href="https://jobs.google.com/about/" target="_blank" rel="noreferrer noopener">Google for Jobs (GFJ)</a>. GFJ aggregates job postings and displays them in a prominent section within the SERPs, meaning they often take precedence and appear before traditional organic results in many cases.</p>
<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="658" height="532" class="wp-image-4405" src="https://prerender.io/wp-content/uploads/SERPs-screenshot-of-the-Google-for-Jobs-GFJ-listings.png" alt="Screenshot of Google for Jobs (GFJ) search results, displaying job listings with details like job titles, companies, and locations." srcset="https://prerender.io/wp-content/uploads/SERPs-screenshot-of-the-Google-for-Jobs-GFJ-listings.png 658w, https://prerender.io/wp-content/uploads/SERPs-screenshot-of-the-Google-for-Jobs-GFJ-listings-300x243.png 300w" sizes="(max-width: 658px) 100vw, 658px" /></figure>
<p>This means that to effectively reach job seekers, job and career sites must optimize not only for traditional SERPs but also for Google for Jobs. This requires strategic management of crawl budget resources. If search engine bots exceed their crawl limit without indexing your most important content, it can hinder your job SEO efforts and reduce your visibility to potential job seekers.</p>
<h2 class="wp-block-heading">6 Strategies on How to Optimize Crawl Budget for Career Sites</h2>
<p>Optimizing your crawl budget here means ensuring that Google prioritizes crawling high-value pages (job listings) over less important content. Here are a few tips on how to make the most out of your limited crawl budget.</p>
<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="936" height="526" class="wp-image-4406" src="https://prerender.io/wp-content/uploads/6-Strategies-on-How-to-Optimize-Crawl-Budget-for-Career-Sites.png" alt="Infographic titled '6 Strategies on How to Optimize Crawl Budget for Career Sites,' featuring tips on prioritizing high-value pages like job listings for improved search engine crawling." srcset="https://prerender.io/wp-content/uploads/6-Strategies-on-How-to-Optimize-Crawl-Budget-for-Career-Sites.png 936w, https://prerender.io/wp-content/uploads/6-Strategies-on-How-to-Optimize-Crawl-Budget-for-Career-Sites-300x169.png 300w, https://prerender.io/wp-content/uploads/6-Strategies-on-How-to-Optimize-Crawl-Budget-for-Career-Sites-768x432.png 768w" sizes="(max-width: 936px) 100vw, 936px" /></figure>
<h3 class="wp-block-heading">1. Reduce JavaScript Rendering Delays</h3>
<p>Most job boards and career sites use JavaScript to build websites and add dynamic interactions. While this is great for UX, it introduces significant delays in crawling and indexing, primarily because <a href="https://prerender.io/blog/how-prerender-renders-javascript-websites/" target="_blank" rel="noreferrer noopener">Googlebot needs to execute the JavaScript to render the content</a> before it can evaluate what to index.</p>
<p>Some websites use SSR or CSS techniques like lazy loading to overcome this challenge, but these have significant drawbacks in terms of cost and efficiency. For instance, <a href="https://prerender.io/comparison/" target="_blank" rel="noreferrer noopener">SSR can cost you $120,000</a> for building the framework alone, not to mention the maintenance and scaling costs. The best way to fix this issue is to use a prerendering tool like Prerender.io.</p>
<p><strong>Top tip</strong>: Learn the financial and engineering benefits you’ll get when you <a href="https://prerender.io/blog/in-house-ssr-vs-prerender-io-a-benefit-and-cost-comparison/" target="_blank" rel="noreferrer noopener">adopt Prerender.io compared to building an SSR</a>.</p>
<p>Prerender.io is a crawl budget optimization cheat code for those that want to hack SEO for career websites. Prerender.io acts as a bridge between your website and search engines. When a search engine bot, like Googlebot, requests a page, Prerender.io intercepts the request and delivers a pre-rendered, static HTML version.</p>
<p>This pre-rendered HTML eliminates the need for Googlebot to spend time processing your career website’s JavaScript. Because of this, your job positings are crawled and indexed quicker, increasing their visibility before the job listings get expired. Importantly, human users continue to experience the full functionality of your dynamic JavaScript-powered website. This means you achieve improved SEO without sacrificing user experience.</p>
<p>Learn more about <a href="https://prerender.io/blog/a-guide-to-prerender-process-and-benefits/" target="_blank" rel="noreferrer noopener">how Prerender works and the benefits</a>.</p>
<figure class="wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio">
<div class="wp-block-embed__wrapper">https://www.youtube.com/watch?v=Kc20MvUvMM4</div>
</figure>
<h3 class="wp-block-heading">2. Implement Structured Data Correctly</h3>
<p>Google for Jobs uses <a href="https://schema.org/JobPosting" target="_blank" rel="noreferrer noopener">JobPosting schema markup</a> to pull job listings into its interface. Proper implementation of structured data is crucial, as it ensures that Google correctly interprets your job postings and includes them in GFJ results.</p>
<p>Ensure your JobPosting schema includes essential fields such as job title, job location, job description, date posted, validThrough (expiration date), and application process. This reduces confusion for Googlebot and helps with more efficient crawling and indexing. Follow our <a href="https://prerender.io/blog/5-types-of-schema-markup-dynamic-websites-should-implement-including-a-tutorial/" target="_blank" rel="noreferrer noopener">JobPosting schema markup tutorial</a> to implement it correctly.</p>
<p>Errors in your structured data can also lead to wasted crawl budget since Googlebot may spend time trying to process pages with faulty or incomplete markup. You have to regularly validate your schema using tools like <a href="https://search.google.com/test/rich-results" target="_blank" rel="noreferrer noopener">Google’s Rich Results Test</a> and monitor your Google Search Console for any issues.</p>
<h3 class="wp-block-heading">3. Optimize Pagination with <em>rel=”next”</em> and <em>rel=”prev”</em></h3>
<p>To ensure that paginated job listings are crawled efficiently, implement <em>rel=”next”</em> and <em>rel=”prev”</em> tags on your paginated series. These tags help Google understand the relationship between paginated pages, improving the chances of discovering and indexing all relevant listings without wasting resources on every single page.</p>
<p>Also, consider grouping listings by date or category to reduce the number of paginated pages Google needs to crawl. For example, you can show all job listings for a specific week on one page instead of spreading them across several pages.</p>
<h3 class="wp-block-heading">4. Monitor Crawl Errors and Fix Them Promptly</h3>
<p>Google Search Console provides insights into crawl errors that may be affecting your site’s performance. Regularly monitor your crawl stats to identify issues such as If expired jobs are returning 404 errors, make sure to redirect them to relevant pages, such as a similar listing or category page. This preserves link equity and helps Googlebot avoid wasting crawl budget on dead-end pages. Learn <a href="https://prerender.io/blog/fix-404-errors-on-spas/" target="_blank" rel="noreferrer noopener">how to fix 404 errors</a> here.<a id="_msocom_1"></a></p>
<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="936" height="538" class="wp-image-4407" src="https://prerender.io/wp-content/uploads/Google-Search-Console-Crawl-Errors-Dashboard.png" alt="Google Search Console Crawl Errors Dashboard displaying insights on website crawl issues, including error types and affected URLs." srcset="https://prerender.io/wp-content/uploads/Google-Search-Console-Crawl-Errors-Dashboard.png 936w, https://prerender.io/wp-content/uploads/Google-Search-Console-Crawl-Errors-Dashboard-300x172.png 300w, https://prerender.io/wp-content/uploads/Google-Search-Console-Crawl-Errors-Dashboard-768x441.png 768w" sizes="(max-width: 936px) 100vw, 936px" /></figure>
<p>You can also use GSC to monitor <a href="https://prerender.io/blog/soft-404/" target="_blank" rel="noreferrer noopener">Soft 404 errors</a> (pages that return a 200 status but contain no content) and infinite redirect loops. These issues can cause Google to get stuck crawling unnecessary pages.</p>
<h3 class="wp-block-heading">5. Adjust Crawl Rate Settings</h3>
<p>For large career sites and job boards, managing crawl rate settings is critical to balancing server performance with efficient crawling by Googlebot. Since job boards often feature thousands or millions of job listings, search engines must crawl pages quickly and frequently to capture the most up-to-date listings before expiration.</p>
<p>However, allowing Google to crawl too much at once can overload your server, impacting both your site’s performance and user experience.</p>
<p>One effective strategy for optimizing the crawl rate on large career sites is to adjust Googlebot’s crawl activity based on the time of day. During peak hours (when user traffic is high), an increased crawl rate could slow down the site. However, during non-peak hours, such as overnight or early in the morning, your server may have enough capacity to handle an increased crawl rate without affecting site performance.</p>
<h3 class="wp-block-heading">6. Optimize for Mobile User Job Seekers</h3>
<p>Optimizing job board sites for mobile devices is crucial for improving crawl budget efficiency, as mobile-optimized pages tend to have faster load times and fewer elements for Google to process. A <a href="https://prerender.io/blog/mobile-first-indexing-for-javascript/" target="_blank" rel="noreferrer noopener">mobile-first design</a> typically leads to a shallow site structure, reducing the number of clicks Googlebot needs to access critical pages like job listings. This minimizes crawl depth, allowing search engines to focus on indexing high-priority pages such as new or soon-to-expire job postings</p>
<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="300" height="668" class="wp-image-4409" src="https://prerender.io/wp-content/uploads/Indeed-job-search-on-mobile-device.jpg" alt="Screenshot of the Indeed jobs search interface on a mobile device, showcasing job listings, filters, and search options." srcset="https://prerender.io/wp-content/uploads/Indeed-job-search-on-mobile-device.jpg 300w, https://prerender.io/wp-content/uploads/Indeed-job-search-on-mobile-device-135x300.jpg 135w" sizes="(max-width: 300px) 100vw, 300px" /></figure>
<p><a href="https://www.indeed.com" target="_blank" rel="noreferrer noopener nofollow">Source</a></p>
<p>A streamlined mobile experience also means fewer duplicate content issues, as both desktop and mobile versions are consolidated, which prevents crawl budget waste on redundant pages. Discover some <a href="https://prerender.io/blog/how-to-fix-duplicate-content-issues/" target="_blank" rel="noreferrer noopener">best practices to find, prevent, and fix content duplication</a>.</p>
<h2 class="wp-block-heading">Healthier Crawl Budget Means Better Job Listing Indexing</h2>
<p>Given the dynamic nature of these job marketplaces and career sites, where job postings are frequently updated or removed, effective management of the crawl budget is crucial for maintaining visibility in search results. The techniques we showed can ensure a more efficient crawling process.</p>
<p>While all of the crawl budget optimization techniques listed above are great, if we had to pick the most important, it’d be reducing JS indexing delays with Prerender.io. By generating static HTML snapshots of your dynamic pages, Prerender.io enables bots to crawl your site faster, without the delays caused by rendering JavaScript. This ensures that your most important content, like new job listings, is indexed promptly.</p>
<p><a href="https://prerender.io/pricing/" target="_blank" rel="noreferrer noopener">Sign up for free</a> and start improving your job site’s crawl budget spending!</p>
<h2>FAQs</h2>
<h2>Do Expired Job Listings Affect Crawl Budget?</h2>
<p>Yes, expired job listings can waste crawl budget. Properly handle expired listings by removing them, redirecting them, or using the noindex tag to prevent search engines from crawling them unnecessarily.</p>
<h2>How Can I Increase My Site’s Crawl Budget?</h2>
<p>To increase crawl budget, optimize site speed, improve internal linking, use a sitemap, remove or noindex low-value pages, and ensure your robots.txt file is configured correctly.</p>
<h2>How Do I Know If Poor JS Rendering Is Affecting My Website? </h2>
<p>Poor JavaScript rendering can cause reduced visibility, slower indexing, inconsistent user experience, and/or lower search rankings. You might be experiencing any of these.</p>
<h2>What Are Technical SEO Tips for Job Boards? </h2>
<p>Some include: optimizing for mobile, writing compelling meta descriptions, implementing proper structured data, and improving your JavaScript site for SEO with a tool like Prerender.io. <a href="https://prerender.io/pricing/">Try for free</a>. </p>
</div>
</div>
</div>
</div>
</div>
]]></content:encoded>
</item>
<item>
<title>Save Your Crawl Budget: The Impact of Noindex vs. Nofollow Tags</title>
<link>https://prerender.io/blog/impact-of-noindex-vs-nofollow-tags/</link>
<comments>https://prerender.io/blog/impact-of-noindex-vs-nofollow-tags/#respond</comments>
<dc:creator><![CDATA[Prerender]]></dc:creator>
<pubDate>Thu, 20 Jun 2024 14:38:52 +0000</pubDate>
<category><![CDATA[Crawl Budget]]></category>
<category><![CDATA[crawl budget]]></category>
<category><![CDATA[crawl budget optimization]]></category>
<category><![CDATA[crawling]]></category>
<category><![CDATA[technical seo]]></category>
<guid isPermaLink="false">https://prerender.io/?p=4034</guid>
<description><![CDATA[How noindex and nofollow tags affect your website's crawl budget.]]></description>
<content:encoded><![CDATA[
<p>You may be familiar with the functions of noindex tag and nofollow tag in content indexing and link equity, but do you know their impact on crawl budget spending? Understanding the impact of Noindex vs. Nofollow tags is crucial for effective SEO strategy.</p>
<p>When used strategically, noindex and nofollow tags can help you save your limited crawl budget, prioritizing indexing content that matters most. But, before diving into optimization, let’s establish the foundation. In this article, we’ll explore what noindex and nofollow tags are, their role in crawling and indexing, and then, learn some tips on how to leverage them to save your crawl budget.</p>
<p>Need a refresher on crawl budgets? Our <a href="https://prerender.io/resources/free-downloads/">FREE white paper</a> explains everything you need to know about crawl budgets. </p>
<h2 class="wp-block-heading">Nofollow and Noindex Tags Explained</h2>
<h3 class="wp-block-heading">What Is a Nofollow Tag?</h3>
<p>Nofollow<strong> </strong>tags tell search engines <strong>not to follow the links on a specific webpage</strong>, meaning the linked pages won’t receive any ranking benefit (<a href="https://www.seobility.net/en/wiki/Link_Juice">link juice</a>) from your page. Googlebot will see the link, but it won’t “travel” down that path to analyze the linked page for ranking purposes. </p>
<p>Nofollow tags are typically used for sponsored links (ads), user-generated content (comments, forum posts), or affiliate links (depending on search engine guidelines).</p>
<h3 class="wp-block-heading">What Is a Noindex Tag?</h3>
<p>Noindex tag instructs search engines to <strong>exclude a specific webpage from their index</strong>. Think of noindex tags as a polite “do not enter” sign for Googlebot.</p>
<p>Common use cases for noindex include thin content pages with minimal value, login or temporary pages, and duplicate content across your site.</p>
<h3 class="wp-block-heading">Nofollow vs. Noindex Tags: What’s The Difference?</h3>
<p>The noindex tag <strong>directly impacts the search engine’s index</strong>, while the nofollow tag <strong>does not affect the search index</strong>. </p>
<p>This is because when a search engine encounters a noindex tag on a webpage, it tells the search engine not<strong> </strong>to include that specific webpage in its index, so the webpage won’t show up in search results. The nofollow tag, on the other hand, only tells search engines not to follow the link juice.</p>
<h2 class="wp-block-heading">Impacts of Nofollow and Noindex Tags on Crawl Budget</h2>
<p>Now, there’s a common misconception that noindex and nofollow tags directly save crawl budget. While it seems logical (why crawl something you don’t want indexed?), the reality is a bit more nuanced.</p>
<p>According to John Mueller, a part of Google’s team, <a href="https://www.searchenginejournal.com/google-noindexed-pages-do-not-impact-crawl-budget/472870/#:~:text=%E2%80%9CNo%2C%20there%20is,an%20SEO%20one.%E2%80%9D">noindexed pages <strong>do not </strong>directly impact your crawl budget</a>. This is because search engines will <strong>still need to crawl a page to identify the noindex tag</strong>, and then decide not to include it in the indexing list. That said, search engines like Google are constantly learning and adapting. Over time, they may crawl noindex pages less frequently, saving your crawl budget for more important content.</p>
<p>With the nofollow tag, things get a little more complicated. Nofollow tags <strong>do not</strong> <strong>prevent crawling</strong>. Googlebot will still visit the page containing the nofollow link, but it simply won’t follow the link itself. This means the linked page won’t receive any ranking benefit (link juice) from your page, but<strong> the crawl budget is still spent on reaching the nofollow page.</strong></p>
<p>This means that the nofollow tag only instructs the crawler not to follow the link and pass ranking signals. However, it doesn’t prevent the page from being crawled and potentially included in the search engine’s index.</p>
<p>If you have pages on your site that you don’t want crawled or indexed, it’s better to use the noindex tag or block them via robots.txt or other server-side techniques rather than relying solely on nofollow links.</p>
<p>Here’s the summary of noindex and nofollow tags impacts on crawl budget spending:</p>
<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="576" src="https://prerender.io/wp-content/uploads/Prerender_Blogs-5-1024x576.png" alt="Noindex vs nofollow tagsexplained in a table format" class="wp-image-4092" srcset="https://prerender.io/wp-content/uploads/Prerender_Blogs-5-1024x576.png 1024w, https://prerender.io/wp-content/uploads/Prerender_Blogs-5-300x169.png 300w, https://prerender.io/wp-content/uploads/Prerender_Blogs-5-768x432.png 768w, https://prerender.io/wp-content/uploads/Prerender_Blogs-5-1536x864.png 1536w, https://prerender.io/wp-content/uploads/Prerender_Blogs-5.png 1920w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>
<p><strong>Related</strong>: Discover some <a href="https://prerender.io/blog/robots-txt-and-seo/">common mistakes of using robots.txt file</a> and how to boost its impact on indexing performance.</p>
<h2 class="wp-block-heading">How to Optimize Crawl Budget with Nofollow and Noindex Tags</h2>
<p>After knowing the roles of noindex and nofollow tags on crawl budget, can you still use them to manage your crawl budget better? The answer is yes, but indirectly. Here’s how.</p>
<h3 class="wp-block-heading">Using Noindex Tags</h3>
<ul class="wp-block-list">
<li><strong>Step 1: Audit your pages</strong></li>
</ul>
<p>Identify low-quality content that does not contribute to your bottom line but only consumes your crawl budget. This includes thin and <a href="https://prerender.io/blog/how-to-fix-duplicate-content-issues/">duplicate content</a> and temporary login or thank you pages. You can merge, delete, or apply noindex or nofollow tags to them.</p>
<ul class="wp-block-list">
<li><strong>Step 2a: Use noindex tags</strong></li>
</ul>
<p>Let’s say you apply noindex tags. This will instruct search engine crawlers not to index them, potentially reducing the frequency of crawling over time and saving your valuable crawl budget. </p>
<p>Keep in mind that Googlebot might still crawl noindex pages to discover the tag itself (at least initially) and check for changes. However, Google’s algorithms will learn to prioritize the crawling of indexed pages you actively want to rank.</p>
<h3 class="wp-block-heading">Using Nofollow Tags</h3>
<p>We recommend using nofollow tags on sponsored links and user-generated content to direct Googlebot’s attention toward the internal links that matter most for your SEO strategy. By controlling link juice flow and directing crawl focus towards valuable internal links, you’re essentially optimizing how Googlebot utilizes its crawl budget on your website.</p>
<h3 class="wp-block-heading">Using Both Noindex and Nofollow Tags</h3>
<p>For paid advertising placements on your website, consider using both noindex and nofollow tags. This prevents the page from being indexed and ensures the ad links don’t receive link juice, which could potentially harm your own website’s ranking.</p>
<h2 class="wp-block-heading">Alternatives to Noindex and Nofollow Tags to Save Crawl Budget</h2>
<p>Using noindex and nofollow is a great starting point to optimize your crawl budget. For more impactful results, consider complimenting them with these crawl budget optimization techniques. </p>
<h3 class="wp-block-heading">Optimize Sitemap</h3>
<p>A well-structured sitemap is a roadmap for Googlebot, highlighting the most valuable content you want Google to index. This includes:</p>
<ul class="wp-block-list">
<li><strong>Priority pages:</strong> URLs that return a 200-code response, indicating successful loading.</li>
<li><strong>Canonical versions:</strong> the definitive versions of important pages to avoid duplicate content issues.</li>
<li><strong>SEO all-stars:</strong> pages strategically targeted in your SEO plan.</li>
<li><strong>Fresh content hubs:</strong> pages that are frequently updated and need Googlebot to visit often.</li>
</ul>
<h3 class="wp-block-heading">Optimize Page Speed</h3>
<p>Principally, the faster your pages load, the less crawl budget Google uses to crawl and index your pages. Here are things you can do to increase your page loading speed:</p>
<ul class="wp-block-list">
<li><strong>Compress image</strong>: resize the content images to be loaded quickly without impacting the image quality.</li>
<li><strong>Utilize browser caching</strong>: save frequently accessed web files in caches so users don’t have to download them again on subsequent visits. This guide explains <a href="https://prerender.io/blog/caching-in-javascript-and-how-it-affects-seo-performance/">how to cache JavaScript content and how it affects SEO performance</a>.</li>
<li><strong>Use Content Delivery Networks (CDNs)</strong>: A CDN distributes your website’s content across geographically dispersed servers. This ensures users access the content from the nearest server, minimizing load times.</li>
<li><strong>Adopt a pre-rendering solution</strong>: discussed below.</li>
</ul>
<h3 class="wp-block-heading">Implement A Pre-rendering Solution</h3>
<p>JavaScript is the biggest crawl budget killer. Due to the <a href="https://prerender.io/blog/how-prerender-renders-javascript-websites/">two-step JavaScript indexing process</a>, search engine crawlers often struggle to process complex JavaScript, leading to incomplete indexing and a wasted crawl budget.</p>
<p>Several approaches, such as <a href="https://prerender.io/comparison/">self-build SSR, static rendering, and hydration</a>, can help with JavaScript SEO, but pre-rendering solutions like <a href="https://prerender.io/">Prerender</a> offer a much more powerful and efficient way to save your crawl budget. Here’s why.</p>
<p>Pre-rendering creates pre-built versions of your JavaScript pages in plain HTML. These pre-rendered versions are readily available for search engines to crawl and index, ensuring they can access all your content during the JavaScript SEO crawling process. </p>
<p>As your content is 100% indexed, this leads to better SEO performance and visibility. Additionally, the rendered content eliminates the client-side JavaScript rendering time, improving website loading speed for users.</p>
<h2 class="wp-block-heading">Manage Your Crawl Budget Smartly with Nofollow and Noindex Tags</h2>
<p>While noindex and nofollow tags don’t directly save crawl budget, they are still valuable tools for improving SEO performance. By identifying and noindexing low-value pages and strategically using nofollow on specific links, you can indirectly influence how search engines crawl your website.</p>
<p>However, for a full-force crawl budget optimization, you’ll need to do more than just implement noindex and nofollow tags. You need to create a well-structured sitemap, prioritize page speed optimization, and implement <a href="https://prerender.io/">Prerender</a> to address JavaScript rendering challenges. </p>
<p>By combining these strategies, you ensure search engines efficiently crawl and index your most valuable content, ultimately leading to a stronger SEO performance.</p>
<p><a href="https://prerender.io/pricing/">Sign up with Prerender for free to try it out.</a></p>
<div class="wp-block-buttons is-layout-flex wp-block-buttons-is-layout-flex">
<div class="wp-block-button"><a class="wp-block-button__link has-white-color has-text-color has-background has-link-color wp-element-button" href="https://auth.prerender.io/auth/realms/prerender/protocol/openid-connect/registrations?client_id=prerender-frontend&response_type=code&scope=openid%20email&redirect_uri=https://dashboard.prerender.io/integration-wizard&_gl=1*7y7laa*_gcl_au*ODIxODI0MzUyLjE3MzkzNzAxOTQ.*_ga*MjAxMjc3MTgxOC4xNzIzODAxMTYz*_ga_5C99FX76HR*MTczOTYwNDI4MC4zODguMS4xNzM5NjA4NDYzLjYwLjAuMA.." style="background-color:#1f8511">Create Free Account</a></div>
</div>
<h2 class="wp-block-heading">FAQs</h2>
<h3 class="wp-block-heading">What Is The Main Difference Between Noindex And Nofollow?</h3>
<p>Noindex tells search engines not to include a page in their index, while nofollow indicates that search engines shouldn’t follow the links on that page or pass PageRank through specific links.</p>
<h3 class="wp-block-heading">When Should I Use Noindex Instead Of Nofollow?</h3>
<p>Use noindex for pages you don’t want appearing in search results (like admin pages or duplicate content), and use nofollow for links you don’t want to endorse or pass authority to (like user-generated content or paid links).</p>
<h3 class="wp-block-heading">Do Noindex Tags Affect My Overall SEO?</h3>
<p>While noindex tags prevent specific pages from appearing in search results, they can be beneficial for SEO by helping search engines focus on your important content and preventing duplicate content issues.</p>
<h3 class="wp-block-heading">Can I Use Both Noindex And Nofollow Together?</h3>
<p>Yes, you can use both tags together when you want to both prevent a page from being indexed and stop search engines from following its links, though this is less common.</p>
<h3 class="wp-block-heading">How Long Does It Take For Noindex Tags To Take Effect?</h3>
<p>There’s no set time. Depending on your site and crawl frequency, it can typically take several days to weeks for search engines to recognize and implement noindex tags.</p>
<h3 class="wp-block-heading">Will Nofollow Tags Affect My Internal Link Structure?</h3>
<p>Nofollow tags primarily impact how PageRank flows through your site, but they don’t prevent search engines from discovering linked pages through other means.</p>
<h3 class="wp-block-heading">How Do I Know If My Noindex Tags Are Working Properly?</h3>
<p>You can verify using Google Search Console’s URL Inspection tool or monitor your indexed pages report to ensure tagged pages are being removed from the index.</p>
<h3 class="wp-block-heading">Should I Use Noindex On My JavaScript-Generated Content?</h3>
<p>For JavaScript-generated content, ensure your noindex tags are properly rendered. Using Prerender.io can help ensure search engines correctly interpret these directives on dynamic content.</p>
]]></content:encoded>
<wfw:commentRss>https://prerender.io/blog/impact-of-noindex-vs-nofollow-tags/feed/</wfw:commentRss>
<slash:comments>0</slash:comments>
</item>
<item>
<title>How to Help Google Understand (and Crawl) Your Ecommerce Website Structure</title>
<link>https://prerender.io/blog/how-to-help-google-understand-and-crawl-your-ecommerce-website-structure/</link>
<comments>https://prerender.io/blog/how-to-help-google-understand-and-crawl-your-ecommerce-website-structure/#respond</comments>
<dc:creator><![CDATA[Prerender]]></dc:creator>
<pubDate>Mon, 17 Jun 2024 13:13:19 +0000</pubDate>
<category><![CDATA[Crawl Budget]]></category>
<category><![CDATA[crawl budget]]></category>
<category><![CDATA[crawling]]></category>
<category><![CDATA[ecommerce seo]]></category>
<category><![CDATA[website navigation]]></category>
<guid isPermaLink="false">https://prerender.io/?p=4006</guid>
<description><![CDATA[Learn how to optimize your ecommerce site structure for crawling.]]></description>
<content:encoded><![CDATA[
<p>Have you ever been lost in an online store trying to find a product? That’s the same experience search engine bots have when crawling ecommerce websites with poor site structures.</p>
<p>Ecommerce websites typically have thousands of product pages that need to be logically structured. Without proper organization, it becomes difficult for Google to crawl and understand your product pages and their SEO elements.</p>
<p>So, what’s the best way to structure your ecommerce website? This Prerender.io blog will provide you with the best practices for structuring your eCommerce website and setting <a href="https://prerender.io/blog/types-of-navigation-for-dynamic-sites-worth-optimizing/">the right site navigation type</a> to meet Google’s requirements.</p>
<p>But first, let’s cover the basics of eCommerce crawling and indexing.</p>
<h2 class="wp-block-heading">What Makes Ecommerce Websites Prone to Crawling and Indexing Issues?</h2>
<p>Compared to “normal” websites, ecommerce sites have unique characteristics that often prevent Google from crawling and indexing them effectively. Some of them are: </p>
<ul class="wp-block-list">
<li>Most ecommerce pages are built with a JavaScript framework, which is notoriously difficult and slow to index. This blog explains <a href="https://prerender.io/blog/how-prerender-renders-javascript-websites/">the challenges search engines face when processing JS content</a>. </li>
<li>They host thousands of product pages that are rapidly and dynamically updated (e.g., product stocks, prices, and customer reviews).</li>
<li>They have a complex site structure due to the various product categories and types.</li>
<li>They are prone to duplicate content.</li>
</ul>
<p>If not specifically addressed, these issues can significantly impact your website’s visibility, potentially leading to missed traffic and sales.</p>
<h2 class="wp-block-heading">How Google Crawls and Indexes Ecommerce Websites </h2>
<p>Principally, how Google crawls and indexes ecommerce product pages is the same as processing “normal” pages. They still look at the content quality and relevance, as well as the on-page SEO (e.g., metadata) and technical SEO health (e.g., page speed and internal linking). </p>
<p><strong>Related</strong>: Discover <a href="https://prerender.io/blog/the-most-important-ranking-factors-for-2023/">some important Google ranking factors</a> to ensure your product pages land at the top SERPs here.</p>
<p>That said, some key factors are emphasized when crawling and indexing ecommerce product pages, as explained below.</p>
<h3 class="wp-block-heading">1. <strong>Product information</strong></h3>
<p>To understand an ecommerce product page, Google “reads” the product information. This includes the product page title, descriptions, and images. When they find the content to be too similar (or duplicated), it damages your content quality scores and will likely deprioritize the page from being indexed.</p>
<h3 class="wp-block-heading">2. <strong>Mobile-first indexing</strong></h3>
<p>With the surge in mobile shopping, Google prioritizes ecommerce sites that deliver a seamless mobile user experience through responsive design. Optimizing your site for mobile is no longer optional, it’s essential.</p>
<p><strong>Pro tip</strong>: Follow these <a href="https://prerender.io/blog/optimizing-ecommerce-mobile-ux/">8 easy steps to improve your ecommerce site mobile user experience</a>.</p>
<h3 class="wp-block-heading">3. <strong>Product reviews</strong></h3>
<p>Positive reviews and high ratings on your product pages and online store can build trust with your customers and Google. Stellar reviews signal Google that your site is trustworthy and provides a good user experience.</p>
<h3 class="wp-block-heading">4. <strong>Website structure</strong></h3>
<p>When crawling a website, Google follows its structure to trace the link and understand its page roles and information. The cleaner and more logical your ecommerce site structure is, the easier and more accurate Google processes it.</p>
<h2 class="wp-block-heading">6 Tips to Enhance How Google Understands of Your Ecommerce Website Structure</h2>
<p>We now know that a well-organized site structure is crucial for search engines to index your ecommerce store effectively. This, in turn, can significantly boost your site’s visibility and ranking potential. Here are 6 best practices to improve your site’s organization and maximize your sales potential.</p>
<h3 class="wp-block-heading">1. Use a Logical Information Architecture (IA) and Internal Navigation</h3>
<p>A logical information architecture is essential to help Google understand your site structure because it guides customers and crawlers to the right places.</p>
<p>When you hierarchically structure your categories, subcategories, and individual product pages, Google receives clear signals about how all the content fits together. For instance, you wouldn’t want ‘shoes’ and ‘electronics’ lumped together under one umbrella category, but you’ll want to group various types of products in the same category hierarchically. For example, ‘Women > Shoes > Basketball Shoes’ and ‘Women > Shoes > Running Shoes’.</p>
<figure class="wp-block-image"><img decoding="async" src="https://lh7-us.googleusercontent.com/docsz/AD_4nXfe43ARK2Pp54XJqgVFz2lBAIDVVfz4pg4auJWArEH3uno0HOW98VdYtx32RMF5XwPKWwUZ4GyjikHfve3FyEBY2WFJLdWB0i5X7p2fEPFalxgMYBTS4cp3NLGEKWlkeF7JdiJI8VTiIkyD4ICDwsUQRg6j0RDTWOF_PKH-eQ?key=XLlRisAXuk1aOHtwrOKN0Q" alt="Example of Nike shoe navigation. "/></figure>
<p><a href="https://www.nike.com/w/womens-shoes-5e1x6zy7ok">Source</a></p>
<p>To accomplish this, optimize your internal navigation system. Internal navigation allows you to link pages A and B, enabling Google to understand that these two pages are related. This also helps Google crawl deeper pages within the categories, helping the overall indexing process.</p>
<p>This guide will tell you more about <a href="https://prerender.io/blog/types-of-navigation-for-dynamic-sites-worth-optimizing/">internal navigation and how to implement it properly</a>.</p>
<h3 class="wp-block-heading">2. Amp Up Your PageSpeed</h3>
<p>Fast page speed can lead to a faster and more thorough indexing of the site’s structure and content. Slow page speed, on the other hand, can negatively impact Google’s ability to understand your ecommerce product pages. It also results in poor UX and high bounce rates since users quickly abandon slow sites.</p>
<p>For large ecommerce sites, adopting an SEO rendering strategy focused on <a href="https://prerender.io/benefits/better-response-times/">optimizing server response time (SRT)</a> can significantly enhance SEO performance. <strong>Prerender</strong>—an enterprise SEO rendering ecommerce solution—is a highly recommended tool for handling this job.</p>
<p>Prerender’s dynamic rendering empowers your online store to <strong>enjoy an SRT under 50 milliseconds</strong>. And since your server performance is now blazing fast, it will take Google a shorter time to crawl and process your pages, leading to higher and more thorough indexed pages. If you want to experience this on your ecommerce site, <a href="https://prerender.io/pricing/">get started with Prerender today</a>.</p>
<h3 class="wp-block-heading">3. Implement Proper URL Structures</h3>
<p>At first glance, URLs may seem like nothing more than a web address for locating a page, but they’re essential in structuring your site for search engine optimization. URLs or links create a clear path for Google to understand your site’s page content and hierarchy, leading to improved site crawling, indexing, and ranking. </p>
<p>Some of the best practices in creating a proper URL structure are:</p>
<ul class="wp-block-list">
<li>Use descriptive keywords in the URL paths, such as ‘domain.com/category/subcategory/product-name’. This tells Google exactly what the page is about.</li>
<li>Keep the URL simple, short, and clean.</li>
<li>Use URL subdirectories instead of query parameters where possible. For example, ‘category/dresses/’ rather than ‘category?type=dresses’.</li>
<li>Keep URLs consistent across the site architecture. Be careful not to mix parameters and subdirectories on different pages.</li>
<li>Hyphens ‘-’ are more readable and should be used to separate words instead of underscores ‘_’.</li>
</ul>
<h3 class="wp-block-heading">4. Optimize the Product Pages’s Meta Tags</h3>
<p>Though invisible on your webpage, meta tags live in the header, acting as concise summaries for search engines. Optimized meta tags provide Googlebot with clear and informative signals about your ecommerce pages, helping it understand your content quickly and accurately.</p>
<p>Now, there are a few meta tags every ecommerce business must focus on.</p>
<ul class="wp-block-list">
<li><strong>Title tag (max 60 characters)</strong></li>
</ul>
<p>This is the headline of your page in search results. You should include relevant keywords and accurately reflect the content, but keep it concise and user-friendly.</p>
<ul class="wp-block-list">
<li><strong>Meta description (max 160 characters)</strong></li>
</ul>
<p>This is a short blurb displayed under your title tag in search results. Highlight benefits and summarize content to entice users to click. You should keep it unique to differentiate yourself from competitors.</p>
<ul class="wp-block-list">
<li><strong>Meta keyword tag</strong></li>
</ul>
<p>While not a direct ranking factor, meta keyword tags can enhance your visibility. Use relevant keywords naturally, avoid keyword stuffing, and focus on user intent.</p>
<figure class="wp-block-image"><img decoding="async" src="https://lh7-us.googleusercontent.com/docsz/AD_4nXfgIfljbfuhNpkVQbqTH_9P0zRAnSqZO2y-8y2Spb60qeIwS1RjW1KWyZklSt_Ug3879uOvezqqhgA9rq7jdwNo6arNoMZLCecOHd-vhLyu8PursyW5CIANkbpXcx_yjVw1gRkReUyumTv9tMWSjBMFx_haiMVZ8eq7GYZWAQ?key=XLlRisAXuk1aOHtwrOKN0Q" alt="Example of Nike meta data. "/></figure>
<h3 class="wp-block-heading">5. Use XML Sitemaps</h3>
<p>XML sitemaps act as a roadmap for Google, listing all the important pages on your ecommerce site along with key metadata. Sitemaps clarify the relationship between each URL on your site, preventing duplicates and ensuring they are correctly indexed and categorized. They also highlight blog posts, videos, and images you want indexed that Google may miss. </p>
<p>By providing a clear, up-to-date structure in the proper XML format, you significantly simplify Google’s works, and your site enjoys faster, more comprehensive indexing and, of course, better SEO results.</p>
<h3 class="wp-block-heading">6. Make Each Product Description Unique</h3>
<figure class="wp-block-image"><img decoding="async" src="https://lh7-us.googleusercontent.com/docsz/AD_4nXcwxYc0ddaYeLzXiEjUIC1B6cGuhld02nm4kcY0huSjE9Fwjdppgpu7CTjCMMQjFiM_wyuc5-3nGYAfwIdnqX1ULDBFmVw5VuVMaAEjtAeePQQzUKRr8dArCUhvHty0g2SSuhlovY7drne5ZJoh4nyaKsNr43D-yf3xjwC7fg?key=XLlRisAXuk1aOHtwrOKN0Q" alt="Example of a Nike product description. "/></figure>
<p><a href="https://www.nike.com/t/pegasus-41-womens-road-running-shoes-tSbZGh/FD2723-701">Source </a></p>
<p>Well-written descriptions packed with relevant keywords help Google understand what you’re selling and categorize your products effectively—and this goes beyond just the title. Detailed information about features, materials, and uses paints a much clearer picture for search engines.</p>
<p>Remember, Google rewards originality. So, ditch the generic fluff and craft engaging descriptions that showcase your product’s value proposition. Think like a shopper and a search engine bot at the same time. Include details that would be helpful to both, and don’t forget descriptive alt text for your product images.</p>
<p>By investing in high-quality product descriptions, you’ll not only inform your customers but also help search engines understand your products, potentially leading to faster crawling, better categorization, and ultimately, improved SEO performance.</p>
<h2 class="wp-block-heading">Optimized Ecommerce Site Structure Improves Google Indexing Performance </h2>
<p>Ecommerce websites built with JavaScript can challenge Google’s crawling and indexing process. To address this, create a logically structured website using the six tips mentioned above.</p>
<p>To complement this approach, consider adopting <a href="https://prerender.io/">Prerender</a>—an enterprise SEO rendering solution. Prerender works by dynamically rendering your product pages ahead of time and feeding them to search engine crawlers upon request. This reduces Server Response Time (SRT) and ensures that all your SEO elements and the page’s content are <strong>100% indexed</strong>. Ultimately, this allows you to leverage your SEO efforts on the product pages to improve their visibility and potentially boost their ranking in search results.</p>
<p><a href="https://prerender.io/pricing/">Sign up to Prerender today</a> and get 1,000 free monthly renders!</p>
<div class="wp-block-buttons is-layout-flex wp-block-buttons-is-layout-flex">
<div class="wp-block-button"><a class="wp-block-button__link has-white-color has-text-color has-background has-link-color wp-element-button" href="https://auth.prerender.io/auth/realms/prerender/protocol/openid-connect/registrations?client_id=prerender-frontend&response_type=code&scope=openid%20email&redirect_uri=https://dashboard.prerender.io/integration-wizard&_gl=1*19myl2i*_gcl_au*ODIxODI0MzUyLjE3MzkzNzAxOTQ.*_ga*MjAxMjc3MTgxOC4xNzIzODAxMTYz*_ga_5C99FX76HR*MTczOTYxNzE3NC4zODkuMS4xNzM5NjIwNDg2LjYwLjAuMA.." style="background-color:#1f8511">Create Free Account</a></div>
</div>
<h2 class="wp-block-heading">FAQs</h2>
<h3 class="wp-block-heading">How Can I Get Google To Crawl My Ecommerce Site Better?</h3>
<p>There are a few ways you can ensure that your ecommerce site gets crawled—and eventually indexed—more efficiently. These include:</p>
<ul class="wp-block-list">
<li>Use a clear site structure</li>
<li>Create optimized XML sitemaps</li>
<li>Use proper internal linking</li>
<li>Ensure efficient JavaScript rendering with Prerender.io</li>
<li>Manage faceted navigation well</li>
</ul>
<h3 class="wp-block-heading">Why Is Google Not Indexing All My Product Pages?</h3>
<p>Common reasons for product page indexing issues include JavaScript rendering problems, poor site architecture, crawl budget limitations, and duplicate content. Learn more about how you can get your <a href="https://prerender.io/blog/10-ways-to-get-your-website-indexed-faster/">webpages indexed faster</a>.</p>
<h3 class="wp-block-heading">How Often Should Google Crawl My Ecommerce Site?</h3>
<p>Crawl frequency depends on:</p>
<ul class="wp-block-list">
<li>Site size and updates</li>
<li>Content freshness</li>
<li>Site performance</li>
<li>Technical optimization</li>
</ul>
<p>Large e-commerce sites should aim for daily crawling of important pages. </p>
]]></content:encoded>
<wfw:commentRss>https://prerender.io/blog/how-to-help-google-understand-and-crawl-your-ecommerce-website-structure/feed/</wfw:commentRss>
<slash:comments>0</slash:comments>
</item>
<item>
<title>Ecommerce Requests Wasting Your Crawl Budget—And How to Optimize Them</title>
<link>https://prerender.io/blog/ecommerce-requests-wasting-your-crawl-budget/</link>
<comments>https://prerender.io/blog/ecommerce-requests-wasting-your-crawl-budget/#respond</comments>
<dc:creator><![CDATA[Prerender]]></dc:creator>
<pubDate>Fri, 10 May 2024 20:06:27 +0000</pubDate>
<category><![CDATA[Crawl Budget]]></category>
<category><![CDATA[crawl budget]]></category>
<category><![CDATA[crawling]]></category>
<category><![CDATA[ecommerce seo]]></category>
<category><![CDATA[indexing]]></category>
<guid isPermaLink="false">https://prerender.io/?p=3946</guid>
<description><![CDATA[Find out what requests drain your ecommerce crawl budget.]]></description>
<content:encoded><![CDATA[
<p>Getting your product listings indexed to rank on SERPs requires crawling. To recap: before anything, bots must crawl your content and use your (already limited) crawl budget to first “understand” your content. This can be a complex process with standard sites, but with ecommerce, it’s even more complicated.</p>
<p>Due to the dynamic nature of content, ecommerce sites are more prone to depleted or poorly managed crawl budget management. Find out what crawl requests drain your budget, plus some tips for managing your crawl budget more efficiently.</p>
<p><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a1.png" alt="💡" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Want to know more about crawl budgets?</strong> Download the <a href="https://prerender.io/blog/6-step-recovery-process-after-getting-deindexed/#:~:text=%F0%9F%94%8D%C2%A0Want%20to%20boost%20your%20SEO%3F%20Download%20the%20free%20technical%20SEO%20guide%20to%C2%A0crawl%20budget%20optimization%C2%A0and%20learn%20how%20to%20improve%20your%20site%E2%80%99s%20visibility%20on%20search%20engines.">free technical SEO guide to crawl budget optimization</a> and learn how to improve your site’s visibility on search engines.</p>
<h2 class="wp-block-heading">Why Ecommerce Sites Should Care About Crawl Budget Optimization</h2>
<p>Ecommerce websites are beasts. Hundreds of thousands of product pages constantly change with updates to availability, prices, and reviews, manipulating the <em>ecommerce website structure</em>. This generates a massive number of URLs and a surge in crawling JSON requests.</p>
<p>Here’s the problem: every (re-)crawl and (re-)index uses up your crawl budget. When hundreds of product listings need attention, your crawl budget quickly depletes. This leads to slow indexation (outdated content) and potentially <a href="https://prerender.io/benefits/no-more-missing-content/">missed content</a> altogether.</p>
<p>Unfortunately, high product page volume isn’t the only factor that can easily deplete your ecommerce crawl budget. There are other “ecommerce requests” that are often overlooked but can highly influence your crawl rate performance. </p>
<h2 class="wp-block-heading">What are Ecommerce Requests?</h2>
<p>Ecommerce requests, in the context of crawl budget, refer to the various types of HTTP requests that are generated when crawlers (like Googlebot) interact with an ecommerce website.</p>
<p>Search engine crawlers don’t modify website data; they simply read and analyze it. So the most common ecommerce request that search crawlers make is <a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Methods/GET">GET requests</a>, which are designed to retrieve information from a server.</p>
<p>These ecommerce requests, while crucial for your website’s functionality and user experience, can significantly deplete your crawl budget if you’re not proactive in <em>managing your crawl budget</em><em>.</em></p>
<h2 class="wp-block-heading">3 Common Ecommerce Crawl Budget Drainers and How to Optimize Them</h2>
<p>Below are three common ecommerce technical SEO challenges that can easily zero out your retail store crawl budget. </p>
<h3 class="wp-block-heading">1. Duplicate Content</h3>
<p><strong><em>PROBLEM</em></strong></p>
<p>One of the biggest crawl budget killers for ecommerce sites is duplicate content. Think about product variations based on color, size, or other minor details. Each variation often gets its own unique URL, resulting in <strong>multiple pages with nearly identical content</strong>.</p>
<p>When search engines encounter multiple URLs leading to essentially the same content, it creates a dilemma. Which version should they index as the main page? How should they allocate their finite crawling resources? Consequently, Googlebot may spend your crawl limit indexing all variations, wasting your valuable crawl budget.</p>
<p>This problem is constantly occurring in retail sites for a few key reasons:</p>
<ul class="wp-block-list">
<li><strong>Product Variations</strong></li>
</ul>
<p>A single product can have countless SKUs (Stock Keeping Units) with unique URLs due to size, color, material, etc. While these variations offer valuable options to customers, they create technically distinct pages presenting the same core content for search engine crawlers.</p>
<figure class="wp-block-image"><img loading="lazy" decoding="async" width="1188" height="798" src="https://prerender.io/wp-content/uploads/image-75.png" alt="Example of product variations on ecommerce sites
" class="wp-image-3952" srcset="https://prerender.io/wp-content/uploads/image-75.png 1188w, https://prerender.io/wp-content/uploads/image-75-300x202.png 300w, https://prerender.io/wp-content/uploads/image-75-1024x688.png 1024w, https://prerender.io/wp-content/uploads/image-75-768x516.png 768w" sizes="(max-width: 1188px) 100vw, 1188px" /></figure>
<p><a href="https://www.nike.com/t/dunk-low-big-kids-shoes-S3lSGW/CW1590-100">Source</a></p>
<ul class="wp-block-list">
<li><strong>Session IDs and Tracking Parameters</strong></li>
</ul>
<p>Dynamic URLs, such as session IDs, affiliate codes, and tracking parameters, often get appended to product and category URLs. This creates a massive number of seemingly identical URLs from a search engine’s perspective. </p>
<ul class="wp-block-list">
<li><strong>Faceted Navigation</strong></li>
</ul>
<p>Modern ecommerce platforms offer robust filtering and faceting options to refine product searches. However, this functionality can present technical SEO problems as each filter combination applied to a category page generates a new URL, potentially leading to duplicate content.</p>
<p>All three factors above make your ecommerce site generate hundreds or even thousands of URLs that are essentially identical from a search engine’s perspective. This not only dilutes the link equity and authority of your most valuable product and category pages, but it also reduces ecommerce page speed and forces crawlers to waste precious time sifting through the redundant content.</p>
<p><strong><em>SOLUTION</em></strong></p>
<p>Deleting or merging duplicate content is the best to reclaim your crawl budget, and this may sometimes mean you need to learn how to guide Googlebot on crawling URLs. You can do that with the following steps:</p>
<p><strong>Step 1: Identify Duplicate Content</strong></p>
<p>Use website crawling tools like Screaming Frog or SEMrush Site Audit to identify duplicate pages. The free version of Screaming Frog might be sufficient for smaller websites with limited budgets, but if you need a more comprehensive SEO audit with advanced reporting and prioritization, SEMrush Site Audit could be a better choice.</p>
<p><strong>Related</strong>: Not a fan of Screaming Frog? These <a href="https://prerender.io/blog/screaming-frog-alternative/">10 free and paid Screaming Frog alternatives</a> may suit you better.</p>
<p><strong>Step 2: Consolidate Duplicate Pages</strong></p>
<p>Instead of separate URLs for product variations (color, size), consider consolidating them onto a single page with clear filtering options. This reduces the number of URLs competing for crawl budget and ensures unique content gets prioritized.</p>
<p><strong>Step 3: Leverage “Noindex”</strong></p>
<p>Use robots.txt to block out-of-stock product pages and unwanted pages from being crawled. This prevents Googlebot from wasting its crawl budget on irrelevant content. Visit our guide for a detailed instruction on <a href="https://prerender.io/blog/robots-txt-and-seo/">how to apply robots.txt directive to your website</a>.</p>
<h3 class="wp-block-heading">2. Infinite Scrolling</h3>
<p><strong><em>PROBLEM</em></strong></p>
<p>Infinite scrolling creates a seamless user experience by loading more products as the user scrolls down a page. While great for user engagement, it presents a challenge for search engine crawlers. </p>
<p>Search engines primarily rely on the initial content loaded on a webpage for indexing. They send a GET request for this content, which includes product information, descriptions, and category listings.</p>
<p>With infinite scrolling, additional product information and listings are loaded dynamically as the user scrolls down. Since Googlebot prioritizes what it sees on the first-page load, <strong>important products buried deeper within the infinite scroll are entirely missed</strong>. </p>
<p>This translates to missing content, as products loaded further down the infinite scroll become invisible to search engines. The overall picture of your offerings presented to search engines also becomes incomplete, leading to inaccurate search results and hindering your organic reach.</p>
<p><strong><em>SOLUTION</em></strong></p>
<p>There are a couple of ways to fix this. </p>
<ul class="wp-block-list">
<li><strong>Implement “Load More” Functionality</strong></li>
</ul>
<p>Instead of infinite scrolling, use a “Load More” button that dynamically loads additional products without generating new URLs.</p>
<ul class="wp-block-list">
<li><strong>Use Canonical Tags</strong></li>
</ul>
<p>Ensure that the main category page URL is set as canonical for all paginated URLs, consolidating link equity and crawl budget.</p>
<ul class="wp-block-list">
<li><strong>Pagination</strong></li>
</ul>
<p>While not as user-friendly as infinite scrolling, pagination allows Googlebot to see all available product listings on separate pages, guaranteeing complete indexing.</p>
<p>However, to save the crawl budget, consider setting a sensible limit on the number of paginated URLs generated, such as only crawling the first 10-20 pages. Also, use robots.txt or meta robots tags to prevent search engines from crawling deeper paginated URLs.</p>
<figure class="wp-block-image is-resized"><img loading="lazy" decoding="async" width="1130" height="678" src="https://prerender.io/wp-content/uploads/image-74.png" alt="Pagination on ecommerce sites." class="wp-image-3951" style="width:840px;height:auto" srcset="https://prerender.io/wp-content/uploads/image-74.png 1130w, https://prerender.io/wp-content/uploads/image-74-300x180.png 300w, https://prerender.io/wp-content/uploads/image-74-1024x614.png 1024w, https://prerender.io/wp-content/uploads/image-74-768x461.png 768w" sizes="(max-width: 1130px) 100vw, 1130px" /></figure>
<p><a href="https://www.zalando.co.uk/womens-clothing-dresses/">Source</a></p>
<h3 class="wp-block-heading">3. Unoptimized Ecommerce JavaScript Content</h3>
<p><strong><em>PROBLEM</em></strong></p>
<p>Another major factor depleting ecommerce crawl budgets is JavaScript-generated content.</p>
<p>Modern ecommerce sites rely heavily on JavaScript frameworks like React, Angular, and Vue for dynamic content, interactive features, and personalization. While JavaScript (JS) is great for user experience (UX), it’s problematic for search engine crawlers because they have difficulty fully parsing and understanding the rendered JS content.</p>
<p>This complexity means search engine crawlers demand a larger crawl budget to crawl and index JS-based content. Consequently, there’ll be little left of your crawl budget. Pages with complex JavaScript can also take longer to load fully, and that’s a known negative ranking factor for search engines.</p>
<p><strong><em>SOLUTION</em></strong></p>
<p>There are a couple of strategies you could use to tackle JavaScript eating up your crawl budget, from minifying JavaScript to using server-side rendering (SSR)—but the best method for JavaScript crawl budget optimization in ecommerce sites is prerendering JavaScript. </p>
<p>Prerendering, using a tool like <a href="https://prerender.io/">Prerender</a>, means preparing a static version of your content. Essentially, Prerender renders your JS content ahead of time and feeds this static version to crawlers. This brings in several technical SEO benefits for your ecommerce site:</p>
<ul class="wp-block-list">
<li><strong>Save Your Valuable Crawl Budget</strong></li>
</ul>
<p>Since Prerender renders your JS content, Googlebot will use less crawl budget to process your pages, saving your crawl limit.</p>
<ul class="wp-block-list">
<li><strong>No More Missing Content</strong></li>
</ul>
<p>Each of your SEO elements and other valuable product information will be 100% indexed, even if the pages rely heavily on the most complex JavaScript code. </p>
<ul class="wp-block-list">
<li><strong>Faster Response Time</strong></li>
</ul>
<p>Prerender not only solves the JavaScript SEO for ecommerce problems, but it also solves the slow page load issue by increasing your server response time (SRT) to less than 50 milliseconds. </p>
<p>Crawl rate optimization is vital for SEO success in ecommerce sites built with JavaScript, and Prerender makes it easy. The best part is <a href="https://prerender.io/pricing/">you can get started with Prerender now for free</a>!</p>
<p><strong>Related</strong>: <a href="https://prerender.io/blog/other-rendering-options-vs-prerendering/">How is prerendering different from other rendering options?</a> Learn here.</p>
<h2 class="wp-block-heading">Fast Ecommerce Indexation Starts with Optimized Crawl Budget</h2>
<p>A healthy crawl budget is the foundation for robust organic visibility, so you need to manage your crawl budget wisely. By understanding the crawl budget drainers specific to ecommerce and implementing the optimization strategies outlined in this article, you can take back control.</p>
<p>To recap, the main causes of ecommerce requests that drain your crawl budget are duplicate content, infinite scrolling, and unoptimized Javascript content. </p>
<p>The path to reclaim your crawl budget may not be easy, but the result? Improved search engine visibility, increased organic traffic, and ultimately, more customers finding the products they need on your ecommerce website. That just makes everything worth it in the end.</p>
<p>This article outlines several solutions, but if you have difficulty implementing all of them, you can simply use Prerender to solve all of these issues in one go and save your crawl budget. <a href="https://prerender.io/blog/how-to-install-prerender/">It’s easy to install</a> and use, so <a href="https://prerender.io/pricing/">you can get started for free now</a> and say goodbye to crawl budget troubles!</p>
<p><em>Looking for more technical SEO for ecommerce best practices? These blogs may interest you:</em></p>
<ul class="wp-block-list">
<li><a href="https://prerender.io/blog/rich-snippets-for-ecommerce-seo/"><em>How to Get Ecommerce Product Snippets to Show up in SERPs</em></a></li>
<li><a href="https://prerender.io/blog/enterprise-guide-to-finding-pages-that-deplete-your-crawl-budget/"><em>Enterprise Guide to Finding Pages That Deplete Your Crawl Budget</em></a></li>
<li><a href="https://prerender.io/blog/ecommerce-products-not-ranking/"><em>6 Reasons Why Your Ecommerce Products Aren’t Ranking</em></a></li>
</ul>
]]></content:encoded>
<wfw:commentRss>https://prerender.io/blog/ecommerce-requests-wasting-your-crawl-budget/feed/</wfw:commentRss>
<slash:comments>0</slash:comments>
</item>
<item>
<title>Index Bloating: What It Means for Crawl Budgets and How to Fix It</title>
<link>https://prerender.io/blog/how-to-fix-index-bloating-seo/</link>
<comments>https://prerender.io/blog/how-to-fix-index-bloating-seo/#respond</comments>
<dc:creator><![CDATA[Prerender]]></dc:creator>
<pubDate>Wed, 28 Feb 2024 10:52:07 +0000</pubDate>
<category><![CDATA[Crawl Budget]]></category>
<category><![CDATA[crawling]]></category>
<category><![CDATA[indexing]]></category>
<category><![CDATA[seo]]></category>
<category><![CDATA[technical seo]]></category>
<guid isPermaLink="false">https://prerender.io/?p=3703</guid>
<description><![CDATA[As an SEO professional, you may have spent a lot of time optimizing new content but still fail to meet your ranking potential. It can be because you ignore one invisible threat: index bloating.]]></description>
<content:encoded><![CDATA[ <div data-elementor-type="wp-post" data-elementor-id="3703" class="elementor elementor-3703" data-elementor-post-type="post">
<div class="elementor-element elementor-element-fe781d7 e-flex e-con-boxed e-con e-parent" data-id="fe781d7" data-element_type="container" data-e-type="container" data-settings="{"jet_parallax_layout_list":[]}">
<div class="e-con-inner">
<div class="elementor-element elementor-element-fc67d2b elementor-widget elementor-widget-text-editor" data-id="fc67d2b" data-element_type="widget" data-e-type="widget" data-widget_type="text-editor.default">
<div class="elementor-widget-container">
<p>As an SEO professional, you may have spent a lot of time optimizing new content but still fail to meet your ranking potential. It can be because you ignore one invisible threat: <strong>index bloating</strong>.</p>
<p>Index bloating presents challenges for both search engines and website owners. It complicates search engine algorithms’ task of identifying valuable content, leading to fewer site crawls. Additionally, it buries high-quality content under less valuable pages, lowering the visibility and overall rank potential of your site.</p>
<p>Fortunately, solving index bloating is easy. In this article, we’ll look at what index bloating is, how it affects crawl budgets, and explore practical ways to solve this issue to improve your website’s online visibility.</p>
<h2 class="wp-block-heading">What is Index Bloating?</h2>
<p>Index bloating occurs when your website gets overwhelmed with low-quality pages and forces Google indexing crawlers to spend valuable time combing through these instead of focusing on more important pages.</p>
<p>This bloating wastes precious crawl budget (the limited resources search engines allocate to crawl a website), making indexing less efficient. Consequently, it can negatively impact your technical SEO scores, rankings, and user experience.</p>
<p><strong>Related</strong>: Follow our <a href="https://prerender.io/resources/free-downloads/white-papers/crawl-budget-guide/">crawl budget optimization guide</a> to speed up the Google indexing process.</p>
<p>Simply put, index bloating happens when the quantity of pages in your indexing list massively outpaces the quality and usefulness of those pages.</p>
<p>Index bloating is especially common on e-commerce websites that house hundreds of thousands of products, categories, and customer reviews. Generally, your site’s index can become bloated due to various reasons:</p>
<ul class="wp-block-list">
<li><strong>Thin content:</strong> Pages with short or low-quality content provide low value to users and may be regarded as poor quality by search engines. However, they can still be indexed, especially if they are automatically generated or are extras from site updates.</li>
<li><strong>Duplicate or near-duplicate content:</strong> When numerous pages on your website have identical or extremely similar information published across multiple URLs, search engines may index them all, resulting in duplicate content issues.</li>
<li><strong>Faceted Navigation and parameters:</strong> E-commerce sites and other platforms with faceted navigation often generate numerous URL variations based on filters, sorting options, etc., resulting in many near-duplicate pages that are indexed unnecessarily.</li>
<li><strong>Media pages:</strong> Excessive image galleries and video collections with poor metadata.</li>
<li><strong>Tag Pages and archives:</strong> While these pages serve organizational purposes, they may not offer unique value and can contribute to index bloating if not properly managed.</li>
<li><strong>Missing robot.txt files:</strong> Robots.txt file is a text file located at the root of a website’s domain that tells web crawlers which pages should and should not be indexed. When it is missing, search engine bots may crawl and index sites that should not be included, resulting in index bloating.</li>
</ul>
<p><strong>Related</strong>: Learn 6 best practices for <a href="https://prerender.io/blog/robots-txt-and-seo/">optimizing robot.txt files to enhance SEO performance</a>.</p>
<h2 class="wp-block-heading">How Index Bloating Affects SEO Performance</h2>
<p>With over <a href="https://www.forbes.com/advisor/in/business/software/website-statistics/#:~:text=Show%20less-,General%20Website%20Statistics,200%20million%20websites%20are%20active.">1.13 billion websites online</a>, search engines have a limited “crawl budget” for each website, meaning they can only visit and process a certain number of pages within a timeframe. </p>
<p>With index bloat, your site’s important pages may be crawled but not indexed, and once your budget is depleted, the indexing process will stop. Consequently, this postpones your content to show up on SERPs, potentially hurting your website rankings, and lowering the conversion rates. </p>
<p>Besides the crawl budget limitations, Google is known to only index a certain number of pages from your website. This leaves valuable content unexplored and potentially underexposed. A high-quality page that typically gets 7,000 visits may only get 2,500 if Google indexes the undesirable pages competing for the same traffic.</p>
<p>Index bloating can also decrease click-through rates (CTR) and poor user experience (UX). When searchers are presented with pages from a bloated index, they have to sift through more low-quality results to find what they want, leading to more bounces and fewer clicks on your pages. Over time, this drives down your CTR, causing Google to trust and give you lower rankings. </p>
<p>Here’s an overview of how index bloating impacts SEO health:</p>
<ul class="wp-block-list">
<li>Wasting valuable crawl budget on pages that bring nothing to your business growth</li>
<li>Hurting rankings, lowering traffic, and ultimately decreasing conversion rates</li>
<li>Decreasing CTR and generating poor UX</li>
</ul>
<p>In short, index bloating dramatically slows down your SEO progress while subtly diluting the power of your best content. It is like trying to get out of quicksand; it drags you down at every step.</p>
<h2 class="wp-block-heading">How to Identify Index Bloating</h2>
<p>To know if your website has index bloating, you have to assess the total number of indexed pages on your website. You can do this by going to Google Search Console and checking the <a href="https://support.google.com/webmasters/answer/7440203?hl=en">Index Coverage Report.</a> </p>
<p>The report provides key index coverage insights by displaying:</p>
<ul class="wp-block-list">
<li>The total number of your web pages Google has included in its search results database</li>
<li>The current indexing status of those pages, and</li>
<li>Crawl activity detailing whether Google’s bots have visited each URL to assess its content. </li>
</ul>
<p>Compare the number of “Valid” pages to the number you want indexed and submitted in your sitemap. If you find a significant difference, you’re likely to have index bloating. Additionally, monitor overall crawl activity and identify unexpected spikes that might suggest excessive crawling of low-quality pages.</p>
<h2 class="wp-block-heading">6 Proven SEO Methods to Fix Index Bloating</h2>
<p>You now understand how to identify pages that cause index bloat. The next and most important step is to solve it. Here are 6 ways to effectively do it.</p>
<h3 class="wp-block-heading">1. Conduct An Index Audit</h3>
<p>Dig into Search Console and Google Analytics to classify the value of indexed pages. Sort into:</p>
<ul class="wp-block-list">
<li>Cornerstone content to keep</li>
<li>Middling fluff to beef up or consolidate</li>
<li>Useless zombie pages to axe or redirect</li>
</ul>
<p>Segmenting pages in this manner highlights consolidation and pruning opportunities, allowing legacy content equity and ongoing link flow to get efficiently transferred to areas of your site best serving user needs. This process will also reveal site architecture gaps that need new content allocation.</p>
<h3 class="wp-block-heading">2. Remove Internal Links</h3>
<p>Examine the internal linking configuration of your website and pinpoint pages that are of low quality, redundant, or no longer necessary. Remove internal links to these sites to prevent search engine crawlers from accessing and indexing them. Prioritize directing internal link equity to important sites to improve their indexing and ranking potential. </p>
<h3 class="wp-block-heading">3. Set up the Proper HTTP Status Code</h3>
<p>To enhance site authority and <a href="https://prerender.io/blog/fix-404-errors-on-spas/">reduce 404 errors</a>, remove thin-content by redirecting them with a 301 redirect to relevant content on the site. This preserves backlink value and minimizes errors. Use an HTTP status code of “410” for content that is no longer relevant for quick removal from search engine indexes.</p>
<h3 class="wp-block-heading">4. Set up Proper Canonical Tags</h3>
<p>Search engines like Google prioritize pages with canonical tags in the header section (<em><link rel=”canonical” href=”<URL of the original page></em>) for indexing. This not only prevents duplicate pages from being indexed but also consolidates link equity and redirects it to the main page.</p>
<h3 class="wp-block-heading">5. Update the robots.txt File to “Disallow”</h3>
<p>The robots.txt file instructs search engines on which pages to crawl or avoid. By selectively using the “disallow” tag in this file, you can stop Google from crawling certain pages. This blocks unwanted pages from entering the index waiting list in the first place, allows the removal of pages as a group, and helps free up the crawl budget. </p>
<p>However, blocking pages via robots.txt doesn’t always remove them from Google’s index, especially if they’re already indexed or linked internally. To completely disable indexing, you have to use the “noindex” tag in the site’s header.</p>
<h3 class="wp-block-heading">6. Use the URL Removals Tool in Google Search Console</h3>
<p>If you’re certain that certain pages were incorrectly indexed and should not appear in search results, use <a href="https://search.google.com/search-console/about">Google Search Console’s URL</a> removal tool (or similar tools for other search engines) to request their removal from the index.</p>
<h2 class="wp-block-heading">Minimizing Index Bloating and Crawl Budget Spent With Prerender</h2>
<p>Due to limited crawl budgets for websites, it’s crucial to guide crawlers towards high-value pages first. However, many modern sites use complex JavaScript rendering to show dynamic content, causing bots to index placeholder pages that lack useful content, leading to index bloating.</p>
<p>To optimize crawl budgets and <a href="https://prerender.io/blog/10-ways-to-get-your-website-indexed-faster/" target="_blank" rel="noopener">improve JS site indexing for Google</a>, you need to incorporate <a href="https://prerender.io/blog/how-prerender-renders-javascript-websites/">JavaScript SEO</a> practices. Although Google has improved JS indexing, delays still exist and can lead to index bloating if pages are crawled prematurely.</p>
<p>That’s where Prerender comes in.</p>
<p><a href="https://prerender.io/">Prerender</a> is an efficient solution to solve JavaScript SEO and index bloating issues. It generates static HTML versions of your dynamic content and serves them to search engines. This way, you eliminate the need for Google indexing JavaScript crawlers to wait for pages to load and the complexities of rendering JavaScript. </p>
<p>As a result, your pages get indexed faster and perfectly all the time, boosting your SEO performance and visibility. If you’d like that for your website, <a href="https://prerender.io/pricing/">you can start with Prerender now</a> to enjoy all its benefits.</p>
<h2 class="wp-block-heading">Prevent Index Bloating For Smooth Google Indexing Process </h2>
<p>At the end of the day, index bloating subtly sabotages sites through a thousand small cuts that drag down performance. What may seem like SEO victories actually undermine your crawling capacity and ability to outrank competitors.</p>
<p>If your website has been around for a while, it’s best to do a <a href="https://prerender.io/blog/best-technical-seo-tools/">full website audit and maintenance check</a> yearly. Go through all your pages with a fine-tooth comb – are they still relevant, helpful, and up-to-date? Or are some of them outdated, thin, or duplicative?</p>
<p>To prevent index bloating and other JavaScript SEO problems obstructing your content from ranking high, adopt Prerender. We have helped 2.7 billion web pages to be crawled 20x faster. </p>
<p><a href="https://prerender.io/pricing/">Sign up today and get 1,000 FREE renders to get started!</a></p>
</div>
</div>
<div class="elementor-element elementor-element-3c38a90 elementor-tablet-align-center elementor-align-center elementor-widget elementor-widget-button" data-id="3c38a90" data-element_type="widget" data-e-type="widget" data-widget_type="button.default">
<div class="elementor-widget-container">
<div class="elementor-button-wrapper">
<a class="elementor-button elementor-button-link elementor-size-sm" href="https://prerender.io/pricing/">
<span class="elementor-button-content-wrapper">
<span class="elementor-button-text">Try Prerender for Free</span>
</span>
</a>
</div>
</div>
</div>
</div>
</div>
</div>
]]></content:encoded>
<wfw:commentRss>https://prerender.io/blog/how-to-fix-index-bloating-seo/feed/</wfw:commentRss>
<slash:comments>0</slash:comments>
</item>
<item>
<title>How to Find Pages That Deplete Your Crawl Budget (SEO Guide)</title>
<link>https://prerender.io/blog/enterprise-guide-to-finding-pages-that-deplete-your-crawl-budget/</link>
<comments>https://prerender.io/blog/enterprise-guide-to-finding-pages-that-deplete-your-crawl-budget/#respond</comments>
<dc:creator><![CDATA[Leo Rodriguez]]></dc:creator>
<pubDate>Mon, 29 Jan 2024 13:52:00 +0000</pubDate>
<category><![CDATA[Crawl Budget]]></category>
<category><![CDATA[Enterprise SEO]]></category>
<category><![CDATA[crawl budget]]></category>
<category><![CDATA[dynamic rendering]]></category>
<category><![CDATA[enterprise seo]]></category>
<category><![CDATA[technical seo]]></category>
<guid isPermaLink="false">https://prerender.io/?p=3132</guid>
<description><![CDATA[Learn about the types of pages depleting your crawl budget. ]]></description>
<content:encoded><![CDATA[ <div data-elementor-type="wp-post" data-elementor-id="3132" class="elementor elementor-3132" data-elementor-post-type="post">
<div class="elementor-element elementor-element-4189c2a8 e-flex e-con-boxed e-con e-parent" data-id="4189c2a8" data-element_type="container" data-e-type="container" data-settings="{"jet_parallax_layout_list":[]}">
<div class="e-con-inner">
<div class="elementor-element elementor-element-60a114db elementor-widget elementor-widget-text-editor" data-id="60a114db" data-element_type="widget" data-e-type="widget" data-widget_type="text-editor.default">
<div class="elementor-widget-container">
<p></p>
<p><span style="font-weight: 400;">Enterprise-level websites often struggle with crawl budget management, which hinders search engines like Google from timely indexing new content. Effective crawl budget optimization is, therefore, essential to ensure your pages appear in search results.</span></p>
<p><span style="font-weight: 400;">This crawl budget guide will teach you how to efficiently identify and optimize pages that consume a lot of crawl budget within your large-scale website. By pinpointing these resource-intensive pages, you can allocate a crawl budget more effectively and improve your website’s overall search engine visibility and SEO rankings.</span></p>
<h2 class="wp-block-heading">5 Most Common Pages That Deplete Your Crawl Budget</h2>
<p></p>
<p><span style="font-weight: 400;">Here are five types of pages that can easily drain your crawl budget.</span></p>
<p></p>
<h3 class="wp-block-heading">1. Soft-404 Errors</h3>
<p></p>
<p>Soft 404 errors occur when a page no longer exists (so it should be displaying a 404 not found error) but appear to exist because the server returns a 200-ok status code or it’s displaying content that shouldn’t be there anymore.</p>
<p></p>
<p>This error is a mismatch between what the search engine expected and what the server returns.</p>
<p></p>
<p>For example, another way it can happen is if the page is returning a 200 status code, but the content is blank, from which search engines would interpret it as actually a (soft) 404 error.</p>
<p></p>
<p>Finding these pages is actually quite simple because Google Search Console has a specific <a href="https://developers.google.com/search/blog/2010/06/crawl-errors-now-reports-soft-404s#:~:text=A%20soft%20404%20occurs%20when,of%20pages%20with%20unique%20content.">soft 404 error report.</a></p>
<p></p>
<figure class="wp-block-image is-resized"><img loading="lazy" decoding="async" style="width: 840px; height: 668px;" src="https://lh5.googleusercontent.com/9jI8GyGB9u6bRkTwbEf6Z0QWjTvaJEjEb9fsxfisKYI-piPLGHJOx1shOPb9BLKExTrNACA2n_BFmucL-OX4xwiW1gr3uo9m9N52DZrB-Ol8Cj6iie5xfanc3iPkFVS7LKTJh9Xvudk6QdcxeEXXLug" alt="Why Pages Aren't Indexed" width="840" height="668" /></figure>
<p></p>
<p>Although crawlers will eventually stop crawling pages with a “Not found (404)” status code, soft 404 pages will keep wasting your crawl budget until the error is fixed.</p>
<p></p>
<p><strong>Note:</strong> It’s a good idea to check your “Not found (404)” report and make sure that these pages are returning the correct status code, just to be sure.</p>
<p></p>
<p>For a more detailed step-by-step guide, follow our <a href="https://prerender.io/blog/soft-404/">tutorial on how to fix soft 404 errors</a>.</p>
<p></p>
<h3 class="wp-block-heading">2. Redirection Chains</h3>
<p></p>
<p>A redirection chain is created when Page A redirects to Page B, which then redirects to Page C, and Page C redirects to Page A, forming an infinite loop that traps Googlebot and depletes your crawl budget.</p>
<p></p>
<p>These chains usually happen during site migrations or large technical changes, so it’s more common than you might think.</p>
<p></p>
<p>To quickly find redirect chains on your site, crawl your site with Screaming Frog and export the redirect chain report.</p>
<p></p>
<figure class="wp-block-image"><img decoding="async" src="https://lh6.googleusercontent.com/T0s2CxgmuXq9-g7tHWEzUdss4byKYSJ9X0W_CmGb2vpqcs0Pmnfkx4sXJfyWVxpFy0bHD-cuUsfkNkgJvU6QoQp4uo87yQgdXUVkrDkS-oOtnXtWT-AtV6l1OXU7AoDN_jz0mdWZvku8rMxLcWXzGfk" alt="Screaming Frog Audit - Dashboard" /></figure>
<p></p>
<p>However, crawling your entire site would require a lot of storage and resources from your machine to handle.</p>
<p></p>
<p>The best approach is to crawl your site using the sections you already created. This will take some of the pressure off your machine as well as reduce the amount of time it would take to crawl millions of URLs in one go.</p>
<p></p>
<p><strong>Note:</strong> You have to remember that Screaming Frog will also crawl resources like JS files and CSS, taking more time and resources from your machine.</p>
<p></p>
<p>To submit a list for Screaming Frog to crawl, switch to List Mode.</p>
<p></p>
<figure class="wp-block-image"><img decoding="async" src="https://lh5.googleusercontent.com/Jg52q4g0VLdu-oTc1RVGuRIMvcyoYZVbrv6PqmbgkxfPXkYgzypYDRQrTQiXnuwgvm2WXPWC2No1talKZjW3RfO4chg9ez5EMLFCj5CDoJFNXDvnWLd0NGiyaUjyurMPBO7v8-q-izYU1Q17cWWY_xw" alt="Screaming Frog Crawl Data" /></figure>
<p></p>
<p>Follow our tutorial on <a href="https://prerender.io/blog/do-redirect-chains-hurt-seo/">fixing redirect chains</a> for clear step-by-step instructions.</p>
<p></p>
<h3 class="wp-block-heading">3. Duplicate Content</h3>
<p></p>
<p><a href="https://developers.google.com/search/docs/advanced/guidelines/duplicate-content">According to Google</a>, “duplicate content generally refers to substantive blocks of content within or across domains that either completely match other content in the same language or are appreciably similar.”</p>
<p></p>
<p>These pages can take many forms like faceted navigation, several URL versions being available, misimplemented Hreflang alternate URLs for localization, or just generic category templates showing almost the same content.</p>
<p></p>
<p>Because Google perceives them as essentially the same page, it will crawl all of these pages but won’t index them or treat them as spam and hurt your overall SEO scores.</p>
<p></p>
<p>In terms of crawl budget, all of these pages are taking resources away from the main pages on your site, wasting precious crawling time. This gets even worse if the pages flagged as duplicate content require JavaScript rendering, which will 9X the time needed to crawl them.</p>
<p></p>
<p>The best way to find duplicate pages is by checking Google Search Console’s “duplicate without user-selected canonical” report, which lists all the pages Google assumes are duplicated pages.</p>
<p></p>
<figure class="wp-block-image"><img decoding="async" src="https://lh5.googleusercontent.com/oeAXAIzy0nsSrXajbXf6UG-qNFmBR82cmivx6tbebH1u57_k5AnMlEHM6olk5mqNtHkTJCYS7X1Zmkk6rgv1MtUuUSyahub0NSsqhXYB8qvQgUdaMcU5pNeMuzQ4nqm9AxBb7LH93Y7xZ8OekivZZJQ" alt="Google Search Console Feedback " /></figure>
<p></p>
<p>You can contrast this information with Screaming Frog by crawling your site locally and then checking the “near duplicate” report.</p>
<p></p>
<figure class="wp-block-image"><img decoding="async" src="https://lh6.googleusercontent.com/P12GsWsrqGEDreIB0BSGd8FKAJvM6tp1vLU68EWOFgVC7TvdGaYas3chEb_WLrgxjFbZFxXXmgz8v_WzXr4E-6U4xAjQCGLXj4G6PokKM-vXQMc8u3BGNNyqNBYmbmRlqTAECiwKcPlNsT-nJs9d7kE" alt="Duplicate content audit in Screaming Frog" /></figure>
<p></p>
<p>Check our guide on <a href="https://prerender.io/blog/how-to-fix-duplicate-content-issues/">fixing duplicate content issues</a> for a step-by-step tutorial.</p>
<p></p>
<h3 class="wp-block-heading">4. Slow or Unresponsive URLs</h3>
<p></p>
<p>If we understand crawl budget as the number of HTTP requests Googlebot can send to your website (pages crawled) in a specific period of time (crawling session) without overwhelming your site (crawl capacity limit), then we have to assume page speed plays a big role in crawl budget efficiency.</p>
<p></p>
<p>From the moment Google requests your page to the moment it finally renders and processes the page, we can say it is still crawling that URL. The longer your page takes to render, the more crawl budget it consumes.</p>
<p></p>
<p>There are two main strategies that you can use to identify slow pages:</p>
<h4><span style="font-family: var( --e-global-typography-text-font-family ), Sans-serif; text-align: var(--text-align); font-size: 1rem;">A.) Connect PageSpeed Insights with Screaming Frog</span><span style="font-family: var( --e-global-typography-text-font-family ), Sans-serif; font-weight: var( --e-global-typography-text-font-weight ); text-align: var(--text-align); font-size: 1rem;"> </span></h4>
<p></p>
<p><!-- /wp:list --><!-- wp:paragraph --></p>
<p>Screaming Frog allows you to connect several SEO tools APIs to gather more information from the URLs you’re crawling.</p>
<p>Top tip: Not a fan of Screaming Frog? Discover <a href="https://prerender.io/blog/screaming-frog-alternative/">10 paid and free alternatives to Screaming Frog</a> to support your technical SEO auditing needs.</p>
<p><!-- /wp:paragraph --><!-- wp:image --></p>
<figure class="wp-block-image"><img decoding="async" src="https://lh4.googleusercontent.com/ch2KIkYymGYAWaRE9i6JrteP6rDC5jwxqZo0wExqXUDujgljBr25bsfMNeCeBiuygWcgPLS4wP1S_JPq1asQHtrOFZ0__m3giosRuZm9qe4MibllyDjJB6HZ6C13itoJSslQ6c621y4L2kTsyFxScYc" alt="PageSpeed Insights" /></figure>
<p><!-- /wp:image --><!-- wp:paragraph --></p>
<p>Connecting PageSpeed Insights will provide you with all the performance information at a larger scale and directly on the same crawl report.</p>
<p><!-- /wp:paragraph --><!-- wp:paragraph --></p>
<p>If you’ve already crawled your site section, don’t worry, <a href="https://prerender.io/blog/screaming-frog-tips/#:~:text=use%20the%20request%20data%20api%20function">Screaming Frog allows you to request data from the APIs you’ve connected retroactively</a>. Avoiding you the pain of crawling thousands of URLs again.</p>
<p><!-- /wp:paragraph --><!-- wp:image --></p>
<h4 class="wp-block-image"><img decoding="async" src="https://lh4.googleusercontent.com/9nfMLPWelnJcJ8Yb9pOdvSzYp3jZHqDJVeavDPl71egZO8NSdfcaEtQm49nc6mdgXY1XAdE6dGbs1UCwJwwOluN9zki3KdfZrysMbSMwzyh0SFzxamdviMsu9oXQa4a0UlvO1C9RKVHpqpRxJHsYyvE" alt="Example of how to request API data in Screaming Frog" />B.) Identify slow pages through log file analysis</h4>
<p><!-- /wp:list-item --></p>
<p><!-- /wp:list --><!-- wp:paragraph --></p>
<p>Log files contain real-life data from your server activity, including the time it took for your server to deliver your page to search engines.</p>
<p><!-- /wp:paragraph --><!-- wp:paragraph --></p>
<p><span style="font-weight: 400;">The two metrics you want to pay attention to are:</span></p>
<ul>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;"><strong>Average Bytes</strong> – the average size in bytes of all log events for the URL.</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;"><strong>Average Response Time (ms)</strong> – The length of time taken (measured in milliseconds) to download the URL on average.</span></li>
</ul>
<p><!-- /wp:list-item --></p>
<p><!-- /wp:list --><!-- wp:paragraph --></p>
<p>As a benchmark, <a href="https://prerender.io/blog/page-speed-tips-and-statistics/">your pages should completely load in 2 seconds or less</a> for a good user experience, but Googlebot will stop crawling any URL with a server response longer than two seconds.</p>
<p><!-- /wp:paragraph --><!-- wp:paragraph --></p>
<p>Here’s a <a href="https://prerender.io/blog/7-best-practices-for-creating-mobile-friendly-javascript-sites/">complete guide to page speed optimization for enterprise sites</a> you can follow to start improving your page performance on both mobile and desktop.</p>
<p><!-- /wp:paragraph --><!-- wp:heading {"level":3} --></p>
<h3 class="wp-block-heading">5. Pages with No Search Value in Your Sitemap</h3>
<p><!-- /wp:heading --><!-- wp:paragraph --></p>
<p>Your sitemap acts as a priority list for Google. This file is used to tell Google about the most important pages on your site that require its attention.</p>
<p><!-- /wp:paragraph --><!-- wp:paragraph --></p>
<p>However, in many cases, developers and SEOs tend to list all URLs within the sitemap, even those without any relevancy for search.</p>
<p><!-- /wp:paragraph --><!-- wp:paragraph --></p>
<p>It’s important to remember that your sitemap won’t replace a well-planned site structure. For example, sitemaps won’t solve issues like orphan pages (URLs without any internal links pointing to them).</p>
<p><!-- /wp:paragraph --><!-- wp:paragraph --></p>
<p>Also, just because a URL is in your sitemap doesn’t mean it’ll get crawled. So, you have to be mindful about what pages you’re adding to the file, or you could be taking resources away from your priority pages.</p>
<p><!-- /wp:paragraph --><!-- wp:paragraph --></p>
<p>Here are the pages you want to identify and remove from your sitemap:</p>
<p><!-- /wp:paragraph --><!-- wp:list --></p>
<ul>
<li style="list-style-type: none;">
<ul><!-- wp:list-item --></ul>
</li>
</ul>
<h4>I.) Pages Tagged as Noindex</h4>
<p><!-- /wp:list-item --></p>
<p><!-- /wp:list --><!-- wp:paragraph --></p>
<p>This tag tells Google not to index the URL (it won’t show in search results), but Google will still use the crawl budget to request and render that page before dropping it due to the tag.</p>
<p><!-- /wp:paragraph --><!-- wp:paragraph --></p>
<p>If you’re adding this URL to your sitemap, you’re wasting crawl budget on pages that have no search relevance.</p>
<p><!-- /wp:paragraph --><!-- wp:paragraph --></p>
<p>To find them, you can crawl your site using Screaming Frog and go to the directive report:</p>
<p><!-- /wp:paragraph --><!-- wp:image --></p>
<figure class="wp-block-image"><img decoding="async" src="https://lh6.googleusercontent.com/9UBGq-5wXt1ujzhz2ysK-D7urdZR6swO-_sQ9ubidrEgCfIDofqGBtnTd2n6oxsHESuBpePkoZrSMTenn4sawHkbhZw0H9u9ZpMgrpsT5G_4gn_wc_8-52gIaePBTvq0eSvmwGA0uuPeawU6TP0Hx5k" alt="Noindex report in Screaming Frog" /></figure>
<p><!-- /wp:image --><!-- wp:paragraph --></p>
<p>You can also find these pages in the Search Console’s indexing report.</p>
<p><!-- /wp:paragraph --><!-- wp:image --></p>
<figure class="wp-block-image"><img decoding="async" src="https://lh4.googleusercontent.com/_mbXMWmcu5Zb-WoGbOqYlyeQxEaxBcubqB97ZWMJqPW4ggwcGbFMAycEo2yar98srIZGLF07CRwzJjKgAAap1dcntkqJ2YCcSwsxxpPRFNcuqRXeEvdYf_FqI1Wjtq3nWUlmzVbB5MhKX8bBRl_WSaw" alt="Page indexing feedback in Google Search Console" /></figure>
<p><!-- /wp:image --><!-- wp:list --></p>
<ul>
<li style="list-style-type: none;">
<ul><!-- wp:list-item --></ul>
</li>
</ul>
<h4>II.) Landing Pages and Content Without Ranking Potential</h4>
<p><!-- /wp:list-item --></p>
<p><!-- /wp:list --><!-- wp:paragraph --></p>
<p>There are landing pages and content that just won’t rank, and from millions of URLs, you are bound to have a few of those.</p>
<p><!-- /wp:paragraph --><!-- wp:paragraph --></p>
<p>It doesn’t mean there’s no value in having them indexed. It just means there’s no value in having them discovered through the sitemap.</p>
<p><!-- /wp:paragraph --><!-- wp:paragraph --></p>
<p>Your sitemap should be a curated list of highly valuable pages for search. Those that move the needle in traffic and conversions.</p>
<p><!-- /wp:paragraph --><!-- wp:paragraph --></p>
<p>A great way to find underperforming pages at scale is by connecting your Google Analytics and Search Console directly to Screaming Frog. This will allow you to spot pages with zero rankings quickly.</p>
<p><!-- /wp:paragraph --><!-- wp:paragraph --></p>
<p>You can also connect Ahrefs’ API to get backlink data on the URLs you’re analyzing.</p>
<p><!-- /wp:paragraph --><!-- wp:list --></p>
<ul>
<li style="list-style-type: none;">
<ul><!-- wp:list-item --></ul>
</li>
</ul>
<h4>III.) Error Pages, Faceted Navigation, and Session Identifiers</h4>
<p><!-- /wp:list-item --></p>
<p><!-- /wp:list --><!-- wp:paragraph --></p>
<p>It goes without saying, but you also need to clean your sitemap from any page returning 4XX and 5XX status codes, URLs with filters, session identifiers, canonicalized URLs, etc.</p>
<p><!-- /wp:paragraph --><!-- wp:paragraph --></p>
<p><strong>Note:</strong> You don’t need to block canonicalized pages through the Robot.txt file.</p>
<p><!-- /wp:paragraph --><!-- wp:paragraph --></p>
<p>For example, adding the entire pagination of a product category is not valuable. You only need the first page, and Google will find the rest while crawling it.</p>
<p><!-- /wp:paragraph --><!-- wp:paragraph --></p>
<p>You’ve already found most of these pages in the previous steps. Now check if any of them are in your sitemap and get them out.</p>
<p><!-- /wp:paragraph --><!-- wp:heading --></p>
<h2 class="wp-block-heading">Challenges Crawl Budget Optimization for Enterprise Websites</h2>
<p><!-- /wp:heading --><!-- wp:paragraph --></p>
<p>Although enterprise sites must follow the same best practices as any other type of website, <a href="https://prerender.io/blog/what-is-enterprise-seo-with-suggested-tools-to-use/#:~:text=what%20makes%20enterprise%20seo%20different">enterprise sites represent a unique challenge</a> because of their large scale and complex structure. To ensure we’re on the same page, here are the three main challenges you’ll encounter during your enterprise crawl budget optimization process:</p>
<p><!-- /wp:paragraph --><!-- wp:list --></p>
<ul>
<li style="list-style-type: none;">
<ul><!-- wp:list-item --></ul>
</li>
</ul>
<h3>A High Number of URLs</h3>
<p>One common denominator among enterprise sites is their size. In most cases, enterprise websites have over 1M URLs, making it harder for any team to stay on top of each and every one of them. Plus, it also makes it easy for teams to overlook some URLs and make mistakes. </p>
<p><!-- /wp:list-item --><!-- wp:list-item --></p>
<h3>Deadlines and Resources</h3>
<p>Something that’s very common in this kind of project is having unrealistic deadlines with limited resources. These resources are both time, people, and tools. Crawl budget optimization is a highly technical and collaborative endeavor, and for teams to be able to manage it successfully, they need to automate certain tasks with tools, count with different experts like writers, SEOs, and developers, and have enough time to implement safely.</p>
<p><!-- /wp:list-item --><!-- wp:list-item --></p>
<h3>Implementation and Testing</h3>
<p>Connected to the previous point, teams usually don’t have enough time for testing, which can have terrible results if your teams have to rush through implementation. Because of the complex structure of your site, even a small change might affect hundreds or thousands of URLs, creating issues like rank drops.</p>
<p><!-- /wp:list-item --></p>
<p><!-- /wp:list --><!-- wp:paragraph --></p>
<p>That said, making changes to your site shouldn’t be scary as long as you develop a solid plan, provide the right resources (including time, tools, and experts), and be conscious that testing is a necessary part of the process to avoid losing traffic and conversions.</p>
<p><!-- /wp:paragraph --><!-- wp:heading --></p>
<h2 class="wp-block-heading">3 Tips to Plan Your Enterprise Crawl Budget Audit</h2>
<p><!-- /wp:heading --><!-- wp:paragraph --></p>
<p>Every plan has to be tailored-made to fit your own organization, business needs, and site structure, so there’s no one-fit-all strategy you can find online.</p>
<p><!-- /wp:paragraph --><!-- wp:paragraph --></p>
<p>Still, there are three simple tips you can use to make your crawl budget audit easier to manage and implement:</p>
<p><!-- /wp:paragraph --><!-- wp:heading {"level":3} --></p>
<h3 class="wp-block-heading">1. Find a Logical Way to Break Down Your Site</h3>
<p><!-- /wp:heading --><!-- wp:paragraph --></p>
<p>You can group similar pages following your website structure (which is the hierarchical order pages are linked between each other in relation to the homepage), creating individual sections that are easier to understand and evaluate.</p>
<p><!-- /wp:paragraph --><!-- wp:image --></p>
<figure class="wp-block-image"><img decoding="async" src="https://lh4.googleusercontent.com/NNdQ2djjXti2PicAkORf82wTwBxqgRpJCnT9_6dAJ5135_uM04tTie3oeyc_NV5F2lm5_YzIf2RINlXrqsq9MqseUnmfXZElQJmwpKPTk-qo3b3LT_yyGAj-LWsqLkqRknWS8jb71nQOyuMyr5d63fM" alt="Web Structure Example" /></figure>
<p><!-- /wp:image --><!-- wp:paragraph --></p>
<p>Because every website has a unique structure, there isn’t a single best way to separate every site. Nevertheless, you can use any of the following ideas to get started:</p>
<p><!-- /wp:paragraph --><!-- wp:list --></p>
<ul>
<li style="list-style-type: none;">
<ul><!-- wp:list-item --></ul>
</li>
</ul>
<ul>
<li><strong>Group your pages by type</strong>—not all pages are built equally, but there are groups of pages sharing a similar approach. For example, your main landing pages might use the same template across all categories, so evaluating these pages together will give you insights into a bigger portion of your site faster than analyzing them one by one.</li>
</ul>
<p><!-- /wp:list-item --><!-- wp:list-item --></p>
<ul>
<li><strong>Group your pages by cluster</strong>—another way to do this is to identify topic clusters. For example, you might have category pages, landing pages, and content pages talking about similar topics and, most likely, linking to each other, so it makes sense for a team to focus on the whole.</li>
</ul>
<p><!-- /wp:list-item --><!-- wp:list-item --></p>
<ul>
<li><strong>Group your pages by hierarchy—</strong>as you can see in the image above, your structure has a “natural” flow based on how you’ve structured your website. By building a graphical representation of your pages, you can separate the different sections of your pages based on this structure – however, for this to work, your website needs a clear structure, if it doesn’t, it’s better to go with any of the previous ideas.</li>
</ul>
<p><!-- /wp:list-item --></p>
<p><!-- /wp:list --><!-- wp:heading {"level":3} --></p>
<h3 class="wp-block-heading">2. Delegate Each Site Section to an SEO Team</h3>
<p><!-- /wp:heading --><!-- wp:paragraph --></p>
<p>Once you know how you want to divide your website, it’s time to get all URLs into a central spreadsheet and then allocate each site section to a particular team – you can copy all the URLs assigned to a team into a different spreadsheet.</p>
<p><!-- /wp:paragraph --><!-- wp:paragraph --></p>
<p>When we talk about teams, we don’t necessarily mean having three different experts per section of your page—although that would be ideal.</p>
<p><!-- /wp:paragraph --><!-- wp:paragraph --></p>
<p>Instead, you’ll want to have one SEO professional for each section of the site do the audit. These professionals will provide all the guidelines, assign tasks, and oversee the entire project.</p>
<p><!-- /wp:paragraph --><!-- wp:paragraph --></p>
<p>For technical optimization like page speed, you’ll want one or two dedicated developers. This will ensure that you’re not taking time away from the product.</p>
<p><!-- /wp:paragraph --><!-- wp:paragraph --></p>
<p>Lastly, you’ll want at least two writers to help you merge content where needed.</p>
<p><!-- /wp:paragraph --><!-- wp:paragraph --></p>
<p>By creating these teams, you can have a smoother process that doesn’t take resources from other marketing areas or engineering. Because they’ll focus solely on this project, it can be managed faster.</p>
<p><!-- /wp:paragraph --><!-- wp:heading {"level":3} --></p>
<h3 class="wp-block-heading">3. Start Testing</h3>
<p><!-- /wp:heading --><!-- wp:paragraph --></p>
<p>You can use a project management tool like Asana or Trello to create a centralized operation that allows data to flow throughout the project.</p>
<p><!-- /wp:paragraph --><!-- wp:paragraph --></p>
<p>Once the first tasks are assigned (implementation tasks), it’s important to give a testing period to see how changes affect your rankings. In other words, instead of rolling all changes at the same time, it’s better to have a couple of months to take a small sample of URLs and make the changes.</p>
<p><!-- /wp:paragraph --><!-- wp:paragraph --></p>
<p>If the effects are negative, you’ll have the time to reverse the changes and study what happened without risking the entire site.</p>
<p><!-- /wp:paragraph --><!-- wp:heading --></p>
<p><!-- /wp:paragraph --><!-- wp:heading --></p>
<h2 class="wp-block-heading">A Better Crawl Budget Management for Your Large-Scale Websites</h2>
<p><!-- /wp:heading --><!-- wp:paragraph --></p>
<p>Rendering has a direct impact on your crawl budget and is arguably the biggest. When Google finds a JS file within a page, it’ll need to add an additional rendering step to access the entire content of the URL.</p>
<p><!-- /wp:paragraph --><!-- wp:paragraph --></p>
<p>This rendering step makes the crawling process take 9x longer than HTML static pages and can deplete your crawl budget between just a few URLs, leaving the rest of your site undiscovered.</p>
<p><!-- /wp:paragraph --><!-- wp:paragraph --></p>
<p>Furthermore, many things can go wrong during the rendering process, like timeouts or partially rendered content, which create all sorts of SEO issues like missing meta tags, missing content, thin pages, or URLs getting ignored.</p>
<p><!-- /wp:paragraph --><!-- wp:paragraph --></p>
<p>To find these pages, you’ll need to identify all URLs using JS to inject or modify the pages’ content. Most commonly, these are pages showing dynamic data, product listings, and dynamically generated content.</p>
<p><!-- /wp:paragraph --><!-- wp:paragraph --></p>
<p>However, if your website is built using a JavaScript framework like React, Angular, or Vue, you’ll experience this rendering issue across the entire site.</p>
<p><!-- /wp:paragraph --><!-- wp:paragraph --></p>
<p>The good news is Prerender can solve all rendering issues from its root, increase crawl budget, and speed up the crawling process with a simple plug-and-play installation.</p>
<p><!-- /wp:paragraph --><!-- wp:paragraph --></p>
<p>Prerender crawls your pages and generates and caches a 100% index-ready snapshot – a completely rendered version – of your page.</p>
<p><!-- /wp:paragraph --><!-- wp:paragraph --></p>
<p>When a search engine requests your URL again, Prerender will deliver the snapshot in 0.03 seconds, taking the rendering process off Google’s shoulders and allowing it to crawl more pages faster.</p>
<p><!-- /wp:paragraph --><!-- wp:paragraph --></p>
<p>Explore more about crawl budget optimization:</p>
<ul>
<li><a href="https://prerender.io/blog/understanding-googles-15mb-crawl-limit/">Understanding Google’s 15MB Crawl Limit</a></li>
<li><a href="https://prerender.io/blog/how-to-optimize-your-crawl-budget-with-internal-links/">4 Ways to Optimize Your Crawl Budget with Internal Links</a></li>
<li><a href="https://prerender.io/blog/how-to-optimize-your-crawl-budget-with-internal-links/">How to Avoid Missing Content in Web Crawls</a></li>
<li><a href="https://prerender.io/blog/how-to-use-log-file-analysis-to-optimize-your-crawl-budget/">5 Ways to Use Log File Analysis to Optimize Your Crawl Budget</a></li>
</ul>
<p><!-- /wp:paragraph --></p> </div>
</div>
<div class="elementor-element elementor-element-6199d46 elementor-tablet-align-center elementor-align-center elementor-widget elementor-widget-button" data-id="6199d46" data-element_type="widget" data-e-type="widget" data-widget_type="button.default">
<div class="elementor-widget-container">
<div class="elementor-button-wrapper">
<a class="elementor-button elementor-button-link elementor-size-sm" href="https://prerender.io/pricing/">
<span class="elementor-button-content-wrapper">
<span class="elementor-button-text">Try Prerender for Free</span>
</span>
</a>
</div>
</div>
</div>
</div>
</div>
</div>
]]></content:encoded>
<wfw:commentRss>https://prerender.io/blog/enterprise-guide-to-finding-pages-that-deplete-your-crawl-budget/feed/</wfw:commentRss>
<slash:comments>0</slash:comments>
</item>
<item>
<title>Understanding Google’s 15MB Crawl Limit</title>
<link>https://prerender.io/blog/understanding-googles-15mb-crawl-limit/</link>
<comments>https://prerender.io/blog/understanding-googles-15mb-crawl-limit/#respond</comments>
<dc:creator><![CDATA[Prerender]]></dc:creator>
<pubDate>Mon, 29 Jan 2024 13:46:07 +0000</pubDate>
<category><![CDATA[Crawl Budget]]></category>
<category><![CDATA[crawling]]></category>
<category><![CDATA[dynamic rendering]]></category>
<category><![CDATA[prerendering]]></category>
<guid isPermaLink="false">https://prerender.io/?p=3122</guid>
<description><![CDATA[What does Google's 15MB crawl limit mean, and how to manage it.]]></description>
<content:encoded><![CDATA[
<p><a href="https://developers.google.com/search/blog/2022/06/googlebot-15mb">Google’s 15MB crawl limit</a> is a crucial factor to consider when optimizing your website.</p>
<p>But what exactly does this limit mean, and how does it affect your website’s performance in search results?</p>
<p>In this blog post, we’ll delve into these questions and more, providing you with a comprehensive understanding of Google’s 15MB crawl limit and how to effectively manage it.</p>
<h2 class="wp-block-heading">What is Google’s 15MB Crawl Limit?</h2>
<p>This limit refers to the maximum size of a web page that Googlebot will download and process for indexing. If a web page exceeds this limit, Googlebot will only process the first 15MB of the page.</p>
<p>It is applied per resource basis, meaning that each HTML file, JavaScript file, and CSS file on your site has its own 15MB limit. However, despite each resource having its limit, you still need to ensure your money-maker or “golden egg” content is considered. For <a href="https://prerender.io/benefits/bigger-crawl-budget/">JavaScript-heavy websites</a>, on the other hand, the impact is even bigger. Naturally, JS files are large and can quickly exceed the 15MB limit.</p>
<p>The primary reason for Google implementing this rule is to manage the resources used during the crawling and indexing process. While it helps Googlebot crawl and index the vast number of web pages on the internet, it’s not always a win for your site.</p>
<p>Note that this limit only applies to the content retrieved for the initial request made by Googlebot, not the referenced resources inside the page.</p>
<h2 class="wp-block-heading">Is the 15MB Limit the Same as Crawl Budget?</h2>
<p>The limit is separate from but related to Google’s <a href="https://prerender.io/blog/crawl-budget-seo/">crawl budget</a>. Your crawl budget refers to the number of pages Googlebot can crawl on your site within a certain timeframe. If a page is close to or exceeds the 15MB limit, Googlebot may use up more of your allocated crawl budget to download and process that page. This <strong>leaves fewer resources for crawling other pages</strong> on your site.</p>
<p>Related: <a href="https://prerender.io/blog/how-to-guide-googlebot-in-crawling-important-urls/">How to Guide Googlebot in Crawling Important URLs</a></p>
<p>Crawl budgets are key for indexing. Once you exceed the first 15 MB limit, all the content after is dropped by Googlebot. Remember that this 15 MB limit only applies to fetches made by Googlebot. Optimizing your crawl budget will be a determining factor in how much online visibility your website can achieve.</p>
<p>More Reading: <a href="https://prerender.io/resources/free-downloads/white-papers/crawl-budget-guide/">Crawl Budget Optimization Guide</a></p>
<p>The limit includes all resources that are embedded in the HTML of a webpage. This includes text content, HTML markup, CSS, and JavaScript. External resources like images and videos are <em>not</em> counted towards the 15MB limit <em>unless</em> their data is embedded directly in the HTML.</p>
<p>While images and videos are not counted towards the 15MB limit, large images and videos can still impact a page’s loading time, which can affect Googlebot’s ability to efficiently crawl the page.</p>
<p>Related: <a href="https://prerender.io/blog/page-speed-tips-and-statistics/">PageSpeed Explained</a></p>
<p>Is it possible to hit a 15MB file size for HTML pages? For most websites, it is not reasonable or necessary to have HTML pages that approach or exceed the 15MB limit. Most web pages are far smaller than this limit. However, JavaScript websites or JS-based elements pages can exceed this limit.</p>
<h2 class="wp-block-heading">Strategies and Techniques to Avoid the 15MB Limit</h2>
<p>There are several strategies you can employ to avoid this limit. Some of them include optimizing your HTML, CSS, and JavaScript files to reduce their size, using external resources instead of embedding large amounts of data within your HTML, and implementing techniques like lazy loading for images and videos. Let’s go into detail.</p>
<h3 class="wp-block-heading">1. Server-Side Rendering (SSR)</h3>
<p>Server-side rendering (SSR) can be used to process JavaScript; serving crawlers a fully rendered HTML version. Additionally, server-side optimizations can include techniques like <a href="https://www.imperva.com/learn/performance/minification">code minification</a> and compression, which reduce the size of HTML, CSS, and JavaScript files.</p>
<p>However, it’s important to note that <strong>server-side rendering is not the optimal choice</strong> for every website.</p>
<p>SSR requires a significant amount of server resources. This can lead to increased server load and potentially slower response times, especially for websites with heavy traffic or complex JavaScript applications. Additionally, implementing SSR can be a costly, complex process that requires significant changes to your website’s architecture and codebase.</p>
<p>Dynamic rendering offers similar benefits at a fraction of the cost. A tool like <a href="https://docs.prerender.io/docs/how-does-prerender-work">Prerender</a>, for example, helps Google to easily crawl and index a website by generating a static HTML version of each page.</p>
<p>Related: <a href="https://prerender.io/blog/when-you-should-consider-dynamic-rendering/">When You Should Consider Dynamic Rendering</a></p>
<h3 class="wp-block-heading">2. Determining and Tracking Your Website’s Size</h3>
<p>You can determine the size of your website using various tools and techniques.</p>
<p>One common method is to use an auditing tool to crawl your site and provide information about the size of each page. You can also manually check the size of your HTML, CSS, and JavaScript files.</p>
<p><a href="https://search.google.com/search-console/about">Google Search Console</a> provides detailed information about how Googlebot interacts with your site. Other tools like <a href="https://prerender.io/blog/screaming-frog-tips/">Screaming Frog</a> can mimic the behaviour of web crawlers, allowing you to diagnose potential issues.</p>
<h3 class="wp-block-heading">3. Make Use of Embedded or Linked SVGs</h3>
<p><a href="https://www.freecodecamp.org/news/use-svg-images-in-css-html/">Including SVGs as image tags</a> can help manage the page’s size, as the data for the image is not embedded in the HTML. However, this can increase the number of HTTP requests the page makes, which can impact load time. The best approach depends on the needs and constraints of your website.</p>
<h2 class="wp-block-heading">Final Thoughts</h2>
<p>In addition to the 15MB limit, increasing your crawl budget will ensure your most important pages get crawled and indexed by Google every time.</p>
<p>Struggling to get indexed? <a href="https://prerender.io/pricing/">Get started with 1,000 URLs free.</a></p>
<p> </p>
<h2>FAQs on Crawl Limit</h2>
<p>1.) <strong>What is Crawl limit?</strong> </p>
<p>Crawl limit refers to the number of pages on your website that a search engine bot (like Googlebot) is allowed to visit and process within a specific timeframe. It’s like a quota for Googlebot visits.</p>
<p>2.) <strong>How does Crawl limit affect my website’s SEO?</strong></p>
<p data-sourcepos="1:1-1:67">Crawl limit can indirectly affect your website’s SEO in a few ways:</p>
<ul data-sourcepos="5:1-7:0">
<li data-sourcepos="5:1-5:292"><strong>Limited Crawling, Limited Indexing:</strong> If Googlebot can’t crawl all your website’s pages due to crawl limit restrictions, important content might not get indexed. This means those pages wouldn’t appear in search results, potentially hurting your website’s visibility for relevant keywords.</li>
<li data-sourcepos="6:1-7:0"><strong>Prioritization:</strong> Google prioritizes crawling well-structured, high-quality content. If your website has crawl errors, slow loading speeds, or irrelevant content, Googlebot might spend its crawl budget on other websites, leaving your valuable content unindexed.</li>
</ul>
<p>3.) <strong>Is there a way to check my website’s crawl limit?</strong></p>
<p>No, there isn’t a way to directly check your website’s specific crawl limit set by Google. Crawl limit is an internal metric that Google uses to manage its crawling resources efficiently.</p>
<p data-sourcepos="3:1-3:137">However, you can use tools and strategies to understand how Google interacts with your website and identify potential crawl limit issues</p>
<p>4.) <strong>Can I increase my website’s crawl limit?</strong></p>
<p data-sourcepos="3:1-3:34">While you can’t directly increase your website’s crawl limit set by Google, there are ways to indirectly influence how Googlebot allocates its crawl budget for your site. Simply focus on Website Optimization.</p>
<ul data-sourcepos="5:1-9:0">
<li data-sourcepos="5:1-5:249"><strong>Prioritize Important Pages:</strong> Ensure the most SEO-critical pages (product pages, service pages, blog posts) are well-structured, load quickly, and free of crawl errors. Googlebot is more likely to spend its crawl budget on these valuable pages.</li>
<li data-sourcepos="6:1-6:208"><strong>Optimize for Speed:</strong> Fast loading times encourage Googlebot to crawl more pages within its allotted time. Consider image optimization, caching mechanisms, and code minification to improve website speed.</li>
<li data-sourcepos="7:1-7:272"><strong>High-Quality Content:</strong> Create valuable, informative content that keeps users engaged. Google prioritizes crawling websites with fresh, relevant content that users find useful. Regularly update your website with new content and maintain existing content for accuracy.</li>
<li data-sourcepos="8:1-9:0"><strong>Fix Crawl Errors:</strong> Address any crawl errors identified by Google Search Console. Crawl errors can waste crawl budget on inaccessible pages. Fixing these errors ensures Googlebot spends its resources efficiently on valuable content.</li>
</ul>
<p><strong>Explore more about crawl budget optimization:</strong></p>
<ul class="wp-block-list">
<li><a href="https://prerender.io/blog/how-to-optimize-your-crawl-budget-with-internal-links/">4 Ways to Optimize Your Crawl Budget with Internal Links</a></li>
<li><a href="https://prerender.io/blog/how-to-avoid-missing-content-in-web-crawls/">How to Avoid Missing Content in Web Crawls</a></li>
<li><a href="https://prerender.io/blog/how-to-use-log-file-analysis-to-optimize-your-crawl-budget/">5 Ways to Use Log File Analysis to Optimize Your Crawl Budget</a></li>
</ul>
<a href="https://prerender.io/pricing/">
Try Prerender for Free
</a>]]></content:encoded>
<wfw:commentRss>https://prerender.io/blog/understanding-googles-15mb-crawl-limit/feed/</wfw:commentRss>
<slash:comments>0</slash:comments>
</item>
<item>
<title>How to Optimize Single-Page Applications (SPAs) for SEO</title>
<link>https://prerender.io/blog/how-to-optimize-single-page-applications-spas-for-crawling-and-indexing/</link>
<comments>https://prerender.io/blog/how-to-optimize-single-page-applications-spas-for-crawling-and-indexing/#respond</comments>
<dc:creator><![CDATA[Prerender]]></dc:creator>
<pubDate>Mon, 18 Dec 2023 12:23:47 +0000</pubDate>
<category><![CDATA[Crawl Budget]]></category>
<category><![CDATA[crawling]]></category>
<category><![CDATA[indexing]]></category>
<category><![CDATA[javscript]]></category>
<category><![CDATA[single page applications]]></category>
<guid isPermaLink="false">https://prerender.io/?p=3494</guid>
<description><![CDATA[Why SPAs cause JavaScript SEO problems, and how to solve them.
]]></description>
<content:encoded><![CDATA[
<p>What are the best practices for optimizing single-page applications (SPAs) for SEO?</p>
<p>SPAs are notoriously known for not being SEO-friendly, causing many SPA websites to experience poor SEO performance, affecting content visibility on SERPs. <br><br>In this blog post, we dive deep into SEO for single-page applications: what they are, why they cause JavaScript SEO issues, and how to optimize SPA for improved SEO and rankings. </p>
<h2 class="wp-block-heading">Why and How SPAs Affect SEO Performance</h2>
<p>Before we learn why SPA can jeopardize your SEO efforts, let’s cover the basics: what is a single-page application (SPA)?</p>
<p>SPA is a JavaScript-based web application that dynamically updates an existing page, as opposed to fetching complete new pages from the server. </p>
<p>Unlike traditional applications (common websites) that store separate pages as distinct HTML files, SPAs utilize <em>one</em> single-page template to render dynamic web pages through AJAX calls. Hence, the name, <em>single</em>-page application.</p>
<p>Since SPA uses a single-page template to house all your content, it eliminates the need for additional page loads after the initial loading process. Consequently, this contributes to a more seamless and responsive user experience (UX) and faster page loading time. However, this reliance on JavaScript comes at a cost. SPAs inherit the crawling and indexing challenges of JS, causing some SEO problems.</p>
<p>Pro tip: Learn <a href="https://prerender.io/blog/how-prerender-renders-javascript-websites/">how Google and other search engines index JavaScript pages</a> in this blog.</p>
<h2 class="wp-block-heading">Why SPAs Cause JavaScript SEO Problems</h2>
<p>For a search engine to crawl a page, it must first discover the page, render and crawl the content to understand it, and then index it. </p>
<p>The problem with SPA indexing is that SPAs only display page content and elements through a dynamic API call. This means crawlers only see an empty container when they visit SPAs. And without any content to crawl, there’s nothing to index, and your page won’t show up on Google SERPs.</p>
<p>Here is what an SPA page looks like from the user side:</p>
<figure class="wp-block-image"><img decoding="async" src="https://lh7-us.googleusercontent.com/zdS9h7FrQ-O1I9nKGgwGQJS_XGUlyN4OC0FeN3RYdBvRliuJoxxG8iwdDz0v4yD7WOOZEg84cefP7McdAUOWiv45KJfgwSWPt6x-DLxPmHtJwOmaEinLWTa_YT9OZKqkvz_7UoYeRUQ8umVxKqeeRNk" alt="Example of a Single Page Application"/></figure>
<p>Here’s what Googlebot would see when accessing the page:</p>
<figure class="wp-block-image"><img decoding="async" src="https://lh7-us.googleusercontent.com/rgG0k5qvE0-7skOTMJN34AFjdMhc5AEhpkiuY1f0RkjL9AYIk_2VTV59yEr_yDkeWoEbBI3F6MASOZ5KmZvg7fi5gAKcDEmfGBYWkqTvxD3nlq4gM6uBG0Q_WB4RGZ_7WuEVg8z9bVXAMYAA8b3RCUs" alt="Example of SPA from Google's perspective"/></figure>
<p>While SPAs have significant benefits, they won’t bring you traffic if you don’t know how to optimize SPA. Luckily, we’re here to show you exactly how to make the SPA pages 100% crawled and indexed.</p>
<h2 class="wp-block-heading">10 Solutions to Improve the SEO Performance of Single-Page Applications (SPAs)</h2>
<h3 class="wp-block-heading">1. Server-Side Rendering (SSR)</h3>
<p>Server-side rendering<a href="https://prerender.io/blog/what-is-srr-and-why-do-you-need-to-know/"> (SSR)</a> refers to a rendering technique where all the rendering process happens on the server before sending it to the client (browser). How does this affect SPA websites in terms of crawling and indexing?</p>
<p>With SSR, SPAs will render the JavaScript files on the server. When search engine crawlers request a page, the content is passed to the browser and reads as a fully rendered <a href="https://prerender.io/blog/seo-for-static-vs-dynamic-webpages/">static HTML page</a>. This results in a much faster loading time, as well as faster crawling and indexing time. And since the browser gets the content quicker, your rankings will also be boosted.</p>
<p>However, while SSR is a great way to optimize single-page application websites, there’s a significant reason why it’s not widely used: it’s expensive and difficult to implement. For SSR to work, you must invest around $120k upfront in servers, engineering hours, and expertise. Then there’s still the problem of <a href="https://prerender.io/blog/spa-javascript-seo-challenges-and-solutions/#:~:text=Although%20it%20technically,for%20the%20project.">scalability</a> and maintenance. SSR may be a good way to optimize SPA, but it’s not a fix-all solution.</p>
<h3 class="wp-block-heading">2. Implement SEO-friendly URLs</h3>
<p>Another technique on how to optimize SPAs is implementing SEO-friendly URLs. This method provides search engine crawlers with clear paths by offering a structured and easily navigable hierarchy. Because of this, the crawling process is streamlined, ensuring that search engines can efficiently explore all elements of your single-page application website.</p>
<p>To create SEO-friendly URLs, it is fundamental to set up your URL router properly. If your router operates in hash mode, it appends #hash fragments to your home page URL. This will cause crawlers to ignore different app views of your SPA because the crawlers see hashed URLs as different parts of the same page. </p>
<p>To achieve clean and SEO-friendly URLs while mitigating the risk of 404 errors, it’s essential to establish a fallback route on your server. This route redirects requests to your index.html page, where your app resides. Although this involves additional steps, popular JavaScript frameworks and libraries offer options for implementing this redirect.</p>
<p>A common mistake some SPAs make is using a single URL for everything on the app. This is a bad practice. When there’s just a single URL for everything, crawlers only see the home page and will not understand what the whole site is about. Therefore, you must treat views as URLs and change the URL anytime the app view changes. </p>
<h3 class="wp-block-heading">3. Dynamic Rendering with Prerender.io</h3>
<p><a href="https://prerender.io/blog/how-to-be-successful-with-dynamic-rendering-and-seo/">Dynamic rendering</a> provides a static HTML version of your pages for search engine bots and dynamic JavaScript content for human users. One of the top solutions available on the market is <a href="https://prerender.io/">Prerender.io</a>—a much more cost-effective alternative to options like in-house server-side rendering.</p>
<p>Since Prerender renders your website pages in advance, it’s able to <a href="https://prerender.io/benefits/nicer-user-experience/">deliver your pages within 0.03 seconds on average</a>, dramatically improving your server response time. Clients also see other SEO benefits from Prerender, including improved <a href="https://prerender.io/resources/free-downloads/white-papers/crawl-budget-guide/">crawl budget management</a>, better PageSpeed scores, and more organic traffic.</p>
<p>Check out the results from <a href="https://prerender.io/resources/case-studies/improved-pagespeed-and-boosting-page-indexing-to-optimize-webshop/">Haarshop—a brand with 35,000+ products—after installing Prerender</a>: </p>
<figure class="wp-block-image size-large"><a href="https://prerender.io/resources/case-studies/improved-pagespeed-and-boosting-page-indexing-to-optimize-webshop/"><img loading="lazy" decoding="async" width="1024" height="581" src="https://prerender.io/wp-content/uploads/haarshop-traffic-before-after-prerender-1-1024x581.png" alt="organic traffic levels from haircare webshop haarshop after installing prerender.io" class="wp-image-4835" srcset="https://prerender.io/wp-content/uploads/haarshop-traffic-before-after-prerender-1-1024x581.png 1024w, https://prerender.io/wp-content/uploads/haarshop-traffic-before-after-prerender-1-300x170.png 300w, https://prerender.io/wp-content/uploads/haarshop-traffic-before-after-prerender-1-768x436.png 768w, https://prerender.io/wp-content/uploads/haarshop-traffic-before-after-prerender-1-1536x872.png 1536w, https://prerender.io/wp-content/uploads/haarshop-traffic-before-after-prerender-1.png 1712w" sizes="(max-width: 1024px) 100vw, 1024px" /></a></figure>
<p><strong>Related</strong>: read on to see <a href="https://prerender.io/blog/how-prerender-renders-javascript-websites/">how Prerender works</a> or watch the video below.</p>
<figure class="wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio"><div class="wp-block-embed__wrapper">
<iframe title="Why Use Prerender.io? Improve Crawling and Indexing to See More Traffic" width="640" height="360" src="https://www.youtube.com/embed/xEPE_UaeW18?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>
</div></figure>
<h3 class="wp-block-heading">4. Ensure Your SPAs are Mobile Friendly</h3>
<p>If your SPA isn’t mobile-friendly, you have a lesser chance of ranking high on SERPs. Google prioritizes mobile-first indexing, which means it mainly looks at your site’s mobile version when deciding rankings. This means that if your SPA loads slowly and/or has unoptimized layouts and font sizes, it’s going to hurt both your SEO and user engagement.</p>
<p>So, how do you make sure your SPA is mobile-friendly?</p>
<p>Start with responsive design. Your site should adjust smoothly to any screen size, whether it’s a phone, tablet, or desktop. Luckily, frameworks like React, Vue.js, and Angular make this easier with flexible layouts and built-in tools for responsiveness, but we’ll talk about that later.</p>
<p>Then there’s navigation. Ever tapped a button on a site and accidentally clicked the wrong thing? Frustrating, right? That’s why it’s important to use touch-friendly buttons, properly spaced menu items, and an intuitive layout that works well on smaller screens.</p>
<p>Finally, don’t just assume everything works—test it. Use Google’s Mobile-Friendly Test and Lighthouse audits to see where your site might be falling short. Fixing any issues early on will keep visitors happy and help your rankings.</p>
<p>Check out our <a href="https://prerender.io/blog/how-to-make-a-website-mobile-friendly/">comprehensive ‘How to make a site mobile-friendly’ guide</a> to learn the details.</p>
<h3 class="wp-block-heading">5. Use Hreflang for Multi-Language SPAs</h3>
<p>If your SPA supports multiple languages but isn’t using hreflang, you might be confusing search engines and frustrating users. They need to search for the language selection settings and Google relies on hreflang tags to serve the right language version of your site to the right audience. Without them, your content might show up in the wrong language or, even worse, get flagged as duplicate content, which can hurt your SPA SEO.</p>
<p>So, how do you make sure Google understands your multi-language SPA? Start by giving each language its own unique URL. Even though SPAs rely on JavaScript, search engines still need clear, distinct URLs to index different versions properly. A good setup looks something like this:</p>
<ul class="wp-block-list">
<li>example.com/en/ for English</li>
<li>example.com/es/ for Spanish</li>
<li>example.com/fr/ for French</li>
</ul>
<p>Next, you need to actually <strong>add hreflang tags</strong> to tell Google which page belongs to which language. You can include these in the <em><</em><strong><em>head</em></strong><em>></em> section of your site or, for larger sites, in your XML sitemap. This helps Google understand that, for example, <strong>example.com/es/</strong> is the Spanish version of <strong>example.com/en/</strong>, so it won’t mix them up.</p>
<h3 class="wp-block-heading">6. Page Splitting and Lazy Loading Your SPAs</h3>
<p>Page splitting (also called code splitting) breaks your SPA into smaller chunks instead of forcing the browser to load everything upfront. This means your site loads faster, especially for first-time visitors. Most modern JavaScript frameworks—like React, Vue, and Angular—make this easy with built-in support for dynamic imports.</p>
<p>Lazy loading, on the other hand, enables the images or content on your SPAs to appear later until you scroll down the site. Instead of loading every single image, script, or video right away, lazy loading delays loading non-essential content until it’s actually needed. This speeds up page load times and saves bandwidth, making your site feel more responsive and boosting the overall SPA SEO health.</p>
<p>Here’s how to lazy load SPAs:</p>
<ul class="wp-block-list">
<li>Use dynamic imports in your code so that different parts of your app load only when a user actually needs them.</li>
<li>Lazy load images and videos with the <em>loading=”lazy”</em> attribute in HTML or use a JavaScript library like <em>lazysizes</em> to do the heavy lifting.</li>
<li>Prioritize above-the-fold content so that users see important elements immediately, while the rest loads in the background.</li>
</ul>
<h3 class="wp-block-heading">7. Use SEO-Friendly Javascript Frameworks</h3>
<p>Not all JavaScript frameworks are built with SEO in mind. Some make it harder for search engines to crawl and index content, which can hurt your rankings. If your SPA relies heavily on JavaScript but isn’t optimized for SEO, you could be invisible to Google, no matter how great your content is.</p>
<p>To avoid this, use an SEO-friendly JavaScript framework or tweak your existing setup to improve crawlability. The best frameworks to use are: </p>
<ul class="wp-block-list">
<li><strong>Next.js (for React)</strong><br>Offers server-side rendering (SSR) and static site generation (SSG) out of the box, making it one of the best choices for SEO. It ensures that search engines can crawl and index content efficiently while improving page speed. Learn <a href="https://prerender.io/blog/reactjs-pros-and-cons/">how to optimize Next.JS for SEO here</a>.<br></li>
<li><strong>AngularJS</strong><br>Since AngularJS relies heavily on client-side rendering, it’s not inherently SEO-friendly. However, AnglarJS can be optimized using a prerendering tool like Prerender.io to ensure that search engines can properly index the content.<br></li>
<li><strong>Vue.js (with Nuxt.js)</strong><br>Nuxt.js extends Vue with built-in SSR and SSG capabilities, improving performance and making Vue-based SPAs more search-engine friendly. You can easily SEO-optimize any SPAs built with Vue.js by following this <a href="https://prerender.io/blog/vue-js-pros-and-cons/">VueJS SPA optimization tutorial</a>.</li>
</ul>
<h3 class="wp-block-heading">8. Schema Markup for SPA Rich Snippets</h3>
<p>If you want your SPAs to stand out in search results, schema markup is an excellent technique. Schema markup helps search engines understand your content better and provide you a rich content snippet, such as product ratings, product prices, and FAQs for popular queries. These not only make your content listing more attractive but also boost click-through rates (CTR).</p>
<figure class="wp-block-image"><img decoding="async" src="https://lh7-rt.googleusercontent.com/docsz/AD_4nXc5GII1ThbKu6u5FHZYnGE5VccIL9xbQodymSGzKIyqpkZ8Je2YkXHF2HvQTHPHSKvn8MsNbH4nGF--RE5q7OuH9UWTBIKC1PaSPKwmKv0cOne95l6t1saNQRD58uLc2HP5IIY84g?key=6D-a-ad4a1RlEWMGTctVLbaL" alt="rich snippet example using schema markup"/></figure>
<p>Additionally, Schema markup provides structured data that tells Google what the content hosted on your SPAs is about. This means search engines don’t have to guess, making it easier for them to index your pages accurately. Rich snippets can appear for various content types, including:</p>
<ul class="wp-block-list">
<li><strong>Articles and blogs</strong>: Enhances visibility with author names, dates, and images.</li>
<li><strong>Products and reviews</strong>: Displays ratings, prices, and availability directly in search results.</li>
<li><strong>FAQs and how-to guides</strong>: Helps you land in Google’s “People Also Ask” section.</li>
<li><strong>Events and recipes</strong>: Highlights event dates, locations, ingredients, and preparation steps.</li>
</ul>
<p>Because SPAs dynamically load content with JavaScript, you’ll need to ensure that search engines can detect the structured data properly. To do this, you can dynamically inject JSON-LD schema markup into the <em><</em><strong><em>head</em></strong><em>></em><em> </em>or append it to the DOM on load. If using Next.js or Nuxt.js, leverage server-side rendering (SSR) or pre-rendering to ensure crawlers can access structured data instantly.</p>
<p><strong>Related</strong>: Which is better, building an <a href="http://prerender.io">in-house SSR or adopting Prerender.io</a> as your JavaScript rendering solution? Our blog has the answer.</p>
<h3 class="wp-block-heading">9. Use Canonical Tags to Avoid Duplicate Content Issues</h3>
<p>SPAs can be a mess when it comes to URLs. Because of how JavaScript handles routing, your site might generate multiple URLs for the same content—whether it’s through query parameters, session-based links, or dynamic filtering. This can confuse search engines, making them think you have duplicate content, which can hurt your SEO rankings.</p>
<p>A canonical tag (<em>rel=”canonical”</em>) tells search engines, <em>“Hey, this is the main version of this page—ignore the rest.”</em> It helps:</p>
<ul class="wp-block-list">
<li>Avoid duplicate content penalties</li>
<li>Make sure link juice (SEO value) isn’t spread across multiple versions of the same page</li>
<li>Tell search engines which URL to rank</li>
</ul>
<p>Since SPAs load content dynamically, canonical tags need to update as users navigate. To handle canonical tags gracefully, you can hardcode you can use JavaScript to update your canonical tag:</p>
<figure class="wp-block-table"><table class="has-fixed-layout"><tbody><tr><td><strong>function</strong> <strong>updateCanonical</strong>(url) {<br> <strong>let</strong> link = document.querySelector(“link[rel=’canonical’]”);<br> <strong>if</strong> (!link) {<br> link = document.createElement(“link”);<br> link.rel = “canonical”;<br> document.head.appendChild(link);<br> }<br> link.href = url;<br>}<br><br>updateCanonical(window.location.href);</td></tr></tbody></table></figure>
<h3 class="wp-block-heading">10. Use Server-Side Redirects Instead of Client-Side Redirects</h3>
<p>When your SPA is relying on client-side redirects, you’re making search engines (and by extension, also users) work harder than they need to process your SPA content. Google expects proper server-side redirects, which ensure the right pages get indexed and ranked. If your redirects depend on JavaScript, search engines can’t execute and follow them, leading to crawling issues, lost rankings, and slower page loads.</p>
<p>So, how do you make sure your SPA handles redirects the right way? Start with <a href="https://httpstatus.io/learn/client-side-vs-server-side-redirects">server-side redirects</a>. These ensure that when a page moves or is replaced, both search engines and users are seamlessly taken to the correct destination without delay. </p>
<p><strong><em>Pro tip: Most backend frameworks, including Node.js, Express, and Next.js, offer built-in support for proper redirect handling.</em></strong></p>
<h2 class="wp-block-heading">Solve Your Single Page Application SEO Problems With Prerender</h2>
<p>By eliminating the requests between the server and browser, SPAs can greatly enhance PageSpeed and UX. As the internet continues to evolve, SPAs will likely become one of the most popular types of web applications. </p>
<p>We’ve gone over important tips for optimizing SPAs in this blog. While all three methods will yield significant SEO results, the most resource-efficient strategy is setting up a pre-built rendering system with Prerender. Trusted by 100,000+ brands worldwide and known for its <a href="https://prerender.io/blog/how-to-install-prerender/">easy installation</a>, our platform can help your SPA see better SEO within weeks.</p>
<p>Think it might be right for you? <a href="https://prerender.io/pricing/">Sign up today and get 1,000 free renders</a>.</p>
<div class="wp-block-buttons is-layout-flex wp-block-buttons-is-layout-flex">
<div class="wp-block-button"><a class="wp-block-button__link has-white-color has-text-color has-background has-link-color wp-element-button" href="https://auth.prerender.io/auth/realms/prerender/protocol/openid-connect/registrations?client_id=prerender-frontend&response_type=code&scope=openid%20email&redirect_uri=https://dashboard.prerender.io/integration-wizard&_gl=1*28g4m4*_gcl_aw*R0NMLjE3Mjg2NDU2NTIuQ2p3S0NBandsYnUyQmhBM0Vpd0EzeVh5dXd3WEFRaVZzc25NcWlLZF91eU5rT0ZBNThyM2lpVXUwR2N0R2RLejRyalJoN1NNOTM0THZSb0NpMTRRQXZEX0J3RQ..*_gcl_au*MzE2MDU0MDY5LjE3MzE1ODYxMjY.*_ga*MjAxMjc3MTgxOC4xNzIzODAxMTYz*_ga_5C99FX76HR*MTczNDUyODc3Ny4yOTIuMS4xNzM0NTM4NTIzLjYwLjAuMA.." style="background-color:#1f8511">Start For Free</a></div>
</div>
<h2 class="wp-block-heading">Further Reading on SEO for SPAs</h2>
<p>If you’d like to learn more about how to improve your SEO for your single-page application, take a look at these resources:</p>
<ul class="wp-block-list">
<li><a href="https://prerender.io/blog/fix-404-errors-on-spas/">How to Fix 404 Errors on SPAs</a></li>
<li><a href="https://prerender.io/blog/spa-javascript-seo-challenges-and-solutions/">SPA Survival Guide—JavaScript SEO Challenges and Solutions</a></li>
</ul>
<h2 class="wp-block-heading">FAQs on Single Page Applications (SPAs) Optimization for Healthier SEO</h2>
<p>Answering some common questions about SPA SEO, crawling and indexing in SPAs, and Prerender.io.</p>
<h3 class="wp-block-heading">1. Are There SEO Advantages to Using a SPA Over a Traditional Multi-Page Application?</h3>
<p>When implemented correctly and well-optimized for search, SPAs can offer several SEO advantages:</p>
<ul class="wp-block-list">
<li><strong>Faster perceived page loads after initial load</strong>, potentially improving user engagement metrics</li>
<li><strong>Easier implementation of dynamic content updates</strong> without full page reloads</li>
<li><strong>Improved UX</strong>, which can indirectly benefit SEO through better engagement metrics</li>
<li>And the <strong>potential for better mobile performance </strong></li>
</ul>
<p>However, these <em>only</em> apply if the SPA is well-optimized for search engines. Without proper optimization, traditional multi-page apps can have a leg up regarding crawlability and indexability. The key is to leverage the benefits of SPAs while ensuring they’re fully accessible to search engine bots.</p>
<h3 class="wp-block-heading">2. How Do Search Engines Handle JavaScript in SPAs?</h3>
<p>While search engines have improved their ability to render JavaScript, they still face challenges with complex SPAs. Search engines rely on HTML for indexing, but JavaScript-heavy SPAs load content dynamically, making it harder for crawlers to access key information. Googlebot processes JavaScript in two waves—initial HTML parsing and delayed JS rendering—but if rendering fails or times out, content may go undiscovered.</p>
<h3 class="wp-block-heading">3. What Tools Can I Use to Test and Debug SPA SEO Issues?</h3>
<p>You can use the following tools to test and solve SPA SEO issues:</p>
<ul class="wp-block-list">
<li>Google Search Console: shows indexing issues and lets you see how Googlebot views your pages.</li>
<li>Lighthouse: helps audit SEO performance and JavaScript execution.</li>
<li>Google’s Mobile-Friendly and Rich Results tests: ensure your content works well on mobile and supports structured data.</li>
<li>Prerender.io: helps JavaScript-heavy SPAs generate static HTML versions to make crawling easier.</li>
<li>Screaming Frog: its SEO spider can mimic a search bot to spot missing content.</li>
</ul>
<h3 class="wp-block-heading">4. Which Rendering Approach is Best for SPA SEO: Client-Side Rendering, Server-Side Rendering, or Pre-rendering?</h3>
<p>If have Single Page Applications (SPAs) and want the best SEO results without the complexity of server-side rendering, pre-rendering with Prerender.io is the way to go. Unlike client-side rendering (CSR), which can leave search engines struggling to index JavaScript-heavy content, Prerender.io generates static HTML snapshots that search bots can easily crawl. It delivers the benefits of SSR without the extra development overhead, making it a simple, scalable, and effective SEO solution for SPAs.<br>See in detail <a href="http://prerender.io">how Prerender.io vs. SSR vs. CSR compared</a>.</p>
]]></content:encoded>
<wfw:commentRss>https://prerender.io/blog/how-to-optimize-single-page-applications-spas-for-crawling-and-indexing/feed/</wfw:commentRss>
<slash:comments>0</slash:comments>
</item>
</channel>
</rss>