Render Budget, or: How I Stopped Worrying and Learned to Render Server-Side

Full client-side rendering in SPAs can still hinder the searchability of a site, especially for large-scale web services. By paying attention to 'render budget', owners of large-scale web services can help Google index their pages.

Hi! This is Kazushi from JADE K.K.

5 years ago, when I was a Googler, I wrote a blog article titled “Understanding Web Pages Better”. We announced that Google Search started to render web pages using JavaScript, which was exciting news back then. Pages that weren’t previously indexable started to show up in the search results, improving the searchability of the web by a considerable margin. We started by rendering a small part of the index and gradually scaled it up, finally letting us deprecate the longstanding AJAX crawling scheme.

Renderbot-kun
Renderbot-kun

5 years on, many people now take it for granted that Google renders web pages by executing JavaScript. It has evolved a ton thanks to the hard work of great engineering teams at Google; it’s a result that I’m personally very proud of. Unfortunately, this doesn’t mean anyone can use JavaScript to create websites that rely entirely on client-side rendering and expect Google to index all their pages.

Recently, Martin Splitt from Google’s Webmaster Trends Analyst (WTA) team released an article titled Understand the JavaScript SEO basics, which provides a great explanation of how Google’s rendering processes are set up. I’ve refrained from talking publicly about these points after leaving Google. Still, now that Google has officially released detailed information about the system, I’d like to make a few quick comments about the setup.

tl;dr: Full client-side rendering can still hinder a site's searchability, especially for large-scale web services.

Why? Martin's diagram is self-explanatory: rendering is designed as an additional step before indexing happens.

https://developers.google.com/static/search/docs/images/googlebot-crawl-render-index.png

As the diagram suggests, if Google notices that a page needs to be rendered, it will be added to the Render Queue to wait for the Renderer to process it and return rendered HTML. Google needs to judge the importance of the page's content before actually seeing what the content is. Even though it hasn’t rendered the page yet, it needs to develop a sophisticated guess of how much value will be added to the index by rendering the content.

This rendering problem resembles the crawling problem in structure. In crawling, Google also needs to make an educated guess about the importance of the page it discovered on the web before completing the crawl.

What does this mean for webmasters? We should know that Render Budget exists, just like Crawl Budget.

Gary Illyes, also a Google WTA, wrote an article titled “What Crawl Budget Means for Googlebot” in 2017. It’s a really good article; you should read it if you haven’t already. Shortly, Google decides the crawl priority of pages based on site popularity, content freshness, and server resources.

In crawling, the biggest bottleneck is usually the server resource on the websites being crawled. Google must make good predictions on how much load the host can tolerate and decide how fast it crawls the site, striking a good balance between user needs and update frequency. This is a very sophisticated system that the great crawl engineering teams have built, even though it receives little attention from external parties.

On the other hand, the biggest bottleneck in rendering is the server resources on the Google side. Of course, there are new resources like JavaScript and JSON files that Google needs to crawl, which will add to the ‘crawl budget’. However, while some resources can be cached to minimize the crawl, pages that need rendering must be rendered from scratch every time. Even Google can’t use its computing resources freely to have an index with fully-rendered, fully-fresh pages from the entire Web.

Google must decide which pages to render, when, and how much. Like the crawl budget, it uses signals to compute priorities. In this case, the most important ones are the site's popularity and the number of URLs that need rendering.

If a site is popular and needs a lot of rendering, you’d assign more resources. If no rendering is needed, no resources are allocated. If a website needs a lot of rendering but is not very popular, Google will assign some resources to render some of the pages deemed necessary within the website. Sounds all-natural, right?

Let’s apply this to a scenario where you’ve created a new single-page application that fully renders on the client side. You return templated HTML and send content separately in JSON files. How much of a render budget will you have assigned from Google? There are a lot of cases where the site doesn’t have crawlable URLs (you need to give permanent URLs to states you want to be crawled!), but even if you did it right, it's quite difficult for Google to figure out how important your new site is for the users. Most likely, it’ll slowly give you a budget to test the waters while trying to gather signals about your site. As a result, not all of your URLs will be indexed at a first pass, and it may take a very long time for it to be fully indexed.

“My site has been in the index for a long time, and Google knows how popular it is”, you say? Does that mean it’s OK to entirely revamp the site to render it client-side? No. If you’ve been returning HTML that doesn’t need a lot of rendering, then suddenly changed all of your URLs to render client-side, Google needs to recrawl AND render all URLs. Both crawl budget and render budget impact your site negatively in this setup. As a result, it might take months, even over a year, to get all of your URLs to the index after the site revamp.

In conclusion, client-side rendering in large-scale websites is still unrealistic if you want your URLs to be indexed efficiently. The ideal setup is to just return plain HTML that doesn’t need any JavaScript execution. Rendering is an additional step that Google supports - but that means it’s forcing Google to make extra efforts to understand your content. It’s time- and resource-consuming, which can slow down the indexing process. Suppose you implement dynamic rendering, returning server-rendered content to Googlebot while making users render client-side. In that case, it will become easier for Google to process your site, so you’re just making your servers do the work for Google.

In the end, returning static, simple HTML continues to be the speediest and most accurate if you want your site indexed fast and accurately at a massive scale. Client-side rendering can hinder your website's performance in the SERPS, especially for news sites that need to be indexed fast or large-scale web platforms that produce hundreds of thousands of new URLs daily.

Google has been working hard to recognise JavaScript pages, making it years ahead of other search engines in understanding today’s web. It’s working great even looking at it from an external perspective, up to mid-scale websites. It’s guaranteed to evolve even more.

However, the current system isn’t perfect—just like any engineered system. Where Google fails, webmasters need to chime in for help from the other side. If your service produces hundreds of thousands of URLs daily, then you probably want to pay attention to these details and proceed with caution.

All participants of the web ecosystem can contribute to enabling a world where information is delivered swiftly to the users who need it. Let’s keep building a better web for a better world!