Improving Search with rigorous testing

Our goal is always to provide you with the most useful and relevant information. Any changes we make to Search are always to improve the usefulness of results you see.

Testing for usefulness

Search has changed over the years to meet the evolving needs and expectations of the people who use Google. From innovations like the Knowledge Graph, to updates to our systems that ensure we’re continuing to highlight relevant content, our goal is always to improve the usefulness of your results. That is why, while advertisers can pay and be displayed in clearly marked ad sections, no one can buy better placement in the Search results.

We put all possible changes to Search through a rigorous evaluation process to analyze metrics and decide whether to implement a proposed change. Data from these search evaluations and experiments go through a thorough review by experienced engineers and search analysts, as well as other legal and privacy experts, who then determine if the change is approved to launch. In 2023, we ran over 700,000 experiments that resulted in more than 4,000 improvements to Search.

We evaluate Search in multiple ways. In 2023, we ran:

Running experiments

First, we enable the feature in question to just a small percentage of people, usually starting at 0.1%. We then compare the experiment group to a control group that did not have the feature enabled.
Analyzing metrics

Next, we look at a very long list of metrics, such as what people click on, how many queries were done, whether queries were abandoned, or how long it took for people to click on a result.
Measuring engagement

Finally, we use these results to measure whether engagement with the new feature is positive, to ensure that the changes we make are increasing the relevance and usefulness of our results for everyone.

Read the rater GUidelines

Surfacing Google Search results

People don’t choose or arrange results on Search. Google Search automatically surfaces the most useful and reliable content for particular queries. We process billions of searches per day. Automation is how Google handles the immense scale of so many searches. These systems consider many factors, including the words in your query, the content of pages, the expertise of sources, and your language and location.

When we identify places where we are not delivering high-quality results, we investigate to understand what the broader issue might be, and we take a scalable approach to improve our results for not just one query, but many others.

We are constantly improving Google Search. In some cases, we may bring in humans to manually block policy-violating or illegal content, in the limited and well-defined situations that warrant this. You can learn more about our policies that apply to Google Search here.

Improving Search with rigorous testing

Testing for usefulness

We evaluate Search in multiple ways. In 2023, we ran:

4,781 launches

16,871 live traffic experiments

719,326 search quality tests

Read the rater GUidelines

124,942 side-by-side experiments

Discover more