BERT Algorithm Update

Avatar
By Chris Yamamoto

On October 25th of this year, Google announced a major algorithm update: BERT. No, not the unibrowed Muppet of our collective childhood. The BERT acronym means Bidirectional Encoder Representations from Transformers; it’s an open source neural network-based technique geared towards natural language processing.

Neural Network What Now?

In layman’s terms: Google just completely revolutionized the way that their internal algorithm understands language and search queries through automated machine learning.

Come again… again? Okay. Consider a simple example: How many of you have searched Google when you’re hungry with the query “food naer me”? Something seemingly simple, Google has to identify that by ‘food’ you mean restaurants and fast food joints and that by ‘naer me’ you misspelled ‘near me’ which means ‘please access my location and identify restaurants in close proximity.’ Got it? Wait, that’s not all.

On top of this, Google also has to NOT misidentify the question. You don’t mean: what food items are close to the word ‘me’ like ‘meat’ or ‘lime’. And you ALSO don’t mean: food near Maine (state code ME).

This is the importance of understanding language and search queries. And the impact of BERT is massive for SEO. In fact, Google says that this shift in search will make an impact on 10% of all queries.

Decoding Human Language

People often experience misunderstandings with each other over text. It’s difficult—without exceptional writing skills—to match intention with vocabulary every time. Often, we lose ourselves in our words, and the tone with which others read our message changes the meaning of the text; thus, misconstruing the original intention. Whether that’s the fault of the reader or writer is subject to debate.

What’s not up for debate is that Google’s search algorithm is surprisingly good at interpreting what we want from what we search.

  • We search “I have the cold” and Google responds with WebMD pages and home remedy articles.
  • We search “Can we live on mars” and Google doesn’t give you nutritional articles on the efficacy of living on Mars candy bars—it gives you articles about surviving the rough conditions of an atmosphere-less planet.

Here’s something crazy to zap into your noggin: Billions of search queries hit Google daily. And about 15% of those queries are considered novel, written and searched for the first time. 

This begs the question: When Google is hit with a string of words that it’s never seen before, how does it still produce quality results?

Mimicking the Brain: How Machines Read and Understand Search Queries

The key is understanding how the human mind reads and understands sentences, and then creating a machine-learning algorithm that mimics this. Take the following misspelled sentence, most people won’t have trouble reading and deciphering what it says:

  • Tihs is a setnence that you can raed even thouhg wrods are misspeled.

Machine algorithms can easily mimic what our brains are doing. One-by-one, it corrects each word in the sentence and then runs the phrase through the search query. Easy enough, but this is just spelling. What happens when you break conventional grammar to create a sentence that looks like this:

  • Sentence hard to read but still can understand perfectly fine.

What’s your brain doing when you read that sentence? It groups words and connects them to create plausibility. “Sentence hard to read” and “still can understand perfectly fine” are connected by a transitional word “but;” this tells the brain there are two separate ideas that are related. “Hard to read” is referring to the sentence, while “perfectly fine” is referring to the ability to understand the sentence.

But how would a machine algorithm know to do this? Why wouldn’t it just as easily flip the descriptor and subject? Before BERT, there’s no guarantee the current Google algorithm would address the sentence correctly.

The BERT Difference

Okay, the basics are laid. Now let’s discuss what BERT is doing and what it’s fixing. According to Google:

“[BERT] was the result of Google research on transformers: models that process words in relation to all the other words in a sentence, rather than one-by-one in order. BERT models can, therefore, consider the full context of a word by looking at the words that come before and after it—particularly useful for understanding the intent behind search queries.”

Now, instead of just running the search query from front to back, there is a mapping element where words are compared and grouped with other words in the sentence. Now, intention can be deduced through the relationship of words and phrases to the rest of the sentence.

What BERT Looks Like in Practice?

Google provided examples of how BERT would positively impact search queries on its platform. Take these searches, for example: 

  • “2019 brazil traveler to usa need a visa.”
    • Before BERT: Google would have a hard time deciphering the importance of the direction of travel. In other words: “brazil” “traveler” and “usa” all indicate traveling plans between these two countries. What’s easily missed is the importance that we (humans) would put on the word “to” that describes going from Brazil to USA. Because travel to Brazil from the USA is more common in articles and SEO-driven posts, the resulting articles were the opposite of the search intention.
    • After BERT: Because each word is now grouped and decoded together, the directional word “to” is contextualized. Searching this phrase would now come up with visa plans for travelers who are from Brazil going to the US. Something minuscule, but oh so important.
  • “Parking on a hill with no curb.”
    • Before BERT: Another interesting case where there is a breadth of information detailing the simple act of “parking” on a “hill.” Plus, all these articles mention which ways to angle your tires so that the car will turn into the “curb.” SO, where does that leave your search query? And what about those other filler words “with no”? Well, before BERT, these would be ignored—offering the wrong information.
    • After BERT: Because “no curb” can be combined and placed in relationship to “parking on a hill,” a more accurate representation of the search query and the intention can be deduced.

Acknowledging the flaws in their search algorithm and working to address them is part of Google’s mission with their public rollout of BERT. Plus, the fact that its open source presents additional resources to funnel into its functionality moving forward.

Benefits of Open Source Algorithms

Apologies for spewing unneeded information for you tech-savvy folk—but a crossroads is where we are and a sign is what we need. What does “Open Source” mean? Open source means that anybody—be it your grandma, your genius cat, or you—can gain access to underlying source code behind BERT. From there, anybody can develop, upgrade, or reimagine the code to create better-functioning algorithms.

Because language is intricate, nuanced, and multi-layered, incorporating all that language has to offer beyond the structural integrities of “proper language” is more than one team can handle. Thus, BERT was designed to be open source to allow users across the globe to build their own systems. Google mentioned with BERT’s initial development:

“We open sourced a new technique for NLP pre-training called Bidirectional Encoder Representations from Transformers, or BERT. With this release, anyone in the world can train their own state-of-the-art question answering system (or a variety of other models) in about 30 minutes on a single Cloud TPU, or in a few hours using a single GPU. The release includes source code built on top of TensorFlow and a number of pre-trained language representation models.”

The amount of practice and fine-tuning BERT was able to garner through the hundreds of thousands of individual search examples helped to create the efficient program it is today.

How Will BERT Impact Search Moving Forward?

Google is the first to admit that neither BERT nor the next iteration of machine learning will be perfect. Perfection isn’t the point; better, more accurate results are the point—and to that end, Google succeeded. Each positive step for Google AI is another in the right direction toward syntactic comprehension and intention deduction within their search query function.

This is especially true for queries that are longer and more nuanced. And the results will more directly be appreciated through Google’s featured snippets (also known as their answer boxes). These are the boxes that appear above organic search results that attempt to directly answer your question with a relevant paragraph or an app that solves your problem (like how a calculator appears for a queried math problem). This is not just rolling out in English, by the way, Google is using BERT for these featured snippets in all languages.

Finally, Google believes that BERT’s syntactic capabilities will extend beyond text-driven searches and be the bridge that connects voice search and text search—which, in its current form, is comprised of two vastly different worlds.

How to Optimize for BERT?

While BERT is fascinating in its own right (and appreciated), the next level for all marketers is to ask the glaring SEO question: How does one optimize for BERT? Thankfully (or un-thankfully, depending on your position), there’s nothing that needs to be done differently for BERT. BERT seeks to more accurately match quality content with relevant search results.

This means, as long as you’re producing and outputting quality content, then congrats, you’re optimizing for BERT.

But don’t take our word for it, take it from Google’s Danny Sullivan, “There’s nothing to optimize for with BERT, nor anything for anyone to be rethinking. The fundamentals of us seeking to reward great content remain unchanged.”

Sources:

Search Engine Roundtable. Google BERT Update Impacts 10% Queries & Has Been Rolling Out All Week. https://www.seroundtable.com/google-bert-update-28427.html

Google. Understanding searches better than ever before. https://www.blog.google/products/search/search-language-understanding-bert/

Google AI Blog. Open Sourcing BERT: State-of-the-Art Pre-training for Natural Language Processing. https://ai.googleblog.com/2018/11/open-sourcing-bert-state-of-art-pre.html

Search Engine Land. Welcome BERT: Google’s latest search algorithm to better understand natural language. https://searchengineland.com/welcome-bert-google-artificial-intelligence-for-understanding-search-queries-323976

Search Engine Land. Why you may not have noticed the Google BERT update. https://searchengineland.com/why-you-may-not-have-noticed-the-google-bert-update-324103