Oh BERT, BERT, wherefore art thou, BERT?

Yesterday Google posted this.
Tl;dr, they are improving their AI’s capacity to understand natural language. (BTW, I still wonder why ppl put their tldr’s underneath the text: I mean, if I wanted a tldr it’s because I DIDN’T want to scroll all the way down, isn’t it? Bah)

Natural Language Processing (NLP) is one of those (many) buzzwords in web marketing that many use and few understand. It looks cool to say it, just like “Search Experience Optimisation” or “entrepreneur“.


The first thing I’d like to point out is, I finally got my 15% bit of data confirmed. 15% of all searches on Google, day in day out, were NEVER searched for before. Fifteenpercent. Quindicipercento. Let that sink in.

It’s a LOT. This number will eventually tend to decrease in a Moore’s law fashion, but it’s been going on for years already. Reason why I find this interesting is, we are not ready, we’ve never been, as a community, to fully understand what Google Search is doing. Think about it: we struggle everyday trying to assign and track pools of 1 or 10 keywords to groups of 100 web pages, which is orders of magnitude far away from what the search engine we are trying to emulate is doing: and every single day, they move further out of our reach. We are working with a high school calculator, pretending to be able to predict what a small brain is doing. And that’s what we are selling to our clients. Basically, bullshit.

Translators and the machine

Quite some time ago, I was an awful student at an Interpretation and Translation Uni course in Italy (embracing my faults as a student is one of the best things that ever happened to me, but this is for another post). Back then, my colleagues and I used to use Google Translate as a paradigm of “shitty translations”. GT was the devil for us- with more than a little bit of a NIMBY syndrome ( “Not In My Back Yard”, the tendency to oppose everything that might alter my own current personal life, even if it might mean greater benefits for humanity at large in the long run).

Their translation capacities were risible at best. As a student and lover of the marvelous English and Spanish languages, I was often appalled by what GT would translate, and would feed it with pieces of texts with the same guilty, obscure fetish for horror that people have when they choose to open that disgusting image of a purulent wound that a doctor friend posted on Facebook, or when they listen to Spice Girls.

But take a look at it now. It’s little short of miraculous. If understanding something means to compare it with instances of similar things we saw in the past, and to recognise its meaning as a symbol rather than mere letters, then GT understands the text – if with obvious limits. Look at this:

Now, I don’t want to play linguist and make this harder than it really is: simply, these two sentences mean the very same thing, but each single word does not. Literally translated, as GT would’ve probably done not two years ago, it’s “to take two pigeons with one broad bean”. If I only input “fava”, GT will translate “broad bean”. But given the context (Italian and English speakers using it to express the same idea), it changes to “stone”. How? See the little “Verified” tick on the right-hand side: it means someone’s confirmed this very exact sentence manually. Us feeding this type of info into Google is exactly what we do when we say the sentence in front of our kid, and they understand its underlying meaning and start using it, for their own kids to listen to, and repeat, some day in the future. So not only are we providing Google with meanings to successions of letters in specific occasions; we are also (and more importantly, providing them with probabilities (“fava” means “broad bean” X% of the times, “stone” Y%). We contribute with several millions of these pieces of information every day. This example is cross-languages, but describes exactly what Search does in a single language, too.

This is what “Natural Language” is: most times we do not use words for what they mean, and even less often do we use a full grammatically accurate structure in our sentences. This, by the way, is even more evident when “talking to a machine” – when requesting info to a machine, we are naturally biased to simplify the query for it to understand. Now, consider this:

A matter of statistics

I don’t know how many searches on Google Translate are performed every day: however, trillions are made on Google Search (data is uncertain at best). It doesn’t matter what’s the exact ratio, what matters is the conclusion that the probability to “guess” a good answer to a query depends on big data and how the AI handles it: if the numbers are big enough to give impressively good results on Translate, just think how accurate it can be on Search! (I don’t see any reason why the latter shouldn’t feed on the former, too: but of course cross-lingual searches are but a fraction of the whole: as always, a matter of statistics).

SOOO, where are you going with this, Chiodo?

I’ve seen dozens of people giving their own interpretation of what BERT means for the market: most do not have any idea whatsoever what natural language, semantics, or machine learning even are in the first place. We are reading Shakespeare without a solid grasp of the English alphabet. We are English students on day one, listening to a Gallagher brothers interview.
But more in general than that: in life, we should not try to simplify everything. We do that, us humans, we do that because of fear. We ancestrally struggle to accept that it’s OK not to know something: it pushes you to search. To move forward.

Nothing in how the brain – or imitations thereof – works is easy.


When we calculate the relevance of our site on keywords like “insurance” or “shoes”; or we optimise a site by changing one or two <titles>; or we give percentages of how many searches are going to be impacted by this or that algorithmic modification (“It’s 10%! GooGlESAiDTeNpErCEnT!!1!”), we do nothing different than when we were discussing right/wrong keywords ratio in a text, or putting hidden anchors in the footer. Remember those times?
It’s bullshit: oversimplified, reductive, obsolete bullshit.

We all might use some study of psychology, says I. And translation. And linguistics. And statistics. And copywriting. And machine learning.

So all in all: we know nothing, study more and stfu 😀

Soundtrack: In cauda venenum by Opeth