Oh BERT, BERT, wherefore art thou, BERT?

Yesterday Google posted this.
Tl;dr, they are improving their AI’s capacity to understand natural language. (BTW, I still wonder why ppl put their tldr’s underneath the text: I mean, if I wanted a tldr it’s because I DIDN’T want to scroll all the way down, isn’t it? Bah)

Natural Language Processing (NLP) is one of those (many) buzzwords in web marketing that many use and few understand. It looks cool to say it, just like “Search Experience Optimisation” or “entrepreneur“.


The first thing I’d like to point out is, I finally got my 15% bit of data confirmed. 15% of all searches on Google, day in day out, were NEVER searched for before. Fifteenpercent. Quindicipercento. Let that sink in.

It’s a LOT. This number will eventually tend to decrease in a Moore’s law fashion, but it’s been going on for years already. Reason why I find this interesting is, we are not ready, we’ve never been, as a community, to fully understand what Google Search is doing. Think about it: we struggle everyday trying to assign and track pools of 1 or 10 keywords to groups of 100 web pages, which is orders of magnitude far away from what the search engine we are trying to emulate is doing: and every single day, they move further out of our reach. We are working with a high school calculator, pretending to be able to predict what a small brain is doing. And that’s what we are selling to our clients. Basically, bullshit.

Translators and the machine

Quite some time ago, I was an awful student at an Interpretation and Translation Uni course in Italy (embracing my faults as a student is one of the best things that ever happened to me, but this is for another post). Back then, my colleagues and I used to use Google Translate as a paradigm of “shitty translations”. GT was the devil for us- with more than a little bit of a NIMBY syndrome ( “Not In My Back Yard”, the tendency to oppose everything that might alter my own current personal life, even if it might mean greater benefits for humanity at large in the long run).

Their translation capacities were risible at best. As a student and lover of the marvelous English and Spanish languages, I was often appalled by what GT would translate, and would feed it with pieces of texts with the same guilty, obscure fetish for horror that people have when they choose to open that disgusting image of a purulent wound that a doctor friend posted on Facebook, or when they listen to Spice Girls.

But take a look at it now. It’s little short of miraculous. If understanding something means to compare it with instances of similar things we saw in the past, and to recognise its meaning as a symbol rather than mere letters, then GT understands the text – if with obvious limits. Look at this:

Now, I don’t want to play linguist and make this harder than it really is: simply, these two sentences mean the very same thing, but each single word does not. Literally translated, as GT would’ve probably done not two years ago, it’s “to take two pigeons with one broad bean”. If I only input “fava”, GT will translate “broad bean”. But given the context (Italian and English speakers using it to express the same idea), it changes to “stone”. How? See the little “Verified” tick on the right-hand side: it means someone’s confirmed this very exact sentence manually. Us feeding this type of info into Google is exactly what we do when we say the sentence in front of our kid, and they understand its underlying meaning and start using it, for their own kids to listen to, and repeat, some day in the future. So not only are we providing Google with meanings to successions of letters in specific occasions; we are also (and more importantly, providing them with probabilities (“fava” means “broad bean” X% of the times, “stone” Y%). We contribute with several millions of these pieces of information every day. This example is cross-languages, but describes exactly what Search does in a single language, too.

This is what “Natural Language” is: most times we do not use words for what they mean, and even less often do we use a full grammatically accurate structure in our sentences. This, by the way, is even more evident when “talking to a machine” – when requesting info to a machine, we are naturally biased to simplify the query for it to understand. Now, consider this:

A matter of statistics

I don’t know how many searches on Google Translate are performed every day: however, trillions are made on Google Search (data is uncertain at best). It doesn’t matter what’s the exact ratio, what matters is the conclusion that the probability to “guess” a good answer to a query depends on big data and how the AI handles it: if the numbers are big enough to give impressively good results on Translate, just think how accurate it can be on Search! (I don’t see any reason why the latter shouldn’t feed on the former, too: but of course cross-lingual searches are but a fraction of the whole: as always, a matter of statistics).

SOOO, where are you going with this, Chiodo?

I’ve seen dozens of people giving their own interpretation of what BERT means for the market: most do not have any idea whatsoever what natural language, semantics, or machine learning even are in the first place. We are reading Shakespeare without a solid grasp of the English alphabet. We are English students on day one, listening to a Gallagher brothers interview.
But more in general than that: in life, we should not try to simplify everything. We do that, us humans, we do that because of fear. We ancestrally struggle to accept that it’s OK not to know something: it pushes you to search. To move forward.

Nothing in how the brain – or imitations thereof – works is easy.


When we calculate the relevance of our site on keywords like “insurance” or “shoes”; or we optimise a site by changing one or two <titles>; or we give percentages of how many searches are going to be impacted by this or that algorithmic modification (“It’s 10%! GooGlESAiDTeNpErCEnT!!1!”), we do nothing different than when we were discussing right/wrong keywords ratio in a text, or putting hidden anchors in the footer. Remember those times?
It’s bullshit: oversimplified, reductive, obsolete bullshit.

We all might use some study of psychology, says I. And translation. And linguistics. And statistics. And copywriting. And machine learning.

So all in all: we know nothing, study more and stfu 😀

Soundtrack: In cauda venenum by Opeth

Artificial Intelligence and SEO: signals and probability

Someone’s asked a couple questions about this post, and I realised I find it very hard to express my knowledge and opinions about Artificial Intelligence and its relationship with web marketing. This immediately triggers two thoughts: on the one hand, I understand the topic way less than I would like to; secondarily, it is a field in which most people don’t have a strong understanding, either. Probably it is because most experts of the field consider the marketing industry trivial (I know I would if I were them), and marketers prove them right by not knowing anything about it, nor caring to know. Which is a mistake per se.

Let me add something on top of this specific point: artificial intelligence, and machine learning in particular, is going to have a strong, strong impact on our job, like in every single job. Even better, it ALREADY DOES. How can you not see how featured snippets are generated? Do you think your developer can implement that? No they cannot. Every day, we talk about billions and trillions of pages, queries, and detected user intents. Only a machine can handle that sort of amount of information bits and put them in order, and this, this is what fascinates me. This is the only reason why it still makes sense to have an organic traffic strategy, 10 years or so after SEO died. Because bear in mind: if you don’t approach and work with AI, then it is dead.
A whole book might be written on the correlation of the verb I just used, “to think”, with a machine. Is the search engine really thinking? It’s a topic for which neither I nor anyone else could possibly have all the answers, ranging from philosophy to advanced engineering: what is it “to think”? While I do think I can provide an opinion on this, as valuable as anyone else’s, I reckon this is not the place to do so: indeed, it’s irrelevant. Whether it is actually “thinking”, what G does; or it’s just imitating us, mirroring what it sees as a parrot, it is the final result that matters now in this specific argument. G engineers do not have all the answers (and more often than not lie about it), when it comes to understanding why the SERP (the Search Engine Results Page) looks the way it does: JohnMu doesn’t know, Larry Page most definitely don’t know. Matt Cutts never knew.
Believing that anyone at Google knows why the search engine behaves somehow in a given moment, is equally irrational as believing that the IBM team that engineered Deep Blue is able to defeat Garry Kasparov in a game of chess.
They cannot.

Somehow, building upon millions and millions of matches and moves and “observing” real-life champions, Deep Blue learned. It learned from experience, which is what we do as kids isn’t it. Regardless, it put information together and used it, learned to use it properly in new, unexpected situations: that’s exactly what we do as little kids, and what Google’s search engine does now on a daily basis.

The AI learning process is well explained in many a TED Talk by better people than yrstruly. But let me try to wrap it up as best I can. A computer’s speed and memory are both better than ours. Better at a logarithmic level. What it’s (still) worse than us at is recognising connections between dots. Our brains are extremely talented at recognising patterns, which is pretty much what I poorly try to explain with my pen example: you might have never seen this specific object, but you’ve seen dozens of similar ones. It’s got these and those characteristics. The environment you find it in is thus and thus. All this bits and pieces make you recognise it as a pen, even though it might be something else: putting all of them together, it’s likely a pen.

Using signals

So how does Google know that Wikipedia’s information is to be shown on top of a SERP? It’s because of thousands of instances in which:
1. people searched for “X”, came back to SERP, searched for “X wiki”
2. people linked from their site to the Wiki page dedicated to “X”
3. Websites are created, in which the sole textual content is copypasted from Wiki
There’s many more “political” reasons, but you get the point: G used signals. With a statistically relevant amount of inputs (signals), it’s able to recognise how likely something is to answer the user’s needs.


Relevantly, an AI is not merely based on an I/O system. My machine at home goes by on/off, Y/N, positive/negative. It’s got no grey areas. A Deep Learning system does the same, but with such magnitude that it’s able to assign probability. For a traditional computer, either X is a pen, or it’s not. For AI, X is likely to be a pen: in lack of a better options, it is a pen.

If we suggest that it is not, in fact, a pen, an AI-powered system is able to suggest another option (what was the second-most-likely). It’s also going to learn that the likelihood of that specific combination of characteristics being proper of a pen is less than it previously believed, and will keep this in mind in the next instance. Every input teaches it something.

Finally, let me say that I’m fully aware that I do not have a complete grasp on the topic: any inputs are very much appreciated because I believe this is something web marketers should talk about.

Soundtrack’s pretty easy to choose this time: Fear Inoculum by Tool

Artificial Intelligence and SEO: the pen example

AI and machine learning are fascinating and ever-evolving topics, and I spend a lot of time studying and thinking about them. This is going to be the first of a series of articles from a web marketing perspective, which is all I have knowledge about. If you are into AI, I recommend finding your own way into it, and a better teacher than myself.

During presentations, may it be in a course or with clients, I always use the Tratto Pen example to illustrate the concept of a Search Engine behaving like a human, and not like a computer.

This is a Tratto Pen: it’s my favourite pen and I’m yet to find an equivalent here in the UK. Its quality is irrelevant though, I only use it to prove a point:

I normally do this with a real pen (which is also useful to avoid Italian over-handgesturing): I display the object in my hands and ask: “do you know what this is?”. They immediately go “a pen?”.

There you have it: how do you know it’s a pen? I think it’s not plausible you’d seen this brand before because it might not even exist here, not to mention this specific pen. You most definitely never really saw this before. So how did your brain come up with the immediate idea that it was a pen? Well, it’s all a matter of probability: your senses scan the world around you, gathering information; your brain compares the surroundings with what’s seen in the past (memory) and assumes. Now, you are seeing an object, its dimensions are X*Y*Z, its colour black. We are in an office, or in a similar space. I’m handling it, pointing at things with it. It appears to have a cap. 99.9% it’s a pen. It might be a bomb or a laser pointer or a car, but it’s simply less likely (respectively, 1%, 20%, 0%). So even if you’ve never seen this specific item before, and I’ve never mentioned its name, you can assume it’s a pen.

In Google’s case, its senses are the crawlers (spiders, bots, whatever you call them). It’s got memory (way more memory than a human’s, in fact). It does not need mentions: it can assume based on clues. It will understand that this blog is about SEO, even if I were to never even mention SEO at all. Because the words I use are normally associated with a high percentage by other SEO blogs; because when people talk about me they do it in association with that set of words; because using the terms SEO or Search Engine Optimisation is largely the same for humans, so it’s become the same for the Search Engine. And so on: there’s plenty of examples of how this works, my favourite being how Google Translate has evolved in the past couple of years. When I was studying translation, Google Translate was synonym of “crappy quality”, now it’s just impressive. This is an interesting article about it: btw, I’m not using a relevant anchor text to link to that page, do you think Google will have trouble understanding what I’m giving that page relevance for? Of course, linking is influenced by this, too.

So in the end: I don’t care at all if you link towards my site with a relevant anchor; hell, I barely care if you actually link at all! Being linked to is now, and will be ever more, only a way of flexing to your client or boss or colleagues. Truth is, a link is but a consequence of people giving signals. Signals are what’s important for the search engine, because it’s using them to create a map of the world.

Soundtrack: Act IV by The Dear Hunter

PS. I SWEAR, I was not ever paid by Tratto Pen/Fila for this post (but hey you guys, if you wanted to send me a box with 100 black Trattos I wouldn’t mind really…)