# Empirically Untangling the Meaning of Grok

Heinlein’s 1961 novel “Stranger in a Strange Land” contains one of the most influential (and controversial) sci-fi stories of the previous century. The protagonist is Valentine Micheal Smith, a man that was raised by aliens on Mars. The story starts when he returns to Earth, and is confronted with human culture. Especially the second half of the book has received a lot of critique (Spoiler alert: Micheal turns out to be some kind of Messiah. Also he founds a sex cult that gives you magical powers), but the book nonetheless contains a lot interesting and progressive concepts that were ahead of its time. The most influential is probably “grok”.

Grok is a frequently used word by Micheal (and Martians in general). Heinlein does not deem it necessary to introduce it, when Micheal starts using it. But eventually we learn that the literal meaning of the Martian word is “to drink”. However, the term is used in a much broader way. The drinking is a metaphor for a kind of absorption of knowledge. Hence it might be better to translate it as “to understand”, but the book emphasizes that grokking is a deeper concept than understanding. It is a central concept in Martian culture. Some passages from the book:

Grok means “to understand,” of course, but Dr. Mahmoud, who might be termed the leading Terran expert on Martians, explains that it also means, “to drink” and “a hundred other English words, words which we think of as antithetical concepts. ‘Grok’ means all of these. It means ‘fear,’ it means ‘love,’ it means ‘hate’—proper hate, for by the Martian ‘map’ you cannot hate anything unless you grok it, understand it so thoroughly that you merge with it and it merges with you—then you can hate it. By hating yourself. But this implies that you love it, too, and cherish it and would not have it otherwise.

‘Grok’ means to understand so thoroughly that the observer becomes a part of the observed—to merge, blend, intermarry, lose identity in group experience. It means almost everything that we mean by religion, philosophy, and science and it means as little to us as color does to a blind man.

However much Heinlein wants to cover the definition of the word in a shroud of mystery, the meaning of word is a function of its usage, not it’s definition. Hence, I would like to try to empirically reveal its functional meaning, by using BERT.

BERT is a masked language model, meaning that it predicts the likelihood of a word appearing in a give context.

This allows to approximate what English words are used most similarly to “grok”. For example, if we mask grok in the following context:

“But, Jubal, don’t make a suggestion like that to Mike. He wouldn’t [MASK] that you were joking - and you might have a corpse on your hands.”

BERT’s top 2 predictions are: “believe” and “understand”.

# Experiments

We can get a more complete picture by averaging the results for all the 305 occurrences of grok in “Stranger in a strange land” (the original, uncut version). These predictions were generated using the pre-trained huggingface model called roberta-large (a variant of BERT).

word probability
know 11.67%
understand 10.96%
think 7.28%
see 3.73%
do 2.78%
get 1.48%
believe 1.39%

I only included the words that have probabilities above 1.2%. We can repeat the same exercise with “grokked” (102 occurrences):

word probability
knew 7.07%
understood 6.90%
saw 4.29%
realized 3.01%
thought 2.72%
felt 2.24%
learned 2.03%
seen 1.99%
known 1.59%

Reassuringly, these give very similar results. Lastly, let’s check the results for “grokking” (44 occurrences):

word probability
it 3.91%
understanding 3.46%
learning 2.98%
living 2.35%
shape 1.89%
shape 1.47%
moment 1.33%
natural 1.31%
doing 1.29%
meaning 1.22%

# Conclusions

The results of “grok” and “grokked” seem to agree with the definition in the Merriam-Webster dictionary: “to understand profoundly and intuitively”. Especially so for the past tense. Only in the decomposition of “grokking” do we get any hint of a mystical component. The active form of “grok” is related to e.g. “living”, “shape” and “moment”. Note furthermore that the absolute probabilities for the best predictions are much lower. This is what you would expect if there is no single English word with a consistent match.

# Appendix

As a kind of control, I checked what the model predicts for regular English words. If there are enough occurrences, the model predicts the word itself by quite a large margin, followed by some related words. For example, using “have” (1062 occurrences).

word probability
have 77.36%
‘ve 4.03%
need 1.52%
get 1.32%

Or for “understand” (85 occurrences)

word probability
understand 36.75%
know 11.70%
see 4.52%
get 3.81%
hear 3.05%
believe 2.40%

Notice that, especially when getting to the bottom of the table, there start to appear entries which have not much to do with “understand”, like “hear”. It’s important to remember the semantics of this table. It’s “can be used in the same context”, not “means the same thing”. Of course, as the contexts increase in size, and if the models are good enough, these should largely converge. But alas, current models and context are not perfect. As an example, observe the output for “first” (245 occurrences).

word probability
first 53.93%
last 2.54%
next 1.52%
second 1.18%

Although it rightly guesses “first” with a big margin, the next entries are not synonyms at all. Let’s take an example from in the book:

Meet me at the desk; I’ll be paying the bill.” She left very suddenly.. . They went to the town’s station flat and caught the [MASK] Greyhound going anywhere.

Observe how in this case, both “first”, as “last” are completely sensible options. There is not enough context to decide. This explains why for “grokking”, “it” is the first option. It can be used a generic placeholder for a word if there is no better alternative. Similarly, “get” occurs quite often.

# Replication

All source code can be found in this colab notebook. To replicate you only need to upload “Stranger in a strange book” in the epub format and rename it book.epub. After this you can just run all cells.