I’m obsessed with the idea that we have a lot of mathematical descriptions for what happens to atoms and energy in our physical world, but our descriptions of how information is moved, changed, and conveyed don’t have so many “big theories”. It’s said that John von Neumann suggested to Claude Shannon to use the word “entropy” to describe the amount of information conveyed in a message, “because the formula looks the same, and nobody understands what the heck is Entropy anyways”.

There are some other laws, but they can all be seen as corolaries of Landauer’s principle (erasing a bit of information releases a tiny amount of heat). And that’s pretty much it; although we could arguably say that all of mathematics is the expression of how certain abstract properties behave when transformed through equivalences, thus making all math information theory.

What Shannon called “bits of information” I would have called just “bits of data”. The difference between the two of them, is how much subjective meaning you give to it. Without meaning, data is worthless: knowing who is the next president in a big bipartisan democratic country’s election night, like the United States of America, conveys about 1 bit of information. In the Shannon sense, the result of a coin toss also has exactly 1 bit of information. Randomness is indistinguishable from meaningful surprise. From a society/psychological point of view, we are lacking some extra vocabulary about information to make it more useful, and distinguish one from the other.

Shannon knew this. He deliberately wanted to avoid that discussion, because it’s a much harder and more nuanced problem. In its second paragraph, the paper A Mathematical Theory of Communication explains that:

Frequently the messages have meaning; that is they refer to or are correlated according to some system with certain physical or conceptual entities. These semantic aspects of communication are irrelevant to the engineering problem.

“Irrelevant to the engineering problem” is such an elegant sidestep: Shannon built the foundation for the entire digital age by explicitly ignoring the hardest part of the question, a masterful simplification. In 2026, now that we need to build systems that have some sort of intelligence, we have no mathematical frameworks for meaning. We need a Shannon for this century.

The lack of a formal definition of meaning might be beyond the ability of our language systems; it’s quite an epistemological challenge. But I have a strong sense that some laws and theories lie out there that can explain mathematically what we mean by meaning.

There are some out there that are trying to fill that missing area of the puzzle:

As I mentioned in the beginning, I’m quite obsessed with this subject, and baffled by my lack of answers, so please reach out if you find more!