[Fis] Is it Shannon entropy?

From: Michael Devereux <dbar_x@cybermesa.com>
Date: Mon 07 Jun 2004 - 07:47:56 CEST

Dear Aleks, Loet and colleagues,

I think many of us are using the same words to describe quite different
things. It seems to me that the words information, entropy, Shannon
entropy and Shannon information are being applied to properties that are
not even under consideration within all the disciplines represented by
all of us in this forum. I’ve never considered the application of
Shannon’s formula to Loet’s currency transactions, or to Aleks’ medical
patients, for example. And, I’m only slightly more familiar with how one
might try to analyze information processing, or transfer, in living
cells and organisms.

I’m now sure from what I read in our forum, that the Shannon formula has
a very extensive history of use in economics, medicine, biology, and
many, many, other disciplines. And it appears that by the traditions and
customs outside physics, if the Shannon formula is used to portray some
pertinent property, then that property may be called Shannon entropy and
Shannon information.

But Shannon himself had something very specific in mind which he wished
to describe mathematically. May I review again the actual history of
Shannon’s work? I trust others may also recall that Shannon derived his
famous entropy formula analytically from three initial postulates (Bell
Sys. Tech. J. 28, 3, p. 379, 1948). Shannon said that the physical
property he wished to describe mathematically (which he called entropy,
“how uncertain we are of the outcome”), satisfied these three
conditions. (Otherwise, of course, it’s not Shannon entropy.)

The Shannon formula, H, is a statement of uniqueness. His equation is
the only one that describes that physical property (entropy) which has
those three characteristics depicted by mathematical postulates. The
measure of how uncertain we are of an outcome, given probabilities,
p-sub-i, for each possible result, must uniquely be the function, H =
lambda (Sum p-sub-i log (p-sub-i)), where lambda is an arbitrary constant.

So, what are these three postulates that uniquely determine the form of
H? First, Shannon entropy will be continuous in the variables, p-sub-i.
Second, in the simple case of maximum uncertainty, where all p-sub-i are
equal to 1/N, Shannon entropy will be described as a MONOTONIC
INCREASING FUNCTION OF N. The third postulate says that for two
successive outcomes, the original value of the entropy must be the
weighted sum of those two individual entropy values.

I’d emphasize that if that property being described, does not increase
monotonically, that property of the system is not Shannon’s entropy.
Even though one may be using Shannon’s equation to describe it. (And
that property of the system must also satisfy the other two postulates,
if it really is Shannon entropy.) I mentioned in an earlier posting that
at one end of a Shannon communication channel, the amount of information
received can never be greater than the amount sent, though it may be
less, if noise is present.

Aleks wrote “we cannot infer ever-increasing entropy for Shannon's
probabilistic model of communication, because this model has nothing to
do with thermodynamical models.” I quoted Shannon previously in this
forum, Aleks. Shannon wrote that “Quantities of the form H = Sum p log p
play a central role in information theory as measures of information,
choice and uncertainty. The form of H will be recognized as that of
entropy as defined in certain formulations of statistical mechanics....
H is then, for example, the H in Boltzmann’s famous H theorem.” (p. 393)
Perhaps, we must all accept Shannon’s own words as the authoritative and
definitive resolution of this question.

Aleks also wrote “Other fields introduce entropy and refer to
thermodynamical laws, but often neglect to show that their underlying
models show the same properties as those of statistical mechanics.” Of
course, one can’t legislate the use of the term entropy only for a
property of a system that satisfies all three of Shannon’s postulates.
But I believe that if it doesn’t satisfy them, we must, as committed
researchers, accept that such a thing is not Shannon entropy. But, if it
does satisfy Shannon’s postulates, it is identical, as Shannon told us,
with that entropy described by Clausius, Boltzmann and Plank.

I don’t, for a moment, dismiss the extensive use and value of Shannon’s
formula in economics, or linguistics, biology, or in any other
discipline. Colleagues here have all made me aware of that history and
tradition, and of the wide application of this formula to daunting,
interesting problems. But, I would maintain that if Shannon’s own work
concludes that some system characteristic isn’t actually Shannon
entropy, then it’s not. Or, if it satisfies Shannon’s criteria, then it
must be Clausius’ and Boltzmann’s entropy, as well as Shannon’s.

I look forward to the valuable and considered responses that so many
have here offered.

Cordially,
Michael Devereux

_______________________________________________
fis mailing list
fis@listas.unizar.es
http://webmail.unizar.es/mailman/listinfo/fis
Received on Mon Jun 7 07:49:54 2004

This archive was generated by hypermail 2.1.8 : Mon 07 Mar 2005 - 10:24:46 CET