Thank you all for the examples and thought-provoking discussion. Having
read the references that Aleks mentioned, here are a couple of thoughts.
Aleks wrote:
> Actually, I was not writing about mutual information. Both mutual and
> conditional mutual information are given already by Shannon, and as almost
> everyone knows about that, it would be wrong to claim that it is
> independent
> reinvention. What I was referring to is interaction information, a
> generalization of mutual information to more than two variables. It was
> independently reinvented several times, judging by a number of different
> names different authors attached to the same quantity, and the fact that
> they all considered it novel. In their respective fields, they also found
> it
> useful for some purpose, indicating that 3-way interactions are a
> practical
> conception. Another such concept is total correlation. If anyone is
> interested in this review, it is on pages 8-10 of my paper at
> http://arxiv.org/abs/cs.AI/0308002
Yes, I can see why the concept of interaction information should be widely
applicable across interdisciplinary boundaries. It got me thinking. The
multivariate normal distribution in statistics is an important distribution
because of the central limit theorem. But it has the interesting property
that all the higher order correlations are functions of the pairwise
covariances (to use the jargon of statistics, the sample covariances are
sufficient statistics). This means that if all the pairwise correlations
are zero, so are all the higher order correlations. My hunch is that one
should be able to prove that the interaction information is zero if all the
pairwise mutual information is zero. Perhaps it is even true that the
interaction information amongst any three or more variables is always zero
for multivariate normal random variables, which would be interesting. If
so, that would be a nice way of characterizing why it is bad to use the
multinormal distribution when not all interactions are reducible to pairwise
dependencies.
Another example in which variables are pairwise independent, but the
interaction information is non-zero is the following generalization of the
XOR example. Suppose that there are three light bulbs that each flash red
or green 50% of the time, but that there are always an odd number of red
flashes. Then if you observe any pair of the bulbs, they will be
uncorrelated (like tossing a pair of coins). But the third bulb always
flashes red if and only the first two both flash the same color.
This happens to be a real example in quantum physics--what is called the GHZ
version of the Bell experiment for an entangled triplet of spin 1/2
particles. See:
Mermin, David N. (1990) "Quantum Mysteries Revisited." American Journal of
Physics, August 1990, pp.731-4.
Aleks wrote:
> Agreed, there has been too much fixation on linear multivariate normal
> models in statistics. My current work attempts to reconcile statistics,
> both
> frequentist and Bayesian, with these information-theoretic notions, by
> interpreting entropy either as a utility/loss function or as
> log-likelihood.
> This way we can also put error bars and confidence intervals on
> information-theoretic quantities, develop unbiased estimators of entropy
> and
> mutual information, and test the significance of the hypothesis of
> positive
> mutual information.
It is probably not what Aleks has in mind, but the idea of using the
Kullback-Leibler information as a utility/loss function in statistics has
been widely discussed during the past 30 years. Seminal references on the
subject are:
Akaike, H. (1973): "Information Theory and an Extension of the Maximum
Likelihood Principle." B. N. Petrov and F. Csaki (eds.), 2nd International
Symposium on Information Theory: 267-81. Budapest: Akademiai Kiado.
Akaike, H. (1985): "Prediction and Entropy." In A. C. Atkinson and S. E.
Fienberg (eds.), A Celebration of Statistics. New York: Springer. 1-24.
When statisticians talk about predictive accuracy, they most often have this
information-theoretic notion in mind. Numerous papers on my website discuss
this idea.
Aleks wrote:
> True, and these divisions in mathematics run deep trenches in other
> fields:
> frequentist probability theorists perceive probability as something you do
> with objects in sets, whereas statistics is more about probability as
> points
> on a line (to be precise, measure theory dominates as the paradigm of
> choice). The chasm between quantum gravity and relativity theory is also
> due
> to this: relativity is geometric (point on a line, spacetime), while
> quantum
> gravity is algebraic (objects in a set, discrete lattices). And I could go
> on to Shannon's entropy, which is an essentially algebraic conception,
> where
> you sum, not integrate: it gets messy if you try to apply it to geometric
> notions, such as symmetry. In this context, I'd recommend reading the
> transcript of M. Atiyah's lecture
> http://duch.mimuw.edu.pl/~sjack/atiyah.ps
>
> The main message here is that quantization and continuity cannot be easily
> reconciled: they are separate views of reality, sometimes complementary,
> sometimes redundant.
It is true that quantization and continuity cannot be easily reconciled. If
a quantum mechanical observable (such as position) can take on a continuum
of values, it means that the dimension of the associated Hilbert space must
be infinite (because the position operator has an infinite number of
eigenvalues). Infinite dimensional spaces are mathematically difficult to
comprehend. But it is being explored by mathematicians. A recent reference
is:
Kryukov, Alexey (2004): "On the Problem of Emergence of Classical
Space-Time: The Quantum-Mechanical Approach," Foundations of Physics 34 (8):
1225-1248.
Malcolm
http://philosophy.wisc.edu/forster
_______________________________________________
fis mailing list
fis@listas.unizar.es
http://webmail.unizar.es/mailman/listinfo/fis
Received on Mon Oct 18 07:17:04 2004
This archive was generated by hypermail 2.1.8 : Mon 07 Mar 2005 - 10:24:47 CET