The Language of Cells

From: Pedro C. Marijuan <marijuan@posta.unizar.es>
Date: Thu 26 Feb 1998 - 09:38:00 CET

THE LANGUAGE OF CELLS

Pedro C. Marijuan & Josi Pastor
Department of Electronics and Communications Engineering
Universidad de Zaragoza, Zaragoza 50015, Spain

1. Introduction: Positional versus Compositional Information

The notion of a "language" of cells does not look consistent in relation
with the standard views of Information Theory applied to biology. Although
Shannon and Wiener (1948) distinguished between discrete, continuous, and
mixed information sources, the standard application of their ideas to the
biology of the cell has been heavily influenced by the DNA and RNA
sequential structure, and only the discrete-positional case has been
traditionally considered (e.g., Gatlin, 1972). As a consequence, the lack
of distinction between "positional" and "compositional" forms of
information and the subsequent neglect of the latter--we are going to argue
here--have implied an analytical dead-end concerning the possibilities of
elucidating formal mechanisms of cellular languages.

In biology and in human to human communication, the assumed preconditions
for information transmission, and for any workable language, refer to
sequences of messages containing combinations of symbols which will be
accepted (or emitted, or read, or transmitted, or deciphered) always
following a positional order. Therefore, Shannon's formula appears as the
natural way of measuring the average combinatory content of these messages,
and of establishing their relative index of surprise, in order to design
appropriate channels, codes, etc. A workable language can be created,
subsequently, as a series of grammatical (Markovian) rules to abide by when
connecting successive positional messages comprised within the dictionary
scope of the language.

However, one can point at a number of instances in natural and social
communication where symbols are used in a rather different way. Instead of
a "positional" context, a "compositional" one is the case. In this
alternative context, messages are exchanged as presences or absences of
symbols which have been accumulated upon predetermined sets of objects. No
meaningful positional relationships among the objects within the set or
among the symbols accumulated upon the objects are assumed. For example,
several glasses on a tray may contain a variable number of different
symbolic items (ice cubes, soda, vermouth, olives, cherries). The set of
glasses on the tray become the message, each glass being an individual
object that accumulates several symbols which precisely configure it as a
distinguishable object. Communication among two subjects could be
established by the exchange of trays, with variable number of glasses and
variable contents within them. Theoretically, that messages might be
reliably distinguished and transmitted by this method "of concurrent
processing of discrete states of media" has been postulated by Karl
Javorszky (1996). A whole body of partitional calculus (or granularity
algebra) has been envisioned by this author (Javorszky, 1995).

More reasonable examples of the compositional way of information exchanges
may be found in the communicational use of colors, odors and tastes by
individual organisms; also in social insects' pheromones; and anecdotally
in the etiquette "language of flowers"; and perhaps within musical
compositions and within the formative frequencies of vowels and consonants
of our own spoken languages as well. The "language of cells", we will
discuss here, may be one of the most interesting instances of communication
by means of such compositional tools; and it has been the forerunner of any
further means of biological communication.
Marshall Mc Luhan's famous dictum "the medium is the message" and the
particular disdain this author showed about Shannon's information theory
(Mc Luhan, 1968) are worth be remembered when considering this fundamental
distinction between positional and compositional forms of information
exchange.

2. Analyzing a Compositional Message

When receiving a compositional message encrusted upon a set of N elements,
there is very little a subject can do: just counting the presence of the
different symbols on each element of the set. Hence the elements can be
grouped in homogeneous classes of overlapping or non-overlapping nature
(each class is defined by the presence of a specific symbol, and it
actually demarcates a partition of the set N). After the classes defined by
single symbols, the more complex coincidences of combinations of symbols
(class overlaps) among the elements can be counted. It can be easily proved
that all the possible countings of symbolic presences among the N elements
of the set, in the first case of linear or one-dimensional partitions for
single symbols, conduce to the whole set of partitions of N, known as E(N)
(Javorszky, 1995). Whereas the coincidences or overlaps of successive
combinations of two, three, four symbols, etc., can be considered as second
order partitions, third order, fourth order ones, etc.--E2(N), E3(N),
E4(N)...

Mathematically, partitions are a very straight concept: the additive
decompositions of natural numbers. For instance, the set { (5), (4,1),
(3,2), (3,1,1), (2,2,1), (2,1,1,1), (1,1,1,1,1) } represents the whole
onedimensional partitions of 5. By adhering to this mathematical treatment,
one can use the well-known partitional properties of numbers to discuss the
most probable logical states of a compositional message encrusted upon the
elements of the set N.

The set of partitions E(N) can be transformed right away into a
probability body. The probability of any state of the set to exist as is
described by a specific partition is given by the relative frequency of
this partition among all partitions. For instance, on E(5) the probability
is

        1/7 - for states (5), (2,1,1,1) and (1,1,1,1,1) each,
        2/7 - for states with either 2 or 3 summands each,
        15/20 - for any summand to be an odd number.

Kmax is that number (1..n < N) which generates the most numerous set of
partitions of N into k summands. In this most probable partitional state,
the set shows Kmax distinct summands with respect to a one-describing
dimension. In the case of E (5), there is a Kmax shared both by 2 and 3.

Heuristically, it appears that a compositional message can be univocally
described by its corresponding "trace" of unidimensional partitions
(Javorszky, 1996), if a few additional statistical measures that act as a
sort of context or shared background in the communicational process have
been previously established: most probable message length, ratio of
symbols/elements, structural depth, shallowness, etc. Then the use of
partitions of further order (second, third, etc.) becomes redundant--and
its inclusion would notably complicate the mathematical description of the
message.

Additionally, the Kmax. dimension of every property or symbolic presence
may be used as an origin or natural cannon to which the respective
deviations of successive messages can refer. This further simplifies the
partitional "trace" describing a specific message in the context of an
ongoing communication process.

Karl Javorszky (1996) has argued that an efficient communication procedure,
massively parallel, can be built around such minimized partitional traces
or message simplifications. It seems to work particularly well with data
sets of a moderate size, which are preferably prestructured and come in a
quasi continuous stream, so that the number of possible symbols is always
kept rather finite--although symbols might come from an infinite multitude,
there should be a relatively small collection of distinguishing items
employed at the communicational session, and their group relations should
not generate a cardinality overstatement symbols/elements above a certain
limit.
In the extent that Javorszky4s estimates are correct, the overall capacity
of a compositional channel making use of discrete states of media can be
generically expressed as

T(N) = E(N) exp ln E(N),

understanding T(N) as the number of different logical states which can be
distinguished by means of collections of symbols put on the elements of the
set N -only non-redundant states are counted, for redundant symbol groups
can always substitute by single symbols, coalescing into a unique logical
state. E (N) is the already mentioned number of partitions of the set N.

Calculating the respective probabilities and taking logarithms, one could
obtain an expression in bits for the compositional-channel capacity (or the
entropy of a compositional information-source) somehow paralleling the
famous Shannonian entropy.

It is also interesting the comparison between T(N) and the strictly
positional use of the same elements of the set N in a combinatory way
(which, in principle, should provide a total of N! different messages or
logical states). In this comparison, T(N) yields a larger number of logical
states than N! for values of N in between 31 and 95, with a maximum around
63-64. However, for N = 12, the number of combinations N! reaches a maximum
with respect to T(N).

Seemingly, several parameters of the genetic code would correspond with
such max./min. extremes that characterize the compositional-positional
interrelationship (see Javorszky, 1995, for detailed expression of all
these formulae and calculations).

3. A Partitional Approach to Cellular Communication

How can the above compositional considerations be applied to the analysis
of inter- and intra-cellular processes? Instead of DNA sequences, it seems
that the natural target of this new approach should be the "mysterious"
processing operations performed by the cellular signaling system.

The basic idea to play with is that any array (sample) of chemical
compositions detected by the set of cellular receptors may be recast as a
partitional state, for we may consider that it has been obtained by
distinguishing the presence of a series of specific chemical signs within
an hypothetical set of N elements or receptor complexes--it is a transient
message coming from the environment through an ongoing compositional
communication process.

Then, an immediate question arises. Could the system of receptors,
membrane-bound enzyme and protein complexes, second messengers, and the
dedicated kinase and phosphatase chains, be understood as an abstract
partitional processing-system capable of extracting the minimized "trace"
or relative information differences within the stream of incoming
compositional messages and physically transport these differences down to
final effectors at the nucleus, cytoplasm, or membrane? That's the basic
hypothesis these two authors are presently trying to explore.

If (and what a big "if") cells would make use of formal tools of
partitional nature in their management by means of the cellular signaling
system of the compositional messages they receive, then the notion of a
genuine cellular language, with specific dialects for every organismic
tissue, could be seriously argumented. And perhaps more interesting than
that, quite a few other bizarre aspects of the signaling system could
receive some more formal (and simpler) treatment: the cross-talk between
signaling pathways, the checkpoints relating signaling operations with
cell-cycle stages, and even the widespread formation of aggregates and
complexes among signaling components...

Partitions are very direct formal tools, but at the same time sophisticated
ones. For instance, if natural numbers are left to "oscillate" and
simultaneously their corresponding partitions are allowed to grow and
shrink (Javorszky, 1998), then there emerges a variety of self-organization
patterns and internally-driven numerical processes which seemingly parallel
the constructive/degradative aspects so prevalent in biological and social
occurrences (and even physical ones).

The studies by Caianiello (1987) on the partitional dynamics inherent of
monetary systems and the suggestion by one of us (Marijuan, 1998) about the
"currency" role actually played by the set of second messengers in the
internal measurement of cellular function might finally be stepping stones
pointing out in the same direction.

----------------

-----------------------------------------------------------
Pedro C. Marijuan --FAX 34 976 761 861 --TEL 34 976 761 927
Dto. Ingenieria Electronica y Comunicaciones
CPS, Universidad de Zaragoza
Zaragoza 50015, SPAIN
Received on Thu Feb 26 10:00:31 1998

This archive was generated by hypermail 2.1.8 : Mon 07 Mar 2005 - 10:24:45 CET