Foundations of Information Science : stdin : [Fis] ON MOLECULAR BIONETWORKS

From: by way of Pedro Marijuan <Jerry.LR.Chandler@Cox.net>
Date: Tue 08 Nov 2005 - 11:20:33 CET

9th FIS Discussion Session:

ON MOLECULAR BIONETWORKS

Jerry L.R. Chandler
Research Professor
Krasnow Institute for Advanced Study
George Mason University

Kevin G. Kirby
Department of Computer Science,
Northern Kentucky University
Highland Heights, Kentucky

Introductory Remarks 1:

12 Questions about relations between molecular bionetworks and information
theory

(by Jerry L.R. Chandler)

I agreed to write a brief introduction to the subject, knowing that I could
not do justice to such a complex topic in such a short time. As little is
known about the subject, I chose to introduce topics and to pose questions
about potential relations. The questions are significant but not ordered
by importance or priority. Hopefully, participants will contribute to
exploring the meanings of the questions and addressing conceivable
approaches to answers. Perhaps one objective could be to discuss the
relative importance of the 12 questions.

The concept of molecular bionetworks is a modern concept. The
development of this concept will continue into the indefinite future. That
is to say, the concept is, at present, well defined but in a primitive or
early stage of development.

1. From an informational perspective what would constitute a definitive
theory of the information content of molecular bionetworks?

Historically, in the 1840’s, Kirchoff’s theory of flow in electrical
networks grounds the metaphor of the conceptual framework for molecular
bionetworks. However, the relatively simplicity of the flow of electrons
in comparison to the flow of life restricts the analogy
severely. Nevertheless, biomolecules are composed from electrical
particles and certain electron flow patterns are intrinsic to metabolic
networks in living systems.

2. From the perspective of information, how is it possible that bionetworks
construct the informed flows of electrical current flow and how is it
related metabolic flows?

Historically, the investigation of empirical basis of molecular bionetworks
(metabolism) started shortly after Pastuer’s pioneering experiments on the
causes of fermentation (1870’s). At that time, the fermentation of grape
juice was difficult to control but of great economic importance. (Bad wine
sells cheaply!) Quantitative analysis of yeast fermentations showed that
each molecule of glucose generated two molecules of the 2-carbon alcohol
and two molecules of carbon dioxide. The yeast cell informed the specific
destruction of the sugar. (If the 6-carbon sugar was merely burned, then
it produced exactly six molecules of carbon dioxide.) Thus, the
thermodynamic question of why the yeast cell did not completely burn up the
6- carbon sugar entirely into carbon dioxide arose.

3. From an informational perspective, what does this incomplete
fermentation process of glucose suggest about the role of thermodynamics in
living processes?

Continuing with the historical roots of biomolecular networks, early in the
20 th Century, the exact chemical pathway between glucose and alcohol and
CO2 was decoded. The code of the cell for fermentation was that each
chemical reaction was catalysed by an informed agents,
termed “zwishenferments”. These agents are now known as enzymes. Thus,
the question of how a cell knows how to ferment sugar was regressed to a
deeper informational question: How does an enzyme know how to inform a
chemical change?

4. From an informational perspective, what is the nature of the code that
an enzyme contains such that it conducts an informed catalytic process?

Continuing with the historical perspective, by the 1950s, it was known that
biological information was carried in chemical sequences, enzymes were
composed from amino acid sequences and that nucleic acids were composed
from nucleotides. In the 1960’s, the convergence of experimental work in
genetics, biochemistry, x-ray crystallography, nutrition and Shannon’s
coding theory, generated the hypothesis that biological information is
encoded in chemical structures. This hypothesis remains intact
today. During the period from 1970 to the present, methods for determining
the chemical sequences of biomolecules were developed and
automated. Today, sequencing of microgram quantities of biomolecules is
routine. The complete DNA sequences of hundreds of biological species are
known. The sequences of tens of thousands of proteins are known. The
sizes of proteins, the so-to-speak “cybernetic agents,” are often hundreds
as time as large as the small molecules that they control. Interpolation
between protein sequences and DNA sequences are routinely conducted via
computer programs. The relation between the information in DNA or in
proteins and the substrates for catalytic reactions is deeply mysterious.

5. From an informational perspective, given solely the DNA and protein
sequences of an organism, what can we say about the biological functioning
of the organism?

The classification of chemical structures has been given a new set of names
in the past decade. Genomics becomes the study of the genome, the DNA
sequences and properties; protonomic becomes the study of protein
structures and properties, metabolonomics becomes the study of “small”
molecules where the sequences of atoms are relatively short. (The use of
the Greek root, “nomos”, as in law (for example, in autonomy), is somewhat
misleading in this context.)

Biomolecular networks consist of all the molecules in a cell. One crucial
feature of such networks is the capacity to generate biological functions.

6. From an informational perspective, what is the nature of the information
content of a collection of molecules such that they generate biological
function?

A second crucial feature of such networks is the capacity, in an
appropriate ecosystem, to reproduce themselves with exactly the same
chemical structures.

7. From an informational perspective, what is the nature of the relational
interactions between the interior and the exterior of an organism such that
the information is reproduced?

8. From an informational perspective, is the reproduction of a cell a
mathematical calculation?

Biomolecular networks function in time. In humans, the generation of
temporal networks of the heart, the brain, the menses, and so forth are
intricate flows of orderly information that manifest themselves in regular
and chaotic rhythms. In simple unicellular organisms, the growth rate and
cellular division can follow strict rhythms. Such rhythms can be
modulated, altered, extended or stopped by chemical agents that are foreign
to the bionetwork.

9. From an informational perspective, what are the relations between
chemical structures and temporal rhythms? What is the information content
of a chemical rhythm?
In what sense are the rhythms of temporal chemical relations the source of
biological information? What is the information content of a chemical rhythm?

Early in the history of biology, organisms were classified based on
categories, more or less following Aristotelian conceptualization of
Phyphrian trees, choices of potential properties. The ten Aristotelian
categories, the ten highest genera are listed as: substance, quantity,
quality, relation, place, time, position, state, and action, being acted
on. Biological function is related to these categories. I presume that a
property of the molecular bionetwork encodes and decodes information
concerning these genera.

10. From modern information theory, how can we improve the Aristotelian
categorization of properties of organisms, based on the qualitative and
quantitative attributes of the biomolecular networks? How should the
concept of information be conceptualized such that it will illuminate
communication about molecular bionetworks or biological function?

Each molecular bionetwork is specific to the particular species in which it
exists. However, the similarity of bionetworks roughly parallels the
separation of species in the ‘tree of life.” Virtually all
self-reproducing species include a common set of small molecules, amino
acids, fats, sugars, vitamins, and minerals. The degree of similarity of
sequences of proteins, RNA and DNA vary with the “tree of life”
separation. Indeed, classification of species, genera and families are now
based on calculation of sequence similarity.

11. From an informational perspective, what is the significance of the
sequence similarities for the theory of biological information? For the
concepts of encoding and decoding of molecular bionetworks?

One approach to the partial analysis of the meaning of the genetic code was
developed from Shannon’s theory by Dr. Thomas Schneider.
<http://www.lecb.ncifcrf.gov/~toms/>http://www.lecb.ncifcrf.gov/~toms/
The theory is specific for one aspect of the molecular bionetwork, namely
binding of proteins to DNA. His website provides a detailed account of
Schneider’s philosophy.

12. From an informational perspective, the encoding processes for DNA –
protein relationships are somehow related to Shannon’s theory. What are
the relations between this sort of encoding and other metabolic
encoding? In particular, can we imagine a catalytic – type of encoding
that parallels the genetic encoding?

----------------------------------------------------------------------------------------------------------------------------

Introductory Remarks 2:

The Natural Computation Perspective

(by Kevin Kirby)

If anything can be called a "science of the artificial" it is Computer
Science, or so one might think. In fact, such thinking is incorrect. And
it is precisely in the field of biology -- with molecular biology as the
best exemplar-- where a clearer understanding of Natural Computation will
be so valuable.

The initial encounter between computing and nature happens in the
computational modeling of natural systems. For example, we now see many
object-oriented models of ecosystems. More deeply into computer science, we
find questions such as, for example, whether a natural language has
such-and-such a computational property (e.g. whether Dutch is a
context-free language). But beyond providing a way to talk about models of
nature, some have viewed computer science as saying something about nature
itself, constraining it in some way. A very strong form of the
Church-Turing thesis can be taken as a physical law. And the theory of
quantum information has made the "it from bit" slogan quite precise. It is
in these latter senses that a true “natural computer science” begins to
take shape.

One way to further this approach is to turn to biological systems, and take
a look at the simulation relation. How do we say a biological system
computes X? Well, we see if there is a dynamics-preserving mapping between
inputs and states of the biological system and a given formal system for X.
This relation is usually written as a commutative diagram. The simulation
relation is central in automata theory and was recast into a
category-theoretic framework by Arbib and Goguen (taking different
approaches). But the notion of a mapping between a biological system and
a formal system, seems to be, at first glance, a category mistake! As soon
as one identifies a fragment of nature as a system, one has locked in some
set of states, and it is hard to separate the true computational power of a
living system from what accrues merely to our conventional state
assignment. This is taken up nicely by the philosopher David Chalmers in a
response to a very strong statement at the conventionality end by Hilary
Putnam. (One could see this as a recasting of the debate in Plato's
Cratylus in computational terms!)

This tension between the formal and the material seems to lie at the heart
of the field of natural computing. The work of Michael Conrad emphasized
the special role of biological material to explain the fantastic outcomes
of evolution, as opposed to any power inhering in the class of relatively
simple Darwinian algorithms. In the mutation-absorption model of the
enzyme, we can begin to see how computational power emerges from the
breakdown in the simulation relation, by the failure of commutativity (and,
in the mathematical sense, the creation of torsion). This seems to be the
vexing locus of this new field: clarifying precisely what happens when
formal systems fail to track changes in fragments of nature.

In this FIS session on molecular bionetworks, I think we may have the
opportunity to find a clearer understanding here. Jerry Chandler's work
deals with the connection between the formal (or symbolic) and the
biological, and it seems these two perspectives may be mutually
illuminating, and give us all the chance for some brainstorming in this
largely unexplored area. Perhaps natural computing takes us back to the
medieval meaning of natural as "sublunar" (to bring up term recirculated
lately by Pedro Marijuan), dealing with the mutable substances down here
under the orbit of the moon, illuminated by, but separated from, the
immutable forms of the celestial world.

One final introductory thought. In imputing computing to a fragment of
nature -- a ribosome, say -- we should not view ourselves as reducing it to
the apparently impoverished level of a Turing machine or a Pentium 4.
Computer Science is in its infancy; the ribosome, the cell, the ecosystem:
these are all better exemplars of computational wonder than our humble
devices. If only we had a theory…

____________________________________
Received on Tue Nov 8 11:15:52 2005