Aug 2005
Ordering and Information
Introduction
Bit fields can be described as an ordered set of bits, and can be naivly implemented as an array of booleans. Since the DBOS does not have primitive support for ordered structures, the need for an ordered primitive is imperative.
Ordering is used to transmit and receive over communication lines. It is extensivly used in existing operating systems and hardware, like files and machine code respectivly. The DBOS must have a technique for dealing with these ordered structures. The bit field is the solution, and in this way bit fields are one of the most fundamental logical structures used in current computation models. They are an atomic structure that encapsulate the concept of information storage.
Information is in Order
The simplest object capable of storing information is the bit. But it is the ordering of bits that allows a small number of bits to store a large amount of information. In a high level way, information is stored in the ordering of objects. We want to prove this by showing that more information can be stored in an ordered set of objects than an unordered set of objects.
The number of combinations given n ordered slots, and m objects to choose from allowing duplication is:
n
Co(n,m)= m
The same but with the n slots unordered, stores an information count given by
(n + m - 1)!
Cu(n,m)= -------------
n! (m-1)!
To start our proof, the base case must be analysed. We expose the base case where n=1 and m=2. Clearly Co >= Cu.
Co(1,2)= 2 Cu(1,2)= 2
We now prove that, given any fixed m, that Co(n,m) > Cu(n,m) if Co(n-1,m)>=Cu(n-1,m) for n>=2 and m>=2. But first observe the following identities:
n n-1
Co(n,m)= m = m m = m Co(n-1,m)
(n+m-1)! (n+m-1)(n-1+m-1)! (n+m-1)
Cu(n,m)= ----------- = ------------------- = ------- Cu(n-1,m)
n! (m-1)! n(n-1)!(m-1)! n
When n and m are both greater, or equal to 2 we note that n+m-1 < n*m. Therefore
(n+m-1)
Co(n,m) = m Co(n-1,m) > ------ Cu(n-1,m) = Cu(n,m)
n
Comparing, we see that order/unorder > 1 for all n>=2 and m>=2, concluding that more information can be stored in an ordered set of objects. Below is a grid of comparisons for the first several n and m values.
Cu/Co (Unordered/Ordered)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
0.010 |
n is the number slots, m is the number of objects
In ths case of bit fields we are interested in the case of m=2 (binary), and we see that at n=6 unordered bits can store only 10.9% the information that ordered bits can.
Another interesting situation is when m=n. This represents a multi-set of objects stored in a list of slots. Many bits worth of information can be stored in the ordering of the objects. For example when m=n=6 we have Cu/Co=0.010, which is just under 7 bits of additional information storage.
Order is Information Hiding
A consequence of being able to put information into ordering also allows ordering to be a form of information hiding. The interpretation of symbols in a sequence requires the reader to know the context under which the ordering was made. If this context is not explicit, or unknown, then the meaning of the symbol sequence is hidden. Even if the context is known, there still needs to be interpretation.
Although highly contextual serialization improves transmission speed and data storage, it is not a good thing for interpretation. To prevent information hiding in DBOS design, and reduce the interpretation times. We avoid the uses of order where ever we can. Strings are used exclusivly for human interaction, no internal logical representations use strings. Method parameters use keywords instead of an ordered list. Program specifications are encouraged toward unordered clauses instead of instruction sequences (eg. SQL over procedural).
Ordered sequences can never be removed completely. Our machines work on ordered instructions, and many problems are ordering problems. Therefore there will always be a need for languages, or language aspects, that are ordered.
June 2000, initial revision
July 10th 2000, added the technical comparison of ordered to unordered
July 16th 2001, made more readable, better introduction
Jan 26th 2002, Added order is information hiding
Aug 3, 2005: Added m=n situation
.