Feb 2000
UIDs and DBOS
In the DBOS, every record in every table has a Unique Identity (UID) attribute. The use of this attribute "with no business meaning" is quite common in DB design, but the method of assigning UIDs can vary. I like to think of this UID as an indicator of the location the object occupies in an object space.
A UID authority is an object or function the generates keys for the system to use. It is the responsibility of the UID authority is to guarantee that the UIDs truly are unique.
There are three main ways that object UIDs are stored in databases.
- Method 1, unique by table:
Used mostly when the class types are unrelated, and there are no 1-1 relations. This type of uniqueness has the advantage of not having to go to a central authority that assigns UIDs. Each table can have its own, allowing for greater parallelism. If the number of records in the table is small then, the key used can also be small. One to One relations have to be stored as separate attributes, so may cause a waste of space, and a complication of queries.
- Method 2, unique by object:
When class hierarchies are still relatively flat, or there are many 1-1 relations, it serves well to have each hierarchy have its own UID authority. The demand on the UID authority is greater then method 1, but this could be trivial depending on depth.
- Method 3, globally unique GUID (Keys):
If there is only one class hierarchy, or multiple instantiation is allowed then the number of 1-1 relations is large: each object can have many records in many tables. To keep the number of 1-1 fields low there should be just a a single key field. But having generating keys puts a severe demand on the UID authority; to the point where key generation could be a bottleneck. The number of possible keys that could be required can get quite large, forcing a large number of bits to keep each unique. Large keys can be the dominant portion of a record in tables that have few attributes. The benefits of GUID are great, especially if object can change class on the fly. Solutions have arisen that try and solve the problems.
Generation of Keys
Having a single authority distributing keys is ineffective, and a bottleneck. Here of some of the common solutions for replacing a single authority. Many solutions would mix the various types.
- Method 1, random luck
Definitely the easiest to implement, but there may be conflicts. The number of bits used may have to be unusually large to accommodate a low probability of a duplicate. To have a one in one million chance of a duplicate there will be a waste of 20bits per key!
- Method 2, multi-level generation
This breaks the problem of generating Keys into high and low order bits. It uses active assigning, where the high order bits are assigned by a single UID authority and the low order bits are assigned by one of many local UID authorities. This releases the pressure off the major authority, by making it appear to deliver more keys per request. The entire key space can be used, but because of the arbitrary high/low points in the key some processes may never use them all, others may use so many that they put considerable strain on the major authority. These two extremes have to be balanced.
- Method 3, physical attributes
Keys can be generated by concatenating many near-unique parameters together. Some of these parameters could be Ethernet card ID, or time-of-day. This method has the advantage of generating keys almost as easily as random luck, with an added benefit of no duplicates. It has drawbacks with respect to GUID waste (clock-times that no GUID is requested, or Ethernet cards never used) this waste could be in the 10s of bits per UID. Second, this method may not be able to generate enough keys for fast machines where the GUID requests are coming in faster then the clock accuracy.
- Method 4, key servers
Keys are served in consecutive blocks to other key servers, and eventually to individual processes. Large and small demands alike can be satisfied. The drawbacks to this method include lost keys due to an aborted process, or a lost machine. Also, keys can not be used unless there is key server, with keys, to serve.
Arcavia plans to have DBOS use key servers. There are issues about single machines, with no access to keys. It may be solved by tracking an internal and external set of keys until true keys are available.