Management of the CruzDB database catalog

Welcome back to our series on the architecture of CruzDB. In the previous post we discussed how afterimages and persistent pointers are managed in order to enable parallel I/O. Today’s post is short, and covers the implementation of the database catalog.

The database catalog #

In the third post in this series we discussed transaction processing. One aspect of transaction processing that was discussed is conflict analysis that involves identifying and analyzing committed intentions from a specific range of the log. That post described how the database catalog is used to store an index of committed intentions in the log, but didn’t discuss how the catalog is actually implemented or managed.

In addition to the committed intention index, CruzDB may maintain a variety of other metadata such as named snapshots, statistics, and schema information. The catalog serves as a system-level service for managing this metadata. The catalog is stored in the database itself by constructing disjoint namespaces using key prefixes for the various types of information being stored. The following diagram depicts the committed_intention_<position> namespace, along side the namespace used for application records.

In practice the prefix is lightweight, consisting of a 0-terminated string typically using a single letter as the prefix. The internal transaction interface accepts a prefix when running operations, and the user-facing interfaces automatically insert the application prefix.

In order to make dealing with prefixed keys easier, CruzDB internally contains a variety of prefix-aware iterators. Three types of iterators exist that handle prefixes differently. The first type, the raw iterator, is an iterator over the entire database, where key values expose their prefix. The second type, the raw prefix iterator, is an iterator that also exposes the prefix, but iterates only across the namespace defined by a given prefix. The third type called the filtered prefix iterator operates within a specific namespace, and also removes the prefix automatically. When an application opens an iterator it is provided with an instance of the filtered prefix iterator with the prefix automatically set to the application prefix. This results in the application only seeing the key space that it has created.

In the next post we’ll take a look at garbage collection and free space management for the underlying log.

And a special thanks to Jeff LeFevre for providing feedback and edits on this post.