TinyCDB is a very fast and simple package for creating and reading constant data bases, a data structure introduced by Dan J. Bernstein in his cdb package. It may be used to speed up searches in a sequence of (key,value) pairs with very big number of records. Example usage is indexing a big list of users - where a search will require linear reading of a large /etc/passwd file, and for many other tasks. It's usage/API is similar to ones found in BerkeleyDB, gdbm and traditional *nix dbm/ndbm libraries, and is compatible in great extent to cdb-0.75 package by Dan Bernstein.
CDB is a constant database, that is, it cannot be updated at a runtime, only rebuilt. Rebuilding is atomic operation and is very fast - much faster than of many other similar packages. Once created, CDB may be queried, and a query takes very little time to complete.
There are two interfaces provided by a library, -lcdb, -- create interface which is used to create CDB file, and two variants of query interface. A program using any routines should #include <cdb.h> header file which holds all required definitions of a library. More information together with detailed description of every routine is available in manual page inside TinyCDB package.
TinyCDB is different from Dan's cdb-0.75 in the following ways:
Create interface is built around struct cdb_make structure which is opaque type. The following is a sequence of action which should be performed in order to create CDB file (error handling is omitted):
struct cdb_make cdbm; int fd; char *key, *val; unsigned klen, vlen; fd = open(tmpfile, O_RDWR|O_CREAT); /* open temporary file */ cdb_make_start(&cdbm, fd); /* initialize structure */ cdb_make_add(&cdbm, key, klen, val, vlen) /* add as many records as needed */ cdb_make_put(&cdbm, key, klen, val, vlen, flag); /* alternative interface. flags is one of: CDB_PUT_ADD adds new record unconditionally like cdb_make_add() CDB_PUT_REPLACE if a key is already exists, replace the record CDB_PUT_INSERT add a record only if the key isn't already exists */ cdb_make_exists(&cdbm, key, klen); /* a routine to test whenever a given key is already exists */ cdb_make_finish(&cdbm); /* final stage - write indexes to CDB file */ rename(tmpfile, cdbfile); /* atomically replace CDB file with newly built one */
There are two variants of query interface, one as found in cdb-0.75, and another as found in earlier versions of cdb (cdb-0.6x).
This interface is built around struct cdb structure which is opaque to the application. This interface designed to be efficient for many queries, for a single query second variant may be more efficient. The following is a sequence of calls needed to perform a query of a value in a CDB file:
int fd; struct cdb cdb; char *key, *val; unsigned klen, vlen, vpos; fd = open(cdbfile, O_RDONLY); cdb_init(&cdb, fd); /* initialize internal structure */ if (cdb_find(&cdb, key, klen) > 0) { /* if search successeful */ vpos = cdb_datapos(&cdb); /* position of data in a file */ vlen = cdb_datalen(&cdb); /* length of data */ val = malloc(vlen); /* allocate memory */ cdb_read(&cdb, val, vlen, vpos); /* read the value into buffer */ ... /* handle the value */ }and here is what is needed to enumerate all values assotiated with a given key:
struct cdb_find cdbf; /* structure to hold current find position */ cdb_findinit(&cdbf, &cdb, key, klen); /* initialize search of key */ while(cdb_findnext(&cdbf) > 0) { vpos = cdb_datapos(&cdb); vlen = cdb_datalen(&cdb); val = malloc(vlen); cdb_read(&cdb, val, vlen, vpos); /* handle the value */ free(val); }
Another, simpler query interface exists which is sutable for a single query. Two routines provided works with a single filedescriptor opened for reading:
int fd; char *key, *val; unsigned klen cdbi_t vlen; fd = open(cdbfile, O_RDONLY); /* open a CDB file */ if (cdb_seek(fd, key, klen, &vlen) > 0) { /* if key was found, file will be positioned to the * start of data value and it's length will be placed to vlen */ val = malloc(vlen); cdb_bread(fd, val, len); /* read the value; * plain read() will do as well. */ /* handle the value */ }
There's a manpage, cdb.5, included with the package.
Initially it was just "Public domain, that is, you may do anything you want with it." However due to this being too vague and questionable, with version 0.81 I changed the license to MIT:
Copyright (C) 2001-2023 Michael Tokarev
Permission is hereby granted, free of charge, to any person obtaining a
copy of this software and associated documentation files (the "Software"),
to deal in the Software without restriction, including without limitation
the rights to use, copy, modify, merge, publish, distribute, sublicense,
and/or sell copies of the Software, and to permit persons to whom the
Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included
in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
DEALINGS IN THE SOFTWARE.
Latest version is 0.81, released 25 Dec 2023, and can be found
here.
It can be built on systems using RedHat Package Manager (rpm) with
-tb option to create installable .rpm package. On a Debian GNU/Linux
system, the preferred way to install it is to use standard apt repository.
For other versions of the package and pre-built rpms look
here.
Download
Enjoy. Michael Tokarev, mjt+cdb {at} tls {dot} msk {dot} ru.