Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
cdb(3)			   Library Functions Manual			cdb(3)

NAME
       cdb - Constant DataBase library

SYNOPSYS
	#include <cdb.h>
	cc ... -lcdb

DESCRIPTION
       cdb  is	a  library to create and access	Constant DataBase files.  File
       stores (key,value) pairs	and used to quickly find a value  based	 on  a
       given  key.   Cdb  files	 are create-once files,	that is, once created,
       file cannot be updated but recreated from scratch -- this is why	 data-
       base is called constant.	 Cdb file is optimized for quick access.  For-
       mat  of such file described in cdb(5) manpage.  This manual page	corre-
       sponds to version 0.81 of tinycdb package.

       Library defines two non-interlaced interfaces:  for  querying  existing
       cdb  file  data	(read-only  mode) and for creating such	a file (almost
       write-only).  Strictly speaking,	those modes allows very	limited	set of
       opposite	operation as well (i.e.	in query mode, it is possible  to  up-
       date key's value).

       All  routines  in  this library are thread-safe as no global data used,
       except of errno variable	for error indication.

       cdb datafiles may be moved between systems safely,  since  format  does
       not depend on architecture.

QUERY MODE
       There  are two query modes available.  First uses a structure that rep-
       resents a cdb database, just like FILE structure	in stdio library,  and
       another	works with plain filedescriptor.  First	mode is	more sophisti-
       cated and flexible, and usually somewhat	faster.	 It uses  mmap(2)  in-
       ternally.   This	 mode  may look	more "natural" or object-oriented com-
       pared to	second one.

       The following routines works with any mode:

       unsigned	cdb_unpack(buf)
	  const	unsigned char buf[4];
	      helper routine to	convert	32-bit integer from internal represen-
	      tation to	machine	format.	 May be	used to	handle application in-
	      tegers in	a portable way.	 There is no error return.

   Query Mode 1
       All query operations in first more deals	with  common  data  structure,
       struct cdb, associated with an open file	descriptor.  This structure is
       opaque to application.

       The following routines exists for accessing cdb database:

       int cdb_init(cdbp, fd)
	  struct cdb *cdbp;
	  int fd;
	      initializes  structure  given  by	cdbp pointer and associates it
	      with opened file descriptor fd.  Memory allocation for structure
	      itself if	needed and file	open operation should be done  by  ap-
	      plication.   File	 fd  should  be	opened at least	read-only, and
	      should be	seekable.  Routine returns 0 on	 success  or  negative
	      value on error.

       void cdb_free(cdbp)
	  struct cdb *cdbp;
	      frees internal resources held by structure.  Note	that this rou-
	      tine does	not closes a file.

       int cdb_fileno(cdbp)
	 const struct cdb *cdbp;
	      returns  filedescriptor  associated  with	 cdb (as was passed to
	      cdb_init()).

       int cdb_read(cdbp, buf, len, pos)
       int cdb_readdata(cdbp, buf, len,	pos)
       int cdb_readkey(cdbp, buf, len, pos)
	  const	struct cdb *cdbp;
	  void *buf;
	  unsigned len;
	  unsigned pos;
	      reads a data from	cdb file, starting at position pos  of	length
	      len, placing result to buf.  This	routine	may be used to get ac-
	      tual  value  found  by cdb_find()	or other routines that returns
	      position and length of a data.  Returns 0	on success or negative
	      value on error.  Routines	cdb_readdata() and  cdb_readkey()  are
	      shorthands  to read current (after e.g. cdb_find()) data and key
	      respectively, using cdb_read().

       const void *cdb_get(cdbp, len, pos)
       const void *cdb_getdata(cdbp)
       const void *cdb_getkey(cdbp)
	  const	struct cdb *cdbp;
	  unsigned len;
	  unsigned pos;
	      Internally, cdb library uses memory-mmaped region	to access  the
	      on-disk  database.  cdb_get() allows one to access internal mem-
	      ory in a way similar to cdb_read() but without extra copying and
	      buffer allocation.  Returns pointer to actual data on success or
	      NULL on error (position points  to  outside  of  the  database).
	      Routines cdb_getdata() and cdb_getkey() are shorthands to	access
	      current (after e.g. cdb_find()) data and key respectively, using
	      cdb_get().

       int cdb_find(cdbp, key, klen)
       unsigned	cdb_datapos(cdbp)
       unsigned	cdb_datalen(cdbp)
       unsigned	cdb_keypos(cdbp)
       unsigned	cdb_keylen(cdbp)
	  struct cdb *cdbp;
	  const	void *key;
	  unsigned klen;
	      attempts	to  find a key given by	(key,klen) parameters.	If key
	      exists in	database, routine returns 1 and	 places	 position  and
	      length  of value associated with this key	to internal fields in-
	      side cdbp	structure, to be accessible by	cdb_datapos(cdbp)  and
	      cdb_datalen(cdbp)	  routines.    If  key	is  not	 in  database,
	      cdb_find() returns 0.  On	error,	negative  value	 is  returned.
	      Data  pointers  (available  via cdb_datapos() and	cdb_datalen())
	      gets updated only	in case	of successful search.  Note that using
	      cdb_find() it is possible	to lookup only	first  record  with  a
	      given key.

       int cdb_findinit(cdbfp, cdbp, key, klen)
       int cdb_findnext(cdbfp)
	 struct	cdb_find *cdbfp;
	 const struct cdb *cdbp;
	 const void *key;
	 unsigned klen;
	      sequential-find  routines	 that  used separate structure.	 It is
	      possible to have more than one record with the  same  key	 in  a
	      database,	 and  these routines allows one	to enumerate all them.
	      cdb_findinit() initializes search	structure pointed to by	cdbfp.
	      It will return negative value on error or	non-negative value  on
	      success.	 cdb_findnext()	 attempts  to  find  next  (first when
	      called right after cdb_findinit()) matching key,	setting	 value
	      position and length in cdbfp structure.  It will return positive
	      value if given key was found, 0 if there is no more such key(s),
	      or negative value	on error.  To access value position and	length
	      after  successful	call to	cdb_findnext() (when it	returned posi-
	      tive result), use	cdb_datapos(cdbp) and  cdb_datalen(cdbp)  rou-
	      tines.   It  is  error to	continue using cdb_findnext() after it
	      returned 0 or error condition (cdb_findinit() should  be	called
	      again).	Current	data pointers (available via cdb_datapos() and
	      cdb_datalen()) gets updated only on successful search.

       void cdb_seqinit(cptr, cdbp)
       int cdb_seqnext(cptr, cdbp)
	 unsigned *cptr;
	 struct	cdb *cdbp;
	      sequential enumeration  of  all  records	stored	in  cdb	 file.
	      cdb_seqinit()  initializes  access  current data pointer cptr to
	      point before first record	in a cdb file.	cdb_seqnext()  updates
	      data  pointers  in  cdbp to point	to the next record and updates
	      cptr, returning positive value on	success, 0 on end of data con-
	      dition and negative value	on  error.   Current  record  will  be
	      available	after successful operation using cdb_datapos(cdbp) and
	      cdb_datalen(cdbp)	  (for	the  data)  and	 cdb_keypos(cdbp)  and
	      cdb_keylen(cdbp) (for the	key of	the  record).	Data  pointers
	      gets updated only	in case	of successful operation.

   Query Mode 2
       In  this	mode, one need to open a cdb file using	one of standard	system
       calls (such as open(2)) to obtain a filedescriptor, and then pass  that
       filedescriptor to cdb routines.	Available methods to query a cdb data-
       base using only a filedescriptor	include:

       int cdb_seek(fd,	key, klen, dlenp)
	 int fd;
	 const void *key;
	 unsigned klen;
	 unsigned *dlenp;
	      searches a cdb database (as pointed to by	fd filedescriptor) for
	      a	 key given by (key, klen), and positions file pointer to start
	      of data associated with that key if found, so that next read op-
	      eration from this	 filedescriptor	 will  read  that  value,  and
	      places  length  of  value,  in  bytes, to	variable pointed to by
	      dlenp.  Returns positive value if	operation was successful, 0 if
	      key was not found, or negative value on error.  To read the data
	      from a cdb file, cdb_bread() routine below can be	used.

       int cdb_bread(fd, buf, len)
	 int fd;
	 void *buf;
	 int len;
	      reads data from a	file (as pointed to by fd filedescriptor)  and
	      places  len  bytes from this file	to a buffer pointed to by buf.
	      Returns 0	if exactly len bytes was read, or a negative value  in
	      case  of	error  or end-of-file.	This routine ignores interrupt
	      errors (EINTR).  Sets errno variable to EIO in case  of  end-of-
	      file  condition  (when there is less than	len bytes available to
	      read).

   Notes
       Note that value of any given key	may be updated	in  place  by  another
       value  of  the  same  size,  by	writing	 to  file at position found by
       cdb_find() or cdb_seek().  However one should be	very careful when  do-
       ing  so,	 since	write  operation may not succeed in case of e.g. power
       failure,	thus leaving corrupted data.  When  database  is  (re)created,
       one  can	 guarantee that	no incorrect data will be written to database,
       but not with inplace update.  Note also that it is not possible to  up-
       date any	key or to change length	of value.

CREATING MODE
       cdb database file should	usually	be created in two steps: first,	tempo-
       rary  file created and written to disk, and second, that	temporary file
       is renamed to permanent place.  Unix rename(2) call  is	atomic	opera-
       tion, it	removes	destination file if any	AND renaes another file	in one
       step.   This  way it is guaranteed that readers will not	see incomplete
       database.  To prevent multiple simultaneous updates, locking  may  also
       be used.

       All routines used to create cdb database	works with struct cdb_make ob-
       ject that is opaque to application.  Application	may assume that	struct
       cdb_make	 has  at  least	 the same member(s) as published in struct cdb
       above.

       int cdb_make_start(cdbmp, fd)
	  struct cdb_make *cdbmp;
	  int fd;
	      initializes structure to create a	database.  File	fd  should  be
	      opened  read-write and should be seekable.  Returns 0 on success
	      or negative value	on error.

       int cdb_make_add(cdbmp, key, klen, val, vlen)
	  struct cdb_make *cdbmp;
	  const	void *key, *val;
	  unsigned klen, vlen;
	      adds record with key (key,klen) and value	(val,vlen) to a	 data-
	      base.   Returns  0  on success or	negative value on error.  Note
	      that this	routine	does not checks	if given key  already  exists,
	      but cdb_find() will not see second record	with the same key.  It
	      is   not	 possible   to	 continue   building   a  database  if
	      cdb_make_add() returned error indicator.

       int cdb_make_finish(cdbmp)
	  struct cdb_make *cdbmp;
	      finalizes	database file, constructing all	 needed	 indexes,  and
	      frees  memory  structures.   It  does not	closes filedescriptor.
	      Returns 0	on success or negative value on	error.

       int cdb_make_exists(cdbmp, key, klen)
	  struct cdb_make *cdbmp;
	  const	void *key;
	  unsigned klen;
	      This routine attempts to find given by (key,klen)	key in a  not-
	      yet-complete database.  It may significantly slow	down the whole
	      process,	and  currently	it  flushes internal buffer to disk on
	      every call with key those	hash value already exists in db.   Re-
	      turns  0	if  such  key  doesn't exists, 1 if it is, or negative
	      value on error.  Note that database file should be opened	 read-
	      write  (not  write-only)	to  use	this routine.  If cdb_make_ex-
	      ists() returned error, it	may be not possible to	continue  con-
	      structing	database.

       int cdb_make_find(cdbmp,	key, klen, mode)
	  struct cdb_make *cdbmp;
	  const	void *key;
	  unsigned klen;
	  int mode;
	      This  routine  attempts  to  find	given by (key,klen) key	in the
	      database being created.  If the given key	is already exists,  it
	      an action	specified by mode will be performed:

	      CDB_FIND
		     checks  whenever the given	record is already in the data-
		     base.

	      CDB_FIND_REMOVE
		     removes all matching records by re-writing	 the  database
		     file accordingly.

	      CDB_FIND_FILL0
		     fills  all	 matching  records with	zeros and removes them
		     from index	so that	the records in question	 will  not  be
		     findable	with   cdb_find().    This   is	  faster  than
		     CDB_FIND_REMOVE, but leaves zero "gaps" in	the  database.
		     Lastly inserted records, if matched, are always removed.

	      If  no  matching	keys was found,	routine	returns	0.  In case at
	      least one	record has been	found/removed, positive	value will  be
	      returned.	  On  error, negative value will be returned and errno
	      will be set appropriately.  When cdb_make_find() returned	 nega-
	      tive value in case of error, it is not possible to continue con-
	      structing	the database.

	      cdb_make_exists()	 is  the  same as calling cdb_make_find() with
	      mode set to CDB_FIND.

       int cdb_make_put(cdbmp, key, klen, val, vlen, mode)
	  struct cdb_make *cdbmp;
	  const	void *key, *val;
	  unsigned klen, vlen;
	  int mode;
	      This is a	somewhat combined cdb_make_exists() and	cdb_make_add()
	      routines.	 mode argument controls	how repeated  (already	exist-
	      ing) keys	will be	treated:

	      CDB_PUT_ADD
		     no	 duplicate  checking  will be performed.  This mode is
		     the same as cdb_make_add()	routine	does.

	      CDB_PUT_REPLACE
		     If	the key	already	exists,	it will	be  removed  from  the
		     database before adding new	key,value pair.	 This requires
		     moving  data  in  the  file, and can be quite slow	if the
		     file is large.  All matching old records will be  removed
		     this  way.	  This	is the same as calling cdb_make_find()
		     with CDB_FIND_REMOVE mode argument	 followed  by  calling
		     cdb_make_add().

	      CDB_PUT_REPLACE0
		     If	the key	already	exists and it isn't the	last record in
		     the file, old record will be zeroed out before adding new
		     key,value	pair.	This  is a lot faster than CDB_PUT_RE-
		     PLACE, but	some extra data	will still be present  in  the
		     file.   The  data -- old record --	will not be accessible
		     by	normal searches, but will appear in  sequential	 data-
		     base   traversal.	  This	 is   the   same   as  calling
		     cdb_make_find() with CDB_FIND_FILL0  mode	argument  fol-
		     lowed by cdb_make_add().

	      CDB_PUT_INSERT
		     add  key,value pair only if such key does not exists in a
		     database.	Note that since	query (see query  mode	above)
		     will  find	first added record, this mode is somewhat use-
		     less (but allows one to reduce database size in  case  of
		     repeated keys).  This is the same as calling cdb_make_ex-
		     ists(),  followed	by  cdb_make_add()  if the key was not
		     found.

	      CDB_PUT_WARN
		     add key,value pair	unconditionally,  but  also  check  if
		     this   key	  already   exists.   This  is	equivalent  of
		     cdb_make_exists() to check	existence of  the  given  key,
		     unconditionally followed by cdb_make_add().

	      If any error occurred during operations, the routine will	return
	      negative	integer	and will set global variable errno to indicate
	      reason of	failure.  In case of successful	operation and  no  du-
	      plicates	found,	routine	 will return 0.	 If any	duplicates has
	      been found or removed (which, in case  of	 CDB_PUT_INSERT	 mode,
	      indicates	 that  the new record was not added), routine will re-
	      turn positive value.  If an error	 occurred  and	cdb_make_put()
	      returned negative	error, it is not possible to continue database
	      construction process.

	      As  with	cdb_make_exists()  and	cdb_make_find(), usage of this
	      routine with any but CDB_PUT_ADD	mode  can  significantly  slow
	      down database creation process, especially when mode is equal to
	      CDB_PUT_REPLACE0.

       void cdb_pack(num, buf)
	  unsigned num;
	  unsigned char	buf[4];
	      helper routine that used internally to convert machine integer n
	      to  internal  form  to be	stored in datafile.  32-bit integer is
	      stored in	4 bytes	in network byte	order.	May be used to	handle
	      application data.	 There is no error return.

       unsigned	cdb_hash(buf, len)
	  const	void *buf;
	  unsigned len;
	      helper  routine  that  calculates	cdb hash value of given	bytes.
	      CDB hash function	is
		hash[n]	= (hash[n-1] + (hash[n-1] << 5)) ^ buf[n]
	      starting with
		hash[-1] = 5381

ERRORS
       cdb library may set errno to following on error:

       EPROTO database file is corrupted in some way

       EINVAL the same as EPROTO above if system lacks EPROTO constant

       EINVAL flag argument for	cdb_make_put() is invalid

       EEXIST flag argument for	cdb_make_put() is CDB_PUT_INSERT, and key  al-
	      ready exists

       ENOMEM not  enough  memory  to  complete	operation (cdb_make_finish and
	      cdb_make_add)

       EIO    set by cdb_bread and cdb_seek if a cdb file is shorter than  ex-
	      pected or	corrupted in some other	way.

EXAMPLES
       Note: in	all examples below, error checking is not shown	for brewity.

   Query Mode
	int fd;
	struct cdb cdb;
	char *key, *data;
	unsigned keylen, datalen;

	/* opening the database	*/
	fd = open(filename, O_RDONLY);
	cdb_init(&cdb, fd);
	/* initialize key and keylen here */

	/* single-record search. */
	if (cdb_find(&cdb, key,	keylen)	> 0) {
	  datalen = cdb_datalen(&cdb);
	  data = malloc(datalen	+ 1);
	  cdb_read(&cdb, data, datalen,	cdb_datapos(&cdb));
	  data[datalen]	= '\0';
	  printf("key=%s data=%s\n", key, data);
	  free(data);
	}
	else
	  printf("key=%s not found\n", key);

	/* multiple record search */
	struct cdb_find	cdbf;
	int n;
	cdb_findinit(&cdbf, &cdb, key, keylen);
	n = 0;
	while(cdb_findnext(&cdbf) > 0) {
	  datalen = cdb_datalen(&cdb);
	  data = malloc(datalen	+ 1);
	  cdb_read(&cdb, data, datalen,	cdb_datapos(&cdb));
	  data[datalen]	= '\0';
	  printf("key=%s data=%s\n", key, data);
	  free(data);
	  ++n;
	}
	printf("key=%s %d records found\n", n);

	/* sequential database access */
	unsigned pos;
	int n;
	cdb_seqinit(&pos, &cdb);
	n = 0;
	while(cdb_seqnext(&pos,	&cdb) >	0) {
	  keylen = cdb_keylen(&cdb);
	  key =	malloc(keylen +	1);
	  cdb_read(&cdb, key, keylen, cdb_keypos(&cdb));
	  key[keylen] =	'\0';
	  datalen = cdb_datalen(&cdb);
	  data = malloc(datalen	+ 1);
	  cdb_read(&cdb, data, datalen,	cdb_datapos(&cdb));
	  data[datalen]	= '\0';
	  ++n;
	  printf("record %n: key=%s data=%s\n",	n, key,	data);
	  free(data); free(key);
	}
	printf("total records found: %d\n", n);

	/* close the database */
	cdb_free(&cdb);
	close(fd);

	/* simplistic query mode */
	fd = open(filename, O_RDONLY);
	if (cdb_seek(fd, key, keylen, &datalen)	> 0) {
	  data = malloc(datalen	+ 1);
	  cdb_bread(fd,	data, datalen);
	  data[datalen]	= '\0';
	  printf("key=%s data=%s\n", key, data);
	}
	else
	  printf("key=%s not found\n", key);
	close(fd);

   Create Mode
	int fd;
	struct cdb_make	cdbm;
	char *key, *data;
	unsigned keylen, datalen;

	/* initialize the database */
	fd = open(filename, O_RDWR|O_CREAT|O_TRUNC, 0644);
	cdb_make_start(&cdbm, fd);

	while(have_more_data())	{
	  /* initialize	key and	data */
	  if (cdb_make_exists(&cdbm, key, keylen) == 0)
	    cdb_make_add(&cdbm,	key, keylen, data, datalen);
	  /* or	use cdb_make_put() with	appropriate flags */
	}

	/* finalize and	close the database */
	cdb_make_finish(&cdbm);
	close(fd);

SEE ALSO
       cdb(5), cdb(1), dbm(3), db(3), open(2).

AUTHOR
       The  tinycdb  package  written  by Michael Tokarev <mjt+cdb@corpit.ru>,
       based on	ideas and shares file format with original cdb library by  Dan
       Bernstein.

LICENSE
       Tinycdb is licensed under MIT license.

				   Jun 2006				cdb(3)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=cdb&sektion=3&manpath=FreeBSD+Ports+14.3.quarterly>

home | help