FreeBSD Manual Pages
HASH(3) BSD Library Functions Manual HASH(3) NAME hash_initialise, hash_insert, hash_retrieve, hash_delete, hash_iterator_initialise, hash_fetch_next, hash_iterator_deinitialise, hash_deinitialise -- hash table manipulation functions LIBRARY Hash Library (libhash, -lhash) SYNOPSIS #include <sys/types.h> #include <hash.h> int hash_initialise(hash *h, u_int32_t number_of_buckets, u_int32_t (*hash_function)(void *key, u_int32_t number_of_buckets), int (*compare_keys)(void *key1, void *key2), int (*duplicate_key)(void **destination, void *source), void (*free_key)(void *key), void (*free_value)(void *value)); int hash_insert(hash *h, void *key, void *value); int hash_retrieve(hash *h, void *key, void **value); int hash_delete(hash *h, void *key); int hash_iterator_initialise(hash *h, struct hash_iterator *it); int hash_fetch_next(hash *h, struct hash_iterator *it, void **key, void **value); int hash_iterator_deinitialise(hash *h, struct hash_iterator *it); void hash_deinitialise(hash *h); DESCRIPTION These functions manipulate a hash table, sometimes known as an associa- tive array. A hash table is a data structure that allows you to store "values" indexed by "key". A key can be any kind of data structure but is commonly a string. Indexing by integer is a reasonable way to simulate a sparse array. The data can also be any kind of data structure, user-de- fined or built-in. To allow any kind of data structure to be used as a key or a value a void * is used to represent each. While giving flexibility, this use of void * does force the user to supply functions to perform all manipulation and examination of the keys - it is not possible for libhash to check for equality, copy or delete keys on its own. Pointers to these user nomi- nated functions are provided to the hash_initialise() function, called before any other operations on the hash table are allowed. The hash_initialise() function is called to initialise the hash table pointed to by h. It must be called before any other hash function on the hash table h. The parameter number_of_buckets sets the "size" of the hash table - the number of buckets that the data is distributed over. For the best perfor- mance the general rule is that this number should be prime and approxi- mately twice the size of the expected data set. Chaining is used in case of collisions so there is no actual limit to the amount of data that can be stored, subject to system resource limits. The parameter hash_function is a pointer to the hash function to be used. The hash function's job is to map keys to buckets so it must be aware of the data structure being used for keys and must return a number between 0 and number_of_buckets - 1 inclusive. A good hash function maps the data set evenly over all buckets. See hash_hash_int(3) for some provided hash functions. This argument may be NULL to use a provided function that works with strings. The parameter compare_keys is a pointer to a function that can provide ordering information about 2 keys. This function should return a value less than 0 if key1 is less than key2, greater than 0 if key1 is greater than key2 and 0 if key1 is equal to key2. This argument may be NULL to use a provided function that works with strings. The parameter duplicate_key is a pointer to a function that will allocate memory and make a copy of a key, the original located at the address source and the address of the new copy to be stored at destination. See hash_copy_int(3) for some provided duplicate functions. This argument may be NULL to use a provided function that works with strings. The parameter free_key is a pointer to a function that will free the mem- ory used by a key allocated with duplicate_key. This argument may be NULL if keys don't need "freeing". The parameter free_value is a pointer to a function that will free the memory used by a value stored in the hash table. This argument may be NULL if values don't need "freeing". The ability to pass NULL is useful when data added to the hash table is already part of another data struc- ture or memory management scheme - the hash table being used as an index. The hash_insert() function is used to add key/value pairs to the hash ta- ble. h is a pointer to the hash table to which the pair is to be added. key is the key under which the value is to be indexed under. value is the value to be stored. The hash_insert() function will make a copy of key but only store a pointer to value so it is the caller's responsibil- ity to ensure the memory used by value is not lost prematurely. The hash_retrieve() function is used to retieve data previously stored in the hash table by the function hash_insert(). The parameter h is a pointer to the hash table to retrieve from. The parameter key is a pointer to a key considered equivalent (by the function compare_keys pro- vided to hash_initialise()) to the key used to store the value. The pa- rameter value is a pointer to a location at which to store the address of the data. The hash_delete() function is used to remove a key/value pair from a hash table. The parameter h is a pointer to the hash table from which to re- move the pair. The parameter key is a pointer to a key considered equiva- lent (by the function compare_keys provided to hash_initialise()) to the key used to store the pair. The functions pointed to by the parameters free_key and free_value provided to the hash_initialise() function are used (if not NULL) to free the memory used by the key/value pair. It is important to note that if you are using data retrieved by hash_retrieve() and then call hash_delete() on the same pair, the pointer returned by hash_retrieve() is invalid. The function hash_iterator_initialise() is used to initialise a struct hash_iterator. A struct hash_iterator, in combination with hash_fetch_next(), is used to allow a caller to iterate over all key/value pairs stored in a hash table. The parameter h is a pointer to the hash table to be iterated over. The it parameter is a pointer to the struct hash_iterator. hash_iterator_initialise() must be called before any other functions that use the struct hash_iterator. The hash_fetch_next() function fetches the "next" key/value pair from a hash table. By repeatedly calling it, this function can be used to iter- ate through all key/value pairs stored in a hash table. Note that the pairs are returned in an order which is an artifact of the internal work- ings of the library and probably meaningless to a user. The parameter h is a pointer to the hash table to iterate over (this should be the same as the h argument provided to hash_iterator_initialise()). The parameter it is a pointer to the struct hash_iterator previously initialised with hash_iterator_initialise(). The parameters key and value are pointers to locations to store the addresses of the key and value fetched. If hash_fetch_next() is called after the last pair in the hash table has been fetched then 0 is returned, nothing is stored at key or value, and the struct hash_iterator is reset so the next call to hash_fetch_next() will return the first pair in the hash table. If a pair is added to the table whilst a struct hash_iterator is active it may or may not be re- turned by hash_fetch_next() before the struct hash_iterator is reset. The calling of hash_delete() while a struct hash_iterator is active is safe for any pair that has already been returned by hash_fetch_next(). The function hash_iterator_deinitialise() should be called when finished with a struct hash_iterator. The parameter h is a pointer to the hash table originally passed to hash_iterator_initialise(). The parameter it is the struct hash_iterator to be deinitialised. This function should be called before the memory used by a struct hash_iterator is disposed of. The function hash_deinitialise() is used to dispose of a hash table. The parameter h is the hash table to dispose of. After calling hash_deinitialise() all pointers returned by hash_retrieve() or hash_fetch_next() are invalid. IMPLEMENTATION NOTES Upon collision (where 2 keys are mapped by the hash function to the same bucket) ordered linked lists are used. On retrieval a sequential search is used. On a hash table that is overly full (has a lot of collisions) retrieval time is therefore O(n/(2 * number_of_buckets)) as opposed to O(1) for a sparsely populated table. Therefore it is wise to set num- ber_of_buckets to be large enough and to use a hash function well suited for your data. RETURN VALUES The hash_initialise(), hash_insert(), hash_retrieve(), hash_delete(), hash_iterator_initialise(), hash_fetch_next() and hash_iterator_deinitialise() functions return 0 on failure and set the global variable errno to indicate the error. The hash_deinitialise() function returns no value. EXAMPLES See the tests directory included in the distribution. ERRORS [ENOENT] There was no value stored for the provided key or in the case of hash_fetch_next() the end of the hash table has been reached. [ENOMEM] A call to malloc failed. SEE ALSO dbopen(3), emmao(8), libhash_convenience(3), queue(3) HISTORY The libhash library was written in January 2002 for emmao (a program to kill off rogue processes under Solaris). libhash was written under FreeBSD 4.4. AUTHORS Andrew Stevenson <andrew@ugh.net.au> BUGS Please report them to <andrew@ugh.net.au>. UgH! January 13, 2002 UgH!
NAME | LIBRARY | SYNOPSIS | DESCRIPTION | IMPLEMENTATION NOTES | RETURN VALUES | EXAMPLES | ERRORS | SEE ALSO | HISTORY | AUTHORS | BUGS
Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=hash_initialise&sektion=3&manpath=FreeBSD+13.0-RELEASE+and+Ports>