FreeBSD Manual Pages

home | help

APACHECOUCHDB(1) Apache CouchDB APACHECOUCHDB(1)

NAME
apachecouchdb - Apache CouchDB 3.4.3

INTRODUCTION
CouchDB is a database that completely embraces the web. Store your data
with JSON documents. Access your documents with your web browser, via
HTTP. Query, combine, and transform your documents with JavaScript.
CouchDB works well with modern web and mobile apps. You can distribute
your data, efficiently using CouchDBs incremental replication. CouchDB
supports master-master setups with automatic conflict detection.

CouchDB comes with a suite of features, such as on-the-fly document
transformation and real-time change notifications, that make web devel-
opment a breeze. It even comes with an easy to use web administration
console, served directly out of CouchDB! We care a lot about -
distributed scaling. CouchDB is highly available and partition toler-
ant, but is also eventually consistent. And we care a lot about your
data. CouchDB has a fault-tolerant storage engine that puts the safety
of your data first.

In this section youll learn about every basic bit of CouchDB, see upon
what conceptions and technologies it built and walk through short tuto-
rial that teach how to use CouchDB.

Technical Overview
Document Storage
A CouchDB server hosts named databases, which store documents. Each
document is uniquely named in the database, and CouchDB provides a -
RESTful HTTP API for reading and updating (add, edit, delete) database
documents.

Documents are the primary unit of data in CouchDB and consist of any
number of fields and attachments. Documents also include metadata thats
maintained by the database system. Document fields are uniquely named
and contain values of varying types (text, number, boolean, lists,
etc), and there is no set limit to text size or element count.

The CouchDB document update model is lockless and optimistic. Document
edits are made by client applications loading documents, applying
changes, and saving them back to the database. If another client edit-
ing the same document saves their changes first, the client gets an
edit conflict error on save. To resolve the update conflict, the latest
document version can be opened, the edits reapplied and the update
tried again.

Single document updates (add, edit, delete) are all or nothing, either
succeeding entirely or failing completely. The database never contains
partially saved or edited documents.

ACID Properties
The CouchDB file layout and commitment system features all Atomic Con-
sistent Isolated Durable (ACID) properties. On-disk, CouchDB never
overwrites committed data or associated structures, ensuring the data-
base file is always in a consistent state. This is a crash-only design
where the CouchDB server does not go through a shut down process, its
simply terminated.

Document updates (add, edit, delete) are serialized, except for binary
blobs which are written concurrently. Database readers are never locked
out and never have to wait on writers or other readers. Any number of
clients can be reading documents without being locked out or inter-
rupted by concurrent updates, even on the same document. CouchDB read
operations use a Multi-Version Concurrency Control (MVCC) model where
each client sees a consistent snapshot of the database from the begin-
ning to the end of the read operation. This means that CouchDB can
guarantee transactional semantics on a per-document basis.

Documents are indexed in B-trees by their name (DocID) and a Sequence
ID. Each update to a database instance generates a new sequential num-
ber. Sequence IDs are used later for incrementally finding changes in
a database. These B-tree indexes are updated simultaneously when docu-
ments are saved or deleted. The index updates always occur at the end
of the file (append-only updates).

Documents have the advantage of data being already conveniently pack-
aged for storage rather than split out across numerous tables and rows
in most database systems. When documents are committed to disk, the
document fields and metadata are packed into buffers, sequentially one
document after another (helpful later for efficient building of views).

When CouchDB documents are updated, all data and associated indexes are
flushed to disk and the transactional commit always leaves the database
in a completely consistent state. Commits occur in two steps:

1. All document data and associated index updates are synchronously
flushed to disk.

2. The updated database header is written in two consecutive, identical
chunks to make up the first 4k of the file, and then synchronously
flushed to disk.

In the event of an OS crash or power failure during step 1, the par-
tially flushed updates are simply forgotten on restart. If such a crash
happens during step 2 (committing the header), a surviving copy of the
previous identical headers will remain, ensuring coherency of all pre-
viously committed data. Excepting the header area, consistency checks
or fix-ups after a crash or a power failure are never necessary.

Compaction
Wasted space is recovered by occasional compaction. On schedule, or
when the database file exceeds a certain amount of wasted space, the
compaction process clones all the active data to a new file and then
discards the old file. The database remains completely online the en-
tire time and all updates and reads are allowed to complete success-
fully. The old database file is deleted only when all the data has been
copied and all users transitioned to the new file.

Views
ACID properties only deal with storage and updates, but we also need
the ability to show our data in interesting and useful ways. Unlike SQL
databases where data must be carefully decomposed into tables, data in
CouchDB is stored in semi-structured documents. CouchDB documents are
flexible and each has its own implicit structure, which alleviates the
most difficult problems and pitfalls of bi-directionally replicating
table schemas and their contained data.

But beyond acting as a fancy file server, a simple document model for
data storage and sharing is too simple to build real applications on
it simply doesnt do enough of the things we want and expect. We want to
slice and dice and see our data in many different ways. What is needed
is a way to filter, organize and report on data that hasnt been decom-
posed into tables.

SEE ALSO:

CouchDB Guide:

• Show Functions

List Functions
WARNING:
List functions are deprecated in CouchDB 3.0, and will be removed in
CouchDB 4.0.

listfun(head, req)

Arguments

• head View Head Information

• req Request object.

Returns
Last chunk.

Return type
string

While Show Functions are used to customize document presentation, List
Functions are used for the same purpose, but on View Functions results.

The following list function formats the view and represents it as a
very simple HTML page:

function(head, req){
start({
'headers': {
'Content-Type': 'text/html'
}
});
send('<html><body><table>');
send('<tr><th>ID</th><th>Key</th><th>Value</th></tr>');
while(row = getRow()){
send(''.concat(
'<tr>',
'<td>' + toJSON(row.id) + '</td>',
'<td>' + toJSON(row.key) + '</td>',
'<td>' + toJSON(row.value) + '</td>',
'</tr>'
));
}
send('</table></body></html>');
}

Templates and styles could obviously be used to present data in a nicer
fashion, but this is an excellent starting point. Note that you may
also use registerType() and provides() functions in a similar way as
for Show Functions! However, note that provides() expects the return
value to be a string when used inside a list function, so youll need to
use start() to set any custom headers and stringify your JSON before
returning it.

SEE ALSO:

CouchDB Guide:

• Transforming Views with List Functions

Update Functions
updatefun(doc, req)

Arguments

• doc The document that is being processed.

• req Request object

Returns
Two-element array: the first element is the (updated or
new) document, which is committed to the database. If the
first element is null no document will be committed to
the database. If you are updating an existing document,
it should already have an _id set, and if you are creat-
ing a new document, make sure to set its _id to some-
thing, either generated based on the input or the
req.uuid provided. The second element is the response
that will be sent back to the caller.

Update handlers are functions that clients can request to invoke
server-side logic that will create or update a document. This feature
allows a range of use cases such as providing a server-side last modi-
fied timestamp, updating individual fields in a document without first
getting the latest revision, etc.

When the request to an update handler includes a document ID in the
URL, the server will provide the function with the most recent version
of that document. You can provide any other values needed by the up-
date handler function via the POST/PUT entity body or query string pa-
rameters of the request.

A basic example that demonstrates all use-cases of update handlers:

function(doc, req){
if (!doc){
if ('id' in req && req['id']){
// create new document
return [{'_id': req['id']}, 'New World']
}
// change nothing in database
return [null, 'Empty World']
}
doc['world'] = 'hello';
doc['edited_by'] = req['userCtx']['name']
return [doc, 'Edited World!']
}

Filter Functions
filterfun(doc, req)

Arguments

• doc The document that is being processed

• req Request object

Returns
Boolean value: true means that doc passes the filter
rules, false means that it does not.

Filter functions mostly act like Show Functions and List Functions:
they format, or filter the changes feed.

Classic Filters
By default the changes feed emits all database documents changes. But
if youre waiting for some special changes, processing all documents is
inefficient.

Filters are special design document functions that allow the changes
feed to emit only specific documents that pass filter rules.

Lets assume that our database is a mailbox and we need to handle only
new mail events (documents with the status new). Our filter function
would look like this:

function(doc, req){
// we need only `mail` documents
if (doc.type != 'mail'){
return false;
}
// we're interested only in `new` ones
if (doc.status != 'new'){
return false;
}
return true; // passed!
}

Filter functions must return true if a document passed all the rules.
Now, if you apply this function to the changes feed it will emit only
changes about new mails:

GET /somedatabase/_changes?filter=mailbox/new_mail HTTP/1.1

{"results":[
{"seq":"1-g1AAAAF9eJzLYWBg4MhgTmHgz8tPSTV0MDQy1zMAQsMcoARTIkOS_P___7MymBMZc4EC7MmJKSmJqWaYynEakaQAJJPsoaYwgE1JM0o1TjQ3T2HgLM1LSU3LzEtNwa3fAaQ_HqQ_kQG3qgSQqnoCqvJYgCRDA5ACKpxPWOUCiMr9hFUegKi8T1jlA4hKkDuzAC2yZRo","id":"df8eca9da37dade42ee4d7aa3401f1dd","changes":[{"rev":"1-c2e0085a21d34fa1cecb6dc26a4ae657"}]},
{"seq":"9-g1AAAAIreJyVkEsKwjAURUMrqCOXoCuQ5MU0OrI70XyppcaRY92J7kR3ojupaSPUUgqWwAu85By4t0AITbJYo5k7aUNSAnyJ_SGFf4gEkvOyLPMsFtHRL8ZKaC1M0v3eq5ALP-X2a0G1xYKhgnONpmenjT04o_v5tOJ3LV5itTES_uP3FX9ppcAACaVsQAo38hNd_eVFt8ZklVljPqSPYLoH06PJhG0Cxq7-yhQcz-B4_fQCjFuqBjjewVF3E9cORoExSrpU_gHBTo5m","id":"df8eca9da37dade42ee4d7aa34024714","changes":[{"rev":"1-29d748a6e87b43db967fe338bcb08d74"}]},
],
"last_seq":"10-g1AAAAIreJyVkEsKwjAURR9tQR25BF2B5GMaHdmdaNIk1FLjyLHuRHeiO9Gd1LQRaimFlsALvOQcuLcAgGkWKpjbs9I4wYSvkDu4cA-BALkoyzLPQhGc3GKSCqWEjrvfexVy6abc_SxQWwzRVHCuYHaxSpuj1aqfTyp-3-IlSrdakmH8oeKvrRSIkJhSNiKFjdyEm7uc6N6YTKo3iI_pw5se3vRsMiETE23WgzJ5x8s73n-9EMYNTUc4Pt5RdxPVDkYJYxR3qfwLwW6OZw"}

Note that the value of last_seq is 10-.., but we received only two
records. Seems like any other changes were for documents that havent
passed our filter.

We probably need to filter the changes feed of our mailbox by more than
a single status value. Were also interested in statuses like spam to
update spam-filter heuristic rules, outgoing to let a mail daemon actu-
ally send mails, and so on. Creating a lot of similar functions that
actually do similar work isnt good idea - so we need a dynamic filter.

You may have noticed that filter functions take a second argument named
request. This allows the creation of dynamic filters based on query pa-
rameters, user context and more.

The dynamic version of our filter looks like this:

function(doc, req){
// we need only `mail` documents
if (doc.type != 'mail'){
return false;
}
// we're interested only in requested status
if (doc.status != req.query.status){
return false;
}
return true; // passed!
}

and now we have passed the status query parameter in the request to let
our filter match only the required documents:

GET /somedatabase/_changes?filter=mailbox/by_status&status=new HTTP/1.1

and we can easily change filter behavior with:

GET /somedatabase/_changes?filter=mailbox/by_status&status=spam HTTP/1.1

{"results":[
{"seq":"6-g1AAAAIreJyVkM0JwjAYQD9bQT05gk4gaWIaPdlNNL_UUuPJs26im-gmuklMjVClFFoCXyDJe_BSAsA4jxVM7VHpJEswWyC_ktJfRBzEzDlX5DGPDv5gJLlSXKfN560KMfdTbL4W-FgM1oQzpmByskqbvdWqnc8qfvvHCyTXWuBu_K7iz38VCOOUENqjwg79hIvfvOhamQahROoVYn3-I5huwXSvm5BJsTbLTk3B8QiO58-_YMoMkT0cr-BwdRElmFKSNKniDcAcjmM","id":"8960e91220798fc9f9d29d24ed612e0d","changes":[{"rev":"3-cc6ff71af716ddc2ba114967025c0ee0"}]},
],
"last_seq":"10-g1AAAAIreJyVkEsKwjAURR9tQR25BF2B5GMaHdmdaNIk1FLjyLHuRHeiO9Gd1LQRaimFlsALvOQcuLcAgGkWKpjbs9I4wYSvkDu4cA-BALkoyzLPQhGc3GKSCqWEjrvfexVy6abc_SxQWwzRVHCuYHaxSpuj1aqfTyp-3-IlSrdakmH8oeKvrRSIkJhSNiKFjdyEm7uc6N6YTKo3iI_pw5se3vRsMiETE23WgzJ5x8s73n-9EMYNTUc4Pt5RdxPVDkYJYxR3qfwLwW6OZw"}

Combining filters with a continuous feed allows creating powerful
event-driven systems.

View Filters
View filters are the same as classic filters above, with one small dif-
ference: they use the map instead of the filter function of a view, to
filter the changes feed. Each time a key-value pair is emitted from the
map function, a change is returned. This allows avoiding filter func-
tions that mostly do the same work as views.

To use them just pass filter=_view and view=designdoc/viewname as re-
quest parameters to the changes feed:

GET /somedatabase/_changes?filter=_view&view=dname/viewname HTTP/1.1

NOTE:
Since view filters use map functions as filters, they cant show any
dynamic behavior since request object is not available.

SEE ALSO:

CouchDB Guide:

• Guide to filter change notification

Validate Document Update Functions
validatefun(newDoc, oldDoc, userCtx, secObj)

Arguments

• newDoc New version of document that will be stored.

• oldDoc Previous version of document that is already
stored.

• userCtx User Context Object

• secObj Security Object

Throws forbidden error to gracefully prevent document storing.

Throws unauthorized error to prevent storage and allow the user
to re-auth.

A design document may contain a function named validate_doc_update
which can be used to prevent invalid or unauthorized document update
requests from being stored. The function is passed the new document
from the update request, the current document stored in the database, a
User Context Object containing information about the user writing the
document (if present), and a Security Object with lists of database se-
curity roles.

Validation functions typically examine the structure of the new docu-
ment to ensure that required fields are present and to verify that the
requesting user should be allowed to make changes to the document prop-
erties. For example, an application may require that a user must be
authenticated in order to create a new document or that specific docu-
ment fields be present when a document is updated. The validation func-
tion can abort the pending document write by throwing one of two error
objects:

// user is not authorized to make the change but may re-authenticate
throw({ unauthorized: 'Error message here.' });

// change is not allowed
throw({ forbidden: 'Error message here.' });

Document validation is optional, and each design document in the data-
base may have at most one validation function. When a write request is
received for a given database, the validation function in each design
document in that database is called in an unspecified order. If any of
the validation functions throw an error, the write will not succeed.

Example: The _design/_auth ddoc from _users database uses a validation
function to ensure that documents contain some required fields and are
only modified by a user with the _admin role:

function(newDoc, oldDoc, userCtx, secObj) {
if (newDoc._deleted === true) {
// allow deletes by admins and matching users
// without checking the other fields
if ((userCtx.roles.indexOf('_admin') !== -1) ||
(userCtx.name == oldDoc.name)) {
return;
} else {
throw({forbidden: 'Only admins may delete other user docs.'});
}
}

if ((oldDoc && oldDoc.type !== 'user') || newDoc.type !== 'user') {
throw({forbidden : 'doc.type must be user'});
} // we only allow user docs for now

if (!newDoc.name) {
throw({forbidden: 'doc.name is required'});
}

if (!newDoc.roles) {
throw({forbidden: 'doc.roles must exist'});
}

if (!isArray(newDoc.roles)) {
throw({forbidden: 'doc.roles must be an array'});
}

if (newDoc._id !== ('org.couchdb.user:' + newDoc.name)) {
throw({
forbidden: 'Doc ID must be of the form org.couchdb.user:name'
});
}

if (oldDoc) { // validate all updates
if (oldDoc.name !== newDoc.name) {
throw({forbidden: 'Usernames can not be changed.'});
}
}

if (newDoc.password_sha && !newDoc.salt) {
throw({
forbidden: 'Users with password_sha must have a salt.' +
'See /_utils/script/couch.js for example code.'
});
}

var is_server_or_database_admin = function(userCtx, secObj) {
// see if the user is a server admin
if(userCtx.roles.indexOf('_admin') !== -1) {
return true; // a server admin
}

// see if the user a database admin specified by name
if(secObj && secObj.admins && secObj.admins.names) {
if(secObj.admins.names.indexOf(userCtx.name) !== -1) {
return true; // database admin
}
}

// see if the user a database admin specified by role
if(secObj && secObj.admins && secObj.admins.roles) {
var db_roles = secObj.admins.roles;
for(var idx = 0; idx < userCtx.roles.length; idx++) {
var user_role = userCtx.roles[idx];
if(db_roles.indexOf(user_role) !== -1) {
return true; // role matches!
}
}
}

return false; // default to no admin
}

if (!is_server_or_database_admin(userCtx, secObj)) {
if (oldDoc) { // validate non-admin updates
if (userCtx.name !== newDoc.name) {
throw({
forbidden: 'You may only update your own user document.'
});
}
// validate role updates
var oldRoles = oldDoc.roles.sort();
var newRoles = newDoc.roles.sort();

if (oldRoles.length !== newRoles.length) {
throw({forbidden: 'Only _admin may edit roles'});
}

for (var i = 0; i < oldRoles.length; i++) {
if (oldRoles[i] !== newRoles[i]) {
throw({forbidden: 'Only _admin may edit roles'});
}
}
} else if (newDoc.roles.length > 0) {
throw({forbidden: 'Only _admin may set roles'});
}
}

// no system roles in users db
for (var i = 0; i < newDoc.roles.length; i++) {
if (newDoc.roles[i][0] === '_') {
throw({
forbidden:
'No system roles (starting with underscore) in users db.'
});
}
}

// no system names as names
if (newDoc.name[0] === '_') {
throw({forbidden: 'Username may not start with underscore.'});
}

var badUserNameChars = [':'];

for (var i = 0; i < badUserNameChars.length; i++) {
if (newDoc.name.indexOf(badUserNameChars[i]) >= 0) {
throw({forbidden: 'Character `' + badUserNameChars[i] +
'` is not allowed in usernames.'});
}
}
}

NOTE:
The return statement is used only for function, it has no impact on
the validation process.

SEE ALSO:

CouchDB Guide:

• Validation Functions

Guide to Views
Views are the primary tool used for querying and reporting on CouchDB
documents. There youll learn how they work and how to use them to
build effective applications with CouchDB.

Introduction to Views
Views are useful for many purposes:

• Filtering the documents in your database to find those relevant to a
particular process.

• Extracting data from your documents and presenting it in a specific
order.

• Building efficient indexes to find documents by any value or struc-
ture that resides in them.

• Use these indexes to represent relationships among documents.

• Finally, with views you can make all sorts of calculations on the
data in your documents. For example, if documents represent your com-
panys financial transactions, a view can answer the question of what
the spending was in the last week, month, or year.

What Is a View?
Lets go through the different use cases. First is extracting data that
you might need for a special purpose in a specific order. For a front
page, we want a list of blog post titles sorted by date. Well work with
a set of example documents as we walk through how views work:

{
"_id":"biking",
"_rev":"AE19EBC7654",

"title":"Biking",
"body":"My biggest hobby is mountainbiking. The other day...",
"date":"2009/01/30 18:04:11"
}

{
"_id":"bought-a-cat",
"_rev":"4A3BBEE711",

"title":"Bought a Cat",
"body":"I went to the pet store earlier and brought home a little kitty...",
"date":"2009/02/17 21:13:39"
}

{
"_id":"hello-world",
"_rev":"43FBA4E7AB",

"title":"Hello World",
"body":"Well hello and welcome to my new blog...",
"date":"2009/01/15 15:52:20"
}

Three will do for the example. Note that the documents are sorted by
_id, which is how they are stored in the database. Now we define a
view. Bear with us without an explanation while we show you some code:

function(doc) {
if(doc.date && doc.title) {
emit(doc.date, doc.title);
}
}

This is a map function, and it is written in JavaScript. If you are not
familiar with JavaScript but have used C or any other C-like language
such as Java, PHP, or C#, this should look familiar. It is a simple
function definition.

You provide CouchDB with view functions as strings stored inside the
views field of a design document. To create this view you can use this
command:

curl -X PUT http://admin:password@127.0.0.1:5984/db/_design/my_ddoc
-d '{"views":{"my_filter":{"map":
"function(doc) { if(doc.date && doc.title) { emit(doc.date, doc.title); }}"}}}'

You dont run the JavaScript function yourself. Instead, when you query
your view, CouchDB takes the source code and runs it for you on every
document in the database your view was defined in. You query your view
to retrieve the view result using the following command:

curl -X GET http://admin:password@127.0.0.1:5984/db/_design/my_ddoc/_view/my_filter

All map functions have a single parameter doc. This is a single docu-
ment in the database. Our map function checks whether our document has
a date and a title attribute luckily, all of our documents have them
and then calls the built-in emit() function with these two attributes
as arguments.

The emit() function always takes two arguments: the first is key, and
the second is value. The emit(key, value) function creates an entry in
our view result. One more thing: the emit() function can be called mul-
tiple times in the map function to create multiple entries in the view
results from a single document, but we are not doing that yet.

CouchDB takes whatever you pass into the emit() function and puts it
into a list (see Table 1, View results below). Each row in that list
includes the key and value. More importantly, the list is sorted by key
(by doc.date in our case). The most important feature of a view result
is that it is sorted by key. We will come back to that over and over
again to do neat things. Stay tuned.

Table 1. View results:
+---------------------+--------------+
| Key | Value |
+---------------------+--------------+
| 2009/01/15 15:52:20 | Hello World |
+---------------------+--------------+
| 2009/01/30 18:04:11 | Biking |
+---------------------+--------------+
| 2009/02/17 21:13:39 | Bought a Cat |
+---------------------+--------------+

When you query your view, CouchDB takes the source code and runs it for
you on every document in the database. If you have a lot of documents,
that takes quite a bit of time and you might wonder if it is not horri-
bly inefficient to do this. Yes, it would be, but CouchDB is designed
to avoid any extra costs: it only runs through all documents once, when
you first query your view. If a document is changed, the map function
is only run once, to recompute the keys and values for that single doc-
ument.

The view result is stored in a B-tree, just like the structure that is
responsible for holding your documents. View B-trees are stored in
their own file, so that for high-performance CouchDB usage, you can
keep views on their own disk. The B-tree provides very fast lookups of
rows by key, as well as efficient streaming of rows in a key range. In
our example, a single view can answer all questions that involve time:
Give me all the blog posts from last week or last month or this year.
Pretty neat.

When we query our view, we get back a list of all documents sorted by
date. Each row also includes the post title so we can construct links
to posts. Table 1 is just a graphical representation of the view re-
sult. The actual result is JSON-encoded and contains a little more
metadata:

{
"total_rows": 3,
"offset": 0,
"rows": [
{
"key": "2009/01/15 15:52:20",
"id": "hello-world",
"value": "Hello World"
},

{
"key": "2009/01/30 18:04:11",
"id": "biking",
"value": "Biking"
},

{
"key": "2009/02/17 21:13:39",
"id": "bought-a-cat",
"value": "Bought a Cat"
}

]
}

Now, the actual result is not as nicely formatted and doesnt include
any superfluous whitespace or newlines, but this is better for you (and
us!) to read and understand. Where does that id member in the result
rows come from? That wasnt there before. Thats because we omitted it
earlier to avoid confusion. CouchDB automatically includes the document
ID of the document that created the entry in the view result. Well use
this as well when constructing links to the blog post pages.

WARNING:
Do not emit the entire document as the value of your emit(key,
value) statement unless youre sure you know you want it. This stores
an entire additional copy of your document in the views secondary
index. Views with emit(key, doc) take longer to update, longer to
write to disk, and consume significantly more disk space. The only
advantage is that they are faster to query than using the ?in-
clude_docs=true parameter when querying a view.

Consider the trade-offs before emitting the entire document. Often
it is sufficient to emit only a portion of the document, or just a
single key / value pair, in your views.

Efficient Lookups
Lets move on to the second use case for views: building efficient in-
dexes to find documents by any value or structure that resides in them.
We already explained the efficient indexing, but we skipped a few de-
tails. This is a good time to finish this discussion as we are looking
at map functions that are a little more complex.

First, back to the B-trees! We explained that the B-tree that backs the
key-sorted view result is built only once, when you first query a view,
and all subsequent queries will just read the B-tree instead of execut-
ing the map function for all documents again. What happens, though,
when you change a document, add a new one, or delete one? Easy: CouchDB
is smart enough to find the rows in the view result that were created
by a specific document. It marks them invalid so that they no longer
show up in view results. If the document was deleted, were good the
resulting B-tree reflects the state of the database. If a document got
updated, the new document is run through the map function and the re-
sulting new lines are inserted into the B-tree at the correct spots.
New documents are handled in the same way. The B-tree is a very effi-
cient data structure for our needs, and the crash-only design of
CouchDB databases is carried over to the view indexes as well.

To add one more point to the efficiency discussion: usually multiple
documents are updated between view queries. The mechanism explained in
the previous paragraph gets applied to all changes in the database
since the last time the view was queried in a batch operation, which
makes things even faster and is generally a better use of your re-
sources.

Find One
On to more complex map functions. We said find documents by any value
or structure that resides in them. We already explained how to extract
a value by which to sort a list of views (our date field). The same
mechanism is used for fast lookups. The URI to query to get a views re-
sult is /database/_design/designdocname/_view/viewname. This gives you
a list of all rows in the view. We have only three documents, so things
are small, but with thousands of documents, this can get long. You can
add view parameters to the URI to constrain the result set. Say we know
the date of a blog post. To find a single document, we would use
/blog/_design/docs/_view/by_date?key="2009/01/30 18:04:11" to get the
Biking blog post. Remember that you can place whatever you like in the
key parameter to the emit() function. Whatever you put in there, we can
now use to look up exactly and fast.

Note that in the case where multiple rows have the same key (perhaps we
design a view where the key is the name of the posts author), key
queries can return more than one row.

Find Many
We talked about getting all posts for last month. If its February now,
this is as easy as:

/blog/_design/docs/_view/by_date?startkey="2010/01/01 00:00:00"&endkey="2010/02/00 00:00:00"

The startkey and endkey parameters specify an inclusive range on which
we can search.

To make things a little nicer and to prepare for a future example, we
are going to change the format of our date field. Instead of a string,
we are going to use an array, where individual members are part of a
timestamp in decreasing significance. This sounds fancy, but it is
rather easy. Instead of:

{
"date": "2009/01/31 00:00:00"
}

we use:

{
"date": [2009, 1, 31, 0, 0, 0]
}

Our map function does not have to change for this, but our view result
looks a little different:

Table 2. New view results:
+---------------------------+--------------+
| Key | Value |
+---------------------------+--------------+
| [2009, 1, 15, 15, 52, 20] | Hello World |
+---------------------------+--------------+
| [2009, 2, 17, 21, 13, 39] | Biking |
+---------------------------+--------------+
| [2009, 1, 30, 18, 4, 11] | Bought a Cat |
+---------------------------+--------------+

And our queries change to:

/blog/_design/docs/_view/by_date?startkey=[2010, 1, 1, 0, 0, 0]&endkey=[2010, 2, 1, 0, 0, 0]

For all you care, this is just a change in syntax, not meaning. But it
shows you the power of views. Not only can you construct an index with
scalar values like strings and integers, you can also use JSON struc-
tures as keys for your views. Say we tag our documents with a list of
tags and want to see all tags, but we dont care for documents that have
not been tagged.

{
...
tags: ["cool", "freak", "plankton"],
...
}

{
...
tags: [],
...
}

function(doc) {
if(doc.tags.length > 0) {
for(var idx in doc.tags) {
emit(doc.tags[idx], null);
}
}
}

This shows a few new things. You can have conditions on structure
(if(doc.tags.length > 0)) instead of just values. This is also an exam-
ple of how a map function calls emit() multiple times per document.
And finally, you can pass null instead of a value to the value parame-
ter. The same is true for the key parameter. Well see in a bit how
that is useful.

Reversed Results
To retrieve view results in reverse order, use the descending=true
query parameter. If you are using a startkey parameter, you will find
that CouchDB returns different rows or no rows at all. Whats up with
that?

Its pretty easy to understand when you see how view query options work
under the hood. A view is stored in a tree structure for fast lookups.
Whenever you query a view, this is how CouchDB operates:

1. Starts reading at the top, or at the position that startkey speci-
fies, if present.

2. Returns one row at a time until the end or until it hits endkey, if
present.

If you specify descending=true, the reading direction is reversed, not
the sort order of the rows in the view. In addition, the same two-step
procedure is followed.

Say you have a view result that looks like this:
+-----+-------+
| Key | Value |
+-----+-------+
| 0 | foo |
+-----+-------+
| 1 | bar |
+-----+-------+
| 2 | baz |
+-----+-------+

Here are potential query options: ?startkey=1&descending=true. What
will CouchDB do? See #1 above: it jumps to startkey, which is the row
with the key 1, and starts reading backward until it hits the end of
the view. So the particular result would be:
+-----+-------+
| Key | Value |
+-----+-------+
| 1 | bar |
+-----+-------+
| 0 | foo |
+-----+-------+

This is very likely not what you want. To get the rows with the indexes
1 and 2 in reverse order, you need to switch the startkey to endkey:
endkey=1&descending=true:
+-----+-------+
| Key | Value |
+-----+-------+
| 2 | baz |
+-----+-------+
| 1 | bar |
+-----+-------+

Now that looks a lot better. CouchDB started reading at the bottom of
the view and went backward until it hit endkey.

The View to Get Comments for Posts
We use an array key here to support the group_level reduce query para-
meter. CouchDBs views are stored in the B-tree file structure. Because
of the way B-trees are structured, we can cache the intermediate reduce
results in the non-leaf nodes of the tree, so reduce queries can be
computed along arbitrary key ranges in logarithmic time. See Figure 1,
Comments map function.

In the blog app, we use group_level reduce queries to compute the count
of comments both on a per-post and total basis, achieved by querying
the same view index with different methods. With some array keys, and
assuming each key has the value 1:

["a","b","c"]
["a","b","e"]
["a","c","m"]
["b","a","c"]
["b","a","g"]

the reduce view:

function(keys, values, rereduce) {
return sum(values)
}

or:

_sum

which is a built-in CouchDB reduce function (the others are _count and
_stats). _sum here returns the total number of rows between the start
and end key. So with startkey=["a","b"]&endkey=["b"] (which includes
the first three of the above keys) the result would equal 3. The effect
is to count rows. If youd like to count rows without depending on the
row value, you can switch on the rereduce parameter:

function(keys, values, rereduce) {
if (rereduce) {
return sum(values);
} else {
return values.length;
}
}

NOTE:
The JavaScript function above could be effectively replaced by the
built-in _count.
[image: Comments map function] [image] Figure 1. Comments map func-
tion.UNINDENT

This is the reduce view used by the example app to count comments,
while utilizing the map to output the comments, which are more useful
than just 1 over and over. It pays to spend some time playing around
with map and reduce functions. Fauxton is OK for this, but it doesnt
give full access to all the query parameters. Writing your own test
code for views in your language of choice is a great way to explore
the nuances and capabilities of CouchDBs incremental MapReduce sys-
tem.

Anyway, with a group_level query, youre basically running a series of
reduce range queries: one for each group that shows up at the level
you query. Lets reprint the key list from earlier, grouped at level
1:

["a"] 3
["b"] 2

And at group_level=2:

["a","b"] 2
["a","c"] 1
["b","a"] 2

Using the parameter group=true makes it behave as though it were
group_level=999, so in the case of our current example, it would give
the number 1 for each key, as there are no exactly duplicated keys.

Reduce/Rereduce
We briefly talked about the rereduce parameter to the reduce function.
Well explain whats up with it in this section. By now, you should have
learned that your view result is stored in B-tree index structure for
efficiency. The existence and use of the rereduce parameter is tightly
coupled to how the B-tree index works.

Consider the map result are:

"afrikaans", 1
"afrikaans", 1
"chinese", 1
"chinese", 1
"chinese", 1
"chinese", 1
"french", 1
"italian", 1
"italian", 1
"spanish", 1
"vietnamese", 1
"vietnamese", 1

Example 1. Example view result (mmm, food)

When we want to find out how many dishes there are per origin, we can
reuse the simple reduce function shown earlier:

function(keys, values, rereduce) {
return sum(values);
}

Figure 2, The B-tree index shows a simplified version of what the
B-tree index looks like. We abbreviated the key strings.
[image: The B-tree index] [image] Figure 2. The B-tree index.UNINDENT

The view result is what computer science grads call a pre-order walk
through the tree. We look at each element in each node starting from
the left. Whenever we see that there is a subnode to descend into, we
descend and start reading the elements in that subnode. When we have
walked through the entire tree, were done.

You can see that CouchDB stores both keys and values inside each leaf
node. In our case, it is simply always 1, but you might have a value
where you count other results and then all rows have a different
value. Whats important is that CouchDB runs all elements that are
within a node into the reduce function (setting the rereduce parame-
ter to false) and stores the result inside the parent node along with
the edge to the subnode. In our case, each edge has a 3 representing
the reduce value for the node it points to.

NOTE:
In reality, nodes have more than 1,600 elements in them. CouchDB
computes the result for all the elements in multiple iterations over
the elements in a single node, not all at once (which would be dis-
astrous for memory consumption).

Now lets see what happens when we run a query. We want to know how many
chinese entries we have. The query option is simple: ?key="chinese".
See Figure 3, The B-tree index reduce result.
[image: The B-tree index reduce result] [image] Figure 3. The B-tree
index reduce result.UNINDENT

CouchDB detects that all values in the subnode include the chinese
key. It concludes that it can take just the 3 values associated with
that node to compute the final result. It then finds the node left to
it and sees that its a node with keys outside the requested range
(key= requests a range where the beginning and the end are the same
value). It concludes that it has to use the chinese elements value
and the other nodes value and run them through the reduce function
with the rereduce parameter set to true.

The reduce function effectively calculates 3 + 1 at query time and
returns the desired result. The next example shows some pseudocode
that shows the last invocation of the reduce function with actual
values:

function(null, [3, 1], true) {
return sum([3, 1]);
}

Now, we said your reduce function must actually reduce your values. If
you see the B-tree, it should become obvious what happens when you dont
reduce your values. Consider the following map result and reduce func-
tion. This time we want to get a list of all the unique labels in our
view:

"abc", "afrikaans"
"cef", "afrikaans"
"fhi", "chinese"
"hkl", "chinese"
"ino", "chinese"
"lqr", "chinese"
"mtu", "french"
"owx", "italian"
"qza", "italian"
"tdx", "spanish"
"xfg", "vietnamese"
"zul", "vietnamese"

We dont care for the key here and only list all the labels we have. Our
reduce function removes duplicates:

function(keys, values, rereduce) {
var unique_labels = {};
values.forEach(function(label) {
if(!unique_labels[label]) {
unique_labels[label] = true;
}
});

return unique_labels;
}

This translates to Figure 4, An overflowing reduce index.

We hope you get the picture. The way the B-tree storage works means
that if you dont actually reduce your data in the reduce function, you
end up having CouchDB copy huge amounts of data around that grow lin-
early, if not faster, with the number of rows in your view.

CouchDB will be able to compute the final result, but only for views
with a few rows. Anything larger will experience a ridiculously slow
view build time. To help with that, CouchDB since version 0.10.0 will
throw an error if your reduce function does not reduce its input val-
ues.
[image: An overflowing reduce index] [image] Figure 4. An overflowing
reduce index.UNINDENT

One vs. Multiple Design Documents
A common question is: when should I split multiple views into multiple
design documents, or keep them together?

Each view you create corresponds to one B-tree. All views in a single
design document will live in the same set of index files on disk (one
file per database shard; in 2.0+ by default, 8 files per node).

The most practical consideration for separating views into separate
documents is how often you change those views. Views that change often,
and are in the same design document as other views, will invalidate
those other views indexes when the design document is written, forcing
them all to rebuild from scratch. Obviously you will want to avoid this
in production!

However, when you have multiple views with the same map function in the
same design document, CouchDB will optimize and only calculate that map
function once. This lets you have two views with different reduce func-
tions (say, one with _sum and one with _stats) but build only a single
copy of the mapped index. It also saves disk space and the time to
write multiple copies to disk.

Another benefit of having multiple views in the same design document is
that the index files can keep a single index of backwards references
from docids to rows. CouchDB needs these back refs to invalidate rows
in a view when a document is deleted (otherwise, a delete would force a
total rebuild!)

One other consideration is that each separate design document will
spawn another (set of) couchjs processes to generate the view, one per
shard. Depending on the number of cores on your server(s), this may be
efficient (using all of the idle cores you have) or inefficient (over-
loading the CPU on your servers). The exact situation will depend on
your deployment architecture.

So, should you use one or multiple design documents? The choice is
yours.

Lessons Learned
• If you dont use the key field in the map function, you are probably
doing it wrong.

• If you are trying to make a list of values unique in the reduce func-
tions, you are probably doing it wrong.

• If you dont reduce your values to a single scalar value or a small
fixed-sized object or array with a fixed number of scalar values of
small sizes, you are probably doing it wrong.

Wrapping Up
Map functions are side effectfree functions that take a document as ar-
gument and emit key/value pairs. CouchDB stores the emitted rows by
constructing a sorted B-tree index, so row lookups by key, as well as
streaming operations across a range of rows, can be accomplished in a
small memory and processing footprint, while writes avoid seeks. Gener-
ating a view takes O(N), where N is the total number of rows in the
view. However, querying a view is very quick, as the B-tree remains
shallow even when it contains many, many keys.

Reduce functions operate on the sorted rows emitted by map view func-
tions. CouchDBs reduce functionality takes advantage of one of the
fundamental properties of B-tree indexes: for every leaf node (a sorted
row), there is a chain of internal nodes reaching back to the root.
Each leaf node in the B-tree carries a few rows (on the order of tens,
depending on row size), and each internal node may link to a few leaf
nodes or other internal nodes.

The reduce function is run on every node in the tree in order to calcu-
late the final reduce value. The end result is a reduce function that
can be incrementally updated upon changes to the map function, while
recalculating the reduction values for a minimum number of nodes. The
initial reduction is calculated once per each node (inner and leaf) in
the tree.

When run on leaf nodes (which contain actual map rows), the reduce
functions third parameter, rereduce, is false. The arguments in this
case are the keys and values as output by the map function. The func-
tion has a single returned reduction value, which is stored on the in-
ner node that a working set of leaf nodes have in common, and is used
as a cache in future reduce calculations.

When the reduce function is run on inner nodes, the rereduce flag is
true. This allows the function to account for the fact that it will be
receiving its own prior output. When rereduce is true, the values
passed to the function are intermediate reduction values as cached from
previous calculations. When the tree is more than two levels deep, the
rereduce phase is repeated, consuming chunks of the previous levels
output until the final reduce value is calculated at the root node.

A common mistake new CouchDB users make is attempting to construct com-
plex aggregate values with a reduce function. Full reductions should
result in a scalar value, like 5, and not, for instance, a JSON hash
with a set of unique keys and the count of each. The problem with this
approach is that youll end up with a very large final value. The number
of unique keys can be nearly as large as the number of total keys, even
for a large set. It is fine to combine a few scalar calculations into
one reduce function; for instance, to find the total, average, and
standard deviation of a set of numbers in a single function.

If youre interested in pushing the edge of CouchDBs incremental reduce
functionality, have a look at Googles paper on Sawzall, which gives ex-
amples of some of the more exotic reductions that can be accomplished
in a system with similar constraints.

Views Collation
Basics
View functions specify a key and a value to be returned for each row.
CouchDB collates the view rows by this key. In the following example,
the LastName property serves as the key, thus the result will be sorted
by LastName:

function(doc) {
if (doc.Type == "customer") {
emit(doc.LastName, {FirstName: doc.FirstName, Address: doc.Address});
}
}

CouchDB allows arbitrary JSON structures to be used as keys. You can
use JSON arrays as keys for fine-grained control over sorting and
grouping.

Examples
The following clever trick would return both customer and order docu-
ments. The key is composed of a customer _id and a sorting token. Be-
cause the key for order documents begins with the _id of a customer
document, all the orders will be sorted by customer. Because the sort-
ing token for customers is lower than the token for orders, the cus-
tomer document will come before the associated orders. The values 0 and
1 for the sorting token are arbitrary.

function(doc) {
if (doc.Type == "customer") {
emit([doc._id, 0], null);
} else if (doc.Type == "order") {
emit([doc.customer_id, 1], null);
}
}

To list a specific customer with _id XYZ, and all of that customers or-
ders, limit the startkey and endkey ranges to cover only documents for
that customers _id:

startkey=["XYZ"]&endkey=["XYZ", {}]

It is not recommended to emit the document itself in the view. Instead,
to include the bodies of the documents when requesting the view, re-
quest the view with ?include_docs=true.

Sorting by Dates
It maybe be convenient to store date attributes in a human readable
format (i.e. as a string), but still sort by date. This can be done by
converting the date to a number in the emit() function. For example,
given a document with a created_at attribute of 'Wed Jul 23 16:29:21
+0100 2013', the following emit function would sort by date:

emit(Date.parse(doc.created_at).getTime(), null);

Alternatively, if you use a date format which sorts lexicographically,
such as "2013/06/09 13:52:11 +0000" you can just

emit(doc.created_at, null);

and avoid the conversion. As a bonus, this date format is compatible
with the JavaScript date parser, so you can use new Date(doc.cre-
ated_at) in your client side JavaScript to make date sorting easy in
the browser.

String Ranges
If you need start and end keys that encompass every string with a given
prefix, it is better to use a high value Unicode character, than to use
a 'ZZZZ' suffix.

That is, rather than:

startkey="abc"&endkey="abcZZZZZZZZZ"

You should use:

startkey="abc"&endkey="abc\ufff0"

Collation Specification
This section is based on the view_collation function in -
view_collation.js:

// special values sort before all other types
null
false
true

// then numbers
1
2
3.0
4

// then text, case sensitive
"a"
"A"
"aa"
"b"
"B"
"ba"
"bb"

// then arrays. compared element by element until different.
// Longer arrays sort after their prefixes
["a"]
["b"]
["b","c"]
["b","c", "a"]
["b","d"]
["b","d", "e"]

// then object, compares each key value in the list until different.
// larger objects sort after their subset objects.
{a:1}
{a:2}
{b:1}
{b:2}
{b:2, a:1} // Member order does matter for collation.
// CouchDB preserves member order
// but doesn't require that clients will.
// this test might fail if used with a js engine
// that doesn't preserve order
{b:2, c:2}

Comparison of strings is done using ICU which implements the Unicode
Collation Algorithm, giving a dictionary sorting of keys. This can
give surprising results if you were expecting ASCII ordering. Note
that:

• All symbols sort before numbers and letters (even the high symbols
like tilde, 0x7e)

• Differing sequences of letters are compared without regard to case,
so a < aa but also A < aa and a < AA

• Identical sequences of letters are compared with regard to case, with
lowercase before uppercase, so a < A

You can demonstrate the collation sequence for 7-bit ASCII characters
like this:

require 'rubygems'
require 'restclient'
require 'json'

DB="http://adm:pass@127.0.0.1:5984/collator"

RestClient.delete DB rescue nil
RestClient.put "#{DB}",""

(32..126).each do |c|
RestClient.put "#{DB}/#{c.to_s(16)}", {"x"=>c.chr}.to_json
end

RestClient.put "#{DB}/_design/test", <<EOS
{
"views":{
"one":{
"map":"function (doc) { emit(doc.x,null); }"
}
}
}
EOS

puts RestClient.get("#{DB}/_design/test/_view/one")

This shows the collation sequence to be:

` ^ _ - , ; : ! ? . ' " ( ) [ ] { } @ * / \ & # % + < = > | ~ $ 0 1 2 3 4 5 6 7 8 9
a A b B c C d D e E f F g G h H i I j J k K l L m M n N o O p P q Q r R s S t T u U v V w W x X y Y z Z

Key ranges
Take special care when querying key ranges. For example: the query:

startkey="Abc"&endkey="AbcZZZZ"

will match ABC and abc1, but not abc. This is because UCA sorts as:

abc < Abc < ABC < abc1 < AbcZZZZZ

For most applications, to avoid problems you should lowercase the
startkey:

startkey="abc"&endkey="abcZZZZZZZZ"

will match all keys starting with [aA][bB][cC]

Complex keys
The query startkey=["foo"]&endkey=["foo",{}] will match most array keys
with foo in the first element, such as ["foo","bar"] and
["foo",["bar","baz"]]. However it will not match ["foo",{"an":"ob-
ject"}]

_all_docs
The _all_docs view is a special case because it uses ASCII collation
for doc ids, not UCA:

startkey="_design/"&endkey="_design/ZZZZZZZZ"

will not find _design/abc because Z comes before a in the ASCII se-
quence. A better solution is:

startkey="_design/"&endkey="_design0"

Raw collation
To squeeze a little more performance out of views, you can specify "op-
tions":{"collation":"raw"} within the view definition for native Er-
lang collation, especially if you dont require UCA. This gives a dif-
ferent collation sequence:

1
false
null
true
{"a":"a"},
["a"]
"a"

Beware that {} is no longer a suitable high key sentinel value. Use a
string like "\ufff0" instead.

Joins With Views
Linked Documents
If your map function emits an object value which has {'_id': XXX} and
you query view with include_docs=true parameter, then CouchDB will
fetch the document with id XXX rather than the document which was
processed to emit the key/value pair.

This means that if one document contains the ids of other documents, it
can cause those documents to be fetched in the view too, adjacent to
the same key if required.

For example, if you have the following hierarchically-linked documents:

[
{ "_id": "11111" },
{ "_id": "22222", "ancestors": ["11111"], "value": "hello" },
{ "_id": "33333", "ancestors": ["22222","11111"], "value": "world" }
]

You can emit the values with the ancestor documents adjacent to them in
the view like this:

function(doc) {
if (doc.value) {
emit([doc.value, 0], null);
if (doc.ancestors) {
for (var i in doc.ancestors) {
emit([doc.value, Number(i)+1], {_id: doc.ancestors[i]});
}
}
}
}

The result you get is:

{
"total_rows": 5,
"offset": 0,
"rows": [
{
"id": "22222",
"key": [
"hello",
0
],
"value": null,
"doc": {
"_id": "22222",
"_rev": "1-0eee81fecb5aa4f51e285c621271ff02",
"ancestors": [
"11111"
],
"value": "hello"
}
},
{
"id": "22222",
"key": [
"hello",
1
],
"value": {
"_id": "11111"
},
"doc": {
"_id": "11111",
"_rev": "1-967a00dff5e02add41819138abb3284d"
}
},
{
"id": "33333",
"key": [
"world",
0
],
"value": null,
"doc": {
"_id": "33333",
"_rev": "1-11e42b44fdb3d3784602eca7c0332a43",
"ancestors": [
"22222",
"11111"
],
"value": "world"
}
},
{
"id": "33333",
"key": [
"world",
1
],
"value": {
"_id": "22222"
},
"doc": {
"_id": "22222",
"_rev": "1-0eee81fecb5aa4f51e285c621271ff02",
"ancestors": [
"11111"
],
"value": "hello"
}
},
{
"id": "33333",
"key": [
"world",
2
],
"value": {
"_id": "11111"
},
"doc": {
"_id": "11111",
"_rev": "1-967a00dff5e02add41819138abb3284d"
}
}
]
}

which makes it very cheap to fetch a document plus all its ancestors in
one query.

Note that the "id" in the row is still that of the originating docu-
ment. The only difference is that include_docs fetches a different
doc.

The current revision of the document is resolved at query time, not at
the time the view is generated. This means that if a new revision of
the linked document is added later, it will appear in view queries even
though the view itself hasnt changed. To force a specific revision of a
linked document to be used, emit a "_rev" property as well as "_id".

Using View Collation
Author Christopher Lenz

Date 2007-10-05

Source http://www.cmlenz.net/archives/2007/10/couchdb-joins

Just today, there was a discussion on IRC on how youd go about modeling
a simple blogging system with post and comment entities, where any blog
post might have N comments. If youd be using an SQL database, youd ob-
viously have two tables with foreign keys and youd be using joins. (At
least until you needed to add some denormalization).

But what would the obvious approach in CouchDB look like?

Approach #1: Comments Inlined
A simple approach would be to have one document per blog post, and
store the comments inside that document:

{
"_id": "myslug",
"_rev": "123456",
"author": "john",
"title": "My blog post",
"content": "Bla bla bla ",
"comments": [
{"author": "jack", "content": ""},
{"author": "jane", "content": ""}
]
}

NOTE:
Of course the model of an actual blogging system would be more ex-
tensive, youd have tags, timestamps, etc, etc. This is just to
demonstrate the basics.

The obvious advantage of this approach is that the data that belongs
together is stored in one place. Delete the post, and you automatically
delete the corresponding comments, and so on.

You may be thinking that putting the comments inside the blog post doc-
ument would not allow us to query for the comments themselves, but youd
be wrong. You could trivially write a CouchDB view that would return
all comments across all blog posts, keyed by author:

function(doc) {
for (var i in doc.comments) {
emit(doc.comments[i].author, doc.comments[i].content);
}
}

Now you could list all comments by a particular user by invoking the
view and passing it a ?key="username" query string parameter.

However, this approach has a drawback that can be quite significant for
many applications: To add a comment to a post, you need to:

• Fetch the blog post document

• Add the new comment to the JSON structure

• Send the updated document to the server

Now if you have multiple client processes adding comments at roughly
the same time, some of them will get a HTTP 409 Conflict error on step
3 (thats optimistic concurrency in action). For some applications this
makes sense, but in many other apps, youd want to append new related
data regardless of whether other data has been added in the meantime.

The only way to allow non-conflicting addition of related data is by
putting that related data into separate documents.

Approach #2: Comments Separate
Using this approach youd have one document per blog post, and one docu-
ment per comment. The comment documents would have a backlink to the
post they belong to.

The blog post document would look similar to the above, minus the com-
ments property. Also, wed now have a type property on all our documents
so that we can tell the difference between posts and comments:

{
"_id": "myslug",
"_rev": "123456",
"type": "post",
"author": "john",
"title": "My blog post",
"content": "Bla bla bla "
}

The comments themselves are stored in separate documents, which also
have a type property (this time with the value comment), and addition-
ally feature a post property containing the ID of the post document
they belong to:

{
"_id": "ABCDEF",
"_rev": "123456",
"type": "comment",
"post": "myslug",
"author": "jack",
"content": ""
}

{
"_id": "DEFABC",
"_rev": "123456",
"type": "comment",
"post": "myslug",
"author": "jane",
"content": ""
}

To list all comments per blog post, youd add a simple view, keyed by
blog post ID:

function(doc) {
if (doc.type == "comment") {
emit(doc.post, {author: doc.author, content: doc.content});
}
}

And youd invoke that view passing it a ?key="post_id" query string pa-
rameter.

Viewing all comments by author is just as easy as before:

function(doc) {
if (doc.type == "comment") {
emit(doc.author, {post: doc.post, content: doc.content});
}
}

So this is better in some ways, but it also has a disadvantage. Imag-
ine you want to display a blog post with all the associated comments on
the same web page. With our first approach, we needed just a single re-
quest to the CouchDB server, namely a GET request to the document. With
this second approach, we need two requests: a GET request to the post
document, and a GET request to the view that returns all comments for
the post.

That is okay, but not quite satisfactory. Just imagine you wanted to
add threaded comments: youd now need an additional fetch per comment.
What wed probably want then would be a way to join the blog post and
the various comments together to be able to retrieve them with a single
HTTP request.

This was when Damien Katz, the author of CouchDB, chimed in to the dis-
cussion on IRC to show us the way.

Optimization: Using the Power of View Collation
Obvious to Damien, but not at all obvious to the rest of us: its fairly
simple to make a view that includes both the content of the blog post
document, and the content of all the comments associated with that
post. The way you do that is by using complex keys. Until now weve been
using simple string values for the view keys, but in fact they can be
arbitrary JSON values, so lets make some use of that:

function(doc) {
if (doc.type == "post") {
emit([doc._id, 0], null);
} else if (doc.type == "comment") {
emit([doc.post, 1], null);
}
}

Okay, this may be confusing at first. Lets take a step back and look at
what views in CouchDB are really about.

CouchDB views are basically highly efficient on-disk dictionaries that
map keys to values, where the key is automatically indexed and can be
used to filter and/or sort the results you get back from your views.
When you invoke a view, you can say that youre only interested in a
subset of the view rows by specifying a ?key=foo query string parame-
ter. Or you can specify ?startkey=foo and/or ?endkey=bar query string
parameters to fetch rows over a range of keys. Finally, by adding ?in-
clude_docs=true to the query, the result will include the full body of
each emitted document.

Its also important to note that keys are always used for collating
(i.e. sorting) the rows. CouchDB has well defined (but as of yet un-
documented) rules for comparing arbitrary JSON objects for collation.
For example, the JSON value ["foo", 2] is sorted after (considered
greater than) the values ["foo"] or ["foo", 1, "bar"], but before e.g.
["foo", 2, "bar"]. This feature enables a whole class of tricks that
are rather non-obvious

SEE ALSO:

• Homebrew

FreeBSD
FreeBSD requires the use of GNU Make. Where make is specified in this
documentation, substitute gmake.

You can install this by running:

pkg install gmake

Installing
Once you have satisfied the dependencies you should run:

./configure

If you wish to customize the installation, pass --help to this script.

If everything was successful you should see the following message:

You have configured Apache CouchDB, time to relax.

Relax.

To build CouchDB you should run:

make release

Try gmake if make is giving you any problems.

If include paths or other compiler options must be specified, they can
be passed to rebar, which compiles CouchDB, with the ERL_CFLAGS envi-
ronment variable. Likewise, options may be passed to the linker with
the ERL_LDFLAGS environment variable:

make release ERL_CFLAGS="-I/usr/local/include/js -I/usr/local/lib/erlang/usr/include"

If everything was successful you should see the following message:

... done
You can now copy the rel/couchdb directory anywhere on your system.
Start CouchDB with ./bin/couchdb from within that directory.

Relax.

Note: a fully-fledged ./configure with the usual GNU Autotools options
for package managers and a corresponding make install are in develop-
ment, but not part of the 2.0.0 release.

User Registration and Security
For OS X, in the steps below, substitute /Users/couchdb for
/home/couchdb.

You should create a special couchdb user for CouchDB.

On many Unix-like systems you can run:

adduser --system \
--shell /bin/bash \
--group --gecos \
"CouchDB Administrator" couchdb

On Mac OS X you can use the Workgroup Manager to create users up to
version 10.9, and dscl or sysadminctl after version 10.9. Search Apples
support site to find the documentation appropriate for your system. As
of recent versions of OS X, this functionality is also included in
Server.app, available through the App Store only as part of OS X
Server.

You must make sure that the user has a working POSIX shell and a
writable home directory.

You can test this by:

• Trying to log in as the couchdb user

• Running pwd and checking the present working directory

As a recommendation, copy the rel/couchdb directory into /home/couchdb
or /Users/couchdb.

Ex: copy the built couchdb release to the new users home directory:

cp -R /path/to/couchdb/rel/couchdb /home/couchdb

Change the ownership of the CouchDB directories by running:

chown -R couchdb:couchdb /home/couchdb

Change the permission of the CouchDB directories by running:

find /home/couchdb -type d -exec chmod 0770 {} \;

Update the permissions for your ini files:

chmod 0644 /home/couchdb/etc/*

First Run
NOTE:
Be sure to create an admin user before trying to start CouchDB!

You can start the CouchDB server by running:

sudo -i -u couchdb /home/couchdb/bin/couchdb

This uses the sudo command to run the couchdb command as the couchdb
user.

When CouchDB starts it should eventually display following messages:

{database_does_not_exist,[{mem3_shards,load_shards_from_db,"_users" ...

Dont be afraid, we will fix this in a moment.

To check that everything has worked, point your web browser to:

http://127.0.0.1:5984/_utils/index.html

From here you should verify your installation by pointing your web
browser to:

http://localhost:5984/_utils/index.html#verifyinstall

Your installation is not complete. Be sure to complete the Setup steps
for a single node or clustered installation.

Running as a Daemon
CouchDB no longer ships with any daemonization scripts.

The CouchDB team recommends runit to run CouchDB persistently and reli-
ably. According to official site:
runit is a cross-platform Unix init scheme with service supervision,
a replacement for sysvinit, and other init schemes. It runs on
GNU/Linux, *BSD, MacOSX, Solaris, and can easily be adapted to other
Unix operating systems.

Configuration of runit is straightforward; if you have questions, con-
tact the CouchDB user mailing list or IRC-channel #couchdb in FreeNode
network.

Lets consider configuring runit on Ubuntu 18.04. The following steps
should be considered only as an example. Details will vary by operating
system and distribution. Check your systems package management tools
for specifics.

Install runit:

sudo apt-get install runit

Create a directory where logs will be written:

sudo mkdir /var/log/couchdb
sudo chown couchdb:couchdb /var/log/couchdb

Create directories that will contain runit configuration for CouchDB:

sudo mkdir /etc/sv/couchdb
sudo mkdir /etc/sv/couchdb/log

Create /etc/sv/couchdb/log/run script:

#!/bin/sh
exec svlogd -tt /var/log/couchdb

Basically it determines where and how exactly logs will be written.
See man svlogd for more details.

Create /etc/sv/couchdb/run:

#!/bin/sh
export HOME=/home/couchdb
exec 2>&1
exec chpst -u couchdb /home/couchdb/bin/couchdb

This script determines how exactly CouchDB will be launched. Feel free
to add any additional arguments and environment variables here if nec-
essary.

Make scripts executable:

sudo chmod u+x /etc/sv/couchdb/log/run
sudo chmod u+x /etc/sv/couchdb/run

Then run:

sudo ln -s /etc/sv/couchdb/ /etc/service/couchdb

In a few seconds runit will discover a new symlink and start CouchDB.
You can control CouchDB service like this:

sudo sv status couchdb
sudo sv stop couchdb
sudo sv start couchdb

Naturally now CouchDB will start automatically shortly after system
starts.

You can also configure systemd, launchd or SysV-init daemons to launch
CouchDB and keep it running using standard configuration files. Consult
your system documentation for more information.

Installation on Windows
There are two ways to install CouchDB on Windows.

Installation from binaries
This is the simplest way to go.

WARNING:
Windows 8, 8.1, and 10 require the .NET Framework v3.5 to be in-
stalled.

1. Get the latest Windows binaries from the CouchDB web site. Old re-
leases are available at archive.

2. Follow the installation wizard steps. Be sure to install CouchDB to
a path with no spaces, such as C:\CouchDB.

3. Your installation is not complete. Be sure to complete the Setup
steps for a single node or clustered installation.

4. Open up Fauxton

5. Its time to Relax!

NOTE:
In some cases you might been asked to reboot Windows to complete in-
stallation process, because of using on different Microsoft Visual
C++ runtimes by CouchDB.

NOTE:
Upgrading note

Its recommended to uninstall previous CouchDB version before upgrad-
ing, especially if the new one is built against different Erlang re-
lease. The reason is simple: there may be leftover libraries with
alternative or incompatible versions from old Erlang release that
may create conflicts, errors and weird crashes.

In this case, make sure you backup of your local.ini config and
CouchDB database/index files.

Silent Install
The Windows installer supports silent installs. Here are some sample
commands, supporting the new features of the 3.0 installer.

Install CouchDB without a service, but with an admin user:password of
admin:hunter2:

msiexec /i apache-couchdb-3.0.0.msi /quiet ADMINUSER=admin ADMINPASSWORD=hunter2 /norestart

The same as above, but also install and launch CouchDB as a service:

msiexec /i apache-couchdb-3.0.0.msi /quiet INSTALLSERVICE=1 ADMINUSER=admin ADMINPASSWORD=hunter2 /norestart

Unattended uninstall of CouchDB to target directory D:CouchDB:

msiexec /x apache-couchdb-3.0.0.msi INSTALLSERVICE=1 APPLICATIONFOLDER="D:\CouchDB" ADMINUSER=admin ADMINPASSWORD=hunter2 /quiet /norestart

Unattended uninstall if the installer file is unavailable:

msiexec /x {4CD776E0-FADF-4831-AF56-E80E39F34CFC} /quiet /norestart

Add /l* log.txt to any of the above to generate a useful logfile for
debugging.

Installation from sources
SEE ALSO:
Glazier: Automate building of CouchDB from source on Windows

Installation on macOS
Installation using the Apache CouchDB native application
The easiest way to run CouchDB on macOS is through the native macOS ap-
plication. Just follow the below instructions:

1. Download Apache CouchDB for macOS. Old releases are available at -
archive.

2. Double click on the Zip file

3. Drag and drop the Apache CouchDB.app into Applications folder

Thats all, now CouchDB is installed on your Mac:

1. Run Apache CouchDB application

2. Open up Fauxton, the CouchDB admin interface

3. Verify the install by clicking on Verify, then Verify Installation.

4. Your installation is not complete. Be sure to complete the Setup
steps for a single node or clustered installation.

5. Time to Relax!

Installation with Homebrew
CouchDB can be installed via Homebrew. Fetch the newest version of
Homebrew and all formulae and install CouchDB with the following com-
mands:

brew update
brew install couchdb

Installation from source
Installation on macOS is possible from source. Download the source tar-
ball, extract it, and follow the instructions in the INSTALL.Unix.md
file.

Running as a Daemon
CouchDB itself no longer ships with any daemonization scripts.

The CouchDB team recommends runit to run CouchDB persistently and reli-
ably. Configuration of runit is straightforward; if you have questions,
reach out to the CouchDB user mailing list.

Naturally, you can configure launchd or other init daemons to launch
CouchDB and keep it running using standard configuration files.

Consult your system documentation for more information.

Installation on FreeBSD
Installation
Use the pre-built binary packages to install CouchDB:

pkg install couchdb3

Alternatively, it is possible installing CouchDB from the Ports Collec-
tion:

cd /usr/ports/databases/couchdb3
make install clean

NOTE:
Be sure to create an admin user before starting CouchDB for the
first time!

Service Configuration
The port is shipped with a script that integrates CouchDB with FreeBSDs
rc.d service framework. The following options for /etc/rc.conf or
/etc/rc.conf.local are supported (defaults shown):

couchdb3_enable="NO"
couchdb3_user="couchdb"
couchdb3_erl_flags="-couch_ini /usr/local/libexec/couchdb3/etc/default.ini /usr/local/etc/couchdb3/local.ini"
couchdb3_chdir="/var/db/couchdb3"

After enabling the couchdb3 service (by setting couchdb3_enable to
"YES"), use the following command to start CouchDB:

service couchdb3 start

This script responds to the arguments start, stop, status, rcvar etc.
If the service is not yet enabled in rc.conf, use onestart to start it
up ad-hoc.

The service will also use settings from the following config files:

• /usr/local/libexec/couchdb3/etc/default.ini

• /usr/local/etc/couchdb3/local.ini

The default.ini should be left read-only, and will be replaced on up-
grades and re-installs without warning. Therefore administrators
should use default.ini as a reference and only modify the local.ini
file.

Post Install
The installation is not complete. Be sure to complete the Setup steps
for a single node or clustered installation.

Also note that the port will probably show some messages after the in-
stallation happened. Make note of these instructions, although they
can be found in the ports tree for later reference.

Installation via Docker
Apache CouchDB provides convenience binary Docker images through Docker
Hub at apache/couchdb. This is our upstream release; it is usually mir-
rored downstream at Dockers top-level couchdb as well.

At least these tags are always available on the image:

• latest - always the latest

• 3: always the latest 3.x version

• 2: always the latest 2.x version

• 1, 1.7, 1.7.2: CouchDB 1.7.2 (convenience only; no longer supported)

• 1-couchperuser, 1.7-couchperuser, 1.7.2-couchperuser: CouchDB 1.7.2
with couchperuser plugin (convenience only; no longer supported)

These images expose CouchDB on port 5984 of the container, run every-
thing as user couchdb (uid 5984), and support use of a Docker volume
for data at /opt/couchdb/data.

Your installation is not complete. Be sure to complete the Setup steps
for a single node or clustered installation.

Further details on the Docker configuration are available in our -
couchdb-docker git repository.

Installation via Snap
Apache CouchDB provides convenience binary Snap builds through the
Ubuntu snapcraft repository under the name couchdb snap. These are
available in separate snap channels for each major/minor release
stream, e.g., 2.x, 3.3, and a latest stream.

Once youve completed installing snapd, you can install the CouchDB snap
via:

$ sudo snap install couchdb

After installation, set up an admin password and a cookie using a snap
hook. Then, restart the snap for changes to take effect:

$ sudo snap set couchdb admin=[your-password] setcookie=[your-cookie]
$ sudo snap restart couchdb

CouchDB will be installed (read only) at /snap/couchdb/current/. Data
files will be written to /var/snap/couchdb/common/data, and (writable)
configuration files will be stored in /var/snap/couchdb/current/etc.

NOTE:
Your installation is not complete. Follow the Setup steps for a sin-
gle node or clustered installation.

Snaps use AppArmor and are closely tied to systemd. They enforce that
only writable files are housed under /var/snap. Ensure that /var has
sufficient space for your data requirements.

To view logs, access them via journalctl snap.couchdb or using the snap
logs command:

$ sudo snap logs couchdb -f

When installing from a specific channel, snaps are automatically re-
freshed with new revisions. Revert to a previous installation with:

$ sudo snap revert couchdb

After this, updates will no longer be received. View installed snaps
and alternative channels using the list and info commands:

$ snap list
$ snap info couchdb

As easily as they are installed, snaps can be removed:

$ sudo snap remove couchdb
$ sudo snap remove couchdb --purge

The first command stops the server, removes couchdb from the list, and
the filesystem (keeping a backup for about 30 days if space permits).
If you reinstall couchdb, it tries to restore the backup. The second
command removes couchdb and purges any backups.

When troubleshooting couchdb snap, check the logs first. Youll likely
need to inspect /var/snap/couchdb/current/etc/local.ini to verify the
data directory or modify admin settings, port, or address bindings.
Also, anything related to Erlang runtime check /var/snap/couchdb/cur-
rent/etc/vm.args to view the erlang name.

The most common issue is couchdb not finding the database files. Ensure
that local.ini includes the following stanza and points to your data
files:

[couchdb]
;max_document_size = 4294967296 ; bytes
;os_process_timeout = 5000
database_dir = /var/snap/couchdb/common/data
view_index_dir = /var/snap/couchdb/common/data

NOTE:
Remember, you cannot modify the /snap/couchdb/ directory, even with
sudo, as the filesystem is mounted read-only for security reasons.

For additional details on the snap build process, refer to our -
couchdb-pkg git repository. This includes instructions on setting up a
cluster using the command line.

Installation on Kubernetes
Apache CouchDB provides a Helm chart to enable deployment to Kuber-
netes.

To install the chart with the release name my-release:

helm repo add couchdb https://apache.github.io/couchdb-helm

helm repo update

helm install --name my-release couchdb/couchdb

Further details on the configuration options are available in the Helm
chart readme.

Search Plugin Installation
Added in version 3.0.

CouchDB can build and query full-text search indexes using an external
Java service that embeds Apache Lucene. Typically, this service is in-
stalled on the same host as CouchDB and communicates with it over the
loopback network.

The search plugin is runtime-compatible with Java JDKs 6, 7 and 8.
Building a release from source requires JDK 6. It will not work with
any newer version of Java. Sorry about that.

Installation of Binary Packages
Binary packages that bundle all the necessary dependencies of the
search plugin are available on GitHub. The files in each release
should be unpacked into a directory on the Java classpath. If you do
not have a classpath already set, or you wish to explicitly set the
classpath location for Clouseau, then add the line:

-classpath '/path/to/clouseau/*'

to the server command below. If clouseau is installed in /opt/clouseau
the line would be:

-classpath '/opt/clouseau/*'

The service expects to find a couple of configuration files convention-
ally called clouseau.ini and log4j.properties with the following con-
tent:

clouseau.ini:

[clouseau]

; the name of the Erlang node created by the service, leave this unchanged
name=clouseau@127.0.0.1

; set this to the same distributed Erlang cookie used by the CouchDB nodes
cookie=brumbrum

; the path where you would like to store the search index files
dir=/path/to/index/storage

; the number of search indexes that can be open simultaneously
max_indexes_open=500

log4j.properties:

log4j.rootLogger=debug, CONSOLE
log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender
log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout
log4j.appender.CONSOLE.layout.ConversionPattern=%d{ISO8601} %c [%p] %m%n

Once these files are in place the service can be started with an invo-
cation like the following:

java -server \
-Xmx2G \
-Dsun.net.inetaddr.ttl=30 \
-Dsun.net.inetaddr.negative.ttl=30 \
-Dlog4j.configuration=file:/path/to/log4j.properties \
-XX:OnOutOfMemoryError="kill -9 %p" \
-XX:+UseConcMarkSweepGC \
-XX:+CMSParallelRemarkEnabled \
com.cloudant.clouseau.Main \
/path/to/clouseau.ini

Chef
The CouchDB cookbook can build the search plugin from source and in-
stall it on a server alongside CouchDB.

Kubernetes
Users running CouchDB on Kubernetes via the Helm chart can add the
search service to each CouchDB Pod by setting enableSearch: true in the
chart values.

Additional Details
The Search User Guide provides detailed information on creating and
querying full-text indexes using this plugin.

The source code for the plugin and additional configuration documenta-
tion is available on GitHub at -
https://github.com/cloudant-labs/clouseau.

Nouveau Server Installation
Added in version 3.4.0.

Nouveau server is runtime-compatible with Java 11 or higher.

Enable Nouveau
You need to enable nouveau in CouchDB configuration;

[nouveau]
enable = true

Installation of Binary Packages
The Java side of nouveau is a set of jar files, one for nouveau itself
and the rest for dependencies (like Lucene and Dropwizard).

To start the nouveau server:

java -jar /path/to/nouveau.jar server /path/to/nouveau.yaml

Ensure that all the jar files from the release are in the same direc-
tory as nouveau.jar

We ship a basic nouveau.yaml configuration with useful defaults; see
that file for details.

nouveau.yaml:

maxIndexesOpen: 100
commitIntervalSeconds: 30
idleSeconds: 60
rootDir: target/indexes

As a DropWizard project you can also use the many configuration options
that it supports. See configuration reference.

By default Nouveau will attempt a clean shutdown if sent a TERM signal,
committing any outstanding index updates, completing any in-progress
segment merges, and finally closes all indexes. This is not essential
and you may safely kill the JVM without letting it do this, though any
uncommitted changes are necessarily lost. Once the JVM is started again
this indexing work will be attempted again.

Docker
There is a version of of the semi-official CouchDB Docker image avail-
able under the *-nouveau tags (eg, 3.4-nouveau).

Compose
A minimal CouchDB/Nouveau cluster can be create with this compose:

services:
couchdb:
image: couchdb:3
environment:
COUCHDB_USER: admin
COUCHDB_PASSWORD: admin
volumes:
- couchdb:/opt/couchdb/data
ports:
- 5984:5984
configs:
- source: nouveau.ini
target: /opt/couchdb/etc/local.d/nouveau.ini

nouveau:
image: couchdb:3-nouveau

volumes:
couchdb:

configs:
nouveau.ini:
content: |
[couchdb]
single_node=true
[nouveau]
enable = true
url = http://nouveau:5987

NOTE:
This is not production ready, but it is a quick way to get Nouveau
running.

Upgrading from prior CouchDB releases
Important Notes
• Always back up your data/ and etc/ directories prior to upgrading
CouchDB.

• We recommend that you overwrite your etc/default.ini file with the
version provided by the new release. New defaults sometimes contain
mandatory changes to enable default functionality. Always places your
customizations in etc/local.ini or any etc/local.d/*.ini file.

Upgrading from CouchDB 2.x
If you are coming from a prior release of CouchDB 2.x, upgrading is
simple.

Standalone (single) node upgrades
If you are running a standalone (single) CouchDB node:

1. Plan for downtime.

2. Backup everything.

3. Check for new recommended settings in the shipped etc/local.ini
file, and merge any changes desired into your own local settings
file(s).

4. Stop CouchDB.

5. Upgrade CouchDB in place.

6. Be sure to create an admin user if you do not have one. CouchDB 3.0+
require an admin user to start (the admin party has ended).

7. Start CouchDB.

8. Relax! Youre done.

Cluster upgrades
CouchDB 2.x and 3.x are explicitly designed to allow mixed clusters
during the upgrade process. This allows you to perform a rolling
restart across a cluster, upgrading one node at a time, for a zero
downtime upgrade. The process is also entirely scriptable within your
configuration management tool of choice.

Were proud of this feature, and you should be, too!

If you are running a CouchDB cluster:

1. Backup everything.

2. Check for new recommended settings in the shipped etc/local.ini
file, and merge any changes desired into your own local settings
file(s), staging these changes to occur as you upgrade the node.

3. Stop CouchDB on a single node.

4. Upgrade that CouchDB install in place.

5. Start CouchDB.

6. Double-check that the node has re-joined the cluster through the
/_membership endpoint. If your load balancer has health check func-
tionality driven by the /_up endpoint, check whether it thinks the
node is healthy as well.

7. Repeat the last 4 steps on the remaining nodes in the cluster.

8. Relax! Youre done.

Upgrading from CouchDB 1.x
To upgrade from CouchDB 1.x, first upgrade to a version of CouchDB 2.x.
You will need to convert all databases to CouchDB 2.x format first; see
the Upgrade Notes there for instructions. Then, upgrade to CouchDB 3.x.

Troubleshooting an Installation
First Install
If your CouchDB doesnt start after youve just installed, check the fol-
lowing things:

• On UNIX-like systems, this is usually this is a permissions issue.
Ensure that youve followed the User Registration and Security
chown/chmod commands. This problem is indicated by the presence of
the keyword eacces somewhere in the error output from CouchDB itself.

• Some Linux distributions split up Erlang into multiple packages. For
your distribution, check that you really installed all the required
Erlang modules. This varies from platform to platform, so youll just
have to work it out for yourself. For example, on recent versions of
Ubuntu/Debian, the erlang package includes all Erlang modules.

• Confirm that Erlang itself starts up with crypto (SSL) support:

## what version of erlang are you running? Ensure it is supported
erl -noshell -eval 'io:put_chars(erlang:system_info(otp_release)).' -s erlang halt
## are the erlang crypto (SSL) libraries working?
erl -noshell -eval 'case application:load(crypto) of ok -> io:put_chars("yay_crypto\n") ; _ -> exit(no_crypto) end.' -s init stop

• Next, identify where your Erlang CouchDB libraries are installed.
This will typically be the lib/ subdirectory of the release that you
have installed.

• Use this to start up Erlang with the CouchDB libraries in its path:

erl -env ERL_LIBS $ERL_LIBS:/path/to/couchdb/lib -couch_ini -s crypto

• In that Erlang shell, lets check that the key libraries are running.
The %% lines are comments, so you can skip them:

%% test SSL support. If this fails, ensure you have the OTP erlang-crypto library installed
crypto:md5_init().

%% test Snappy compression. If this fails, check your CouchDB configure script output or alternatively
%% if your distro comes with erlang-snappy make sure you're using only the CouchDB supplied version
snappy:compress("gogogogogogogogogogogogogogo").

%% test the CouchDB JSON encoder. CouchDB uses different encoders in each release, this one matches
%% what is used in 2.0.x.
jiffy:decode(jiffy:encode(<<"[1,2,3,4,5]">>)).

%% this is how you quit the erlang shell.
q().

• The output should resemble this, or an error will be thrown:

Erlang/OTP 17 [erts-6.2] [source] [64-bit] [smp:2:2] [async-threads:10] [kernel-poll:false]

Eshell V6.2 (abort with ^G)
1> crypto:md5_init().
<<1,35,69,103,137,171,205,239,254,220,186,152,118,84,50,
16,0,0,0,0,0,0,0,0,0,0,0,0,0,...>>
2> snappy:compress("gogogogogogogogogogogogogogo").
{ok,<<28,4,103,111,102,2,0>>}
3> jiffy:decode(jiffy:encode(<<"[1,2,3,4,5]">>)).
<<"[1,2,3,4,5]">>
4> q().

• At this point the only remaining dependency is your systems Unicode
support library, ICU, and the Spidermonkey Javascript VM from
Mozilla. Make sure that your LD_LIBRARY_PATH or equivalent for
non-Linux systems (DYLD_LIBRARY_PATH on macOS) makes these available
to CouchDB. Linux example running as normal user:

LD_LIBRARY_PATH=/usr/local/lib:/usr/local/spidermonkey/lib couchdb

Linux example running as couchdb user:

echo LD_LIBRARY_PATH=/usr/local/lib:/usr/local/spidermonkey/lib couchdb | sudo -u couchdb sh

• If you receive an error message including the key word eaddrinuse,
such as this:

Failure to start Mochiweb: eaddrinuse

edit your ``etc/default.ini`` or ``etc/local.ini`` file and change the
``[chttpd] port = 5984`` line to an available port.

• If you receive an error including the string:

OS Process Error {os_process_error,{exit_status,127}}

then it is likely your SpiderMonkey JavaScript VM installation is not
correct. Please recheck your build dependencies and try again.

• If you receive an error including the string:

OS Process Error {os_process_error,{exit_status,139}}

this is caused by the fact that SELinux blocks access to certain areas
of the file system. You must re-configure SELinux, or you can fully
disable SELinux using the command:

setenforce 0

• If you are still not able to get CouchDB to start at this point, keep
reading.

Quick Build
Having problems getting CouchDB to run for the first time? Follow this
simple procedure and report back to the user mailing list or IRC with
the output of each step. Please put the output of these steps into a
paste service (such as https://paste.ee/) rather than including the
output of your entire run in IRC or the mailing list directly.

1. Note down the name and version of your operating system and your
processor architecture.

2. Note down the installed versions of CouchDBs dependencies.

3. Follow the checkout instructions to get a fresh copy of CouchDBs
trunk.

4. Configure from the couchdb directory:

./configure

5. Build the release:

make release

6. Run the couchdb command and log the output:

cd rel/couchdb
bin/couchdb

7. Use your systems kernel trace tool and log the output of the above
command.

a. For example, linux systems should use strace:

strace bin/couchdb 2> strace.out

8. Report back to the mailing list (or IRC) with the output of each
step.

Upgrading
Are you upgrading from CouchDB 1.x? Install CouchDB into a fresh direc-
tory. CouchDBs directory layout has changed and may be confused by li-
braries present from previous releases.

Runtime Errors
Erlang stack trace contains system_limit, open_port, or emfile
Modern Erlang has a default limit of 65536 ports (8196 on Windows),
where each open file handle, tcp connection, and linked-in driver uses
one port. OSes have different soft and hard limits on the number of
open handles per process, often as low as 1024 or 4096 files. Youve
probably exceeded this.

There are two settings that need changing to increase this value. Con-
sult your OS documentation for how to increase the limit for your
process. Under Linux and systemd, this setting can be adjusted via sys-
temctl edit couchdb and adding the lines:

[Service]
LimitNOFILE=65536

to the file in the editor.

To increase this value higher than 65536, you must also add the Erlang
+Q parameter to your etc/vm.args file by adding the line:

+Q 102400

The old ERL_MAX_PORTS environment variable is ignored by the version of
Erlang supplied with CouchDB.

Lots of memory being used on startup
Is your CouchDB using a lot of memory (several hundred MB) on startup?
This one seems to especially affect Dreamhost installs. Its really an
issue with the Erlang VM pre-allocating data structures when ulimit is
very large or unlimited. A detailed discussion can be found on the er-
lang-questions list, but the short answer is that you should decrease
ulimit -n or lower the vm.args parameter +Q to something reasonable
like 1024.

Function raised exception (Cannot encode undefined value as JSON)
If you see this in the CouchDB error logs, the JavaScript code you are
using for either a map or reduce function is referencing an object mem-
ber that is not defined in at least one document in your database. Con-
sider this document:

{
"_id":"XYZ123",
"_rev":"1BB2BB",
"field":"value"
}

and this map function:

function(doc) {
emit(doc.name, doc.address);
}

This will fail on the above document, as it does not contain a name or
address member. Instead, use guarding to make sure the function only
accesses members when they exist in a document:

function(doc) {
if(doc.name && doc.address) {
emit(doc.name, doc.address);
}
}

While the above guard will work in most cases, its worth bearing
JavaScripts understanding of false values in mind. Testing against a
property with a value of 0 (zero), '' (empty String), false or null
will return false. If this is undesired, a guard of the form if
(doc.foo !== undefined) should do the trick.

This error can also be caused if a reduce function does not return a
value. For example, this reduce function will cause an error:

function(key, values) {
sum(values);
}

The function needs to return a value:

function(key, values) {
return sum(values);
}

Erlang stack trace contains bad_utf8_character_code
CouchDB 1.1.1 and later contain stricter handling of UTF8 encoding. If
you are replicating from older versions to newer versions, then this
error may occur during replication.

A number of work-arounds exist; the simplest is to do an in-place up-
grade of the relevant CouchDB and then compact prior to replicating.

Alternatively, if the number of documents impacted is small, use fil-
tered replication to exclude only those documents.

FIPS mode
Operating systems can be configured to disallow the use of OpenSSL MD5
hash functions in order to prevent use of MD5 for cryptographic pur-
poses. CouchDB makes use of MD5 hashes for verifying the integrity of
data (and not for cryptography) and will not run without the ability to
use MD5 hashes.

The message below indicates that the operating system is running in
FIPS mode, which, among other restrictions, does not allow the use of
OpenSSLs MD5 functions:

md5_dgst.c(82): OpenSSL internal error, assertion failed: Digest MD5 forbidden in FIPS mode!
[os_mon] memory supervisor port (memsup): Erlang has closed
[os_mon] cpu supervisor port (cpu_sup): Erlang has closed
Aborted

A workaround for this is provided with the --erlang-md5 compile flag.
Use of the flag results in CouchDB substituting the OpenSSL MD5 func-
tion calls with equivalent calls to Erlangs built-in library er-
lang:md5. NOTE: there may be a performance penalty associated with this
workaround.

Because CouchDB does not make use of MD5 hashes for cryptographic pur-
poses, this workaround does not defeat the purpose of FIPS mode, pro-
vided that the system owner is aware of and consents to its use.

Debugging startup
If youve compiled from scratch and are having problems getting CouchDB
to even start up, you may want to see more detail. Start by enabling
logging at the debug level:

[log]
level = debug

You can then pass the -init_debug +W i +v +V -emu_args flags in the
ERL_FLAGS environment variable to turn on additional debugging informa-
tion that CouchDB developers can use to help you.

Then, reach out to the CouchDB development team using the links pro-
vided on the CouchDB home page for assistance.

macOS Known Issues
undefined error, exit_status 134
Sometimes the Verify Installation fails with an undefined error. This
could be due to a missing dependency with Mac. In the logs, you will
find couchdb exit_status,134.

Installing the missing nspr via brew install nspr resolves the issue.
(see: https://github.com/apache/couchdb/issues/979)

SETUP
CouchDB 2.x can be deployed in either a single-node or a clustered con-
figuration. This section covers the first-time setup steps required for
each of these configurations.

Single Node Setup
Many users simply need a single-node CouchDB 2.x installation. Opera-
tionally, it is roughly equivalent to the CouchDB 1.x series. Note that
a single-node setup obviously doesnt take any advantage of the new
scaling and fault-tolerance features in CouchDB 2.x.

After installation and initial startup, visit Fauxton at
http://127.0.0.1:5984/_utils#setup. You will be asked to set up CouchDB
as a single-node instance or set up a cluster. When you click Sin-
gle-Node-Setup, you will get asked for an admin username and password.
Choose them well and remember them.

You can also bind CouchDB to a public address, so it is accessible
within your LAN or the public, if you are doing this on a public VM.
Or, you can keep the installation private by binding only to 127.0.0.1
(localhost). Binding to 0.0.0.0 will bind to all addresses. The wizard
then configures your admin username and password and creates the three
system databases _users, _replicator and _global_changes for you.

Another option is to set the configuration parameter [couchdb] sin-
gle_node=true in your local.ini file. When doing this, CouchDB will
create the system database for you on restart.

Alternatively, if you dont want to use the Setup Wizard or set that
value, and run 3.x as a single node with a server administrator already
configured via config file, make sure to create the three system data-
bases manually on startup:

curl -X PUT http://adm:pass@127.0.0.1:5984/_users

curl -X PUT http://adm:pass@127.0.0.1:5984/_replicator

curl -X PUT http://adm:pass@127.0.0.1:5984/_global_changes

Note that the last of these is not necessary if you do not expect to be
using the global changes feed. Feel free to delete this database if you
have created it, it has grown in size, and you do not need the function
(and do not wish to waste system resources on compacting it regularly.)

Cluster Set Up
This section describes everything you need to know to prepare, install,
and set up your first CouchDB 2.x/3.x cluster.

CouchDB in clustered mode uses the port 5984, just as in a standalone
configuration. Port 5986, previously used in CouchDB 2.x, has been re-
moved in CouchDB 3.x. All endpoints previously accessible at that port
are now available under the /_node/{node-name}/... hierarchy via the
primary 5984 port.

CouchDB uses Erlang-native clustering functionality to achieve a clus-
tered installation. Erlang uses TCP port 4369 (EPMD) to find other
nodes, so all servers must be able to speak to each other on this port.
In an Erlang cluster, all nodes are connected to all other nodes, in a
mesh network configuration.

Every Erlang application running on that machine (such as CouchDB) then
uses automatically assigned ports for communication with other nodes.
Yes, this means random ports. This will obviously not work with a fire-
wall, but it is possible to force an Erlang application to use a spe-
cific port range.

This documentation will use the range TCP 9100-9200, but this range is
unnecessarily broad. If you only have a single Erlang application run-
ning on a machine, the range can be limited to a single port:
9100-9100, since the ports erlang assigns are for inbound connections
only. Three CouchDB nodes running on a single machine, as in a develop-
ment cluster scenario, would need three ports in this range.

WARNING:
If you expose the distribution port to the Internet or any other un-
trusted network, then the only thing protecting you is the Erlang -
cookie.

Configure and Test the Communication with Erlang
Make CouchDB use correct IP|FQDN and the open ports
In file etc/vm.args change the line -name couchdb@127.0.0.1 to -name
couchdb@<reachable-ip-address|fully-qualified-domain-name> which de-
fines the name of the node. Each node must have an identifier that al-
lows remote systems to talk to it. The node name is of the form
<name>@<reachable-ip-address|fully-qualified-domain-name>.

The name portion can be couchdb on all nodes, unless you are running
more than 1 CouchDB node on the same server with the same IP address or
domain name. In that case, we recommend names of couchdb1, couchdb2,
etc.

The second portion of the node name must be an identifier by which
other nodes can access this node either the nodes fully qualified do-
main name (FQDN) or the nodes IP address. The FQDN is preferred so that
you can renumber the nodes IP address without disruption to the clus-
ter. (This is common in cloud-hosted environments.)

WARNING:
Changing the name later is somewhat cumbersome (i.e. moving shards),
which is why you will want to set it once and not have to change it.

Open etc/vm.args, on all nodes, and add -kernel inet_dist_listen_min
9100 and -kernel inet_dist_listen_max 9200 like below:

-name ...
-setcookie ...
...
-kernel inet_dist_listen_min 9100
-kernel inet_dist_listen_max 9200

Again, a small range is fine, down to a single port (set both to 9100)
if you only ever run a single CouchDB node on each machine.

Confirming connectivity between nodes
For this test, you need 2 servers with working hostnames. Let us call
them server1.test.com and server2.test.com. They reside at 192.168.0.1
and 192.168.0.2, respectively.

On server1.test.com:

erl -name bus@192.168.0.1 -setcookie 'brumbrum' -kernel inet_dist_listen_min 9100 -kernel inet_dist_listen_max 9200

Then on server2.test.com:

erl -name car@192.168.0.2 -setcookie 'brumbrum' -kernel inet_dist_listen_min 9100 -kernel inet_dist_listen_max 9200

An explanation to the commands:

• erl the Erlang shell.

• -name bus@192.168.0.1 the name of the Erlang node and its IP
address or FQDN.

• -setcookie 'brumbrum' the password used when nodes connect to
each other.

• -kernel inet_dist_listen_min 9100 the lowest port in the
range.

• -kernel inet_dist_listen_max 9200 the highest port in the
range.

This gives us 2 Erlang shells. shell1 on server1, shell2 on server2.
Time to connect them. Enter the following, being sure to end the line
with a period (.):

In shell1:

net_kernel:connect_node('car@192.168.0.2').

This will connect to the node called car on the server called
192.168.0.2.

If that returns true, then you have an Erlang cluster, and the fire-
walls are open. This means that 2 CouchDB nodes on these two servers
will be able to communicate with each other successfully. If you get
false or nothing at all, then you have a problem with the firewall,
DNS, or your settings. Try again.

If youre concerned about firewall issues, or having trouble connecting
all nodes of your cluster later on, repeat the above test between all
pairs of servers to confirm connectivity and system configuration is
correct.

Preparing CouchDB nodes to be joined into a cluster
Before you can add nodes to form a cluster, you must have them listen-
ing on an IP address accessible from the other nodes in the cluster.
You should also ensure that a few critical settings are identical
across all nodes before joining them.

The settings we recommend you set now, before joining the nodes into a
cluster, are:

1. etc/vm.args settings as described in the previous two sections

2. At least one server administrator user (and password)

3. Bind the nodes clustered interface (port 5984) to a reachable IP ad-
dress

4. A consistent UUID. The UUID is used in identifying the cluster when
replicating. If this value is not consistent across all nodes in the
cluster, replications may be forced to rewind the changes feed to
zero, leading to excessive memory, CPU and network use.

5. A consistent httpd secret. The secret is used in calculating and
evaluating cookie and proxy authentication, and should be set con-
sistently to avoid unnecessary repeated session cookie requests.

As of CouchDB 3.0, steps 4 and 5 above are automatically performed for
you when using the setup API endpoints described below.

If you use a configuration management tool, such as Chef, Ansible, Pup-
pet, etc., then you can place these settings in a .ini file and dis-
tribute them to all nodes ahead of time. Be sure to pre-encrypt the
password (cutting and pasting from a test instance is easiest) if you
use this route to avoid CouchDB rewriting the file.

If you do not use configuration management, or are just experimenting
with CouchDB for the first time, use these commands once per server to
perform steps 2-4 above. Be sure to change the password to something
secure, and again, use the same password on all nodes. You may have to
run these commands locally on each node; if so, replace
<server-IP|FQDN> below with 127.0.0.1.

# First, get two UUIDs to use later on. Be sure to use the SAME UUIDs on all nodes.
curl http://<server-IP|FQDN>:5984/_uuids?count=2

# CouchDB will respond with something like:
# {"uuids":["60c9e8234dfba3e2fdab04bf92001142","60c9e8234dfba3e2fdab04bf92001cc2"]}
# Copy the provided UUIDs into your clipboard or a text editor for later use.
# Use the first UUID as the cluster UUID.
# Use the second UUID as the cluster shared http secret.

# Create the admin user and password:
curl -X PUT http://<server-IP|FQDN>:5984/_node/_local/_config/admins/admin -d '"password"'

# Now, bind the clustered interface to all IP addresses available on this machine
curl -X PUT http://<server-IP|FQDN>:5984/_node/_local/_config/chttpd/bind_address -d '"0.0.0.0"'

# If not using the setup wizard / API endpoint, the following 2 steps are required:
# Set the UUID of the node to the first UUID you previously obtained:
curl -X PUT http://<server-IP|FQDN>:5984/_node/_local/_config/couchdb/uuid -d '"FIRST-UUID-GOES-HERE"'

# Finally, set the shared http secret for cookie creation to the second UUID:
curl -X PUT http://<server-IP|FQDN>:5984/_node/_local/_config/chttpd_auth/secret -d '"SECOND-UUID-GOES-HERE"'

The Cluster Setup Wizard
CouchDB 2.x/3.x comes with a convenient Cluster Setup Wizard as part of
the Fauxton web administration interface. For first-time cluster setup,
and for experimentation, this is your best option.

It is strongly recommended that the minimum number of nodes in a clus-
ter is 3. For more explanation, see the Cluster Theory section of this
documentation.

After installation and initial start-up of all nodes in your cluster,
ensuring all nodes are reachable, and the pre-configuration steps
listed above, visit Fauxton at http://<server1>:5984/_utils#setup. You
will be asked to set up CouchDB as a single-node instance or set up a
cluster.

When you click Setup Cluster you are asked for admin credentials again,
and then to add nodes by IP address. To get more nodes, go through the
same install procedure for each node, using the same machine to perform
the setup process. Be sure to specify the total number of nodes you
expect to add to the cluster before adding nodes.

Now enter each nodes IP address or FQDN in the setup wizard, ensuring
you also enter the previously set server admin username and password.

Once you have added all nodes, click Setup and Fauxton will finish the
cluster configuration for you.

To check that all nodes have been joined correctly, visit
http://<server-IP|FQDN>:5984/_membership on each node. The returned
list should show all of the nodes in your cluster:

{
"all_nodes": [
"couchdb@server1.test.com",
"couchdb@server2.test.com",
"couchdb@server3.test.com"
],
"cluster_nodes": [
"couchdb@server1.test.com",
"couchdb@server2.test.com",
"couchdb@server3.test.com"
]
}

The cluster_nodes section is the list of expected nodes; the all_nodes
section is the list of actually connected nodes. Be sure the two lists
match.

Now your cluster is ready and available! You can send requests to any
one of the nodes, and all three will respond as if you are working with
a single CouchDB cluster.

For a proper production setup, youd now set up an HTTP reverse proxy in
front of the cluster, for load balancing and SSL termination. We recom-
mend HAProxy, but others can be used. Sample configurations are avail-
able in the Best Practices section.

The Cluster Setup API
If you would prefer to manually configure your CouchDB cluster, CouchDB
exposes the _cluster_setup endpoint for that purpose. After installa-
tion and initial setup/config, we can set up the cluster. On each node
we need to run the following command to set up the node:

curl -X POST -H "Content-Type: application/json" http://admin:password@127.0.0.1:5984/_cluster_setup -d '{"action": "enable_cluster", "bind_address":"0.0.0.0", "username": "admin", "password":"password", "node_count":"3"}'

After that we can join all the nodes together. Choose one node as the
setup coordination node to run all these commands on. This setup coor-
dination node only manages the setup and requires all other nodes to be
able to see it and vice versa. It has no special purpose beyond the
setup process; CouchDB does not have the concept of a master node in a
cluster.

Setup will not work with unavailable nodes. All nodes must be online
and properly preconfigured before the cluster setup process can begin.

To join a node to the cluster, run these commands for each node you
want to add:

curl -X POST -H "Content-Type: application/json" http://admin:password@<setup-coordination-node>:5984/_cluster_setup -d '{"action": "enable_cluster", "bind_address":"0.0.0.0", "username": "admin", "password":"password", "port": 5984, "node_count": "3", "remote_node": "<remote-node-ip>", "remote_current_user": "<remote-node-username>", "remote_current_password": "<remote-node-password>" }'
curl -X POST -H "Content-Type: application/json" http://admin:password@<setup-coordination-node>:5984/_cluster_setup -d '{"action": "add_node", "host":"<remote-node-ip>", "port": <remote-node-port>, "username": "admin", "password":"password"}'

This will join the two nodes together. Keep running the above commands
for each node you want to add to the cluster. Once this is done run the
following command to complete the cluster setup and add the system
databases:

curl -X POST -H "Content-Type: application/json" http://admin:password@<setup-coordination-node>:5984/_cluster_setup -d '{"action": "finish_cluster"}'

Verify install:

curl http://admin:password@<setup-coordination-node>:5984/_cluster_setup

Response:

{"state":"cluster_finished"}

Verify all cluster nodes are connected:

curl http://admin:password@<setup-coordination-node>:5984/_membership

Response:

{
"all_nodes": [
"couchdb@couch1.test.com",
"couchdb@couch2.test.com",
"couchdb@couch3.test.com",
],
"cluster_nodes": [
"couchdb@couch1.test.com",
"couchdb@couch2.test.com",
"couchdb@couch3.test.com",
]
}

If the cluster is enabled and all_nodes and cluster_nodes lists dont
match, use curl to add nodes with PUT /_node/_lo-
cal/_nodes/couchdb@<reachable-ip-address|fully-qualified-domain-name>
and remove nodes with DELETE /_node/_local/_nodes/couchdb@<reach-
able-ip-address|fully-qualified-domain-name>

You CouchDB cluster is now set up.

CONFIGURATION
Introduction To Configuring
Configuration files
By default, CouchDB reads configuration files from the following loca-
tions, in the following order:

1. etc/default.ini

2. etc/default.d/*.ini

3. etc/local.ini

4. etc/local.d/*.ini

Configuration files in the *.d/ directories are sorted by name, that
means for example a file with the name etc/local.d/00-shared.ini is
loaded before etc/local.d/10-server-specific.ini.

All paths are specified relative to the CouchDB installation directory:
/opt/couchdb recommended on UNIX-like systems, C:\CouchDB recommended
on Windows systems, and a combination of two directories on macOS: Ap-
plications/Apache CouchDB.app/Contents/Resources/couchdbx-core/etc for
the default.ini and default.d directories, and one of
/Users/<your-user>/Library/Application Support/CouchDB2/etc/couchdb or
/Users/<your-user>/Library/Preferences/couchdb2-local.ini for the lo-
cal.ini and local.d directories.

Settings in successive documents override the settings in earlier en-
tries. For example, setting the chttpd/bind_address parameter in lo-
cal.ini would override any setting in default.ini.

WARNING:
The default.ini file may be overwritten during an upgrade or re-in-
stallation, so localised changes should be made to the local.ini
file or files within the local.d directory.

The configuration file chain may be changed by setting the ERL_FLAGS
environment variable:

export ERL_FLAGS="-couch_ini /path/to/my/default.ini /path/to/my/local.ini"

or by placing the -couch_ini .. flag directly in the etc/vm.args file.
Passing -couch_ini .. as a command-line argument when launching couchdb
is the same as setting the ERL_FLAGS environment variable.

WARNING:
The environment variable/command-line flag overrides any -couch_ini
option specified in the etc/vm.args file. And, BOTH of these options
completely override CouchDB from searching in the default locations.
Use these options only when necessary, and be sure to track the con-
tents of etc/default.ini, which may change in future releases.

If there is a need to use different vm.args or sys.config files, for
example, in different locations to the ones provided by CouchDB, or you
dont want to edit the original files, the default locations may be
changed by setting the COUCHDB_ARGS_FILE or COUCHDB_SYSCONFIG_FILE en-
vironment variables:

export COUCHDB_ARGS_FILE="/path/to/my/vm.args"
export COUCHDB_SYSCONFIG_FILE="/path/to/my/sys.config"

Parameter names and values
All parameter names are case-sensitive. Every parameter takes a value
of one of five types: boolean, integer, string, tuple and proplist.
Boolean values can be written as true or false.

Parameters with value type of tuple or proplist are following the Er-
lang requirement for style and naming.

Setting parameters via the configuration file
Changed in version 3.3: added ability to have = in parameter names

Changed in version 3.3: removed the undocumented ability to have
multi-line values.

The common way to set some parameters is to edit the local.ini file
(location explained above).

For example:

; This is a comment
[section]
param = value ; inline comments are allowed

Each configuration file line may contains section definition, parameter
specification, empty (space and newline characters only) or commented
line. You can set up inline commentaries for sections or parameters.

The section defines group of parameters that are belongs to some spe-
cific CouchDB subsystem. For instance, httpd section holds not only
HTTP server parameters, but also others that directly interacts with
it.

The parameter specification contains two parts divided by the equal
sign (=): the parameter name on the left side and the parameter value
on the right one. The leading and following whitespace for = is an op-
tional to improve configuration readability.

Since version 3.3 its possible to use = in parameter names, but only
when the parameter and value are separated `` = , i.e. the equal sign
is surrounded by at least one space on each side. This might be useful
in the ``[jwt_keys] section, where base64 encoded keys may contain some
= characters.

The semicolon (;) signals the start of a comment. Everything after this
character is ignored by CouchDB.

After editing the configuration file, CouchDB should be restarted to
apply any changes.

Setting parameters via the HTTP API
Alternatively, configuration parameters can be set via the HTTP API.
This API allows changing CouchDB configuration on-the-fly without re-
quiring a server restart:

curl -X PUT http://adm:pass@localhost:5984/_node/<name@host>/_config/uuids/algorithm -d '"random"'

The old parameters value is returned in the response:

"sequential"

You should be careful changing configuration via the HTTP API since its
possible to make CouchDB unreachable, for example, by changing the
chttpd/bind_address:

curl -X PUT http://adm:pass@localhost:5984/_node/<name@host>/_config/chttpd/bind_address -d '"10.10.0.128"'

If you make a typo or the specified IP address is not available from
your network, CouchDB will be unreachable. The only way to resolve this
will be to remote into the server, correct the config file, and restart
CouchDB. To protect yourself against such accidents you may set the
chttpd/config_whitelist of permitted configuration parameters for up-
dates via the HTTP API. Once this option is set, further changes to
non-whitelisted parameters must take place via the configuration file,
and in most cases, will also require a server restart before taking ef-
fect.

Configuring the local node
While the HTTP API allows configuring all nodes in the cluster, as a
convenience, you can use the literal string _local in place of the node
name, to interact with the local nodes configuration. For example:

curl -X PUT http://adm:pass@localhost:5984/_node/_local/_config/uuids/algorithm -d '"random"'

Base Configuration
Base CouchDB Options
[couchdb]

attachment_stream_buffer_size
Higher values may result in better read performance due
to fewer read operations and/or more OS page cache hits.
However, they can also increase overall response time for
writes when there are many attachment write requests in
parallel.

[couchdb]
attachment_stream_buffer_size = 4096

database_dir
Specifies location of CouchDB database files (*.couch
named). This location should be writable and readable for
the user the CouchDB service runs as (couchdb by de-
fault).

[couchdb]
database_dir = /var/lib/couchdb

default_security
Changed in version 3.0: admin_only is now the default.

Default security object for databases if not explicitly
set. When set to everyone, anyone can performs reads and
writes. When set to admin_only, only admins can read and
write. When set to admin_local, sharded databases can be
read and written by anyone but the shards can only be
read and written by admins.

[couchdb]
default_security = admin_only

enable_database_recovery
Enable this to only soft-delete databases when DELETE
/{db} DELETE requests are made. This will rename all
shards of the database with a suffix of the form <db-
name>.YMD.HMS.deleted.couchdb. You can then manually
delete these files later, as desired.

Default is false.

[couchdb]
enable_database_recovery = false

file_compression
Changed in version 1.2: Added Google Snappy compression
algorithm.

Method used to compress everything that is appended to
database and view index files, except for attachments
(see the attachments section). Available methods are:

• none: no compression

• snappy: use Google Snappy, a very fast compressor/de-
compressor

• deflate_N: use zlibs deflate; N is the compression
level which ranges from 1 (fastest, lowest compression
ratio) to 9 (slowest, highest compression ratio)

[couchdb]
file_compression = snappy

maintenance_mode
A CouchDB node may be put into two distinct maintenance
modes by setting this configuration parameter.

• true: The node will not respond to clustered requests
from other nodes and the /_up endpoint will return a
404 response.

• nolb: The /_up endpoint will return a 404 response.

• false: The node responds normally, /_up returns a 200
response.

It is expected that the administrator has configured a
load balancer in front of the CouchDB nodes in the clus-
ter. This load balancer should use the /_up endpoint to
determine whether or not to send HTTP requests to any
particular node. For HAProxy, the following config is ap-
propriate:

http-check disable-on-404
option httpchk GET /_up

max_dbs_open
This option places an upper bound on the number of data-
bases that can be open at once. CouchDB reference counts
database accesses internally and will close idle data-
bases as needed. Sometimes it is necessary to keep more
than the default open at once, such as in deployments
where many databases will be replicating continuously.

[couchdb]
max_dbs_open = 100

max_document_size
Changed in version 3.0.0.

Limit maximum document body size. Size is calculated
based on the serialized Erlang representation of the JSON
document body, because that reflects more accurately the
amount of storage consumed on disk. In particular, this
limit does not include attachments.

HTTP requests which create or update documents will fail
with error code 413 if one or more documents is larger
than this configuration value.

In case of _update handlers, document size is checked af-
ter the transformation and right before being inserted
into the database.

[couchdb]
max_document_size = 8000000 ; bytes

WARNING:
Before version 2.1.0 this setting was implemented by
simply checking http request body sizes. For individ-
ual document updates via PUT that approximation was
close enough, however that is not the case for
_bulk_docs endpoint. After 2.1.0 a separate configura-
tion parameter was defined:
chttpd/max_http_request_size, which can be used to
limit maximum http request sizes. After upgrade, it is
advisable to review those settings and adjust them ac-
cordingly.

os_process_timeout
If an external process, such as a query server or exter-
nal process, runs for this amount of milliseconds without
returning any results, it will be terminated. Keeping
this value smaller ensures you get expedient errors, but
you may want to tweak it for your specific needs.

[couchdb]
os_process_timeout = 5000 ; 5 sec

single_node
Added in version 3.0.0.

When this configuration setting is set to true, automati-
cally create the system databases on startup. Must be set
false for a clustered CouchDB installation.

uri_file
This file contains the full URI that can be used to ac-
cess this instance of CouchDB. It is used to help dis-
cover the port CouchDB is running on (if it was set to 0
(e.g. automatically assigned any free one). This file
should be writable and readable for the user that runs
the CouchDB service (couchdb by default).

[couchdb]
uri_file = /var/run/couchdb/couchdb.uri

users_db_security_editable
Added in version 3.0.0.

When this configuration setting is set to false, reject
any attempts to modify the _users database security ob-
ject. Modification of this object is deprecated in 3.x
and will be completely disallowed in CouchDB 4.x.

users_db_suffix
Specifies the suffix (last component of a name) of the
system database for storing CouchDB users.

[couchdb]
users_db_suffix = _users

WARNING:
If you change the database name, do not forget to re-
move or clean up the old database, since it will no
longer be protected by CouchDB.

util_driver_dir
Specifies location of binary drivers (icu, ejson, etc.).
This location and its contents should be readable for the
user that runs the CouchDB service.

[couchdb]
util_driver_dir = /usr/lib/couchdb/erlang/lib/couch-1.5.0/priv/lib

uuid Added in version 1.3.

Unique identifier for this CouchDB cluster.

[couchdb]
uuid = 0a959b9b8227188afc2ac26ccdf345a6

view_index_dir
Specifies location of CouchDB view index files. This lo-
cation should be writable and readable for the user that
runs the CouchDB service (couchdb by default).

[couchdb]
view_index_dir = /var/lib/couchdb

write_xxhash_checksums
Added in version 3.4.

The default value in version 3.4 is false. The legacy
checksum algorithm will be used for writing couch_file
blocks. During reads, both xxHash and the legacy checksum
algorithm will be used to verify data integrity. In a fu-
ture version of CouchDB the default value will become
true. However, it would still be possible to safely down-
grade to version 3.4, which would be able to verify both
xxHash and legacy checksums. If CouchDB version downgrade
is not a concern, enabling xxHash checksums can result in
a measuralbe document read performance, especially for
larger document sizes:

[couchdb]
write_xxhash_checksums = false

js_engine
Changed in version 3.4.

Select the default Javascript engine. Available options
are spidermonkey and quickjs. The default setting is spi-
dermonkey:

[couchdb]
js_engine = spidermonkey

Configuring Clustering
Cluster Options
[cluster]

Sets the default number of shards for newly created databases.
The default value, 2, splits a database into 2 separate parti-
tions.

[cluster]
q = 2

For systems with only a few, heavily accessed, large databases,
or for servers with many CPU cores, consider increasing this
value to 4 or 8.

The value of q can also be overridden on a per-DB basis, at DB
creation time.

SEE ALSO:
Built-in Reduce Functions

Backing up CouchDB
CouchDB has three different types of files it can create during run-
time:

• Database files (including secondary indexes)

• Configuration files (*.ini)

• Log files (if configured to log to disk)

Below are strategies for ensuring consistent backups of all of these
files.

Database Backups
The simplest and easiest approach for CouchDB backup is to use CouchDB
replication to another CouchDB installation. You can choose between
normal (one-shot) or continuous replications depending on your need.

However, you can also copy the actual .couch files from the CouchDB
data directory (by default, data/) at any time, without problem.
CouchDBs append-only storage format for both databases and secondary
indexes ensures that this will work without issue.

To ensure reliability of backups, it is recommended that you back up
secondary indexes (stored under data/.shards) prior to backing up the
main database files (stored under data/shards as well as the sys-
tem-level databases at the parent data/ directory). This is because
CouchDB will automatically handle views/secondary indexes that are
slightly out of date by updating them on the next read access, but
views or secondary indexes that are newer than their associated data-
bases will trigger a full rebuild of the index. This can be a very
costly and time-consuming operation, and can impact your ability to re-
cover quickly in a disaster situation.

On supported operating systems/storage environments, you can also make
use of storage snapshots. These have the advantage of being near-in-
stantaneous when working with block storage systems such as ZFS or LVM
or Amazon EBS. When using snapshots at the block-storage level, be sure
to quiesce the file system with an OS-level utility such as Linuxs -
fsfreeze if necessary. If unsure, consult your operating systems or
cloud providers documentation for more detail.

Configuration Backups
CouchDBs configuration system stores data in .ini files under the con-
figuration directory (by default, etc/). If changes are made to the
configuration at runtime, the very last file in the configuration chain
will be updated with the changes.

Simple back up the entire etc/ directory to ensure a consistent config-
uration after restoring from backup.

If no changes to the configuration are made at runtime through the HTTP
API, and all configuration files are managed by a configuration manage-
ment system (such as Ansible or Chef), there is no need to backup the
configuration directory.

Log Backups
If configured to log to a file, you may want to back up the log files
written by CouchDB. Any backup solution for these files works.

Under UNIX-like systems, if using log rotation software, a
copy-then-truncate approach is necessary. This will truncate the origi-
nal log file to zero size in place after creating a copy. CouchDB does
not recognize any signal to be told to close its log file and create a
new one. Because of this, and because of differences in how file han-
dles function, there is no straightforward log rotation solution under
Microsoft Windows other than periodic restarts of the CouchDB process.

FAUXTON
Fauxton Setup
Fauxton is included with CouchDB 2.0, so make sure CouchDB is running,
then go to:

http://127.0.0.1:5984/_utils/

You can also upgrade to the latest version of Fauxton by using npm:

$ npm install -g fauxton
$ fauxton

(Recent versions of node.js and npm are required.)

Fauxton Visual Guide
You can find the Visual Guide here:
http://couchdb.apache.org/fauxton-visual-guide

Development Server
Recent versions of node.js and npm are required.

Using the dev server is the easiest way to use Fauxton, specially when
developing for it:

$ git clone https://github.com/apache/couchdb-fauxton.git
$ npm install && npm run dev

Understanding Fauxton Code layout
Each bit of functionality is its own separate module or addon.

All core modules are stored under app/module and any addons that are
optional are under app/addons.

We use backbone.js and Backbone.layoutmanager quite heavily, so best to
get an idea how they work. Its best at this point to read through a
couple of the modules and addons to get an idea of how they work.

Two good starting points are app/addon/config and app/modules/data-
bases.

Each module must have a base.js file, this is read and compile when
Fauxton is deployed.

The resource.js file is usually for your Backbone.Models and Back-
bone.Collections, view.js for your Backbone.Views.

The routes.js is used to register a url path for your view along with
what layout, data, breadcrumbs and api point is required for the view.

ToDo items
Checkout JIRA or GitHub Issues for a list of items to do.

EXPERIMENTAL FEATURES
This is a list of experimental features in CouchDB. They are included
in a release because the development team is requesting feedback from
the larger developer community. As such, please play around with these
features and send us feedback, thanks!

Use at your own risk! Do not rely on these features for critical appli-
cations.

Content-Security-Policy (CSP) Header Support for /_utils (Fauxton)
This will just work with Fauxton. You can enable it in your config: you
can enable the feature in general and change the default header that is
sent for everything in /_utils.

[csp]
enable = true

Then restart CouchDB.

Nouveau Server (new Apache Lucene integration)
Enable nouveau in config and run the Java service.

[nouveau]
enable = true

Have fun!

API REFERENCE
The components of the API URL path help determine the part of the
CouchDB server that is being accessed. The result is the structure of
the URL request both identifies and effectively describes the area of
the database you are accessing.

As with all URLs, the individual components are separated by a forward
slash.

As a general rule, URL components and JSON fields starting with the _
(underscore) character represent a special component or entity within
the server or returned object. For example, the URL fragment /_all_dbs
gets a list of all of the databases in a CouchDB instance.

This reference is structured according to the URL structure, as below.

API Basics
The CouchDB API is the primary method of interfacing to a CouchDB in-
stance. Requests are made using HTTP and requests are used to request
information from the database, store new data, and perform views and
formatting of the information stored within the documents.

Requests to the API can be categorised by the different areas of the
CouchDB system that you are accessing, and the HTTP method used to send
the request. Different methods imply different operations, for example
retrieval of information from the database is typically handled by the
GET operation, while updates are handled by either a POST or PUT re-
quest. There are some differences between the information that must be
supplied for the different methods. For a guide to the basic HTTP meth-
ods and request structure, see Request Format and Responses.

For nearly all operations, the submitted data, and the returned data
structure, is defined within a JavaScript Object Notation (JSON) ob-
ject. Basic information on the content and data types for JSON are pro-
vided in JSON Basics.

Errors when accessing the CouchDB API are reported using standard HTTP
Status Codes. A guide to the generic codes returned by CouchDB are pro-
vided in HTTP Status Codes.

When accessing specific areas of the CouchDB API, specific information
and examples on the HTTP methods and request, JSON structures, and er-
ror codes are provided.

Request Format and Responses
CouchDB supports the following HTTP request methods:

• GET

Request the specified item. As with normal HTTP requests, the format
of the URL defines what is returned. With CouchDB this can include
static items, database documents, and configuration and statistical
information. In most cases the information is returned in the form of
a JSON document.

• HEAD

The HEAD method is used to get the HTTP header of a GET request with-
out the body of the response.

• POST

Upload data. Within CouchDB POST is used to set values, including up-
loading documents, setting document values, and starting certain ad-
ministration commands.

• PUT

Used to put a specified resource. In CouchDB PUT is used to create
new objects, including databases, documents, views and design docu-
ments.

• DELETE

Deletes the specified resource, including documents, views, and de-
sign documents.

• COPY

A special method that can be used to copy documents and objects.

If you use an unsupported HTTP request type with an URL that does not
support the specified type then a 405 - Method Not Allowed will be re-
turned, listing the supported HTTP methods. For example:

{
"error":"method_not_allowed",
"reason":"Only GET,HEAD allowed"
}

HTTP Headers
Because CouchDB uses HTTP for all communication, you need to ensure
that the correct HTTP headers are supplied (and processed on retrieval)
so that you get the right format and encoding. Different environments
and clients will be more or less strict on the effect of these HTTP
headers (especially when not present). Where possible you should be as
specific as possible.

Request Headers
• Accept

Specifies the list of accepted data types to be returned by the
server (i.e. that are accepted/understandable by the client). The
format should be a list of one or more MIME types, separated by
colons.

For the majority of requests the definition should be for JSON data
(application/json). For attachments you can either specify the MIME
type explicitly, or use */* to specify that all file types are sup-
ported. If the Accept header is not supplied, then the */* MIME type
is assumed (i.e. client accepts all formats).

The use of Accept in queries for CouchDB is not required, but is
highly recommended as it helps to ensure that the data returned can
be processed by the client.

If you specify a data type using the Accept header, CouchDB will
honor the specified type in the Content-type header field returned.
For example, if you explicitly request application/json in the Accept
of a request, the returned HTTP headers will use the value in the re-
turned Content-type field.

For example, when sending a request without an explicit Accept
header, or when specifying */*:

GET /recipes HTTP/1.1
Host: couchdb:5984
Accept: */*

The returned headers are:

HTTP/1.1 200 OK
Server: CouchDB (Erlang/OTP)
Date: Thu, 13 Jan 2011 13:39:34 GMT
Content-Type: text/plain;charset=utf-8
Content-Length: 227
Cache-Control: must-revalidate

NOTE:
The returned content type is text/plain even though the informa-
tion returned by the request is in JSON format.

Explicitly specifying the Accept header:

GET /recipes HTTP/1.1
Host: couchdb:5984
Accept: application/json

The headers returned include the application/json content type:

HTTP/1.1 200 OK
Server: CouchDB (Erlang/OTP)
Date: Thu, 13 Jan 2013 13:40:11 GMT
Content-Type: application/json
Content-Length: 227
Cache-Control: must-revalidate

• Content-type

Specifies the content type of the information being supplied within
the request. The specification uses MIME type specifications. For the
majority of requests this will be JSON (application/json). For some
settings the MIME type will be plain text. When uploading attachments
it should be the corresponding MIME type for the attachment or binary
(application/octet-stream).

The use of the Content-type on a request is highly recommended.

• X-Couch-Request-ID

(Optional) CouchDB will add a X-Couch-Request-ID header to every re-
sponse in order to help users correlate any problem with the CouchDB
log.

If this header is present on the request (as long as the header value
is no longer than 36 characters from the set 0-9a-zA-z-_) this value
will be used internally as the request nonce, which appears in logs,
and will also be returned as the X-Couch-Request-ID response header.

Response Headers
Response headers are returned by the server when sending back content
and include a number of different header fields, many of which are
standard HTTP response header and have no significance to CouchDB oper-
ation. The list of response headers important to CouchDB are listed be-
low.

• Cache-control

The cache control HTTP response header provides a suggestion for
client caching mechanisms on how to treat the returned information.
CouchDB typically returns the must-revalidate, which indicates that
the information should be revalidated if possible. This is used to
ensure that the dynamic nature of the content is correctly updated.

• Content-length

The length (in bytes) of the returned content.

• Content-type

Specifies the MIME type of the returned data. For most request, the
returned MIME type is text/plain. All text is encoded in Unicode
(UTF-8), and this is explicitly stated in the returned Content-type,
as text/plain;charset=utf-8.

• Etag

The Etag HTTP header field is used to show the revision for a docu-
ment, or a view.

ETags have been assigned to a map/reduce group (the collection of
views in a single design document). Any change to any of the indexes
for those views would generate a new ETag for all view URLs in a sin-
gle design doc, even if that specific views results had not changed.

Each _view URL has its own ETag which only gets updated when changes
are made to the database that effect that index. If the index for
that specific view does not change, that view keeps the original ETag
head (therefore sending back 304 - Not Modified more often).

• Transfer-Encoding

If the response uses an encoding, then it is specified in this header
field.

Transfer-Encoding: chunked means that the response is sent in parts,
a method known as chunked transfer encoding. This is used when
CouchDB does not know beforehand the size of the data it will send
(for example, the changes feed).

• X-CouchDB-Body-Time

Time spent receiving the request body in milliseconds.

Available when body content is included in the request.

• X-Couch-Request-ID

Unique identifier for the request.

JSON Basics
The majority of requests and responses to CouchDB use the JavaScript
Object Notation (JSON) for formatting the content and structure of the
data and responses.

JSON is used because it is the simplest and easiest solution for work-
ing with data within a web browser, as JSON structures can be evaluated
and used as JavaScript objects within the web browser environment. JSON
also integrates with the server-side JavaScript used within CouchDB.

JSON supports the same basic types as supported by JavaScript, these
are:

• Array - a list of values enclosed in square brackets. For example:

["one", "two", "three"]

• Boolean - a true or false value. You can use these strings directly.
For example:

{ "value": true}

• Number - an integer or floating-point number.

• Object - a set of key/value pairs (i.e. an associative array, or
hash). The key must be a string, but the value can be any of the sup-
ported JSON values. For example:

{
"servings" : 4,
"subtitle" : "Easy to make in advance, and then cook when ready",
"cooktime" : 60,
"title" : "Chicken Coriander"
}

In CouchDB, the JSON object is used to represent a variety of struc-
tures, including the main CouchDB document.

• String - this should be enclosed by double-quotes and supports Uni-
code characters and backslash escaping. For example:

"A String"

Parsing JSON into a JavaScript object is supported through the
JSON.parse() function in JavaScript, or through various libraries that
will perform the parsing of the content into a JavaScript object for
you. Libraries for parsing and generating JSON are available in many
languages, including Perl, Python, Ruby, Erlang and others.

WARNING:
Care should be taken to ensure that your JSON structures are valid,
invalid structures will cause CouchDB to return an HTTP status code
of 500 (server error).

Number Handling
Developers and users new to computer handling of numbers often en-
counter surprises when expecting that a number stored in JSON format
does not necessarily return as the same number as compared character by
character.

Any numbers defined in JSON that contain a decimal point or exponent
will be passed through the Erlang VMs idea of the double data type. Any
numbers that are used in views will pass through the view servers idea
of a number (the common JavaScript case means even integers pass
through a double due to JavaScripts definition of a number).

Consider this document that we write to CouchDB:

{
"_id":"30b3b38cdbd9e3a587de9b8122000cff",
"number": 1.1
}

Now lets read that document back from CouchDB:

{
"_id":"30b3b38cdbd9e3a587de9b8122000cff",
"_rev":"1-f065cee7c3fd93aa50f6c97acde93030",
"number":1.1000000000000000888
}

What happens is CouchDB is changing the textual representation of the
result of decoding what it was given into some numerical format. In
most cases this is an IEEE 754 double precision floating point number
which is exactly what almost all other languages use as well.

What Erlang does a bit differently than other languages is that it does
not attempt to pretty print the resulting output to use the shortest
number of characters. For instance, this is why we have this relation-
ship:

ejson:encode(ejson:decode(<<"1.1">>)).
<<"1.1000000000000000888">>

What can be confusing here is that internally those two formats decode
into the same IEEE-754 representation. And more importantly, it will
decode into a fairly close representation when passed through all major
parsers that we know about.

While weve only been discussing cases where the textual representation
changes, another important case is when an input value contains more
precision than can actually represented in a double. (You could argue
that this case is actually losing data if you dont accept that numbers
are stored in doubles).

Heres a log for a couple of the more common JSON libraries that happen
to be on the authors machine:

Ejson (CouchDBs current parser) at CouchDB sha 168a663b:

$ ./utils/run -i
Erlang R14B04 (erts-5.8.5) [source] [64-bit] [smp:2:2] [rq:2]
[async-threads:4] [hipe] [kernel-poll:true]

Eshell V5.8.5 (abort with ^G)
1> ejson:encode(ejson:decode(<<"1.01234567890123456789012345678901234567890">>)).
<<"1.0123456789012346135">>
2> F = ejson:encode(ejson:decode(<<"1.01234567890123456789012345678901234567890">>)).
<<"1.0123456789012346135">>
3> ejson:encode(ejson:decode(F)).
<<"1.0123456789012346135">>

Node:

$ node -v
v0.6.15
$ node
JSON.stringify(JSON.parse("1.01234567890123456789012345678901234567890"))
'1.0123456789012346'
var f = JSON.stringify(JSON.parse("1.01234567890123456789012345678901234567890"))
undefined
JSON.stringify(JSON.parse(f))
'1.0123456789012346'

Python:

$ python
Python 2.7.2 (default, Jun 20 2012, 16:23:33)
[GCC 4.2.1 Compatible Apple Clang 4.0 (tags/Apple/clang-418.0.60)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
import json
json.dumps(json.loads("1.01234567890123456789012345678901234567890"))
'1.0123456789012346'
f = json.dumps(json.loads("1.01234567890123456789012345678901234567890"))
json.dumps(json.loads(f))
'1.0123456789012346'

Ruby:

$ irb --version
irb 0.9.5(05/04/13)
require 'JSON'
=> true
JSON.dump(JSON.load("[1.01234567890123456789012345678901234567890]"))
=> "[1.01234567890123]"
f = JSON.dump(JSON.load("[1.01234567890123456789012345678901234567890]"))
=> "[1.01234567890123]"
JSON.dump(JSON.load(f))
=> "[1.01234567890123]"

NOTE:
A small aside on Ruby, it requires a top level object or array, so I
just wrapped the value. Should be obvious it doesnt affect the re-
sult of parsing the number though.

Spidermonkey:

$ js -h 2>&1 | head -n 1
JavaScript-C 1.8.5 2011-03-31
$ js
js> JSON.stringify(JSON.parse("1.01234567890123456789012345678901234567890"))
"1.0123456789012346"
js> var f = JSON.stringify(JSON.parse("1.01234567890123456789012345678901234567890"))
js> JSON.stringify(JSON.parse(f))
"1.0123456789012346"

As you can see they all pretty much behave the same except for Ruby ac-
tually does appear to be losing some precision over the other li-
braries.

The astute observer will notice that ejson (the CouchDB JSON library)
reported an extra three digits. While its tempting to think that this
is due to some internal difference, its just a more specific case of
the 1.1 input as described above.

The important point to realize here is that a double can only hold a
finite number of values. What were doing here is generating a string
that when passed through the standard floating point parsing algorithms
(ie, strtod) will result in the same bit pattern in memory as we
started with. Or, slightly different, the bytes in a JSON serialized
number are chosen such that they refer to a single specific value that
a double can represent.

The important point to understand is that were mapping from one infi-
nite set onto a finite set. An easy way to see this is by reflecting on
this:

1.0 == 1.00 == 1.000 = 1.(infinite zeros)

Obviously a computer cant hold infinite bytes so we have to decimate
our infinitely sized set to a finite set that can be represented con-
cisely.

The game that other JSON libraries are playing is merely:

How few characters do I have to use to select this specific value for a
double

And that game has lots and lots of subtle details that are difficult to
duplicate in C without a significant amount of effort (it took Python
over a year to get it sorted with their fancy build systems that auto-
matically run on a number of different architectures).

Hopefully weve shown that CouchDB is not doing anything funky by chang-
ing input. Its behaving the same as any other common JSON library does,
its just not pretty printing its output.

On the other hand, if you actually are in a position where an IEEE-754
double is not a satisfactory data type for your numbers, then the an-
swer as has been stated is to not pass your numbers through this repre-
sentation. In JSON this is accomplished by encoding them as a string or
by using integer types (although integer types can still bite you if
you use a platform that has a different integer representation than
normal, ie, JavaScript).

Further information can be found easily, including the Floating Point
Guide, and David Goldbergs Reference.

Also, if anyone is really interested in changing this behavior, were
all ears for contributions to jiffy (which is theoretically going to
replace ejson when we get around to updating the build system). The
places weve looked for inspiration are TCL and Python. If you know a
decent implementation of this float printing algorithm give us a
holler.

HTTP Status Codes
With the interface to CouchDB working through HTTP, error codes and
statuses are reported using a combination of the HTTP status code num-
ber, and corresponding data in the body of the response data.

A list of the error codes returned by CouchDB, and generic descriptions
of the related errors are provided below. The meaning of different sta-
tus codes for specific request types are provided in the corresponding
API call reference.

• 200 - OK

Request completed successfully.

• 201 - Created

Document created successfully.

• 202 - Accepted

Request has been accepted, but the corresponding operation may not
have completed. This is used for background operations, such as data-
base compaction.

• 304 - Not Modified

The additional content requested has not been modified. This is used
with the ETag system to identify the version of information returned.

• 400 - Bad Request

Bad request structure. The error can indicate an error with the re-
quest URL, path or headers. Differences in the supplied MD5 hash and
content also trigger this error, as this may indicate message corrup-
tion.

• 401 - Unauthorized

The item requested was not available using the supplied authoriza-
tion, or authorization was not supplied.

• 403 - Forbidden

The requested item or operation is forbidden. This might be because;

• Your user name or roles do not match the security object of the
database

• The request requires administrator privileges but you dont have
them

• Youve made too many requests with invalid credentials and have been
temporarily locked out.

• 404 - Not Found

The requested content could not be found. The content will include
further information, as a JSON object, if available. The structure
will contain two keys, error and reason. For example:

{"error":"not_found","reason":"no_db_file"}

• 405 - Method Not Allowed

A request was made using an invalid HTTP request type for the URL re-
quested. For example, you have requested a PUT when a POST is re-
quired. Errors of this type can also triggered by invalid URL
strings.

• 406 - Not Acceptable

The requested content type is not supported by the server.

• 409 - Conflict

Request resulted in an update conflict.

• 412 - Precondition Failed

The request headers from the client and the capabilities of the
server do not match.

• 413 - Request Entity Too Large

A document exceeds the configured couchdb/max_document_size value or
the entire request exceeds the chttpd/max_http_request_size value.

• 415 - Unsupported Media Type

The content types supported, and the content type of the information
being requested or submitted indicate that the content type is not
supported.

• 416 - Requested Range Not Satisfiable

The range specified in the request header cannot be satisfied by the
server.

• 417 - Expectation Failed

When sending documents in bulk, the bulk load operation failed.

• 500 - Internal Server Error

The request was invalid, either because the supplied JSON was in-
valid, or invalid information was supplied as part of the request.

• 503 - Service Unavailable

The request cant be serviced at this time, either because the cluster
is overloaded, maintenance is underway, or some other reason. The
request may be retried without changes, perhaps in a couple of min-
utes.

Server
The CouchDB server interface provides the basic interface to a CouchDB
server for obtaining CouchDB information and getting and setting con-
figuration information.

/
GET / Accessing the root of a CouchDB instance returns meta informa-
tion about the instance. The response is a JSON structure con-
taining information about the server, including a welcome mes-
sage, version of the server, and a list of features. The fea-
tures elements may change depending on which configuration op-
tions are enabled (for example, quickjs if its set as the de-
fault JavasScript engine), or which additional components are
installed and configured (for example the nouveau text indexing
application).

Request Headers

• Accept .INDENT 2.0

• application/json

• text/plain

Response Headers

• Content-Type .INDENT 2.0

• application/json

• text/plain; charset=utf-8

Status Codes

• 200 OK Request completed successfully

Request:

GET / HTTP/1.1
Accept: application/json
Host: localhost:5984

Response:

HTTP/1.1 200 OK
Content-Length: 247
Content-Type: application/json
Date: Mon, 21 Oct 2024 21:53:51 GMT
Server: CouchDB/3.4.2 (Erlang OTP/25)

{
"couchdb": "Welcome",
"features": [
"access-ready",
"partitioned",
"pluggable-storage-engines",
"reshard",
"scheduler"
],
"git_sha": "6e5ad2a5c",
"uuid": "9ddf59457dbb8772316cf06fc5e5a2e4",
"vendor": {
"name": "The Apache Software Foundation"
},
"version": "3.4.2"
}

/_active_tasks
Changed in version 2.1.0: Because of how the scheduling replicator
works, continuous replication jobs could be periodically stopped and
then started later. When they are not running they will not appear in
the _active_tasks endpoint

Changed in version 3.3: Added bulk_get_attempts and bulk_get_docs
fields for replication jobs.

GET /_active_tasks
List of running tasks, including the task type, name, status and
process ID. The result is a JSON array of the currently running
tasks, with each task being described with a single object. De-
pending on operation type set of response object fields might be
different.

Request Headers

• Accept .INDENT 2.0

• application/json

• text/plain

Response Headers

• Content-Type .INDENT 2.0

• application/json

• text/plain; charset=utf-8

Response JSON Object

• changes_done (number) Processed changes

• database (string) Source database

• pid (string) Process ID

• progress (number) Current percentage progress

• started_on (number) Task start time as unix timestamp

• status (string) Task status message

• task (string) Task name

• total_changes (number) Total changes to process

• type (string) Operation Type

• updated_on (number) Unix timestamp of last operation update

Status Codes

• 200 OK Request completed successfully

• 401 Unauthorized CouchDB Server Administrator privileges re-
quired

Request:

GET /_active_tasks HTTP/1.1
Accept: application/json
Host: localhost:5984

Response:

HTTP/1.1 200 OK
Cache-Control: must-revalidate
Content-Length: 1690
Content-Type: application/json
Date: Sat, 10 Aug 2013 06:37:31 GMT
Server: CouchDB (Erlang/OTP)

[
{
"changes_done": 64438,
"database": "mailbox",
"pid": "<0.12986.1>",
"progress": 84,
"started_on": 1376116576,
"total_changes": 76215,
"type": "database_compaction",
"updated_on": 1376116619
},
{
"changes_done": 14443,
"database": "mailbox",
"design_document": "c9753817b3ba7c674d92361f24f59b9f",
"pid": "<0.10461.3>",
"progress": 18,
"started_on": 1376116621,
"total_changes": 76215,
"type": "indexer",
"updated_on": 1376116650
},
{
"changes_done": 5454,
"database": "mailbox",
"design_document": "_design/meta",
"pid": "<0.6838.4>",
"progress": 7,
"started_on": 1376116632,
"total_changes": 76215,
"type": "indexer",
"updated_on": 1376116651
},
{
"checkpointed_source_seq": 68585,
"continuous": false,
"doc_id": null,
"doc_write_failures": 0,
"bulk_get_attempts": 4524,
"bulk_get_docs": 4524,
"docs_read": 4524,
"docs_written": 4524,
"missing_revisions_found": 4524,
"pid": "<0.1538.5>",
"progress": 44,
"replication_id": "9bc1727d74d49d9e157e260bb8bbd1d5",
"revisions_checked": 4524,
"source": "mailbox",
"source_seq": 154419,
"started_on": 1376116644,
"target": "http://mailsrv:5984/mailbox",
"type": "replication",
"updated_on": 1376116651
}
]

/_all_dbs
GET /_all_dbs
Returns a list of all the databases in the CouchDB instance.

Request Headers

• Accept .INDENT 2.0

• application/json

• text/plain

Query Parameters

• descending (boolean) Return the databases in descending order
by key. Default is false.

• endkey (json) Stop returning databases when the specified key
is reached.

• end_key (json) Alias for endkey param

• limit (number) Limit the number of the returned databases to
the specified number.

• skip (number) Skip this number of databases before starting
to return the results. Default is 0.

• startkey (json) Return databases starting with the specified
key.

• start_key (json) Alias for startkey.

Response Headers

• Content-Type .INDENT 2.0

• application/json

• text/plain; charset=utf-8

Status Codes

• 200 OK Request completed successfully

Request:

GET /_all_dbs HTTP/1.1
Accept: application/json
Host: localhost:5984

Response:

HTTP/1.1 200 OK
Cache-Control: must-revalidate
Content-Length: 52
Content-Type: application/json
Date: Sat, 10 Aug 2013 06:57:48 GMT
Server: CouchDB (Erlang/OTP)

[
"_users",
"contacts",
"docs",
"invoices",
"locations"
]

/_dbs_info
Added in version 3.2.

GET /_dbs_info
Returns a list of all the databases information in the CouchDB
instance.

Request Headers

• Accept .INDENT 2.0

• application/json

• text/plain

Query Parameters

• descending (boolean) Return databases information in descend-
ing order by key. Default is false.

• endkey (json) Stop returning databases information when the
specified key is reached.

• end_key (json) Alias for endkey param

• limit (number) Limit the number of the returned databases in-
formation to the specified number.

• skip (number) Skip this number of databases before starting
to return the results. Default is 0.

• startkey (json) Return databases information starting with
the specified key.

• start_key (json) Alias for startkey.

Response Headers

• Content-Type .INDENT 2.0

• application/json

• text/plain; charset=utf-8

Status Codes

• 200 OK Request completed successfully

Request:

GET /_dbs_info HTTP/1.1
Accept: application/json
Host: localhost:5984

Response:

HTTP/1.1 200 OK
Cache-Control: must-revalidate
Content-Type: application/json
Date: Thu, 18 Nov 2021 14:37:35 GMT
Server: CouchDB (Erlang OTP/23)

Added in version 2.2.

POST /_dbs_info
Returns information of a list of the specified databases in the
CouchDB instance. This enables you to request information about
multiple databases in a single request, in place of multiple GET
/{db} requests.

Request Headers

• Accept .INDENT 2.0

• application/json

Response Headers

• Content-Type .INDENT 2.0

• application/json

Request JSON Object

• keys (array) Array of database names to be requested

Status Codes

• 200 OK Request completed successfully

• 400 Bad Request Missing keys or exceeded keys in request

Request:

POST /_dbs_info HTTP/1.1
Accept: application/json
Host: localhost:5984
Content-Type: application/json

{
"keys": [
"animals",
"plants"
]
}

Response:

HTTP/1.1 200 OK
Cache-Control: must-revalidate
Content-Type: application/json
Date: Sat, 20 Dec 2017 06:57:48 GMT
Server: CouchDB (Erlang/OTP)

[
{
"key": "animals",
"info": {
"db_name": "animals",
"update_seq": "52232",
"sizes": {
"file": 1178613587,
"external": 1713103872,
"active": 1162451555
},
"purge_seq": 0,
"doc_del_count": 0,
"doc_count": 52224,
"disk_format_version": 6,
"compact_running": false,
"cluster": {
"q": 8,
"n": 3,
"w": 2,
"r": 2
},
"instance_start_time": "0"
}
},
{
"key": "plants",
"info": {
"db_name": "plants",
"update_seq": "303",
"sizes": {
"file": 3872387,
"external": 2339,
"active": 67475
},
"purge_seq": 0,
"doc_del_count": 0,
"doc_count": 11,
"disk_format_version": 6,
"compact_running": false,
"cluster": {
"q": 8,
"n": 3,
"w": 2,
"r": 2
},
"instance_start_time": "0"
}
}
]

NOTE:
The supported number of the specified databases in the list can be
limited by modifying the max_db_number_for_dbs_info_req entry in
configuration file. The default limit is 100. Increasing the limit,
while possible, creates load on the server so it is advisable to
have more requests with 100 dbs, rather than a few requests with
1000s of dbs at a time.

/_cluster_setup
Added in version 2.0.

GET /_cluster_setup
Returns the status of the node or cluster, per the cluster setup
wizard.

Request Headers

• Accept .INDENT 2.0

• application/json

• text/plain

Query Parameters

• ensure_dbs_exist (array) List of system databases to ensure
exist on the node/cluster. Defaults to ["_users","_replica-
tor"].

Response Headers

• Content-Type .INDENT 2.0

• application/json

• text/plain; charset=utf-8

Response JSON Object

• state (string) Current state of the node and/or cluster (see
below)

Status Codes

• 200 OK Request completed successfully

The state returned indicates the current node or cluster state, and is
one of the following:

• cluster_disabled: The current node is completely unconfigured.

• single_node_disabled: The current node is configured as a sin-
gle (standalone) node ([cluster] n=1), but either does not
have a server-level admin user defined, or does not have the
standard system databases created. If the ensure_dbs_exist
query parameter is specified, the list of databases provided
overrides the default list of standard system databases.

• single_node_enabled: The current node is configured as a sin-
gle (standalone) node, has a server-level admin user defined,
and has the ensure_dbs_exist list (explicit or default) of
databases created.

• cluster_enabled: The current node has [cluster] n > 1, is not
bound to 127.0.0.1 and has a server-level admin user defined.
However, the full set of standard system databases have not
been created yet. If the ensure_dbs_exist query parameter is
specified, the list of databases provided overrides the de-
fault list of standard system databases.

• cluster_finished: The current node has [cluster] n > 1, is not
bound to 127.0.0.1, has a server-level admin user defined and
has the ensure_dbs_exist list (explicit or default) of data-
bases created.

Request:

GET /_cluster_setup HTTP/1.1
Accept: application/json
Host: localhost:5984

Response:

HTTP/1.1 200 OK
X-CouchDB-Body-Time: 0
X-Couch-Request-ID: 5c058bdd37
Server: CouchDB/2.1.0-7f17678 (Erlang OTP/17)
Date: Sun, 30 Jul 2017 06:33:18 GMT
Content-Type: application/json
Content-Length: 29
Cache-Control: must-revalidate

{"state":"cluster_enabled"}

POST /_cluster_setup
Configure a node as a single (standalone) node, as part of a
cluster, or finalise a cluster.

Request Headers

• Accept .INDENT 2.0

• application/json

• text/plain

• Content-Type application/json

Request JSON Object

• action (string) .INDENT 2.0

• enable_single_node: Configure the current node as a single,
standalone CouchDB server.

• enable_cluster: Configure the local or remote node as one
node, preparing it to be joined to a new CouchDB cluster.

• add_node: Add the specified remote node to this clusters list
of nodes, joining it to the cluster.

• finish_cluster: Finalise the cluster by creating the standard
system databases.

• bind_address (string) The IP address to which to bind the current
node. The special value 0.0.0.0 may be specified to bind to all in-
terfaces on the host. (enable_cluster and enable_single_node only)

• username (string) The username of the server-level administrator to
create. (enable_cluster and enable_single_node only), or the remote
servers administrator username (add_node)

• password (string) The password for the server-level administrator to
create. (enable_cluster and enable_single_node only), or the remote
servers administrator username (add_node)

• port (number) The TCP port to which to bind this node (enable_clus-
ter and enable_single_node only) or the TCP port to which to bind a
remote node (add_node only).

• node_count (number) The total number of nodes to be joined into the
cluster, including this one. Used to determine the value of the clus-
ters n, up to a maximum of 3. (enable_cluster only)

• remote_node (string) The IP address of the remote node to setup as
part of this clusters list of nodes. (enable_cluster only)

• remote_current_user (string) The username of the server-level admin-
istrator authorized on the remote node. (enable_cluster only)

• remote_current_password (string) The password of the server-level
administrator authorized on the remote node. (enable_cluster only)

• host (string) The remote node IP of the node to add to the cluster.
(add_node only)

• ensure_dbs_exist (array) List of system databases to ensure exist on
the node/cluster. Defaults to ["_users","_replicator"].

No example request/response included here. For a worked example, please
see The Cluster Setup API.

/_db_updates
Added in version 1.4.

GET /_db_updates
Returns a list of all database events in the CouchDB instance.
The existence of the _global_changes database is required to use
this endpoint.

Request Headers

• Accept .INDENT 2.0

• application/json

• text/plain

Query Parameters

• feed (string) .INDENT 2.0

• normal: Returns all historical DB changes, then closes the
connection. Default.

• longpoll: Closes the connection after the first event.

• continuous: Send a line of JSON per event. Keeps the socket
open until timeout.

• eventsource: Like, continuous, but sends the events in -
EventSource format.

• timeout (number) Number of milliseconds until CouchDB closes the
connection. Default is 60000.

• heartbeat (number) Period in milliseconds after which an empty line
is sent in the results. Only applicable for longpoll, continuous, and
eventsource feeds. Overrides any timeout to keep the feed alive in-
definitely. Default is 60000. May be true to use default value.

• since (string) Return only updates since the specified sequence ID.
If the sequence ID is specified but does not exist, all changes are
returned. May be the string now to begin showing only new updates.

Response Headers

• Content-Type .INDENT 2.0

• application/json

• text/plain; charset=utf-8

• Transfer-Encoding chunked

Response JSON Object

• results (array) An array of database events. For longpoll and
continuous modes, the entire response is the contents of the
results array.

• last_seq (string) The last sequence ID reported.

Status Codes

• 200 OK Request completed successfully

• 401 Unauthorized CouchDB Server Administrator privileges re-
quired

The results field of database updates:

JSON Parameters

• db_name (string) Database name.

• type (string) A database event is one of created, up-
dated, deleted.

• seq (json) Update sequence of the event.

Request:

GET /_db_updates HTTP/1.1
Accept: application/json
Host: localhost:5984

Response:

HTTP/1.1 200 OK
Cache-Control: must-revalidate
Content-Type: application/json
Date: Sat, 18 Mar 2017 19:01:35 GMT
Etag: "C1KU98Y6H0LGM7EQQYL6VSL07"
Server: CouchDB/2.0.0 (Erlang OTP/17)
Transfer-Encoding: chunked
X-Couch-Request-ID: ad87efc7ff
X-CouchDB-Body-Time: 0

{
"results":[
{"db_name":"mailbox","type":"created","seq":"1-g1AAAAFReJzLYWBg4MhgTmHgzcvPy09JdcjLz8gvLskBCjMlMiTJ____PyuDOZExFyjAnmJhkWaeaIquGIf2JAUgmWQPMiGRAZcaB5CaePxqEkBq6vGqyWMBkgwNQAqobD4h"},
{"db_name":"mailbox","type":"deleted","seq":"2-g1AAAAFReJzLYWBg4MhgTmHgzcvPy09JdcjLz8gvLskBCjMlMiTJ____PyuDOZEpFyjAnmJhkWaeaIquGIf2JAUgmWQPMiGRAZcaB5CaePxqEkBq6vGqyWMBkgwNQAqobD4hdQsg6vYTUncAou4-IXUPIOpA7ssCAIFHa60"}
],
"last_seq": "2-g1AAAAFReJzLYWBg4MhgTmHgzcvPy09JdcjLz8gvLskBCjMlMiTJ____PyuDOZEpFyjAnmJhkWaeaIquGIf2JAUgmWQPMiGRAZcaB5CaePxqEkBq6vGqyWMBkgwNQAqobD4hdQsg6vYTUncAou4-IXUPIOpA7ssCAIFHa60"
}

/_membership
Added in version 2.0.

GET /_membership
Displays the nodes that are part of the cluster as clus-
ter_nodes. The field all_nodes displays all nodes this node
knows about, including the ones that are part of the cluster.
The endpoint is useful when setting up a cluster, see Node Man-
agement

Request Headers

• Accept .INDENT 2.0

• application/json

• text/plain

Response Headers

• Content-Type .INDENT 2.0

• application/json

• text/plain; charset=utf-8

Status Codes

• 200 OK Request completed successfully

Request:

GET /_membership HTTP/1.1
Accept: application/json
Host: localhost:5984

Response:

HTTP/1.1 200 OK
Cache-Control: must-revalidate
Content-Type: application/json
Date: Sat, 11 Jul 2015 07:02:41 GMT
Server: CouchDB (Erlang/OTP)
Content-Length: 142

{
"all_nodes": [
"node1@127.0.0.1",
"node2@127.0.0.1",
"node3@127.0.0.1"
],
"cluster_nodes": [
"node1@127.0.0.1",
"node2@127.0.0.1",
"node3@127.0.0.1"
]
}

/_replicate
Changed in version 3.3: Added bulk_get_attempts and bulk_get_docs
fields to the replication history response object.

POST /_replicate
Request, configure, or stop, a replication operation.

Request Headers

• Accept .INDENT 2.0

• application/json

• text/plain

• Content-Type application/json

Request JSON Object

• cancel (boolean) Cancels the replication

• continuous (boolean) Configure the replication to be continu-
ous

• create_target (boolean) Creates the target database. Re-
quired administrators privileges on target server.

• create_target_params (object) An object that contains parame-
ters to be used when creating the target database. Can include
the standard q and n parameters.

• winning_revs_only (boolean) Replicate winning revisions only.

• doc_ids (array) Array of document IDs to be synchronized.
doc_ids, filter, and selector are mutually exclusive.

• filter (string) The name of a filter function. doc_ids, fil-
ter, and selector are mutually exclusive.

• selector (json) A selector to filter documents for synchro-
nization. Has the same behavior as the selector objects in
replication documents. doc_ids, filter, and selector are mu-
tually exclusive.

• source_proxy (string) Address of a proxy server through which
replication from the source should occur (protocol can be http
or socks5)

• target_proxy (string) Address of a proxy server through which
replication to the target should occur (protocol can be http
or socks5)

• source (string/object) Fully qualified source database URL or
an object which contains the full URL of the source database
with additional parameters like headers. Eg: -
http://example.com/source_db_name or {url:url in here, head-
ers: {header1:value1, }} . For backwards compatibility,
CouchDB 3.x will auto-convert bare database names by prepend-
ing the address and port CouchDB is listening on, to form a
complete URL. This behaviour is deprecated in 3.x and will be
removed in CouchDB 4.0.

• target (string/object) Fully qualified target database URL or
an object which contains the full URL of the target database
with additional parameters like headers. Eg: -
http://example.com/target_db_name or {url:url in here, head-
ers: {header1:value1, }} . For backwards compatibility,
CouchDB 3.x will auto-convert bare database names by prepend-
ing the address and port CouchDB is listening on, to form a
complete URL. This behaviour is deprecated in 3.x and will be
removed in CouchDB 4.0.

Response Headers

• Content-Type .INDENT 2.0

• application/json

• text/plain; charset=utf-8

Response JSON Object

• history (array) Replication history (see below)

• ok (boolean) Replication status

• replication_id_version (number) Replication protocol version

• session_id (string) Unique session ID

• source_last_seq (number) Last sequence number read from
source database

Status Codes

• 200 OK Replication request successfully completed

• 202 Accepted Continuous replication request has been accepted

• 400 Bad Request Invalid JSON data

• 401 Unauthorized CouchDB Server Administrator privileges re-
quired

• 404 Not Found Either the source or target DB is not found or
attempt to cancel unknown replication task

• 500 Internal Server Error JSON specification was invalid

The specification of the replication request is controlled through the JSON
content of the request. The JSON should be an object with the fields defining
the source, target and other options.

The Replication history is an array of objects with following structure:

JSON Parameters

• doc_write_failures (number) Number of document write
failures

• docs_read (number) Number of documents read

• docs_written (number) Number of documents written to
target

• bulk_get_attempts (number) The total count of at-
tempted doc revisions fetched with _bulk_get.

• bulk_get_docs (number) The total count of successful
docs fetched with _bulk_get.

• end_last_seq (number) Last sequence number in changes
stream

• end_time (string) Date/Time replication operation com-
pleted in RFC 2822 format

• missing_checked (number) Number of missing documents
checked

• missing_found (number) Number of missing documents
found

• recorded_seq (number) Last recorded sequence number

• session_id (string) Session ID for this replication
operation

• start_last_seq (number) First sequence number in
changes stream

• start_time (string) Date/Time replication operation
started in RFC 2822 format

NOTE:
As of CouchDB 2.0.0, fully qualified URLs are required for both the
replication source and target parameters.

Request

POST /_replicate HTTP/1.1
Accept: application/json
Content-Length: 80
Content-Type: application/json
Host: localhost:5984

{
"source": "http://adm:pass@127.0.0.1:5984/db_a",
"target": "http://adm:pass@127.0.0.1:5984/db_b"
}

Response

HTTP/1.1 200 OK
Cache-Control: must-revalidate
Content-Length: 692
Content-Type: application/json
Date: Sun, 11 Aug 2013 20:38:50 GMT
Server: CouchDB (Erlang/OTP)

{
"history": [
{
"doc_write_failures": 0,
"docs_read": 10,
"bulk_get_attempts": 10,
"bulk_get_docs": 10,
"docs_written": 10,
"end_last_seq": 28,
"end_time": "Sun, 11 Aug 2013 20:38:50 GMT",
"missing_checked": 10,
"missing_found": 10,
"recorded_seq": 28,
"session_id": "142a35854a08e205c47174d91b1f9628",
"start_last_seq": 1,
"start_time": "Sun, 11 Aug 2013 20:38:50 GMT"
},
{
"doc_write_failures": 0,
"docs_read": 1,
"bulk_get_attempts": 1,
"bulk_get_docs": 1,
"docs_written": 1,
"end_last_seq": 1,
"end_time": "Sat, 10 Aug 2013 15:41:54 GMT",
"missing_checked": 1,
"missing_found": 1,
"recorded_seq": 1,
"session_id": "6314f35c51de3ac408af79d6ee0c1a09",
"start_last_seq": 0,
"start_time": "Sat, 10 Aug 2013 15:41:54 GMT"
}
],
"ok": true,
"replication_id_version": 3,
"session_id": "142a35854a08e205c47174d91b1f9628",
"source_last_seq": 28
}

Replication Operation
The aim of the replication is that at the end of the process, all ac-
tive documents on the source database are also in the destination data-
base and all documents that were deleted in the source databases are
also deleted (if they exist) on the destination database.

Replication can be described as either push or pull replication:

• Pull replication is where the source is the remote CouchDB instance,
and the target is the local database.

Pull replication is the most useful solution to use if your source
database has a permanent IP address, and your destination (local)
database may have a dynamically assigned IP address (for example,
through DHCP). This is particularly important if you are replicating
to a mobile or other device from a central server.

• Push replication is where the source is a local database, and target
is a remote database.

Specifying the Source and Target Database
You must use the URL specification of the CouchDB database if you want
to perform replication in either of the following two situations:

• Replication with a remote database (i.e. another instance of CouchDB
on the same host, or a different host)

• Replication with a database that requires authentication

For example, to request replication between a database local to the
CouchDB instance to which you send the request, and a remote database
you might use the following request:

POST http://couchdb:5984/_replicate HTTP/1.1
Content-Type: application/json
Accept: application/json

{
"source" : "recipes",
"target" : "http://coucdb-remote:5984/recipes",
}

In all cases, the requested databases in the source and target specifi-
cation must exist. If they do not, an error will be returned within the
JSON object:

{
"error" : "db_not_found"
"reason" : "could not open http://couchdb-remote:5984/ol1ka/",
}

You can create the target database (providing your user credentials al-
low it) by adding the create_target field to the request object:

POST http://couchdb:5984/_replicate HTTP/1.1
Content-Type: application/json
Accept: application/json

{
"create_target" : true
"source" : "recipes",
"target" : "http://couchdb-remote:5984/recipes",
}

The create_target field is not destructive. If the database already ex-
ists, the replication proceeds as normal.

Single Replication
You can request replication of a database so that the two databases can
be synchronized. By default, the replication process occurs one time
and synchronizes the two databases together. For example, you can re-
quest a single synchronization between two databases by supplying the
source and target fields within the request JSON content.

POST http://couchdb:5984/_replicate HTTP/1.1
Accept: application/json
Content-Type: application/json

{
"source" : "recipes",
"target" : "recipes-snapshot",
}

In the above example, the databases recipes and recipes-snapshot will
be synchronized. These databases are local to the CouchDB instance
where the request was made. The response will be a JSON structure con-
taining the success (or failure) of the synchronization process, and
statistics about the process:

{
"ok" : true,
"history" : [
{
"docs_read" : 1000,
"bulk_get_attempts": 1000,
"bulk_get_docs": 1000,
"session_id" : "52c2370f5027043d286daca4de247db0",
"recorded_seq" : 1000,
"end_last_seq" : 1000,
"doc_write_failures" : 0,
"start_time" : "Thu, 28 Oct 2010 10:24:13 GMT",
"start_last_seq" : 0,
"end_time" : "Thu, 28 Oct 2010 10:24:14 GMT",
"missing_checked" : 0,
"docs_written" : 1000,
"missing_found" : 1000
}
],
"session_id" : "52c2370f5027043d286daca4de247db0",
"source_last_seq" : 1000
}

Continuous Replication
Synchronization of a database with the previously noted methods happens
only once, at the time the replicate request is made. To have the tar-
get database permanently replicated from the source, you must set the
continuous field of the JSON object within the request to true.

With continuous replication changes in the source database are repli-
cated to the target database in perpetuity until you specifically re-
quest that replication ceases.

POST http://couchdb:5984/_replicate HTTP/1.1
Accept: application/json
Content-Type: application/json

{
"continuous" : true
"source" : "recipes",
"target" : "http://couchdb-remote:5984/recipes",
}

Changes will be replicated between the two databases as long as a net-
work connection is available between the two instances.

NOTE:
Two keep two databases synchronized with each other, you need to set
replication in both directions; that is, you must replicate from
source to target, and separately from target to source.

Canceling Continuous Replication
You can cancel continuous replication by adding the cancel field to the
JSON request object and setting the value to true. Note that the struc-
ture of the request must be identical to the original for the cancella-
tion request to be honoured. For example, if you requested continuous
replication, the cancellation request must also contain the continuous
field.

For example, the replication request:

POST http://couchdb:5984/_replicate HTTP/1.1
Content-Type: application/json
Accept: application/json

{
"source" : "recipes",
"target" : "http://couchdb-remote:5984/recipes",
"create_target" : true,
"continuous" : true
}

Must be canceled using the request:

POST http://couchdb:5984/_replicate HTTP/1.1
Accept: application/json
Content-Type: application/json

{
"cancel" : true,
"continuous" : true
"create_target" : true,
"source" : "recipes",
"target" : "http://couchdb-remote:5984/recipes",
}

Requesting cancellation of a replication that does not exist results in
a 404 error.

/_scheduler/jobs
GET /_scheduler/jobs
List of replication jobs. Includes replications created via
/_replicate endpoint as well as those created from replication
documents. Does not include replications which have completed or
have failed to start because replication documents were mal-
formed. Each job description will include source and target in-
formation, replication id, a history of recent event, and a few
other things.

Request Headers

• Accept .INDENT 2.0

• application/json

Response Headers

• Content-Type .INDENT 2.0

• application/json

Query Parameters

• limit (number) How many results to return

• skip (number) How many result to skip starting at the begin-
ning, ordered by replication ID

Response JSON Object

• offset (number) How many results were skipped

• total_rows (number) Total number of replication jobs

• id (string) Replication ID.

• database (string) Replication document database

• doc_id (string) Replication document ID

• history (list) Timestamped history of events as a list of ob-
jects

• pid (string) Replication process ID

• node (string) Cluster node where the job is running

• source (string) Replication source

• target (string) Replication target

• start_time (string) Timestamp of when the replication was
started

Status Codes

• 200 OK Request completed successfully

• 401 Unauthorized CouchDB Server Administrator privileges re-
quired

Request:

GET /_scheduler/jobs HTTP/1.1
Accept: application/json
Host: localhost:5984

Response:

HTTP/1.1 200 OK
Cache-Control: must-revalidate
Content-Length: 1690
Content-Type: application/json
Date: Sat, 29 Apr 2017 05:05:16 GMT
Server: CouchDB (Erlang/OTP)

{
"jobs": [
{
"database": "_replicator",
"doc_id": "cdyno-0000001-0000003",
"history": [
{
"timestamp": "2017-04-29T05:01:37Z",
"type": "started"
},
{
"timestamp": "2017-04-29T05:01:37Z",
"type": "added"
}
],
"id": "8f5b1bd0be6f9166ccfd36fc8be8fc22+continuous",
"info": {
"changes_pending": 0,
"checkpointed_source_seq": "113-g1AAAACTeJzLYWBgYMpgTmHgz8tPSTV0MDQy1zMAQsMckEQiQ1L9____szKYE01ygQLsZsYGqcamiZjKcRqRxwIkGRqA1H-oSbZgk1KMLCzTDE0wdWUBAF6HJIQ",
"doc_write_failures": 0,
"docs_read": 113,
"docs_written": 113,
"bulk_get_attempts": 113,
"bulk_get_docs": 113,
"missing_revisions_found": 113,
"revisions_checked": 113,
"source_seq": "113-g1AAAACTeJzLYWBgYMpgTmHgz8tPSTV0MDQy1zMAQsMckEQiQ1L9____szKYE01ygQLsZsYGqcamiZjKcRqRxwIkGRqA1H-oSbZgk1KMLCzTDE0wdWUBAF6HJIQ",
"through_seq": "113-g1AAAACTeJzLYWBgYMpgTmHgz8tPSTV0MDQy1zMAQsMckEQiQ1L9____szKYE01ygQLsZsYGqcamiZjKcRqRxwIkGRqA1H-oSbZgk1KMLCzTDE0wdWUBAF6HJIQ"
},
"node": "node1@127.0.0.1",
"pid": "<0.1850.0>",
"source": "http://myserver.com/foo",
"start_time": "2017-04-29T05:01:37Z",
"target": "http://adm:*****@localhost:15984/cdyno-0000003/",
"user": null
},
{
"database": "_replicator",
"doc_id": "cdyno-0000001-0000002",
"history": [
{
"timestamp": "2017-04-29T05:01:37Z",
"type": "started"
},
{
"timestamp": "2017-04-29T05:01:37Z",
"type": "added"
}
],
"id": "e327d79214831ca4c11550b4a453c9ba+continuous",
"info": {
"changes_pending": null,
"checkpointed_source_seq": 0,
"doc_write_failures": 0,
"docs_read": 12,
"docs_written": 12,
"bulk_get_attempts": 12,
"bulk_get_docs": 12,
"missing_revisions_found": 12,
"revisions_checked": 12,
"source_seq": "12-g1AAAACTeJzLYWBgYMpgTmHgz8tPSTV0MDQy1zMAQsMckEQiQ1L9____szKYE1lzgQLsBsZm5pZJJpjKcRqRxwIkGRqA1H-oSexgk4yMkhITjS0wdWUBADfEJBg",
"through_seq": "12-g1AAAACTeJzLYWBgYMpgTmHgz8tPSTV0MDQy1zMAQsMckEQiQ1L9____szKYE1lzgQLsBsZm5pZJJpjKcRqRxwIkGRqA1H-oSexgk4yMkhITjS0wdWUBADfEJBg"
},
"node": "node2@127.0.0.1",
"pid": "<0.1757.0>",
"source": "http://myserver.com/foo",
"start_time": "2017-04-29T05:01:37Z",
"target": "http://adm:*****@localhost:15984/cdyno-0000002/",
"user": null
}
],
"offset": 0,
"total_rows": 2
}

/_scheduler/docs
Changed in version 2.1.0: Use this endpoint to monitor the state of
document-based replications. Previously needed to poll both documents
and _active_tasks to get a complete state summary

Changed in version 3.0.0: In error states the info field switched from
being a string to being an object

Changed in version 3.3: Added bulk_get_attempts and bulk_get_docs the
info object.

GET /_scheduler/docs
List of replication document states. Includes information about
all the documents, even in completed and failed states. For each
document it returns the document ID, the database, the replica-
tion ID, source and target, and other information.

Request Headers

• Accept .INDENT 2.0

• application/json

Response Headers

• Content-Type .INDENT 2.0

• application/json

Query Parameters

• limit (number) How many results to return

• skip (number) How many result to skip starting at the begin-
ning, if ordered by document ID

Response JSON Object

• offset (number) How many results were skipped

• total_rows (number) Total number of replication documents.

• id (string) Replication ID, or null if state is completed or
failed

• state (string) One of following states (see Replication
states for descriptions): initializing, running, completed,
pending, crashing, error, failed

• database (string) Database where replication document came
from

• doc_id (string) Replication document ID

• node (string) Cluster node where the job is running

• source (string) Replication source

• target (string) Replication target

• start_time (string) Timestamp of when the replication was
started

• last_updated (string) Timestamp of last state update

• info (object) Will contain additional information about the
state. For errors, this will be an object with an "error"
field and string value. For success states, see below.

• error_count (number) Consecutive errors count. Indicates how
many times in a row this replication has crashed. Replication
will be retried with an exponential backoff based on this num-
ber. As soon as the replication succeeds this count is reset
to 0. To can be used to get an idea why a particular replica-
tion is not making progress.

Status Codes

• 200 OK Request completed successfully

• 401 Unauthorized CouchDB Server Administrator privileges re-
quired

The info field of a scheduler doc:

JSON Parameters

• revisions_checked (number) The count of revisions
which have been checked since this replication began.

• missing_revisions_found (number) The count of revi-
sions which were found on the source, but missing from
the target.

• docs_read (number) The count of docs which have been
read from the source.

• docs_written (number) The count of docs which have
been written to the target.

• bulk_get_attempts (number) The total count of at-
tempted doc revisions fetched with _bulk_get.

• bulk_get_docs (number) The total count of successful
docs fetched with _bulk_get.

• changes_pending (number) The count of changes not yet
replicated.

• doc_write_failures (number) The count of docs which
failed to be written to the target.

• checkpointed_source_seq (object) The source sequence
id which was last successfully replicated.

Request:

GET /_scheduler/docs HTTP/1.1
Accept: application/json
Host: localhost:5984

Response:

HTTP/1.1 200 OK
Content-Type: application/json
Date: Sat, 29 Apr 2017 05:10:08 GMT
Server: Server: CouchDB (Erlang/OTP)
Transfer-Encoding: chunked

{
"docs": [
{
"database": "_replicator",
"doc_id": "cdyno-0000001-0000002",
"error_count": 0,
"id": "e327d79214831ca4c11550b4a453c9ba+continuous",
"info": {
"changes_pending": 15,
"checkpointed_source_seq": "60-g1AAAACTeJzLYWBgYMpgTmHgz8tPSTV0MDQy1zMAQsMckEQiQ1L9____szKYEyVygQLsBsZm5pZJJpjKcRqRxwIkGRqA1H-oSSpgk4yMkhITjS0wdWUBAENCJEg",
"doc_write_failures": 0,
"docs_read": 67,
"bulk_get_attempts": 67,
"bulk_get_docs": 67,
"docs_written": 67,
"missing_revisions_found": 67,
"revisions_checked": 67,
"source_seq": "67-g1AAAACTeJzLYWBgYMpgTmHgz8tPSTV0MDQy1zMAQsMckEQiQ1L9____szKYE2VygQLsBsZm5pZJJpjKcRqRxwIkGRqA1H-oSepgk4yMkhITjS0wdWUBAEVKJE8",
"through_seq": "67-g1AAAACTeJzLYWBgYMpgTmHgz8tPSTV0MDQy1zMAQsMckEQiQ1L9____szKYE2VygQLsBsZm5pZJJpjKcRqRxwIkGRqA1H-oSepgk4yMkhITjS0wdWUBAEVKJE8"
},
"last_updated": "2017-04-29T05:01:37Z",
"node": "node2@127.0.0.1",
"source_proxy": null,
"target_proxy": null,
"source": "http://myserver.com/foo",
"start_time": "2017-04-29T05:01:37Z",
"state": "running",
"target": "http://adm:*****@localhost:15984/cdyno-0000002/"
},
{
"database": "_replicator",
"doc_id": "cdyno-0000001-0000003",
"error_count": 0,
"id": "8f5b1bd0be6f9166ccfd36fc8be8fc22+continuous",
"info": {
"changes_pending": null,
"checkpointed_source_seq": 0,
"doc_write_failures": 0,
"bulk_get_attempts": 12,
"bulk_get_docs": 12,
"docs_read": 12,
"docs_written": 12,
"missing_revisions_found": 12,
"revisions_checked": 12,
"source_seq": "12-g1AAAACTeJzLYWBgYMpgTmHgz8tPSTV0MDQy1zMAQsMckEQiQ1L9____szKYE1lzgQLsBsZm5pZJJpjKcRqRxwIkGRqA1H-oSexgk4yMkhITjS0wdWUBADfEJBg",
"through_seq": "12-g1AAAACTeJzLYWBgYMpgTmHgz8tPSTV0MDQy1zMAQsMckEQiQ1L9____szKYE1lzgQLsBsZm5pZJJpjKcRqRxwIkGRqA1H-oSexgk4yMkhITjS0wdWUBADfEJBg"
},
"last_updated": "2017-04-29T05:01:37Z",
"node": "node1@127.0.0.1",
"source_proxy": null,
"target_proxy": null,
"source": "http://myserver.com/foo",
"start_time": "2017-04-29T05:01:37Z",
"state": "running",
"target": "http://adm:*****@localhost:15984/cdyno-0000003/"
}
],
"offset": 0,
"total_rows": 2
}

GET /_scheduler/docs/{replicator_db}
Get information about replication documents from a replicator
database. The default replicator database is _replicator but
other replicator databases can exist if their name ends with the
suffix /_replicator.

NOTE:
As a convenience slashes (/) in replicator db names do not
have to be escaped. So /_scheduler/docs/other/_replicator is
valid and equivalent to /_scheduler/docs/other%2f_replicator

Request Headers

• Accept .INDENT 2.0

• application/json

Response Headers

• Content-Type .INDENT 2.0

• application/json

Query Parameters

• limit (number) How many results to return

• skip (number) How many result to skip starting at the begin-
ning, if ordered by document ID

Response JSON Object

• offset (number) How many results were skipped

• total_rows (number) Total number of replication documents.

• id (string) Replication ID, or null if state is completed or
failed

• state (string) One of following states (see Replication
states for descriptions): initializing, running, completed,
pending, crashing, error, failed

• database (string) Database where replication document came
from

• doc_id (string) Replication document ID

• node (string) Cluster node where the job is running

• source (string) Replication source

• target (string) Replication target

• start_time (string) Timestamp of when the replication was
started

• last_update (string) Timestamp of last state update