intro.doc (8503B)
1 /* 2 * Copyright 2015-2021 Howard Chu, Symas Corp. 3 * All rights reserved. 4 * 5 * Redistribution and use in source and binary forms, with or without 6 * modification, are permitted only as authorized by the OpenLDAP 7 * Public License. 8 * 9 * A copy of this license is available in the file LICENSE in the 10 * top-level directory of the distribution or, alternatively, at 11 * <http://www.OpenLDAP.org/license.html>. 12 */ 13 /** @page starting Getting Started 14 15 LMDB is compact, fast, powerful, and robust and implements a simplified 16 variant of the BerkeleyDB (BDB) API. (BDB is also very powerful, and verbosely 17 documented in its own right.) After reading this page, the main 18 \ref mdb documentation should make sense. Thanks to Bert Hubert 19 for creating the 20 <a href="https://github.com/ahupowerdns/ahutils/blob/master/lmdb-semantics.md"> 21 initial version</a> of this writeup. 22 23 Everything starts with an environment, created by #mdb_env_create(). 24 Once created, this environment must also be opened with #mdb_env_open(). 25 26 #mdb_env_open() gets passed a name which is interpreted as a directory 27 path. Note that this directory must exist already, it is not created 28 for you. Within that directory, a lock file and a storage file will be 29 generated. If you don't want to use a directory, you can pass the 30 #MDB_NOSUBDIR option, in which case the path you provided is used 31 directly as the data file, and another file with a "-lock" suffix 32 added will be used for the lock file. 33 34 Once the environment is open, a transaction can be created within it 35 using #mdb_txn_begin(). Transactions may be read-write or read-only, 36 and read-write transactions may be nested. A transaction must only 37 be used by one thread at a time. Transactions are always required, 38 even for read-only access. The transaction provides a consistent 39 view of the data. 40 41 Once a transaction has been created, a database can be opened within it 42 using #mdb_dbi_open(). If only one database will ever be used in the 43 environment, a NULL can be passed as the database name. For named 44 databases, the #MDB_CREATE flag must be used to create the database 45 if it doesn't already exist. Also, #mdb_env_set_maxdbs() must be 46 called after #mdb_env_create() and before #mdb_env_open() to set the 47 maximum number of named databases you want to support. 48 49 Note: a single transaction can open multiple databases. Generally 50 databases should only be opened once, by the first transaction in 51 the process. After the first transaction completes, the database 52 handles can freely be used by all subsequent transactions. 53 54 Within a transaction, #mdb_get() and #mdb_put() can store single 55 key/value pairs if that is all you need to do (but see \ref Cursors 56 below if you want to do more). 57 58 A key/value pair is expressed as two #MDB_val structures. This struct 59 has two fields, \c mv_size and \c mv_data. The data is a \c void pointer to 60 an array of \c mv_size bytes. 61 62 Because LMDB is very efficient (and usually zero-copy), the data returned 63 in an #MDB_val structure may be memory-mapped straight from disk. In 64 other words <b>look but do not touch</b> (or free() for that matter). 65 Once a transaction is closed, the values can no longer be used, so 66 make a copy if you need to keep them after that. 67 68 @section Cursors Cursors 69 70 To do more powerful things, we must use a cursor. 71 72 Within the transaction, a cursor can be created with #mdb_cursor_open(). 73 With this cursor we can store/retrieve/delete (multiple) values using 74 #mdb_cursor_get(), #mdb_cursor_put(), and #mdb_cursor_del(). 75 76 #mdb_cursor_get() positions itself depending on the cursor operation 77 requested, and for some operations, on the supplied key. For example, 78 to list all key/value pairs in a database, use operation #MDB_FIRST for 79 the first call to #mdb_cursor_get(), and #MDB_NEXT on subsequent calls, 80 until the end is hit. 81 82 To retrieve all keys starting from a specified key value, use #MDB_SET. 83 For more cursor operations, see the \ref mdb docs. 84 85 When using #mdb_cursor_put(), either the function will position the 86 cursor for you based on the \b key, or you can use operation 87 #MDB_CURRENT to use the current position of the cursor. Note that 88 \b key must then match the current position's key. 89 90 @subsection summary Summarizing the Opening 91 92 So we have a cursor in a transaction which opened a database in an 93 environment which is opened from a filesystem after it was 94 separately created. 95 96 Or, we create an environment, open it from a filesystem, create a 97 transaction within it, open a database within that transaction, 98 and create a cursor within all of the above. 99 100 Got it? 101 102 @section thrproc Threads and Processes 103 104 LMDB uses POSIX locks on files, and these locks have issues if one 105 process opens a file multiple times. Because of this, do not 106 #mdb_env_open() a file multiple times from a single process. Instead, 107 share the LMDB environment that has opened the file across all threads. 108 Otherwise, if a single process opens the same environment multiple times, 109 closing it once will remove all the locks held on it, and the other 110 instances will be vulnerable to corruption from other processes. 111 112 Also note that a transaction is tied to one thread by default using 113 Thread Local Storage. If you want to pass read-only transactions across 114 threads, you can use the #MDB_NOTLS option on the environment. 115 116 @section txns Transactions, Rollbacks, etc. 117 118 To actually get anything done, a transaction must be committed using 119 #mdb_txn_commit(). Alternatively, all of a transaction's operations 120 can be discarded using #mdb_txn_abort(). In a read-only transaction, 121 any cursors will \b not automatically be freed. In a read-write 122 transaction, all cursors will be freed and must not be used again. 123 124 For read-only transactions, obviously there is nothing to commit to 125 storage. The transaction still must eventually be aborted to close 126 any database handle(s) opened in it, or committed to keep the 127 database handles around for reuse in new transactions. 128 129 In addition, as long as a transaction is open, a consistent view of 130 the database is kept alive, which requires storage. A read-only 131 transaction that no longer requires this consistent view should 132 be terminated (committed or aborted) when the view is no longer 133 needed (but see below for an optimization). 134 135 There can be multiple simultaneously active read-only transactions 136 but only one that can write. Once a single read-write transaction 137 is opened, all further attempts to begin one will block until the 138 first one is committed or aborted. This has no effect on read-only 139 transactions, however, and they may continue to be opened at any time. 140 141 @section dupkeys Duplicate Keys 142 143 #mdb_get() and #mdb_put() respectively have no and only some support 144 for multiple key/value pairs with identical keys. If there are multiple 145 values for a key, #mdb_get() will only return the first value. 146 147 When multiple values for one key are required, pass the #MDB_DUPSORT 148 flag to #mdb_dbi_open(). In an #MDB_DUPSORT database, by default 149 #mdb_put() will not replace the value for a key if the key existed 150 already. Instead it will add the new value to the key. In addition, 151 #mdb_del() will pay attention to the value field too, allowing for 152 specific values of a key to be deleted. 153 154 Finally, additional cursor operations become available for 155 traversing through and retrieving duplicate values. 156 157 @section optim Some Optimization 158 159 If you frequently begin and abort read-only transactions, as an 160 optimization, it is possible to only reset and renew a transaction. 161 162 #mdb_txn_reset() releases any old copies of data kept around for 163 a read-only transaction. To reuse this reset transaction, call 164 #mdb_txn_renew() on it. Any cursors in this transaction must also 165 be renewed using #mdb_cursor_renew(). 166 167 Note that #mdb_txn_reset() is similar to #mdb_txn_abort() and will 168 close any databases you opened within the transaction. 169 170 To permanently free a transaction, reset or not, use #mdb_txn_abort(). 171 172 @section cleanup Cleaning Up 173 174 For read-only transactions, any cursors created within it must 175 be closed using #mdb_cursor_close(). 176 177 It is very rarely necessary to close a database handle, and in 178 general they should just be left open. 179 180 @section onward The Full API 181 182 The full \ref mdb documentation lists further details, like how to: 183 184 \li size a database (the default limits are intentionally small) 185 \li drop and clean a database 186 \li detect and report errors 187 \li optimize (bulk) loading speed 188 \li (temporarily) reduce robustness to gain even more speed 189 \li gather statistics about the database 190 \li define custom sort orders 191 192 */