1 # Copyright 1999, The OpenLDAP Foundation, All Rights Reserved.
2 # COPYING RESTRICTIONS APPLY, see COPYRIGHT.
3 H1: Database Creation and Maintenance Tools
5 This section tells you how to create a slapd database from
6 scratch, and how to do trouble shooting if you run into
7 problems. There are two ways to create a database. First,
8 you can create the database on-line using LDAP. With this
9 method, you simply start up slapd and add entries using the
10 LDAP client of your choice. This method is fine for relatively
11 small databases (a few hundred or thousand entries,
12 depending on your requirements).
14 The second method of database creation is to do it off-line,
15 using the index generation tools. This method is best if you
16 have many thousands of entries to create, which would take
17 an unacceptably long time using the LDAP method, or if you
18 want to ensure the database is not accessed while it is
23 H2: Creating a database over LDAP
27 With this method, you use the LDAP client of your choice
28 (e.g., the ldapadd(1) tool) to add entries, just like you would
29 once the database is created. You should be sure to set the
30 following configuration options before starting slapd:
34 As described in the preceding section, this option says what
35 entries are to be held by this database. You should set this
36 to the DN of the root of the subtree you are trying to create.
39 E: suffix "dc=OpenLDAP, dc=org"
41 You should be sure to specify a directory where the index
42 files should be created:
44 E: directory <directory>
48 E: directory /usr/local/openldap/slapd
50 You need to make it so you can connect to slapd as
51 somebody with permission to add entries. This is done
52 through the following two options in the database definition:
57 These options specify a DN and password that can be used
58 to authenticate as the "superuser" entry of the database (i.e.,
59 the entry allowed to do anything). The DN and password
60 specified here will always work, regardless of whether the
61 entry named actually exists or has the password given. This
62 solves the chicken-and-egg problem of how to authenticate
63 and add entries before any entries yet exist.
65 Finally, you should make sure that the database definition
66 contains the index definitions you want:
68 E: index {<attrlist> | default} [pres,eq,approx,sub,none]
70 For example, to index the cn, sn, uid and objectclass
71 attributes the following index configuration lines could be
75 E: index objectclass pres,eq
78 See Section 4 on the configuration file for more details on
79 this option. Once you have configured things to your liking,
80 start up slapd, connect with your LDAP client, and start
81 adding entries. For example, to add a the organizational entry
82 followed by a Postmaster entry using the {{I:ldapadd}} tool, you
83 could create a file called {{EX:/tmp/newentry}} with the contents:
86 E: dc=OpenLDAP, dc=org
87 E: objectClass=dcObject
88 E: objectClass=organization
92 E: o=OpenLDAP Foundation
93 E: description=The OpenLDAP Foundation
94 E: description=The OpenLDAP Project
96 E: cn=Postmaster, dc=OpenLDAP, dc=org
97 E: objectClass=organizationalRole
99 E: description=OpenLDAP Postmaster <Postmaster@OpenLDAP.org>
101 and then use a command like this to actually create the
104 E: ldapadd -f /tmp/newentry -D \
105 "cn=Manager, dc=OpenLDAP, dc=org" -w secret
107 The above command assumes that you have set {{EX: rootdn}} to
108 "cn=Manager, dc=OpenLDAP, dc=org" and {{EX: rootpw}}
111 H2: Creating a database off-line
113 The second method of database creation is to do it off-line,
114 using the index generation tools described below. This
115 method is best if you have many thousands of entries to
116 create, which would take an unacceptably long time using
117 the LDAP method described above. These tools read the
118 slapd configuration file and an input file containing a text
119 representation of the entries to add. They produce the LDBM
120 index files directly. There are several important configuration
121 options you will want to be sure and set in the config file
122 database definition first:
126 As described in the preceding section, this option says what
127 entries are to be held by this database. You should set this
128 to the DN of the root of the subtree you are trying to create.
131 E: suffix "dc=OpenLDAP, dc=org"
133 You should be sure to specify a directory where the index
134 files should be created:
136 E: directory <directory>
140 E: directory /usr/local/var/openldap
142 Next, you probably want to increase the size of the in-core
143 cache used by each open index file. For best performance
144 during index creation, the entire index should fit in memory. If
145 your data is too big for this, or your memory too small, you
146 can still make it pretty big and let the paging system do the
147 work. This size is set with the following option:
149 E: dbcachesize <integer>
153 E: dbcachesize 50000000
155 This would create a cache 50 MB big, which is pretty big (at
156 U-M, our database has about 125K entries, and our biggest
157 index file is about 45 MB). Experiment with this number a bit,
158 and the degree of parallelism (explained below), to see what
159 works best for your system. Remember to turn this number
160 back down once your index files are created and before you
163 Finally, you need to specify which indexes you want to build.
164 This is done by one or more index options.
166 E: index {<attrlist> | default} [pres,eq,approx,sub,none]
170 E: index cn,sn,uid pres,eq,approx
171 E: index default none
173 This would create presence, equality and approximate
174 indexes for the cn, sn, and uid attributes, and no indexes for
175 any other attributes. See the configuration file section for
176 more information on this option.
178 H3: The {{EX: ldif2ldbm}} program
180 Once you've configured things to your liking, you create the
181 indexes by running the ldif2ldbm program:
183 E: ldif2ldbm -i <inputfile> -f <slapdconfigfile>
184 E: [-d <debuglevel>] [-j <integer>]
185 E: [-n <databasenumber>] [-e <etcdir>]
187 The arguments have the following meanings:
191 Specifies the LDIF input file containing the entries to add in
192 text form (described below in Section 8.3).
194 E: -f <slapdconfigfile>
196 Specifies the slapd configuration file that tells where to
197 create the indexes, what indexes to create, etc.
201 Turn on debugging, as specified by {{EX: <debuglevel>}}. The
202 debug levels are the same as for slapd (see Section 6.1).
206 An optional argument that specifies that at most {{EX: <integer>}}
207 processes should be started in parallel when building the
208 indexes. The default is 1. If set to a value greater than one,
209 {{I: ldif2ldbm}} will create at most that many subprocesses at a
210 time when building the indexes. A separate subprocess is
211 created to build each attribute index. Running these
212 processes in parallel can speed things up greatly, but
213 beware of creating too many processes, all competing for
214 memory and disk resources.
216 E: -n <databasenumber>
218 An optional argument that specifies the configuration file
219 database for which to build indices. The first database listed
220 is "1", the second "2", etc. By default, the first ldbm database
221 in the configuration file is used.
225 An optional argument that specifies the directory where
226 {{EX: ldif2ldbm}} can find the other database conversion tools it
227 needs to execute ({{EX: ldif2index}} and friends). The default is the
228 installation {{EX: ETCDIR}}.
230 The next sections describe the programs invoked by
231 {{I: ldif2ldbm}} when it is building indexes. Normally, these
232 programs are invoked for you, but occasionally you may
233 want to invoke them yourself.
237 H3: The {{EX: ldif2index}} program
239 Sometimes it may be necessary to create a new attribute
240 index file without disturbing the rest of the database. This is
241 possible using the {{EX: ldif2index}} program. {{EX: ldif2index}} is invoked
244 E: ldif2index -i <inputfile> -f <slapdconfigfile>
245 E: [-d <debuglevel>] [-n <databasenumber>] <attr>
247 Where the -i, -f, -d, and -n options are the same as for the
248 {{I: ldif2ldbm}} program. {{EX: <attr>}} is the attribute to build an index for.
249 Which indexes are built (e.g., equality, substring, etc.) is
250 controlled by the corresponding index line in the slapd
253 You can use the ldbmcat program to create a suitable LDIF
254 input file from an existing LDBM database.
258 H3: The {{EX: ldif2id2entry}} program
260 The {{EX: ldif2id2entry}} program is normally invoked from {{EX: ldif2ldbm}}.
261 It is used to convert an LDIF text file into an {{EX: id2entry}} index.
262 It is unlikely that you would need to invoke it yourself, but if
263 you do it works like this
265 E: ldif2id2entry -i <inputfile> -f <slapdconfigfile>
266 E: [-d <debuglevel>] [-n <databasenumber>]
268 The arguments are the same as for the {{EX: ldif2ldbm}} program.
272 H3: The {{EX: ldif2id2children}} program
274 The {{EX: ldif2id2children}} program is normally invoked from
275 {{EX: ldif2ldbm}}. It is used to convert an LDIF text file into
276 {{EX: id2children}} and {{EX: dn2id}} indexes. Occasionally, it may be
277 necessary to run this program yourself, for example if one of
278 these indexes has become corrupted. {{EX: ldif2id2children}} is
281 E: ldif2id2children -i <inputfile> -f <slapdconfigfile>
282 E: [-d <debuglevel>] [-n <databasenumber>]
284 The arguments are the same as for the {{EX: ldif2ldbm}} program.
285 You can use the ldbmcat program to create a suitable LDIF
286 input file from an existing LDBM database.
290 H3: The {{EX: ldbmcat}} program
292 The {{EX: ldbmcat}} program is used to convert an {{EX: id2entry}} index
293 back into its LDIF text format. This can be useful when you
294 want to make a human-readable backup of your database,
295 or as an intermediate step in creating a new index using the
296 {{EX: ldif2index}} program. The program is invoked like this:
298 E: ldbmcat [-n] <filename>
300 where {{EX: <filename>}} is the name of the {{EX: id2entry}} index file. The
301 corresponding LDIF output is written to standard output.
303 The -n option can be used to prevent the printing of entry
304 IDs in the LDIF format. If you are creating an LDIF format for
305 use as input to {{EX: ldif2index}} or anything by {{EX: ldif2ldbm}}, you
306 should not use the -n option (because the entry IDs must
307 match those already in the id2entry file). If you are just
308 making a backup of your data, you can use the -n option to
313 H3: The {{EX: ldif}} program
315 The ldif program is used to convert arbitrary data values to
316 LDIF format. This can be useful when writing a program or
317 script to create the LDIF file you will feed into the ldif2ldbm
318 program, or when writing a SHELL backend. ldif takes an
319 attribute name as an argument, and reads the attribute
320 value(s) from standard input. It produces the LDIF formatted
321 attribute line(s) on standard output. The usage is:
323 E: ldif [-b] <attrname>
325 where {{EX: <attrname>}} is the name of the attribute. Without the
326 -b option, ldif considers each line of standard input to be a
327 separate value of the attribute.
329 The -b option can be used to force ldif to interpret its input
330 as a single raw binary value. This option is useful when
331 converting binary data such as a {{EX: jpegPhoto}} or {{EX: audio}}
335 H2: The LDIF text entry format
337 The LDAP Data Interchange Format (LDIF) is used to
338 represent LDAP entries in a simple text format. The basic
342 E: dn: <distinguished name>
343 E: <attrtype>: <attrvalue>
344 E: <attrtype>: <attrvalue>
348 where {{EX: <id>}} is the optional entry ID (a positive decimal
349 number). Normally, you would not supply the {{EX: <id>}}, allowing
350 the database creation tools to do that for you. The ldbmcat
351 program, however, produces an LDIF format that includes
352 {{EX: <id>}} so that new indexes created will be consistent.
354 A line may be continued by starting the next line with a
355 single space or tab character. e.g.,
357 E: dn: cn=Barbara J Jensen, dc=OpenLDAP, dc=org
359 Multiple attribute values are specified on separate lines. e.g.,
361 E: cn: Barbara J Jensen
364 If an {{EX: <attrvalue>}} contains a non-printing character, or
365 begins with a space or a colon `:', the {{EX: <attrtype>}} is followed
366 by a double colon and the value is encoded in base 64
367 notation. e.g., the value " begins with a space" would be
370 E: cn:: IGJlZ2lucyB3aXRoIGEgc3BhY2U=
372 Multiple entries within the same LDIF file are separated by
373 blank lines. Here's an example of an LDIF file containing
376 E: dn: cn=Barbara J Jensen, dc=OpenLDAP, dc=org
377 E: cn: Barbara J Jensen
379 E: objectclass: person
383 E: dn: cn=Bjorn J Jensen, dc=OpenLDAP, dc=org
384 E: cn: Bjorn J Jensen
386 E: objectclass: person
389 E: dn: cn=Jennifer J Jensen, dc=OpenLDAP, dc=org
390 E: cn: Jennifer J Jensen
391 E: cn: Jennifer Jensen
392 E: objectclass: person
394 E: jpegPhoto:: /9j/4AAQSkZJRgABAAAAAQABAAD/2wBDABALD
395 E: A4MChAODQ4SERATGCgaGBYWGDEjJR0oOjM9PDkzODdASFxOQ
396 E: ERXRTc4UG1RV19iZ2hnPk1xeXBkeFxlZ2P/2wBDARESEhgVG
400 Notice that the {{EX: jpegPhoto}} in Jennifer Jensen's entry is
401 encoded using base 64. The {{EX: ldif}} program (described in
402 Section 8.2.6) can be used to produce the LDIF format.
404 Note: Trailing spaces are not trimmed from values in an
405 LDIF file. Nor are multiple internal spaces compressed. If
406 you don't want them in your data, don't put them there.
409 H2: Converting from QUIPU EDB format to LDIF format
411 If you have directory data that is or was held in a QUIPU
412 DSA (available as part of the ISODE package), you will want
413 to convert the EDB files used by QUIPU into an LDIF file.
414 The edb2ldif program is provided to do most of the
415 conversion for you. Once you have an LDIF file, you should
416 follow the steps outlined in section 6.2 above to build an
417 LDBM database for slapd.
421 H3: The {{EX: edb2ldif}} program
423 The edb2ldif program is invoked like this:
425 E: edb2ldif [-d] [-v] [-r] [-o] [-b <basedn>]
426 E: [-a <addvalsfile>] [-f <fileattrdir>]
427 E: [-i <ignoreattr...>] [<edbfile...>]
429 The LDIF data is written to standard output. The arguments
430 have the following meanings:
434 This option enables some debugging output on standard
439 Enable verbose mode that writes status information to
440 standard error, such as which EDB file is being processed,
441 how many entries have been converted so far, etc.
445 Recurse through child directories, processing all EDB files
450 Cause local .add file definitions to override the global addfile
455 Specify the Distinguished Name that all EDB file entries
460 The LDIF information contained in this file will be appended
465 Specify a single directory where all file-based attributes
466 (typically sounds and images) can be found. If this option is
467 not given, file attributes are assumed to be located in the
468 same directory as the EDB file that refers to them.
472 Specify an attribute that should not be converted. You can
473 include as many -i flags as necessary.
477 Specify a particular EDB file (or files) to read data from. By
478 default, the EDB.root (if it exists) and EDB files in the current
481 When {{EX: edb2ldif}} is invoked, it will also look for files named
482 .add in the directories where EDB files are found and append
483 the contents of the .add file to each entry. Typically, this
484 feature is used to include inherited attribute values (e.g.,
485 {{EX: objectClass}}) that do not appear in the EDB files.
489 H3: Step-by-step EDB to LDIF conversion
491 The basic steps to follow when converting your EDB format
492 data to an LDIF file are:
494 ^ Locate the directory at the top of the EDB file hierarchy
495 .that your QUIPU DSA masters. The EDB file located there
496 .should contain the entries for the first level of your
497 .organization or organizational unit. If you are using an
498 .indexed database with QUIPU, you may need to create EDB
499 .files from your index files (using the synctree or qb2edb
503 +If you do not have a file named EDB.root in the same
504 .directory that contains your organizational or organizational
505 .unit entry, create it now by hand. Its contents should look
506 .something like this:
512 .{{EX: objectClass= top & organization & domainRelatedObject &\}}
513 .{{EX: quipuObject & quipuNonLeafObject}}
514 .{{EX: l= Redwood City, California}}
515 .{{EX: st= California}}
516 .{{EX: o=OpenLDAP Project & OpenLDAP Foundation & OpenLDAP}}
517 .{{EX: description=The OpenLDAP Project}}
518 .{{EX: associatedDomain= openldap.org}}
519 .{{EX: masterDSA= c=US@cn=Woolly Monkey}}
522 + (Optional) Create a global add file and/or local .add files to
523 .take care of adding any attribute values that do not appear in
524 .the EDB files. For example, if all entries in a particular EDB
525 .are person entries and you want to add the appropriate
526 .objectClass attribute value for them, create a file called .add
527 .in the same directory as the person EDB that contains the
530 .{{EX: objectClass: person }}
533 + Run the edb2ldif program to do the actual conversion.
534 .Make sure you are in the directory that contains the root of
535 .the EDB hierarchy (the one where the EDB.root file resides).
536 .Include a -b flag with a base DN one level above your
537 .organizational entry, and include -i flags to ignore any
538 .attributes that are not useful to slapd. E.g., the command:
540 .{{EX: edb2ldif -v -r -b "c=US" -i iattr -i acl -i xacl -i sacl}}
541 .{{EX: -i lacl -i masterDSA -i slaveDSA > ldif}}
543 .will convert the entire EDB hierarchy to LDIF format and
544 .write the result to a file named ldif. Some attributes that are
545 .not useful when running slapd are ignored. The EDB
546 .hierarchy is assumed to reside logically below the base DN
550 + Follow the steps outlined in section 8.2 above to produce
551 .an LDBM database from your new LDIF file.
555 H2: The ldbmtest program
557 Occasionally you may find it useful to look at the LDBM
558 database and index files directly (i.e., without going through
559 slapd). The {{EX: ldbmtest}} program is provided for this purpose. It
560 gives you raw access to the database itself. {{EX: ldbmtest}} should
563 E: ldbmtest [-d <debuglevel>] [-f <slapdconfigfile>]
565 The default configuration file in the {{EX: ETCDIR}} is used if you
566 don't supply one. By default, ldbmtest operates on the last
567 database listed in the config file. You can specify an
568 alternate database, or see the current database with the
571 E: b specify an alternate backend database
572 E: B print out the current backend database
574 The {{EX: b}} command will prompt you for the suffix associated with
575 the database you want. The database you select can be
576 viewed and modified using a set of two-letter commands.
577 The first letter selects the command function to perform.
578 Possible commands and their meanings are as follows.
580 E: l lookup (do not follow indirection)
581 E: L lookup (follow indirection)
582 E: t traverse and print keys and data
583 E: T traverse and print keys only
584 E: x delete an index item
585 E: e edit an index item
586 E: a add an index item
587 E: c create an index file
588 E: i insert an entry into an index item
590 The second letter indicates which index the command
591 applies to. The possible index selections are as follows.
593 E: c id2children index
596 E: f arbitrary file name
599 Each command may require additional arguments which
600 ldbmtest will prompt you for.
602 To exit {{EX: ldbmtest}}, type {{EX: control-D}} or {{EX: control-C}}.
604 Note that this is a very raw interface originally developed
605 when testing the database format. It is provided and
606 minimally documented here for interested parties, but it is not
607 meant to be used by the inexperienced. See the next section
608 for a brief description of the LDBM database format.
612 H2: The LDBM database format
614 In normal operation, it is not necessary for you to know much
615 about the LDBM database format. If you are going to use the
616 ldbmtest program to look at or alter the database, or if you
617 want a deeper understanding of how indexes are maintained,
618 some knowledge of how it works could be useful. This
619 section gives an overview of the database format and how
620 slapd makes use of it.
626 The LDBM database works by assigning a compact
627 four-byte unique identifier to each entry in the database. It
628 uses this identifier to refer to entries in indexes. The
629 database consists of one main index file, called id2entry,
630 which maps from an entry's unique identifier (EID) to a text
631 representation of the entry itself. Other index files are
632 maintained, for each indexed attribute for example, that map
633 values people are likely to search on to lists of EIDs.
635 Using this simple scheme, many LDAP queries can be
636 answered efficiently. For example, to answer a search for
637 entries with a surname of "Jensen", slapd would first consult
638 the surname attribute index, look up the value "Jensen" and
639 retrieve the corresponding list of EIDs. Next, slapd would
640 look up each EID in the id2entry index, retrieve the
641 corresponding entry, convert it from text to LDAP format, and
642 return it to the client.
644 The following sections give a very brief overview of each
645 type of index and what it contains. For more detailed
646 information see the paper "An X.500 and LDAP Database:
647 Design and Implementation," available in postscript format
650 {{CMD[jump="ftp://terminator.rs.itd.umich.edu/ldap/papers/xldbm.ps"]ftp://terminator.rs.itd.umich.edu/ldap/papers/xldbm.ps}}
654 H3: Attribute index format
656 The LDBM backend will maintain one index file for each
657 attribute it is asked to index. Several sets of keys must
658 coexist in this file (e.g., keys for equality and approximate
659 equality), so the keys are prefixed with a character to ensure
660 uniqueness. The prefixes are given in the table below
663 E: ~ approximate equality keys
664 E: * substring equality keys
665 E: \ continuation keys
667 Key values are also normalized (e.g., converted to upper
668 case for case ignore attributes). So, for example, to look up
669 the surname equality value in the example above using the
670 ldbmtest program, you would look up the value "{{EX: =JENSEN}}".
672 Substring indexes are maintained by generating all possible
673 N-character substrings for a value (N is 3 by default). These
674 substrings are then stored in the attribute index, prefixed by
675 "*". Additional anchors of "^" and "$" are added at the
676 beginning and end of words. So, for example the surname of
677 Jensen would cause the following keys to be entered in the
678 index: {{EX: ^JE, JEN, ENS, NSE, SEN, EN$}}.
680 Approximate values are handled in a similar way, with
681 phonetic codes being generated for each word in a value
682 and then stored in the index, prefixed by "~".
684 Large blocks in the index are split into smaller ones. The
685 smaller blocks are accessed through a level of indirection
686 provided by the original block. They are stored in the index
687 using the continuation key prefix of "\".
693 In addition to the {{EX: id2entry}} and attribute indexes, LDBM
694 maintains a number of other indexes, including the {{EX: dn2id}}
695 index and the {{EX: id2children}} index. These indexes provide the
696 mapping between a DN and the corresponding EID, and the
697 mapping between an EID and the EIDs of the corresponding
698 entry's children, respectively.
700 The {{EX: dn2id}} index stores normalized DNs as keys. The data
701 stored is the corresponding EID.
703 The {{EX: id2children}} index stores EIDs as keys. The data stored
704 is a list of EIDs, just as for the attribute indexes.