git.sur5r.net Git - openldap/blob - doc/guide/admin/intro.sdf

   1 # $OpenLDAP$
   2 # Copyright 1999-2007 The OpenLDAP Foundation, All Rights Reserved.
   3 # COPYING RESTRICTIONS APPLY, see COPYRIGHT.
   4 H1: Introduction to OpenLDAP Directory Services
   5
   6 This document describes how to build, configure, and operate
   7 {{PRD:OpenLDAP}} Software to provide directory services.  This
   8 includes details on how to configure and run the Standalone
   9 {{TERM:LDAP}} Daemon, {{slapd}}(8).  It is intended for new and
  10 experienced administrators alike.  This section provides a basic
  11 introduction to directory services and, in particular, the directory
  12 services provided by {{slapd}}(8).  This introduction is only
  13 intended to provide enough information so one might get started
  14 learning about {{TERM:LDAP}}, {{TERM:X.500}}, and directory services.
  15
  16
  17 H2: What is a directory service?
  18
  19 A directory is a specialized database specifically designed for
  20 searching and browsing, in additional to supporting basic lookup
  21 and update functions.
  22
  23 Note: A directory is defined by some as merely a database optimized
  24 for read access.  This definition, at best, is overly simplistic.
  25
  26 Directories tend to contain descriptive, attribute-based information
  27 and support sophisticated filtering capabilities.  Directories
  28 generally do not support complicated transaction or roll-back schemes
  29 found in database management systems designed for handling high-volume
  30 complex updates.  Directory updates are typically simple all-or-nothing
  31 changes, if they are allowed at all.  Directories are generally
  32 tuned to give quick response to high-volume lookup or search
  33 operations. They may have the ability to replicate information
  34 widely in order to increase availability and reliability, while
  35 reducing response time.  When directory information is replicated,
  36 temporary inconsistencies between the replicas may be okay, as long
  37 as inconsistencies are resolved in a timely manner.
  38
  39 There are many different ways to provide a directory service.
  40 Different methods allow different kinds of information to be stored
  41 in the directory, place different requirements on how that information
  42 can be referenced, queried and updated, how it is protected from
  43 unauthorized access, etc.  Some directory services are {{local}},
  44 providing service to a restricted context (e.g., the finger service
  45 on a single machine). Other services are global, providing service
  46 to a much broader context (e.g., the entire Internet).  Global
  47 services are usually {{distributed}}, meaning that the data they
  48 contain is spread across many machines, all of which cooperate to
  49 provide the directory service. Typically a global service defines
  50 a uniform {{namespace}} which gives the same view of the data no
  51 matter where you are in relation to the data itself.
  52
  53 A web directory, such as provided by the {{Open Directory Project}}
  54 <{{URL:http://dmoz.org}}>, is a good example of a directory service.
  55 These services catalog web pages and are specifically designed to
  56 support browsing and searching.
  57
  58 While some consider the Internet {{TERM[expand]DNS}} (DNS) is an
  59 example of a globally distributed directory service, DNS is not
  60 browseable nor searchable.  It is more properly described as a
  61 globally distributed {{lookup}} service.
  62
  63
  64 H2: What is LDAP?
  65
  66 {{TERM:LDAP}} stands for {{TERM[expand]LDAP}}.  As the name suggests,
  67 it is a lightweight protocol for accessing directory services,
  68 specifically {{TERM:X.500}}-based directory services.  LDAP runs
  69 over {{TERM:TCP}}/{{TERM:IP}} or other connection oriented transfer
  70 services.  LDAP is an {{ORG:IETF}} Standard Track protocol and is
  71 specified in "Lightweight Directory Access Protocol (LDAP) Technical
  72 Specification Road Map" {{REF:RFC4510}}.
  73
  74 This section gives an overview of LDAP from a user's perspective.
  75
  76 {{What kind of information can be stored in the directory?}} The
  77 LDAP information model is based on {{entries}}. An entry is a
  78 collection of attributes that has a globally-unique {{TERM[expand]DN}}
  79 (DN).  The DN is used to refer to the entry unambiguously. Each of
  80 the entry's attributes has a {{type}} and one or more {{values}}.
  81 The types are typically mnemonic strings, like "{{EX:cn}}" for
  82 common name, or "{{EX:mail}}" for email address. The syntax of
  83 values depend on the attribute type.  For example, a {{EX:cn}}
  84 attribute might contain the value {{EX:Babs Jensen}}.  A {{EX:mail}}
  85 attribute might contain the value "{{EX:babs@example.com}}". A
  86 {{EX:jpegPhoto}} attribute would contain a photograph in the
  87 {{TERM:JPEG}} (binary) format.
  88
  89 {{How is the information arranged?}} In LDAP, directory entries
  90 are arranged in a hierarchical tree-like structure.  Traditionally,
  91 this structure reflected the geographic and/or organizational
  92 boundaries.  Entries representing countries appear at the top of
  93 the tree. Below them are entries representing states and national
  94 organizations. Below them might be entries representing organizational
  95 units, people, printers, documents, or just about anything else
  96 you can think of.  Figure 1.1 shows an example LDAP directory tree
  97 using traditional naming.
  98
  99 !import "intro_tree.png"; align="center"; \
 100         title="LDAP directory tree (traditional naming)"
 101 FT[align="Center"] Figure 1.1: LDAP directory tree (traditional naming)
 102
 103 The tree may also be arranged based upon Internet domain names.
 104 This naming approach is becoming increasing popular as it allows
 105 for directory services to be located using the {{DNS}}.
 106 Figure 1.2 shows an example LDAP directory tree using domain-based
 107 naming.
 108
 109 !import "intro_dctree.png"; align="center"; \
 110         title="LDAP directory tree (Internet naming)"
 111 FT[align="Center"] Figure 1.2: LDAP directory tree (Internet naming)
 112
 113 In addition, LDAP allows you to control which attributes are required
 114 and allowed in an entry through the use of a special attribute
 115 called {{EX:objectClass}}.  The values of the {{EX:objectClass}}
 116 attribute determine the {{schema}} rules the entry must obey.
 117
 118 {{How is the information referenced?}} An entry is referenced by
 119 its distinguished name, which is constructed by taking the name of
 120 the entry itself (called the {{TERM[expand]RDN}} or RDN) and
 121 concatenating the names of its ancestor entries. For example, the
 122 entry for Barbara Jensen in the Internet naming example above has
 123 an RDN of {{EX:uid=babs}} and a DN of
 124 {{EX:uid=babs,ou=People,dc=example,dc=com}}. The full DN format is
 125 described in {{REF:RFC4514}}, "LDAP: String Representation of
 126 Distinguished Names."
 127
 128 {{How is the information accessed?}} LDAP defines operations for
 129 interrogating and updating the directory.  Operations are provided
 130 for adding and deleting an entry from the directory, changing an
 131 existing entry, and changing the name of an entry. Most of the
 132 time, though, LDAP is used to search for information in the directory.
 133 The LDAP search operation allows some portion of the directory to
 134 be searched for entries that match some criteria specified by a
 135 search filter. Information can be requested from each entry that
 136 matches the criteria.
 137
 138 For example, you might want to search the entire directory subtree
 139 at and below {{EX:dc=example,dc=com}} for people with the name
 140 {{EX:Barbara Jensen}}, retrieving the email address of each entry
 141 found. LDAP lets you do this easily.  Or you might want to search
 142 the entries directly below the {{EX:st=California,c=US}} entry for
 143 organizations with the string {{EX:Acme}} in their name, and that
 144 have a fax number. LDAP lets you do this too. The next section
 145 describes in more detail what you can do with LDAP and how it might
 146 be useful to you.
 147
 148 {{How is the information protected from unauthorized access?}} Some
 149 directory services provide no protection, allowing anyone to see
 150 the information. LDAP provides a mechanism for a client to authenticate,
 151 or prove its identity to a directory server, paving the way for
 152 rich access control to protect the information the server contains.
 153 LDAP also supports data security (integrity and confidentiality)
 154 services.
 155
 156
 157 H2: When should I use LDAP?
 158
 159 This is a very good question. In general, you should use a Directory
 160 server when you require data to be centrally managed, stored and accessible via
 161 standards based methods.
 162
 163 Some common examples found throughout the industry are, but not limited to:
 164
 165 * Machine Authentication
 166 * User Authentication
 167 * User/System Groups
 168 * Address book
 169 * Organization Representation
 170 * Asset Tracking
 171 * Telephony Information Store
 172 * User resource management
 173 * E-mail address lookups
 174 * Application Configuration store
 175 * PBX Configuration store
 176 * etc.....
 177
 178 There are always new ways to use a Directory and apply LDAP principles to address
 179 certain problems, therefore there is no simple answer to this question.
 180
 181 If in doubt, join the general LDAP forum for non-commercial discussions and
 182 information relating to LDAP at:
 183 {{URL:http://www.umich.edu/~dirsvcs/ldap/mailinglist.html}} and ask
 184
 185 H2: When should I not use LDAP?
 186
 187 When you start finding yourself bending the directory to do what you require,
 188 maybe a redesign is needed. Or if you only require one application to use and
 189 manipulate your data (for discussion of LDAP vs RDBMS, please read the
 190 {{SECT:LDAP vs RDBMS}} section).
 191
 192 It will become obvious when LDAP is the right tool for the job.
 193
 194
 195 H2: How does LDAP work?
 196
 197 LDAP utilizes a {{client-server model}}. One or more LDAP servers
 198 contain the data making up the directory information tree ({{TERM:DIT}}).
 199 The client connects to servers and asks it a question.  The server
 200 responds with an answer and/or with a pointer to where the client
 201 can get additional information (typically, another LDAP server).
 202 No matter which LDAP server a client connects to, it sees the same
 203 view of the directory; a name presented to one LDAP server references
 204 the same entry it would at another LDAP server.  This is an important
 205 feature of a global directory service.
 206
 207
 208 H2: What about X.500?
 209
 210 Technically, {{TERM:LDAP}} is a directory access protocol to an
 211 {{TERM:X.500}} directory service, the {{TERM:OSI}} directory service.
 212 Initially, LDAP clients accessed gateways to the X.500 directory service.
 213 This gateway ran LDAP between the client and gateway and X.500's
 214 {{TERM[expand]DAP}} ({{TERM:DAP}}) between the gateway and the
 215 X.500 server.  DAP is a heavyweight protocol that operates over a
 216 full OSI protocol stack and requires a significant amount of
 217 computing resources.  LDAP is designed to operate over
 218 {{TERM:TCP}}/{{TERM:IP}} and provides most of the functionality of
 219 DAP at a much lower cost.
 220
 221 While LDAP is still used to access X.500 directory service via
 222 gateways, LDAP is now more commonly directly implemented in X.500
 223 servers.
 224
 225 The Standalone LDAP Daemon, or {{slapd}}(8), can be viewed as a
 226 {{lightweight}} X.500 directory server.  That is, it does not
 227 implement the X.500's DAP nor does it support the complete X.500
 228 models.
 229
 230 If you are already running a X.500 DAP service and you want to
 231 continue to do so, you can probably stop reading this guide.  This
 232 guide is all about running LDAP via {{slapd}}(8), without running
 233 X.500 DAP.  If you are not running X.500 DAP, want to stop running
 234 X.500 DAP, or have no immediate plans to run X.500 DAP, read on.
 235
 236 It is possible to replicate data from an LDAP directory server to
 237 a X.500 DAP {{TERM:DSA}}.  This requires an LDAP/DAP gateway.
 238 OpenLDAP Software does not include such a gateway.
 239
 240
 241 H2: What is the difference between LDAPv2 and LDAPv3?
 242
 243 LDAPv3 was developed in the late 1990's to replace LDAPv2.
 244 LDAPv3 adds the following features to LDAP:
 245
 246  * Strong authentication and data security services via {{TERM:SASL}}
 247  * Certificate authentication and data security services via {{TERM:TLS}} (SSL)
 248  * Internationalization through the use of Unicode
 249  * Referrals and Continuations
 250  * Schema Discovery
 251  * Extensibility (controls, extended operations, and more)
 252
 253 LDAPv2 is historic ({{REF:RFC3494}}).  As most {{so-called}} LDAPv2
 254 implementations (including {{slapd}}(8)) do not conform to the
 255 LDAPv2 technical specification, interoperability amongst
 256 implementations claiming LDAPv2 support is limited.  As LDAPv2
 257 differs significantly from LDAPv3, deploying both LDAPv2 and LDAPv3
 258 simultaneously is quite problematic.  LDAPv2 should be avoided.
 259 LDAPv2 is disabled by default.
 260
 261
 262 H2: LDAP vs RDBMS
 263
 264 This question is raised many times, in different forms. The most common,
 265 however, is: {{Why doesn't OpenLDAP drop Berkeley DB and use a relational
 266 database management system (RDBMS) instead?}} In general, expecting that the
 267 sophisticated algorithms implemented by commercial-grade RDBMS would make
 268 {{OpenLDAP}} be faster or somehow better and, at the same time, permitting
 269 sharing of data with other applications.
 270
 271 The short answer is that use of an embedded database and custom indexing system
 272 allows OpenLDAP to provide greater performance and scalability without loss of
 273 reliability. OpenLDAP, since release 2.1, in its main storage-oriented backends
 274 (back-bdb and, since 2.2, back-hdb) uses Berkeley DB concurrent / transactional
 275 database software. This is the same software used by leading commercial
 276 directory software.
 277
 278 Now for the long answer. We are all confronted all the time with the choice
 279 RDBMSes vs. directories. It is a hard choice and no simple answer exists.
 280
 281 It is tempting to think that having a RDBMS backend to the directory solves all
 282 problems. However, it is a pig. This is because the data models are very
 283 different. Representing directory data with a relational database is going to
 284 require splitting data into multiple tables.
 285
 286 Think for a moment about the person objectclass. Its definition requires
 287 attribute types objectclass, sn and cn and allows attribute types userPassword,
 288 telephoneNumber, seeAlso and description. All of these attributes are multivalued,
 289 so a normalization requires putting each attribute type in a separate table.
 290
 291 Now you have to decide on appropriate keys for those tables. The primary key
 292 might be a combination of the DN, but this becomes rather inefficient on most
 293 database implementations.
 294
 295 The big problem now is that accessing data from one entry requires seeking on
 296 different disk areas. On some applications this may be OK but in many
 297 applications performance suffers.
 298
 299 The only attribute types that can be put in the main table entry are those that
 300 are mandatory and single-value. You may add also the optional single-valued
 301 attributes and set them to NULL or something if not present.
 302
 303 But wait, the entry can have multiple objectclasses and they are organized in
 304 an inheritance hierarchy. An entry of objectclass organizationalPerson now has
 305 the attributes from person plus a few others and some formerly optional attribute
 306 types are now mandatory.
 307
 308 What to do? Should we have different tables for the different objectclasses?
 309 This way the person would have an entry on the person table, another on
 310 organizationalPerson, etc. Or should we get rid of person and put everything on
 311 the second table?
 312
 313 But what do we do with a filter like (cn=*) where cn is an attribute type that
 314 appears in many, many objectclasses. Should we search all possible tables for
 315 matching entries? Not very attractive.
 316
 317 Once this point is reached, three approaches come to mind. One is to do full
 318 normalization so that each attribute type, no matter what, has its own separate
 319 table. The simplistic approach where the DN is part of the primary key is
 320 extremely wasteful, and calls for an approach where the entry has a unique
 321 numeric id that is used instead for the keys and a main table that maps DNs to
 322 ids. The approach, anyway, is very inefficient when several attribute types from
 323 one or more entries are requested. Such a database, though cumbersomely,
 324 can be managed from SQL applications.
 325
 326 The second approach is to put the whole entry as a blob in a table shared by all
 327 entries regardless of the objectclass and have additional tables that act as
 328 indices for the first table. Index tables are not database indices, but are
 329 fully managed by the LDAP server-side implementation. However, the database
 330 becomes unusable from SQL. And, thus, a fully fledged database system provides
 331 little or no advantage. The full generality of the database is unneeded.
 332 Much better to use something light and fast, like Berkeley DB.
 333
 334 A completely different way to see this is to give up any hopes of implementing
 335 the directory data model. In this case, LDAP is used as an access protocol to
 336 data that provides only superficially the directory data model. For instance,
 337 it may be read only or, where updates are allowed, restrictions are applied,
 338 such as making single-value attribute types that would allow for multiple values.
 339 Or the impossibility to add new objectclasses to an existing entry or remove
 340 one of those present. The restrictions span the range from allowed restrictions
 341 (that might be elsewhere the result of access control) to outright violations of
 342 the data model. It can be, however, a method to provide LDAP access to preexisting
 343 data that is used by other applications. But in the understanding that we don't
 344 really have a "directory".
 345
 346 Existing commercial LDAP server implementations that use a relational database
 347 are either from the first kind or the third. I don't know of any implementation
 348 that uses a relational database to do inefficiently what BDB does efficiently.
 349 For those who are interested in "third way" (exposing EXISTING data from RDBMS
 350 as LDAP tree, having some limitations compared to classic LDAP model, but making
 351 it possible to interoperate between LDAP and SQL applications):
 352
 353 OpenLDAP includes back-sql - the backend that makes it possible. It uses ODBC +
 354 additional metainformation about translating LDAP queries to SQL queries in your
 355 RDBMS schema, providing different levels of access - from read-only to full
 356 access depending on RDBMS you use, and your schema.
 357
 358 For more information on concept and limitations, see {{slapd-sql}}(5) man page,
 359 or the {{SECT: Backends}} section. There are also several examples for several
 360 RDBMSes in {{F:back-sql/rdbms_depend/*}} subdirectories.
 361
 362 TO REFERENCE:
 363
 364 http://blogs.sun.com/treydrake/entry/ldap_vs_relational_database
 365 http://blogs.sun.com/treydrake/entry/ldap_vs_relational_database_part
 366
 367 H2: What is slapd and what can it do?
 368
 369 {{slapd}}(8) is an LDAP directory server that runs on many different
 370 platforms. You can use it to provide a directory service of your
 371 very own.  Your directory can contain pretty much anything you want
 372 to put in it. You can connect it to the global LDAP directory
 373 service, or run a service all by yourself. Some of slapd's more
 374 interesting features and capabilities include:
 375
 376 {{B:LDAPv3}}: {{slapd}} implements version 3 of {{TERM[expand]LDAP}}.
 377 {{slapd}} supports LDAP over both {{TERM:IPv4}} and {{TERM:IPv6}}
 378 and Unix {{TERM:IPC}}.
 379
 380 {{B:{{TERM[expand]SASL}}}}: {{slapd}} supports strong authentication
 381 and data security (integrity and confidentiality) services through
 382 the use of SASL.  {{slapd}}'s SASL implementation utilizes {{PRD:Cyrus
 383 SASL}} software which supports a number of mechanisms including
 384 {{TERM:DIGEST-MD5}}, {{TERM:EXTERNAL}}, and {{TERM:GSSAPI}}.
 385
 386 {{B:{{TERM[expand]TLS}}}}: {{slapd}} supports certificate-based
 387 authentication and data security (integrity and confidentiality)
 388 services through the use of TLS (or SSL).  {{slapd}}'s TLS
 389 implementation can utilize either {{PRD:OpenSSL}} or {{PRD:GnuTLS}} software.
 390
 391 {{B:Topology control}}: {{slapd}} can be configured to restrict
 392 access at the socket layer based upon network topology information.
 393 This feature utilizes {{TCP wrappers}}.
 394
 395 {{B:Access control}}: {{slapd}} provides a rich and powerful access
 396 control facility, allowing you to control access to the information
 397 in your database(s). You can control access to entries based on
 398 LDAP authorization information, {{TERM:IP}} address, domain name
 399 and other criteria.  {{slapd}} supports both {{static}} and {{dynamic}}
 400 access control information.
 401
 402 {{B:Internationalization}}: {{slapd}} supports Unicode and language
 403 tags.
 404
 405 {{B:Choice of database backends}}: {{slapd}} comes with a variety
 406 of different database backends you can choose from. They include
 407 {{TERM:BDB}}, a high-performance transactional database backend;
 408 {{TERM:HDB}}, a hierarchical high-performance transactional
 409 backend; {{SHELL}}, a backend interface to arbitrary shell scripts;
 410 and PASSWD, a simple backend interface to the {{passwd}}(5) file.
 411 The BDB and HDB backends utilize {{ORG:Oracle}} {{PRD:Berkeley
 412 DB}}.
 413
 414 {{B:Multiple database instances}}: {{slapd}} can be configured to
 415 serve multiple databases at the same time. This means that a single
 416 {{slapd}} server can respond to requests for many logically different
 417 portions of the LDAP tree, using the same or different database
 418 backends.
 419
 420 {{B:Generic modules API}}:  If you require even more customization,
 421 {{slapd}} lets you write your own modules easily. {{slapd}} consists
 422 of two distinct parts: a front end that handles protocol communication
 423 with LDAP clients; and modules which handle specific tasks such as
 424 database operations.  Because these two pieces communicate via a
 425 well-defined {{TERM:C}} {{TERM:API}}, you can write your own
 426 customized modules which extend {{slapd}} in numerous ways.  Also,
 427 a number of {{programmable database}} modules are provided.  These
 428 allow you to expose external data sources to {{slapd}} using popular
 429 programming languages ({{PRD:Perl}}, {{shell}}, and {{TERM:SQL}}.
 430
 431 {{B:Threads}}: {{slapd}} is threaded for high performance.  A single
 432 multi-threaded {{slapd}} process handles all incoming requests using
 433 a pool of threads.  This reduces the amount of system overhead
 434 required while providing high performance.
 435
 436 {{B:Replication}}: {{slapd}} can be configured to maintain shadow
 437 copies of directory information.  This {{single-master/multiple-slave}}
 438 replication scheme is vital in high-volume environments where a
 439 single {{slapd}} installation just doesn't provide the necessary availability
 440 or reliability.  For extremely demanding environments where a
 441 single point of failure is not acceptable, {{multi-master}} replication
 442 is also available.  {{slapd}} includes support for {{LDAP Sync}}-based
 443 replication.
 444
 445 {{B:Proxy Cache}}: {{slapd}} can be configured as a caching
 446 LDAP proxy service.
 447
 448 {{B:Configuration}}: {{slapd}} is highly configurable through a
 449 single configuration file which allows you to change just about
 450 everything you'd ever want to change.  Configuration options have
 451 reasonable defaults, making your job much easier. Configuration can
 452 also be performed dynamically using LDAP itself, which greatly
 453 improves manageability.
 454