Merge remote branch 'origin/mdb.master'

[openldap] / doc / guide / admin / intro.sdf
diff --git a/doc/guide/admin/intro.sdf b/doc/guide/admin/intro.sdf

index 6fd1383bfce6e5ff4844a90d902414719cec3308..5655032a8aaa72cadcc18401a56d48f85c59cab4 100644 (file)
--- a/doc/guide/admin/intro.sdf
+++ b/doc/guide/admin/intro.sdf
@@ -1,5 +1,5 @@
  # $OpenLDAP$
-# Copyright 1999-2007 The OpenLDAP Foundation, All Rights Reserved.
+# Copyright 1999-2011 The OpenLDAP Foundation, All Rights Reserved.
  # COPYING RESTRICTIONS APPLY, see COPYRIGHT.
  H1: Introduction to OpenLDAP Directory Services
  
@@ -57,8 +57,8 @@ support browsing and searching.
  
  While some consider the Internet {{TERM[expand]DNS}} (DNS) is an
  example of a globally distributed directory service, DNS is not
-browsable nor searchable.  It is more properly described as a
-globaly distributed {{lookup}} service.
+browseable nor searchable.  It is more properly described as a
+globally distributed {{lookup}} service.
  
  
  H2: What is LDAP?
@@ -156,9 +156,44 @@ services.
  
  H2: When should I use LDAP?
  
+This is a very good question. In general, you should use a Directory
+server when you require data to be centrally managed, stored and accessible via
+standards based methods. 
+
+Some common examples found throughout the industry are, but not limited to:
+
+* Machine Authentication
+* User Authentication
+* User/System Groups
+* Address book
+* Organization Representation
+* Asset Tracking
+* Telephony Information Store
+* User resource management
+* E-mail address lookups
+* Application Configuration store
+* PBX Configuration store
+* etc.....
+
+There are various {{SECT:Distributed Schema Files}} that are standards based, but
+you can always create your own {{SECT:Schema Specification}}.
+
+There are always new ways to use a Directory and apply LDAP principles to address
+certain problems, therefore there is no simple answer to this question.
+
+If in doubt, join the general LDAP forum for non-commercial discussions and 
+information relating to LDAP at: 
+{{URL:http://www.umich.edu/~dirsvcs/ldap/mailinglist.html}} and ask
  
  H2: When should I not use LDAP?
  
+When you start finding yourself bending the directory to do what you require,
+maybe a redesign is needed. Or if you only require one application to use and 
+manipulate your data (for discussion of LDAP vs RDBMS, please read the 
+{{SECT:LDAP vs RDBMS}} section).
+
+It will become obvious when LDAP is the right tool for the job.
+
  
  H2: How does LDAP work?
  
@@ -211,16 +246,16 @@ H2: What is the difference between LDAPv2 and LDAPv3?
  LDAPv3 was developed in the late 1990's to replace LDAPv2.
  LDAPv3 adds the following features to LDAP:
  
- - Strong authentication and data security services via {{TERM:SASL}}
- - Certificate authentication and data security services via {{TERM:TLS}} (SSL)
- - Internationalization through the use of Unicode
- - Referrals and Continuations
- - Schema Discovery
- - Extensibility (controls, extended operations, and more)
+ * Strong authentication and data security services via {{TERM:SASL}}
+ * Certificate authentication and data security services via {{TERM:TLS}} (SSL)
+ * Internationalization through the use of Unicode
+ * Referrals and Continuations
+ * Schema Discovery
+ * Extensibility (controls, extended operations, and more)
  
  LDAPv2 is historic ({{REF:RFC3494}}).  As most {{so-called}} LDAPv2
  implementations (including {{slapd}}(8)) do not conform to the
-LDAPv2 technical specification, interoperatibility amongst
+LDAPv2 technical specification, interoperability amongst
  implementations claiming LDAPv2 support is limited.  As LDAPv2
  differs significantly from LDAPv3, deploying both LDAPv2 and LDAPv3
  simultaneously is quite problematic.  LDAPv2 should be avoided.
@@ -229,10 +264,103 @@ LDAPv2 is disabled by default.
  
  H2: LDAP vs RDBMS
  
-To reference:
+This question is raised many times, in different forms. The most common, 
+however, is: {{Why doesn't OpenLDAP drop Berkeley DB and use a relational 
+database management system (RDBMS) instead?}} In general, expecting that the 
+sophisticated algorithms implemented by commercial-grade RDBMS would make 
+{{OpenLDAP}} be faster or somehow better and, at the same time, permitting 
+sharing of data with other applications.
+
+The short answer is that use of an embedded database and custom indexing system 
+allows OpenLDAP to provide greater performance and scalability without loss of 
+reliability. OpenLDAP uses Berkeley DB concurrent / transactional 
+database software. This is the same software used by leading commercial 
+directory software.
+
+Now for the long answer. We are all confronted all the time with the choice 
+RDBMSes vs. directories. It is a hard choice and no simple answer exists.
+
+It is tempting to think that having a RDBMS backend to the directory solves all 
+problems. However, it is a pig. This is because the data models are very 
+different. Representing directory data with a relational database is going to 
+require splitting data into multiple tables.
+
+Think for a moment about the person objectclass. Its definition requires 
+attribute types objectclass, sn and cn and allows attribute types userPassword, 
+telephoneNumber, seeAlso and description. All of these attributes are multivalued, 
+so a normalization requires putting each attribute type in a separate table.
+
+Now you have to decide on appropriate keys for those tables. The primary key 
+might be a combination of the DN, but this becomes rather inefficient on most 
+database implementations.
+
+The big problem now is that accessing data from one entry requires seeking on 
+different disk areas. On some applications this may be OK but in many 
+applications performance suffers.
+
+The only attribute types that can be put in the main table entry are those that 
+are mandatory and single-value. You may add also the optional single-valued 
+attributes and set them to NULL or something if not present.
+
+But wait, the entry can have multiple objectclasses and they are organized in 
+an inheritance hierarchy. An entry of objectclass organizationalPerson now has 
+the attributes from person plus a few others and some formerly optional attribute 
+types are now mandatory.
+
+What to do? Should we have different tables for the different objectclasses? 
+This way the person would have an entry on the person table, another on 
+organizationalPerson, etc. Or should we get rid of person and put everything on 
+the second table?
+
+But what do we do with a filter like (cn=*) where cn is an attribute type that 
+appears in many, many objectclasses. Should we search all possible tables for 
+matching entries? Not very attractive.
+
+Once this point is reached, three approaches come to mind. One is to do full 
+normalization so that each attribute type, no matter what, has its own separate 
+table. The simplistic approach where the DN is part of the primary key is 
+extremely wasteful, and calls for an approach where the entry has a unique 
+numeric id that is used instead for the keys and a main table that maps DNs to 
+ids. The approach, anyway, is very inefficient when several attribute types from 
+one or more entries are requested. Such a database, though cumbersomely, 
+can be managed from SQL applications.
+
+The second approach is to put the whole entry as a blob in a table shared by all 
+entries regardless of the objectclass and have additional tables that act as 
+indices for the first table. Index tables are not database indices, but are 
+fully managed by the LDAP server-side implementation. However, the database 
+becomes unusable from SQL. And, thus, a fully fledged database system provides 
+little or no advantage. The full generality of the database is unneeded. 
+Much better to use something light and fast, like Berkeley DB. 
+
+A completely different way to see this is to give up any hopes of implementing 
+the directory data model. In this case, LDAP is used as an access protocol to 
+data that provides only superficially the directory data model. For instance, 
+it may be read only or, where updates are allowed, restrictions are applied, 
+such as making single-value attribute types that would allow for multiple values. 
+Or the impossibility to add new objectclasses to an existing entry or remove 
+one of those present. The restrictions span the range from allowed restrictions 
+(that might be elsewhere the result of access control) to outright violations of 
+the data model. It can be, however, a method to provide LDAP access to preexisting 
+data that is used by other applications. But in the understanding that we don't
+really have a "directory".
+
+Existing commercial LDAP server implementations that use a relational database 
+are either from the first kind or the third. I don't know of any implementation 
+that uses a relational database to do inefficiently what BDB does efficiently.
+For those who are interested in "third way" (exposing EXISTING data from RDBMS 
+as LDAP tree, having some limitations compared to classic LDAP model, but making 
+it possible to interoperate between LDAP and SQL applications):
+
+OpenLDAP includes back-sql - the backend that makes it possible. It uses ODBC + 
+additional metainformation about translating LDAP queries to SQL queries in your 
+RDBMS schema, providing different levels of access - from read-only to full 
+access depending on RDBMS you use, and your schema.
+
+For more information on concept and limitations, see {{slapd-sql}}(5) man page, 
+or the {{SECT: Backends}} section. There are also several examples for several 
+RDBMSes in {{F:back-sql/rdbms_depend/*}} subdirectories. 
  
-http://blogs.sun.com/treydrake/entry/ldap_vs_relational_database
-http://blogs.sun.com/treydrake/entry/ldap_vs_relational_database_part
  
  H2: What is slapd and what can it do?
  
@@ -256,7 +384,8 @@ SASL}} software which supports a number of mechanisms including
  {{B:{{TERM[expand]TLS}}}}: {{slapd}} supports certificate-based
  authentication and data security (integrity and confidentiality)
  services through the use of TLS (or SSL).  {{slapd}}'s TLS
-implementation utilizes {{PRD:OpenSSL}} software.
+implementation can utilize {{PRD:OpenSSL}}, {{PRD:GnuTLS}},
+or {{PRD:MozNSS}} software.
  
  {{B:Topology control}}: {{slapd}} can be configured to restrict
  access at the socket layer based upon network topology information.
@@ -296,8 +425,7 @@ well-defined {{TERM:C}} {{TERM:API}}, you can write your own
  customized modules which extend {{slapd}} in numerous ways.  Also,
  a number of {{programmable database}} modules are provided.  These
  allow you to expose external data sources to {{slapd}} using popular
-programming languages ({{PRD:Perl}}, {{shell}}, {{TERM:SQL}}, and
-{{PRD:TCL}}).
+programming languages ({{PRD:Perl}}, {{shell}}, and {{TERM:SQL}}).
  
  {{B:Threads}}: {{slapd}} is threaded for high performance.  A single
  multi-threaded {{slapd}} process handles all incoming requests using
@@ -307,8 +435,10 @@ required while providing high performance.
  {{B:Replication}}: {{slapd}} can be configured to maintain shadow
  copies of directory information.  This {{single-master/multiple-slave}}
  replication scheme is vital in high-volume environments where a
-single {{slapd}} just doesn't provide the necessary availability
-or reliability.  {{slapd}} includes support for {{LDAP Sync}}-based
+single {{slapd}} installation just doesn't provide the necessary availability
+or reliability.  For extremely demanding environments where a
+single point of failure is not acceptable, {{multi-master}} replication
+is also available.  {{slapd}} includes support for {{LDAP Sync}}-based
  replication.
  
  {{B:Proxy Cache}}: {{slapd}} can be configured as a caching
@@ -317,5 +447,7 @@ LDAP proxy service.
  {{B:Configuration}}: {{slapd}} is highly configurable through a
  single configuration file which allows you to change just about
  everything you'd ever want to change.  Configuration options have
-reasonable defaults, making your job much easier.
+reasonable defaults, making your job much easier. Configuration can
+also be performed dynamically using LDAP itself, which greatly
+improves manageability.