Merge remote-tracking branch 'origin/mdb.master'

[openldap] / doc / guide / admin / tuning.sdf
diff --git a/doc/guide/admin/tuning.sdf b/doc/guide/admin/tuning.sdf

index 486aef445aacb8a78913fe61d3a3947ad2505569..6ad12a67cff70c8ede4336e044451a91a72540c3 100644 (file)
--- a/doc/guide/admin/tuning.sdf
+++ b/doc/guide/admin/tuning.sdf
@@ -1,5 +1,5 @@
  # $OpenLDAP$
-# Copyright 1999-2007 The OpenLDAP Foundation, All Rights Reserved.
+# Copyright 1999-2013 The OpenLDAP Foundation, All Rights Reserved.
  # COPYING RESTRICTIONS APPLY, see COPYRIGHT.
  
  H1: Tuning
@@ -27,14 +27,26 @@ H3: Memory
  
  Scale your cache to use available memory and increase system memory if you can.
  
-More info here.
+See {{SECT:Caching}} for BDB cache tuning hints.
+Note that LMDB uses no cache of its own and has no tuning options, so the Caching
+section can be ignored when using LMDB.
  
  
  H3: Disks
  
-Use fast subsystems. Put each database and logs on separate disks.
+Use fast filesystems, and conduct your own testing to see which filesystem
+types perform best with your workload. (On our own Linux testing, EXT2 and JFS
+tend to provide better write performance than everything else, including
+newer filesystems like EXT4, BTRFS, etc.)
  
-Example showing config settings
+Use fast subsystems. Put each database and logs on separate disks
+(for BDB this is configurable via {{DB_CONFIG}}):
+
+>       # Data Directory
+>       set_data_dir /data/db
+>       
+>       # Transaction Log settings
+>       set_lg_dir /logs
  
  
  H3: Network Topology
@@ -78,7 +90,7 @@ General rule: don't go overboard with indexes. Unused indexes must be maintained
  See {{slapd.conf}}(8) and {{slapdindex}}(8) for more information
  
  
-H3: Presense indexing
+H3: Presence indexing
  
  If your client application uses presence filters and if the
  target attribute exists on the majority of entries in your target scope, then
@@ -106,7 +118,7 @@ H2: Logging
  
  H3: What log level to use
  
-The default of {{loglevel 256}} is really the best bet. There's a corollary to 
+The default of {{loglevel stats}} (256) is really the best bet. There's a corollary to 
  this when problems *do* arise, don't try to trace them using syslog. 
  Use the debug flag instead, and capture slapd's stderr output. syslog is too 
  slow for debug tracing, and it's inherently lossy - it will throw away messages when it
@@ -119,14 +131,14 @@ H3: What to watch out for
  
  The most common message you'll see that you should pay attention to is:
  
->  "<= bdb_equality_candidates: (foo) index_param failed (18)"
+>       "<= bdb_equality_candidates: (foo) index_param failed (18)"
  
  That means that some application tried to use an equality filter ({{foo=<somevalue>}}) 
  and attribute {{foo}} does not have an equality index. If you see a lot of these
  messages, you should add the index. If you see one every month or so, it may
  be acceptable to ignore it.
  
-The default syslog level is 256 which logs the basic parameters of each
+The default syslog level is stats (256) which logs the basic parameters of each
  request; it usually produces 1-3 lines of output. On Solaris and systems that
  only provide synchronous syslog, you may want to turn it off completely, but
  usually you want to leave it enabled so that you'll be able to see index
@@ -141,17 +153,17 @@ to sync the file system with every write ({{man syslogd/syslog.conf}}). In Linux
  you can prepend the log file name with a "-" in {{syslog.conf}}. For example, 
  if you are using the default LOCAL4 logging you could try:
  
->   # LDAP logs
->   LOCAL4.*         -/var/log/ldap
+>       # LDAP logs
+>       LOCAL4.*         -/var/log/ldap
  
  For syslog-ng, add or modify the following line in {{syslog-ng.conf}}:
  
-   options { sync(n); };
+>       options { sync(n); };
  
  where n is the number of lines which will be buffered before a write.
  
  
-H2: BDB/HDB Database Caching
+H2: Caching
  
  We all know what caching is, don't we? 
  
@@ -164,7 +176,24 @@ entry cache and {{TERM:IDL}} (IDL) cache.
  
  H3: Berkeley DB Cache
  
-BerkeleyDB's own data cache operates on page-sized blocks of raw data.
+There are two ways to tune for the BDB cachesize:
+
+(a) BDB cache size necessary to load the database via slapadd in optimal time
+
+(b) BDB cache size necessary to have a high performing running slapd once the data is loaded
+
+For (a), the optimal cachesize is the size of the entire database.  If you 
+already have the database loaded, this is simply a 
+
+>       du -c -h *.bdb 
+
+in the directory containing the OpenLDAP ({{/usr/local/var/openldap-data}}) data.
+
+For (b), the optimal cachesize is just the size of the {{id2entry.bdb}} file, 
+plus about 10% for growth.
+
+The tuning of {{DB_CONFIG}} should be done for each BDB type database 
+instantiated (back-bdb, back-hdb).
  
  Note that while the {{TERM:BDB}} cache is just raw chunks of memory and 
  configured as a memory size, the {{slapd}}(8) entry cache holds parsed entries, 
@@ -186,7 +215,7 @@ that's large enough for your "working set."
  That means, large enough to hold all of the most frequently accessed data, 
  plus a few less-frequently accessed items.
  
-ORACLE LINKS HERE
+For more information, please see: {{URL:http://www.oracle.com/technology/documentation/berkeley-db/db/ref/am_conf/cachesize.html}}
  
  H4: Calculating Cachesize
  
@@ -206,16 +235,16 @@ along the path from the root of the tree down to the particular data item
  you're accessing. That's enough cache for a single search. For the general case, 
  you want enough cache to contain all the internal nodes in the database. 
  
->   db_stat -d
+>       db_stat -d
  
  will tell you how many internal pages are present in a database. You should 
  check this number for both dn2id and id2entry.
  
  Also note that {{id2entry}} always uses 16KB per "page", while {{dn2id}} uses whatever 
-the underlying filesystem uses, typically 4 or 8KB. To avoid thrashing the, 
+the underlying filesystem uses, typically 4 or 8KB. To avoid thrashing,
  your cache must be at least as large as the number of internal pages in both 
-the {{dn2id}} and {{id2entry}} databases, plus some extra space to accomodate the actual 
-leaf data pages.
+the {{dn2id}} and {{id2entry}} databases, plus some extra space to accommodate
+the actual leaf data pages.
  
  For example, in my OpenLDAP 2.4 test database, I have an input LDIF file that's 
  about 360MB. With the back-hdb backend this creates a {{dn2id.bdb}} that's 68MB, 
@@ -224,29 +253,23 @@ and an {{id2entry}} that's 800MB. db_stat tells me that {{dn2id}} uses 4KB pages
  internal pages, and 45912 leaf pages. In order to efficiently retrieve any 
  single entry in this database, the cache should be at least
  
->   (433+1) * 4KB + (52+1) * 16KB in size: 1736KB + 848KB =~ 2.5MB.
+>       (433+1) * 4KB + (52+1) * 16KB in size: 1736KB + 848KB =~ 2.5MB.
  
  This doesn't take into account other library overhead, so this is even lower 
  than the barest minimum. The default cache size, when nothing is configured, 
  is only 256KB. 
  
-This 2.5MB number also doesn't take indexing into account. Each indexed attribute 
-uses another database file of its own, using a Hash structure. 
+This 2.5MB number also doesn't take indexing into account. Each indexed
+attribute results in another database file.  Earlier versions of OpenLDAP
+kept these index databases in Hash format, but from OpenLDAP 2.2 onward
+the index databases are in B-tree format so the same procedure can
+be used to calculate the necessary amount of cache for each index database.
  
-Unlike the B-trees, where you only need to touch one data page to find an entry 
-of interest, doing an index lookup generally touches multiple keys, and the 
-point of a hash structure is that the keys are evenly distributed across the 
-data space. That means there's no convenient compact subset of the database that 
-you can keep in the cache to insure quick operation, you can pretty much expect 
-references to be scattered across the whole thing. My strategy here would be to 
-provide enough cache for at least 50% of all of the hash data. 
+For example, if your only index is for the objectClass attribute and db_stat
+reveals that {{objectClass.bdb}} has 339 internal pages and uses 4096 byte
+pages, the additional cache needed for just this attribute index is
  
->   (Number of hash buckets + number of overflow pages + number of duplicate pages) * page size / 2.
-
-The objectClass index for my example database is 5.9MB and uses 3 hash buckets 
-and 656 duplicate pages. So:
-
->   ( 3 + 656 ) * 4KB / 2 =~ 1.3MB.
+>       (339+1) * 4KB =~ 1.3MB.
  
  With only this index enabled, I'd figure at least a 4MB cache for this backend. 
  (Of course you're using a single cache shared among all of the database files, 
@@ -263,8 +286,9 @@ id2entry data, so 4MB is good enough.
  With back-bdb and back-hdb you can use "db_stat -m" to check how well the 
  database cache is performing. 
  
+For more information on {{db_stat}}: {{URL:http://www.oracle.com/technology/documentation/berkeley-db/db/utility/db_stat.html}}
  
-H3: {{slapd}}(8) Entry Cache
+H3: {{slapd}}(8) Entry Cache (cachesize)
  
  The {{slapd}}(8) entry cache operates on decoded entries. The rationale - entries 
  in the entry cache can be used directly, giving the fastest response. If an entry 
@@ -275,6 +299,10 @@ If the entry is in neither cache then BDB will have to flush some of its current
  cached pages and bring in the needed pages, resulting in a couple of expensive 
  I/Os as well as parsing.
  
+The most optimal value is of course, the entire number of entries in the database.  
+However, most directory servers don't consistently serve out their entire database, so setting this to a lesser number that more closely matches the believed working set of data is 
+sufficient. This is the second most important parameter for the DB.
+
  As far as balancing the entry cache vs the BDB cache - parsed entries in memory 
  are generally about twice as large as they are on disk. 
  
@@ -284,62 +312,27 @@ occurring in the database. It is merely the fact that the cache is thrashing
  itself that causes performance/response time to slowdown. 
  
  
-MOVE BELOW AROUND:
-
-
-If you want to setup the cache size, please read:
-
- (Xref) How do I configure the BDB backend?
- (Xref) What are the DB_CONFIG configuration directives?
- http://www.sleepycat.com/docs/utility/db_recover.html
-
-A default config can be found in the answer:
-
- (Xref) What are the DB_CONFIG configuration directives?
-
-just change the set_lg_dir to point to your .log directory or comment that line.
-
-Quick guide:
-- Create a DB_CONFIG file in your ldap home directory (/var/lib/ldap/DB_CONFIG) with the correct "set_cachesize" value
-- stop your ldap server and run db_recover -h /var/lib/ldap
-- start your ldap server and check the new cache size with:
-
-  db_stat -h /var/lib/ldap -m | head -n 2
-
-- this procedure is only needed if you use OpenLDAP 2.2 with the BDB or HDB backends; In OpenLDAP 2.3 DB recovery is performed automatically whenever the DB_CONFIG file is changed or when an unclean shutdown is detected.
-
-
---On Tuesday, February 22, 2005 12:15 PM -0500 Dusty Doris <openldap@mail.doris.cc> wrote:
-
-    Few questions, if you change the cachesize and idlecachesize entries, do
-    you have to do anything special aside from restarting slapd, such as run
-    slapindex or db_recover?
-
-
-    Also, is there any way to tell how much memory these caches are taking up
-    to make sure they are not set too large?  What happens if you set your
-    cachesize too large and you don't have enough available memory to store
-    these?  Will that cause an issue with openldap, or will it just not cache
-    those entries that would make it exceed its available memory.  Will it
-    just use some sort of FIFO on those caches?
-
-
-It will consume the memory resources of your system, and likely cause issues.
-
-    Finally, what do most people try to achieve with these values?  Would the
-    goal be to make these as big as the directory?  So, if I have 400,000 dn's
-    in my directory, would it be safe to set these at 400000 or would
-    something like 20,000 be good enough to get a nice performance increase?
-
+H3: {{TERM:IDL}} Cache (idlcachesize)
  
-I try to cache the most actively used entries. Unless you expect all 400,000 entries of your DB to be accessed regularly, there is no need to cache that many entries. My entry cache is set to 20,000 (out of a little over 400,000 entries).
+Each IDL holds the search results from a given query, so the IDL cache will 
+end up holding the most frequently requested search results.  For back-bdb, 
+it is generally recommended to match the "cachesize" setting.  For back-hdb, 
+it is generally recommended to be 3x"cachesize".
  
-The idl cache has to do with how many unique result sets of searches you want to store in memory. Setting up this cache will allow your most frequently placed searches to get results much faster, but I doubt you want to try and cache the results of every search that hits your system. ;)
+{NOTE: The idlcachesize setting directly affects search performance}
  
---Quanah
  
+H2: {{slapd}}(8) Threads
  
-H3: {{TERM:IDL}} Cache
+{{slapd}}(8) can process requests via a configurable number of threads, which
+in turn affects the in/out rate of connections.
  
+This value should generally be a function of the number of "real" cores on 
+the system, for example on a server with 2 CPUs with one core each, set this 
+to 8, or 4 threads per real core.  This is a "read" maximized value. The more 
+threads that are configured per core, the slower {{slapd}}(8) responds for 
+"read" operations.  On the flip side, it appears to handle write operations 
+faster in a heavy write/low read scenario.
  
-http://www.openldap.org/faq/data/cache/1076.html
+The upper bound for good read performance appears to be 16 threads (which
+also happens to be the default setting).