From: Tim Mooney Date: Tue, 12 Apr 2011 22:57:57 +0000 (-0500) Subject: ITS#6906 Update cachesize recommendations X-Git-Url: https://git.sur5r.net/?a=commitdiff_plain;h=45b41fb41afe1411e329a7c208f3934bbf03d028;p=openldap ITS#6906 Update cachesize recommendations to remove references to indexes in Hash format Fix whitespace error -- hyc --- diff --git a/doc/guide/admin/tuning.sdf b/doc/guide/admin/tuning.sdf index 5ba305a69f..03b866b3e5 100644 --- a/doc/guide/admin/tuning.sdf +++ b/doc/guide/admin/tuning.sdf @@ -234,10 +234,10 @@ will tell you how many internal pages are present in a database. You should check this number for both dn2id and id2entry. Also note that {{id2entry}} always uses 16KB per "page", while {{dn2id}} uses whatever -the underlying filesystem uses, typically 4 or 8KB. To avoid thrashing the, +the underlying filesystem uses, typically 4 or 8KB. To avoid thrashing, your cache must be at least as large as the number of internal pages in both -the {{dn2id}} and {{id2entry}} databases, plus some extra space to accommodate the actual -leaf data pages. +the {{dn2id}} and {{id2entry}} databases, plus some extra space to accommodate +the actual leaf data pages. For example, in my OpenLDAP 2.4 test database, I have an input LDIF file that's about 360MB. With the back-hdb backend this creates a {{dn2id.bdb}} that's 68MB, @@ -252,23 +252,17 @@ This doesn't take into account other library overhead, so this is even lower than the barest minimum. The default cache size, when nothing is configured, is only 256KB. -This 2.5MB number also doesn't take indexing into account. Each indexed attribute -uses another database file of its own, using a Hash structure. +This 2.5MB number also doesn't take indexing into account. Each indexed +attribute results in another database file. Earlier versions of OpenLDAP +kept these index databases in Hash format, but from OpenLDAP 2.2 onward +the index databases are in B-tree format so the same procedure can +be used to calculate the necessary amount of cache for each index database. -Unlike the B-trees, where you only need to touch one data page to find an entry -of interest, doing an index lookup generally touches multiple keys, and the -point of a hash structure is that the keys are evenly distributed across the -data space. That means there's no convenient compact subset of the database that -you can keep in the cache to insure quick operation, you can pretty much expect -references to be scattered across the whole thing. My strategy here would be to -provide enough cache for at least 50% of all of the hash data. +For example, if your only index is for the objectClass attribute and db_stat +reveals that {{objectClass.bdb}} has 339 internal pages and uses 4096 byte +pages, the additional cache needed for just this attribute index is -> (Number of hash buckets + number of overflow pages + number of duplicate pages) * page size / 2. - -The objectClass index for my example database is 5.9MB and uses 3 hash buckets -and 656 duplicate pages. So: - -> ( 3 + 656 ) * 4KB / 2 =~ 1.3MB. +> (339+1) * 4KB =~ 1.3MB. With only this index enabled, I'd figure at least a 4MB cache for this backend. (Of course you're using a single cache shared among all of the database files,