git.sur5r.net Git - openldap/blob - doc/guide/admin/syncrepl.sdf

   1 # $OpenLDAP$
   2 # Copyright 2003, The OpenLDAP Foundation, All Rights Reserved.
   3 # COPYING RESTRICTIONS APPLY, see COPYRIGHT.
   4
   5 H1: LDAP Sync Replication
   6
   7 The LDAP Sync replication engine is designed to function as an
   8 improved alternative to {{slurpd}}(8).  While the replication with
   9 {{slurpd}}(8) provides the replication capability for improved capacity,
  10 availability, and reliability, it has some drawbacks:
  11
  12 ^ It is not stateful, hence lacks the resynchronization capability.
  13 Because there is no representation of replica state in the replication
  14 with {{slurpd}}(8), it is not possible to provide an efficient mechanism
  15 to make the slave replica consistent to the master replica once
  16 they become out of sync. For instance, if the slave database content
  17 is damaged, the slave replica should be re-primed from the master
  18 replica again. with a state-based replication, it would be possible
  19 to recover the slave replica from a local backup. The slave replica,
  20 then, will be synchronized by calculating and transmitting the diffs
  21 between the slave replica and the master replica based on their
  22 states. The LDAP Sync replication is stateful.
  23
  24 + It is history-based, not state-based. The replication with
  25 {{slurpd}}(8) relies on the history information in the replication log
  26 file generated by {{slapd}}(8). If a portion of the log file that
  27 contains updates yet to be synchronized to the slave is truncated
  28 or damaged, a full reload is required. The state-based replication,
  29 on the other hand, would not rely on the separate history store.
  30 In the LDAP Sync replication, every directory entry has its state
  31 information in the entryCSN operational attribute. The replica
  32 contents are calculated based on the consumer cookie and the entryCSN
  33 of the directory entries.
  34
  35 + It is push-based, not pull-based. In the replication with
  36 {{slurpd}}(8), it is the master who decides when to synchronize the
  37 replica. The pull-based polling replication is not possible with
  38 {{slurpd}}(8). For example, in order to make a daily directory backup
  39 which is an exact image at a time, it is required to make the slave
  40 replica read-only by stopping {{slurpd}}(8) during backup. After backup,
  41 {{slurpd}}(8) can be run in an one-shot mode to resynchronize the slave
  42 replica with the updates during the backup. In a pull-based, polling
  43 replication, it is guaranteed to be read-only between the two polling
  44 points. The LDAP Sync replication supports both the push-based
  45 replication and the pull-based replication.
  46
  47 + It only supports the fractional replication and does not support
  48 the sparse replication. The LDAP Sync replication supports both the
  49 fractional and sparse replication. It is possible to use general
  50 search specification to initiate a synchronization session only for
  51 the interesting subset of the context.
  52
  53 H2: LDAP Content Sync Protocol Description
  54
  55 The LDAP Sync replication uses the LDAP Content Sync protocol (refer
  56 to the Internet Draft entitled "The LDAP Content Synchronization
  57 Operation") for replica synchronization. The LDAP Content Sync
  58 protocol operation is based on the replica state which is transmitted
  59 between replicas as the synchronization cookies. There are two
  60 operating modes : refreshOnly and refreshAndPersist. In both modes,
  61 a consumer {{slapd}}(8) connects to a provider {{slapd}}(8) with a cookie
  62 value representing the state of the consumer replica. The non-persistent
  63 part of the synchronization consists of two phases.
  64
  65 The first is the state-base phase. The entries updated after the
  66 point in time the consumer cookie represents will be transmitted
  67 to the consumer. Because the unit of synchronization is entry, all
  68 the requested attributes will be transmitted even though only some
  69 of them are changed. For the rest of the entries, the present
  70 messages consisting only of the name and the synchronization control
  71 will be sent to the consumer. After the consumer receives all the
  72 updated and present entries, it can reliably make its replica
  73 consistent to the provider replica. The consumer will add all the
  74 newly added entries, replace the entries if updated entries are
  75 existent, and delete entries in the local replica if they are neither
  76 updated nor specified as present.
  77
  78 The second is the log-base phase. This phase is incorporated to
  79 optimize the protocol with respect to the volume of the present
  80 traffic. If the provider maintains a history store from which the
  81 content to be synchronized can be reliably calculated, this log-base
  82 phase follows the state-base phase. In this mode, the actual directory
  83 update operations such as delete, modify, and add are transmitted.
  84 There is no need to send present messages in this log-base phase.
  85
  86 If the protocol operates in the refreshOnly mode, the synchronization
  87 will terminate. The provider will send a synchronization cookie
  88 which reflects the new state to the consumer. The consumer will
  89 present the new cookie at the next time it requests a synchronization.
  90 If the protocol operates in the refreshAndPersist mode, the
  91 synchronization operation remains persistent in the provider. Every
  92 updates made to the provider replica will be transmitted to the
  93 consumer. Cookies can be sent to the consumer at any time by using
  94 the SyncInfo intermediate response and at the end of the synchronization
  95 by using the SyncDone control attached to the SearchResultDone
  96 message.
  97
  98 Entries are uniquely identified by the entryUUID attribute value
  99 in the LDAP Content Sync protocol. It can role as a reliable entry
 100 identifier while DN of an entry can change by modrdn operations.
 101 The entryUUID is attached to each SearchResultEntry or
 102 SearchResultReference as a part of the Sync State control.
 103
 104 H2: LDAP Sync Replication Details
 105
 106 The LDAP Sync replication uses both the refreshOnly and the
 107 refreshAndPersist modes of synchronization. If an LDAP Sync replication
 108 is specified in a database definition, the {{slapd}}(8) schedules an
 109 execution of the LDAP Sync replication engine. In the refreshOnly
 110 mode, the engine will be rescheduled at the interval time after a
 111 replication session ends. In the refreshAndPersist mode, the engine
 112 will remain active to process the SearchResultEntry messages from
 113 the provider.
 114
 115 The LDAP Sync replication uses only the state-base synchronization
 116 phase.  Because {{slapd}}(8) does not currently implement history store
 117 like changelog or tombstone, it depends only on the state-base
 118 phase. A Null log-base phase follows the state-base phase.
 119
 120 As an optimization, no entries will be transmitted to a consumer
 121 if there has been no update in the master replica after the last
 122 synchronization with the consumer. Even present messages for the
 123 unchanged entries are not transmitted. The consumer retains its
 124 replica contents.
 125
 126 H3: entryCSN
 127
 128 The LDAP Sync replication implemented in OpenLDAP stores state
 129 information to ever entry in the entryCSN attribute. entryCSN of
 130 an entry is the CSN (change sequence number), which is the refined
 131 timestamp, at which the entry was updated most lately. The CSN
 132 consists of three parts : the time, a replica ID, and a change count
 133 within a single second.
 134
 135 H3: contextCSN
 136
 137 contextCSN represents the current state of the provider replica.
 138 It is the largest entryCSN of all entries in the context such that
 139 no transaction having smaller entryCSN value remains outstanding.
 140 Because the entryCSN value is obtained before transaction start and
 141 transactions are not committed in the entryCSN order, special care
 142 needed to be taken to manage the proper contextCSN value in the
 143 transactional environment. Also, the state of the search result set
 144 is required to correspond to the contextCSN value returned to the
 145 consumer as a sync cookie.
 146
 147 contextCSN, the provider replica state, is stored in the
 148 syncProviderSubentry. The value of the contextCSN is transmitted
 149 to the consumer replica as a Sync Cookie. The cookie is stored in
 150 the syncreplCookie attribute of syncConsumerSubentry subentry. The
 151 consumer will use the stored cookie value to represent its replica
 152 state when it connects to the provider in the future.
 153
 154 H3: Glue Entry
 155
 156 Because general search filter can be used in the LDAP Sync replication,
 157 an entry might be created without a parent, if the parent entry was
 158 filtered out. The LDAP Sync replication engine creates the glue
 159 entries for such holes in the replica. The glue entries will not
 160 be returned in response to a search to the consumer {{slapd}}(8) if
 161 manageDSAit is not set. It will be returned if it is set.
 162
 163 H2: Configuring slapd for LDAP Sync Replication
 164
 165 It is relatively simple to start servicing with a replicated OpenLDAP
 166 environment with the LDAP Sync replication, compared to the replication
 167 with {{slurpd}}(8). First, we should configure both the provider and
 168 the consumer {{slapd}}(8) servers appropriately. Then, start the provider
 169 slapd instance first, and the consumer slapd instance next.
 170 Administrative tasks such as database copy and temporal shutdown
 171 (or read-only demotion) of the provider are not required.
 172
 173 H3: Set up the provider slapd
 174
 175 There is no special slapd.conf(5) directive for the provider {{slapd}}(8).
 176 Because the LDAP Sync searches are subject to access control, proper
 177 access control privileges should be set up for the replicated
 178 content.
 179
 180 When creating a provider database from an ldif file using slapadd(8),
 181 you must create and update a state indicator of the database context
 182 up to date. slapadd(8) will store the contextCSN in the
 183 syncProviderSubentry if it is given the -w flag. It is also possible
 184 to create the syncProviderSubentry with an appropriate contextCSN
 185 value by directly including it in the ldif file. If slapadd(8) runs
 186 without the -w flag, the provided contextCSN will be stored. With
 187 the -w flag, a new value based on the current time will be stored
 188 as contextCSN. slapcat(8) can be used to retrieve the directory
 189 with the contextCSN when it is run with the -m flag.
 190
 191 Only the back-bdb and the back-hdb backends can perform as the LDAP
 192 Sync replication provider. Back-ldbm currently does not have the
 193 LDAP Content Sync protocol functionality.
 194
 195 H3: Set up the consumer slapd
 196
 197 The consumer slapd is configured by slapd.conf(5) configuration
 198 file. For the configuration directives, see syncrepl section of the
 199 slapd Configuration File chapter. In the configuration file, make
 200 sure the DN given in the updatedn= directive of the syncrepl
 201 specification has permission to write to the database. Below is an
 202 example syncrepl specification at the consumer replica :
 203
 204 >       syncrepl id = 1
 205 >               provider=ldap://provider.example.com:389
 206 >               updatedn="cn=replica,dc=example,dc=com"
 207 >               binddn="cn=syncuser,dc=example,dc=com"
 208 >               bindmethod=simple
 209 >               credentials=secret
 210 >               searchbase="dc=example,dc=com"
 211 >               filter="(objectClass=organizationalPerson)"
 212 >               attrs="cn,sn,ou,telephoneNumber,title,l"
 213 >               schemachecking=on
 214 >               scope=sub
 215 >               type=refreshOnly
 216 >               interval=01:00:00
 217
 218 In this example, the consumer will connect to the provider slapd
 219 at port 389 of ldap://provider.example.com to perform a polling
 220 (refreshOnly) mode of synchronization once a day. It will bind as
 221 "cn=syncuser,dc=example,dc=com" using simple authentication with
 222 password "secret". Note that the access control privilege of the DN
 223 specified by the binddn= directive should be set properly to
 224 synchronize the desired replica content. The consumer will write to
 225 its database with the privilege of the "cn=replica,dc=example,dc=com"
 226 entry as specified by the updatedn= directive. The updatedn entry
 227 should have write permission to the database.
 228
 229 The synchronization search in the example will search for entries
 230 whose objectClass is organizationalPerson in the entire subtree
 231 under "dc=example,dc=com" search base inclusively. The requested
 232 attributes are cn, sn, ou, telephoneNumber, title, and l. The schema
 233 checking is turned on, so that the consumer {{slapd}}(8) will enforce
 234 entry schema checking when it process updates from the provider
 235 {{slapd}}(8).
 236
 237 The LDAP Sync replication engine is backend independent. All three
 238 native backends can perform as the LDAP Sync replication consumer.
 239
 240 H3: Start the provider and the consumer slapd
 241
 242 If the currently running provider {{slapd}}(8) already has the
 243 syncProviderSubentry in its database, it is not required to restart
 244 the provider slapd. You don't need to restart the provider {{slapd}}(8)
 245 when you start a replicated LDAP service. When you run a consumer
 246 {{slapd}}(8), it will immediately perform either the initial full reload
 247 if cookie is NULL or too out of date, or incremental synchronization
 248 if effective cookie is provided. In the refreshOnly mode, the next
 249 synchronization session is scheduled to run interval time after the
 250 completion of the current session. In the refreshAndPersist mode,
 251 the synchronization session is open between the consumer and provider.
 252 The provider will send update message whenever there are updates
 253 in the provider replica.