1 /******************************************************************************
3 * Copyright (C) 2000 Pierangelo Masarati, <ando@sys-net.it>
6 * Permission is granted to anyone to use this software for any purpose
7 * on any computer system, and to alter it and redistribute it, subject
8 * to the following restrictions:
10 * 1. The author is not responsible for the consequences of use of this
11 * software, no matter how awful, even if they arise from flaws in it.
13 * 2. The origin of this software must not be misrepresented, either by
14 * explicit claim or by omission. Since few users ever read sources,
15 * credits should appear in the documentation.
17 * 3. Altered versions must be plainly marked as such, and must not be
18 * misrepresented as being the original software. Since few users
19 * ever read sources, credits should appear in the documentation.
21 * 4. This notice may not be removed or altered.
23 ******************************************************************************/
28 * A string is rewritten according to a set of rules, called
29 * a `rewrite context'.
30 * The rules are based on Regular Expressions (POSIX regex) with
31 * substring matching; extensions are planned to allow basic variable
32 * substitution and map resolution of substrings.
33 * The behavior of pattern matching/substitution can be altered by a
36 * The underlying concept is to build a lightweight rewrite module
37 * for the slapd server (initially dedicated to the back-ldap module).
42 * An incoming string is matched agains a set of rules. Rules are made
43 * of a match pattern, a substitution pattern and a set of actions.
44 * In case of match a string rewriting is performed according to the
45 * substitution pattern that allows to refer to substrings matched
46 * in the incoming string. The actions, if any, are finally performed.
47 * The substitution pattern allows map resolution of substrings.
48 * A map is a generic object that maps a substitution pattern to a
52 * Pattern Matching Flags
54 * 'C' honors case in matching (default is case insensitive)
55 * 'R' use POSIX Basic Regular Expressions (default is Extended)
60 * ':' apply the rule once only (default is recursive)
61 * '@' stop applying rules in case of match.
62 * '#' stop current operation if the rule matches, and issue an
63 * `unwilling to perform' error.
64 * 'G{n}' jump n rules back and forth (watch for loops!). Note that
65 * 'G{1}' is implicit in every rule.
66 * 'I' ignores errors in rule; this means, in case of error, e.g.
67 * issued by a map, the error is treated as a missed match.
68 * The 'unwilling to perform' is not overridden.
70 * the ordering of the flags is significant. For instance:
72 * 'IG{2}' means ignore errors and jump two lines ahead both in case
73 * of match and in case of error, while
74 * 'G{2}I' means ignore errors, but jump thwo lines ahead only in case
77 * More flags (mainly Action Flags) will be added as needed.
85 * String Substitution:
87 * the string substitution happens according to a substitution pattern.
88 * - susbtring substitution is allowed with the syntax '\d'
89 * where 'd' is a digit ranging 0-9 (0 is the full match).
90 * I see that 0-9 digit expansion is a widely accepted
91 * practise; however there is no technical reason to use
92 * such a strict limit. A syntax of the form '\{ddd}'
93 * should be fine if there is any need to use a higher
94 * number of possible submatches.
95 * - variable substitution will be allowed (at least when I
96 * figure out which kind of variable could be proficiently
98 * - map lookup will be allowed (map lookup of substring matches
99 * in gdbm, ldap(!), math(?) and so on maps 'a la sendmail'.
100 * - subroutine invocation will make it possible to rewrite a
101 * submatch in terms of the output of another rewriteContext
105 * '\' {0-9} [ '{' <name> [ '(' <args> ')' ] '}' ]
107 * where <name> is the name of a built-in map, and
108 * <args> are optional arguments to the map, if
109 * the map <name> requires them.
110 * The following experimental maps have been implemented:
113 * maps the n-th substring match as uid to
114 * the gecos field in /etc/passwd;
116 * \n{xfile(/absolute/path)}
117 * maps the n-th substring match
118 * to a 'key value' style plain text file.
120 * \n{xldap(ldap://url/with?%0?in?filter)
121 * maps the n-th substring match to an
122 * attribute retrieved by means of an LDAP
123 * url with substitution of %0 in the filter
128 * - everything starting with '\' requires substitution;
129 * - the only obvious exception is '\\', which is left as is;
130 * - the basic substitution is '\d', where 'd' is a digit;
131 * 0 means the whole string, while 1-9 is a submatch;
132 * - in the outdated schema, the digit may be optionally
133 * followed by a '{', which means pipe the submatch into
134 * the map described by the string up to the following '}';
135 * - the output of the map is used instead of the submatch;
136 * - in the new schema, a '\' followed by a '{' invokes an
137 * advanced substitution scheme. The pattern is:
139 * '\' '{' [{ <op> }] <name> '(' <substitution schema> ')' '}'
141 * where <name> must be a legal name for the map, i.e.
143 * <name> ::= [a-z][a-z0-9]* (case insensitive)
144 * <op> ::= '>' '|' '&' '&&' '*' '**' '$'
146 * and <substitution schema> must be a legal substitution
147 * schema, with no limits on the nesting level.
149 * > sub context invocation; <name> must be a legal,
150 * already defined rewrite context name
151 * | external command invocation; <name> must refer
152 * to a legal, already defined command name (NOT IMPL.)
153 * & variable assignment; <name> defines a variable
154 * in the running operation structure which can be
155 * dereferenced later (NOT IMPL.)
156 * * variable dereferencing; <name> must refer to a
157 * variable that is defined and assigned for the
158 * running operation (NOT IMPL.)
159 * $ parameter dereferencing; <name> must refer to
160 * an existing parameter; the idea is to make
161 * some run-time parameters set by the system
162 * available to the rewrite engine, as the client
163 * host name, the bind dn if any, constant
164 * parameters initialized at config time, and so
167 * Note: as the slapd parsing routines escape backslashes ('\'),
168 * a double backslash is required inside substitution patterns.
169 * To overcome the resulting heavy notation, the substitution escaping
170 * has been delegated to the '%' symbol, which should be used
171 * instead of '\' in string substitution patterns. The symbol can
172 * be altered at will by redefining the related macro in "rewrite-int.h".
173 * In the current snapshot, all the '\' on the left side of each rule
174 * (the regex pattern) must be converted in '\\'; all the '\' on the
175 * right side of the rule (the substitution pattern) must be turned
176 * into '%'. In the following examples, the original (more readable)
177 * syntax is used; however, in the servers/slapd/back-ldap/slapd.conf
178 * example file, the working syntax is used.
184 * a rewrite context is a set of rules which are applied in sequence.
185 * The basic idea is to have an application initialize a rewrite
186 * engine (think of Apache's mod_rewrite ...) with a set of rewrite
187 * contexts; when string rewriting is required, one invokes the
188 * appropriate rewrite context with the input string and obtains the
189 * newly rewritten one if no errors occur.
191 * An interesting application, in back-ldap or in slapd itself,
192 * could associate each basic server operation to a rewrite context
193 * (most of them possibly aliasing the default one). Then, DN rewriting
194 + could take place at any invocation of a backend operation.
197 * default if defined and no specific context is available
200 * searchFilter search
205 * newSuperiorDn modrdn
209 * searchResult search (only if defined; no default)
212 * Configuration syntax:
216 * rewriteEngine { on | off }
218 * rewriteContext <context name> [ alias <aliased context name> ]
220 * rewriteRule <regex pattern> <substitution pattern> [ <flags> ]
225 * rewriteMap <map name> <map type> [ <map attrs> ]
227 * rewriteParam <param name> <param value>
229 * rewriteMaxPasses <number of passes>
235 * if 'on', the requested rewriting is performed; if 'off', no
236 * rewriting takes place (an easy way to stop rewriting without
237 * altering too much the configuration file)
241 * <context name> is the name that identifies the context, i.e.
242 * the name used by the application to refer to the set of rules
243 * it contains. It is used also to reference sub contexts in
244 * string rewriting. A context may aliase another one. In this
245 * case the alias context contains no rule, and any reference to
246 * it will result in accessing the aliased one.
250 * determines how a tring can be rewritten if a pattern is matched.
251 * Examples are reported below.
255 * allows to define a map that transforms substring rewriting into
256 * something else. The map is referenced inside the substitution
261 * sets a value with global scope, that can be dereferenced by the
262 * command '\{$paramName}'.
266 * sets the maximum number of total rewriting passes taht can be
267 * performed in a signle rewriting operation (to avoid loops).
270 * Configuration examples:
272 * # set to 'off' to disable rewriting
277 * # everything defined here goes into the 'default' context
278 * # this rule changes the naming context of anything sent to
279 * # 'dc=home,dc=net' to 'dc=OpenLDAP, dc=org'
281 * rewriteRule "(.*)dc=home,[ ]?dc=net" "\1dc=OpenLDAP, dc=org" ":"
284 * # start a new context (ends input of the previous one)
285 * # this rule adds blancs between dn parts if not present.
287 * rewriteContext addBlancs
288 * rewriteRule "(.*),([^ ].*)" "\1, \2"
291 * # this one eats blancs
293 * rewriteContext eatBlancs
294 * rewriteRule "(.*),[ ](.*)" "\1,\2"
297 * # here control goes back to the default rewrite context; rules are
298 * # appended to the existing ones.
299 * # anything that gets here is piped into rule 'addBlancs'
301 * rewriteContext default
302 * rewriteRule ".*" "\{>addBlancs(\0)}" ":"
305 * # anything with 'uid=username' gets looked up in /etc/passwd for
306 * # gecos (I know it's nearly useless, but it is there just to
307 * # test something fancy!). Note the 'I' flag that leaves
308 * # 'uid=username' in place if 'username' does not have a valid
309 * # account. Note also the ':' that forces the rule to be processed
312 * rewriteContext uid2Gecos
313 * rewriteRule "(.*)uid=([a-z0-9]+),(.+)" "\1cn=\2{xpasswd},\3" "I:"
316 * # finally, in case of bind, if one uses a 'uid=username' dn,
317 * # it is rewritten in 'cn=name surname' if possible.
319 * rewriteContext bindDn
320 * rewriteRule ".*" "\{>addBlancs(\{>uid2Gecos(\0)})}" ":"
323 * # the search base is rewritten according to 'default' rules
325 * rewriteContext searchBase alias default
328 * # search results with OpenLDAP dn are rewritten back with
329 * # 'dc=home,dc=net' naming context, with spaces eaten.
331 * rewriteContext searchResult
332 * rewriteRule "(.*[^ ]?)[ ]?dc=OpenLDAP,[ ]?dc=org"
333 * "\{>eatBlancs(\1)}dc=home,dc=net" ":"
335 * # bind with email instead of full dn: we first need an ldap map
336 * # that turns attributes into a dn (the filter is provided by the
337 * # substitution string):
339 * rewriteMap ldap attr2dn "ldap://host/dc=my,dc=org?dn?sub"
341 * # then we need to detect emails; note that the rule in case of match
342 * # stops rewriting; in case of error, it is ignored.
343 * # In case we are mapping virtual to real naming contexts, we also
344 * # need to rewrite regular dns, because the definition of a bindDn
345 * # rewrite context overrides the default definition.
347 * rewriteContext bindDn
348 * rewriteRule "(mail=[^,]+@[^,]+)" "\{attr2dn(\1)}" "@I"
350 * # This is a rather sophisticate example. It massages a search filter
351 * # in case who performs the search has administrative privileges.
352 * # First we need to keep track of the bind dn of the incoming request:
354 * rewriteContext bindDn
355 * rewriteRule ".+" "\{**&binddn(\0)}" ":"
357 * # a search filter containing 'uid=' is rewritten only if an
358 * # appropriate dn is bound.
359 * # to do this, in the first rule the bound dn is dereferenced, while
360 * # the filter is decomposed in a prefix, the argument of the 'uid=',
361 * # and in a suffix. A tag '<>' is appended to the dn. If the dn
362 * # refers to an entry in the 'ou=admin' subtree, the filter is
363 * # rewritten OR-ing the 'uid=<arg>' with 'cn=<arg>'; otherwise
364 * # it is left as is. This could be useful, for instance, to allow
365 * # apache's auth_ldap-1.4 module to authenticate users with both
366 * # 'uid' and 'cn', but only if the request comes from a possible
367 * # 'dn: cn=Web auth, ou=admin, dc=home, dc=net' user.
369 * rewriteContext searchFilter
370 * rewriteRule "(.*\()uid=([a-z0-9_]+)(\).*)"
371 * "\{**binddn}<>\{&prefix(\1)}\{&arg(\2)}\{&suffix(\3)}" ":I"
372 * rewriteRule "[^,]+,[ ]?ou=admin,[ ]?dc=home,[ ]?dc=net"
373 * "\{*prefix}|(uid=\{*arg})(cn=\{*arg})\{*suffix}" "@I"
374 * rewriteRule ".*<>" "\{*prefix}uid=\{*arg}\{*suffix}"
377 * LDAP Proxy resolution (a possible evolution of the back-ldap):
379 * in case the rewritten dn is an LDAP URL, the operation is initiated
380 * towards the host[:port] indicated in the url, if it does not refer
381 * to the local server.
385 * rewriteRule '^cn=root,.*' '\0' 'G{3}'
386 * rewriteRule '^cn=[a-l].*' 'ldap://ldap1.my.org/\0' '@'
387 * rewriteRule '^cn=[m-z].*' 'ldap://ldap2.my.org/\0' '@'
388 * rewriteRule '.*' 'ldap://ldap3.my.org/\0' '@'
390 * (rule 1 is simply there to illustrate the 'G{n}' action; it could
393 * rewriteRule '^cn=root,.*' 'ldap://ldap3.my.org/\0' '@'
395 * with the advantage of saving one rewrite pass ...)