X-Git-Url: https://git.sur5r.net/?a=blobdiff_plain;f=libraries%2Fliblunicode%2FCompositionExclusions.txt;h=07a60b8b92047a209a345811010eae6689d823d4;hb=0ea43c9d7d2fbf08b2078931bebfcaf443230878;hp=53f846729d5655fd35e59b62cfc0eda948f378d9;hpb=1bea1fdd343666e70f68be36aa075fde2b342a0e;p=openldap diff --git a/libraries/liblunicode/CompositionExclusions.txt b/libraries/liblunicode/CompositionExclusions.txt index 53f846729d..07a60b8b92 100644 --- a/libraries/liblunicode/CompositionExclusions.txt +++ b/libraries/liblunicode/CompositionExclusions.txt @@ -1,145 +1,176 @@ -# CompositionExclusions-2.txt +# CompositionExclusions-3.2.0.txt +# Date: 2002-03-19,23:30:28 GMT [MD] # -# Composition Exclusions -# This file lists the characters from the UTR #15 Composition Exclusion Table. +# This file lists the characters from the UAX #15 Composition Exclusion Table. +# +# The format of the comments in this file has been updated since the last version, +# CompositionExclusions-3.txt. The only substantive change to this file between that +# version and this one is the addition of U+2ADC FORKING. # # For more information, see # http://www.unicode.org/unicode/reports/tr15/#Primary Exclusion List Table +# ================================================ # (1) Script Specifics # This list of characters cannot be derived from the UnicodeData file. +# ================================================ + +0958 # DEVANAGARI LETTER QA +0959 # DEVANAGARI LETTER KHHA +095A # DEVANAGARI LETTER GHHA +095B # DEVANAGARI LETTER ZA +095C # DEVANAGARI LETTER DDDHA +095D # DEVANAGARI LETTER RHA +095E # DEVANAGARI LETTER FA +095F # DEVANAGARI LETTER YYA +09DC # BENGALI LETTER RRA +09DD # BENGALI LETTER RHA +09DF # BENGALI LETTER YYA +0A33 # GURMUKHI LETTER LLA +0A36 # GURMUKHI LETTER SHA +0A59 # GURMUKHI LETTER KHHA +0A5A # GURMUKHI LETTER GHHA +0A5B # GURMUKHI LETTER ZA +0A5E # GURMUKHI LETTER FA +0B5C # ORIYA LETTER RRA +0B5D # ORIYA LETTER RHA +0F43 # TIBETAN LETTER GHA +0F4D # TIBETAN LETTER DDHA +0F52 # TIBETAN LETTER DHA +0F57 # TIBETAN LETTER BHA +0F5C # TIBETAN LETTER DZHA +0F69 # TIBETAN LETTER KSSA +0F76 # TIBETAN VOWEL SIGN VOCALIC R +0F78 # TIBETAN VOWEL SIGN VOCALIC L +0F93 # TIBETAN SUBJOINED LETTER GHA +0F9D # TIBETAN SUBJOINED LETTER DDHA +0FA2 # TIBETAN SUBJOINED LETTER DHA +0FA7 # TIBETAN SUBJOINED LETTER BHA +0FAC # TIBETAN SUBJOINED LETTER DZHA +0FB9 # TIBETAN SUBJOINED LETTER KSSA +FB1D # HEBREW LETTER YOD WITH HIRIQ +FB1F # HEBREW LIGATURE YIDDISH YOD YOD PATAH +FB2A # HEBREW LETTER SHIN WITH SHIN DOT +FB2B # HEBREW LETTER SHIN WITH SIN DOT +FB2C # HEBREW LETTER SHIN WITH DAGESH AND SHIN DOT +FB2D # HEBREW LETTER SHIN WITH DAGESH AND SIN DOT +FB2E # HEBREW LETTER ALEF WITH PATAH +FB2F # HEBREW LETTER ALEF WITH QAMATS +FB30 # HEBREW LETTER ALEF WITH MAPIQ +FB31 # HEBREW LETTER BET WITH DAGESH +FB32 # HEBREW LETTER GIMEL WITH DAGESH +FB33 # HEBREW LETTER DALET WITH DAGESH +FB34 # HEBREW LETTER HE WITH MAPIQ +FB35 # HEBREW LETTER VAV WITH DAGESH +FB36 # HEBREW LETTER ZAYIN WITH DAGESH +FB38 # HEBREW LETTER TET WITH DAGESH +FB39 # HEBREW LETTER YOD WITH DAGESH +FB3A # HEBREW LETTER FINAL KAF WITH DAGESH +FB3B # HEBREW LETTER KAF WITH DAGESH +FB3C # HEBREW LETTER LAMED WITH DAGESH +FB3E # HEBREW LETTER MEM WITH DAGESH +FB40 # HEBREW LETTER NUN WITH DAGESH +FB41 # HEBREW LETTER SAMEKH WITH DAGESH +FB43 # HEBREW LETTER FINAL PE WITH DAGESH +FB44 # HEBREW LETTER PE WITH DAGESH +FB46 # HEBREW LETTER TSADI WITH DAGESH +FB47 # HEBREW LETTER QOF WITH DAGESH +FB48 # HEBREW LETTER RESH WITH DAGESH +FB49 # HEBREW LETTER SHIN WITH DAGESH +FB4A # HEBREW LETTER TAV WITH DAGESH +FB4B # HEBREW LETTER VAV WITH HOLAM +FB4C # HEBREW LETTER BET WITH RAFE +FB4D # HEBREW LETTER KAF WITH RAFE +FB4E # HEBREW LETTER PE WITH RAFE + +# Total code points: 67 + +# ================================================ +# (2) Post Composition Version precomposed characters +# These characters cannot be derived solely from the UnicodeData.txt file +# in this version of Unicode. +# ================================================ -0958 # DEVANAGARI LETTER QA -0959 # DEVANAGARI LETTER KHHA -095A # DEVANAGARI LETTER GHHA -095B # DEVANAGARI LETTER ZA -095C # DEVANAGARI LETTER DDDHA -095D # DEVANAGARI LETTER RHA -095E # DEVANAGARI LETTER FA -095F # DEVANAGARI LETTER YYA -09DC # BENGALI LETTER RRA -09DD # BENGALI LETTER RHA -09DF # BENGALI LETTER YYA -0A33 # GURMUKHI LETTER LLA -0A36 # GURMUKHI LETTER SHA -0A59 # GURMUKHI LETTER KHHA -0A5A # GURMUKHI LETTER GHHA -0A5B # GURMUKHI LETTER ZA -0A5E # GURMUKHI LETTER FA -0B5C # ORIYA LETTER RRA -0B5D # ORIYA LETTER RHA -0F43 # TIBETAN LETTER GHA -0F4D # TIBETAN LETTER DDHA -0F52 # TIBETAN LETTER DHA -0F57 # TIBETAN LETTER BHA -0F5C # TIBETAN LETTER DZHA -0F69 # TIBETAN LETTER KSSA -0F76 # TIBETAN VOWEL SIGN VOCALIC R -0F78 # TIBETAN VOWEL SIGN VOCALIC L -0F93 # TIBETAN SUBJOINED LETTER GHA -0F9D # TIBETAN SUBJOINED LETTER DDHA -0FA2 # TIBETAN SUBJOINED LETTER DHA -0FA7 # TIBETAN SUBJOINED LETTER BHA -0FAC # TIBETAN SUBJOINED LETTER DZHA -0FB9 # TIBETAN SUBJOINED LETTER KSSA -FB1F # HEBREW LIGATURE YIDDISH YOD YOD PATAH -FB2A # HEBREW LETTER SHIN WITH SHIN DOT -FB2B # HEBREW LETTER SHIN WITH SIN DOT -FB2C # HEBREW LETTER SHIN WITH DAGESH AND SHIN DOT -FB2D # HEBREW LETTER SHIN WITH DAGESH AND SIN DOT -FB2E # HEBREW LETTER ALEF WITH PATAH -FB2F # HEBREW LETTER ALEF WITH QAMATS -FB30 # HEBREW LETTER ALEF WITH MAPIQ -FB31 # HEBREW LETTER BET WITH DAGESH -FB32 # HEBREW LETTER GIMEL WITH DAGESH -FB33 # HEBREW LETTER DALET WITH DAGESH -FB34 # HEBREW LETTER HE WITH MAPIQ -FB35 # HEBREW LETTER VAV WITH DAGESH -FB36 # HEBREW LETTER ZAYIN WITH DAGESH -FB38 # HEBREW LETTER TET WITH DAGESH -FB39 # HEBREW LETTER YOD WITH DAGESH -FB3A # HEBREW LETTER FINAL KAF WITH DAGESH -FB3B # HEBREW LETTER KAF WITH DAGESH -FB3C # HEBREW LETTER LAMED WITH DAGESH -FB3E # HEBREW LETTER MEM WITH DAGESH -FB40 # HEBREW LETTER NUN WITH DAGESH -FB41 # HEBREW LETTER SAMEKH WITH DAGESH -FB43 # HEBREW LETTER FINAL PE WITH DAGESH -FB44 # HEBREW LETTER PE WITH DAGESH -FB46 # HEBREW LETTER TSADI WITH DAGESH -FB47 # HEBREW LETTER QOF WITH DAGESH -FB48 # HEBREW LETTER RESH WITH DAGESH -FB49 # HEBREW LETTER SHIN WITH DAGESH -FB4A # HEBREW LETTER TAV WITH DAGESH -FB4B # HEBREW LETTER VAV WITH HOLAM -FB4C # HEBREW LETTER BET WITH RAFE -FB4D # HEBREW LETTER KAF WITH RAFE -FB4E # HEBREW LETTER PE WITH RAFE +2ADC # FORKING +1D15E # MUSICAL SYMBOL HALF NOTE +1D15F # MUSICAL SYMBOL QUARTER NOTE +1D160 # MUSICAL SYMBOL EIGHTH NOTE +1D161 # MUSICAL SYMBOL SIXTEENTH NOTE +1D162 # MUSICAL SYMBOL THIRTY-SECOND NOTE +1D163 # MUSICAL SYMBOL SIXTY-FOURTH NOTE +1D164 # MUSICAL SYMBOL ONE HUNDRED TWENTY-EIGHTH NOTE +1D1BB # MUSICAL SYMBOL MINIMA +1D1BC # MUSICAL SYMBOL MINIMA BLACK +1D1BD # MUSICAL SYMBOL SEMIMINIMA WHITE +1D1BE # MUSICAL SYMBOL SEMIMINIMA BLACK +1D1BF # MUSICAL SYMBOL FUSA WHITE +1D1C0 # MUSICAL SYMBOL FUSA BLACK -# (2) Post Composition Version characters -# These characters cannot be derived from the UnicodeData file. -# (There are no characters in this category in this version of Unicode.) +# Total code points: 14 +# ================================================ # (3) Singleton Decompositions # These characters can be derived from the UnicodeData file # by including all characters whose canonical decomposition # consists of a single character. # These characters are simply quoted here for reference. +# ================================================ -# 0340 COMBINING GRAVE TONE MARK -# 0341 COMBINING ACUTE TONE MARK -# 0343 COMBINING GREEK KORONIS -# 0374 GREEK NUMERAL SIGN -# 037E GREEK QUESTION MARK -# 0387 GREEK ANO TELEIA -# 1F71 GREEK SMALL LETTER ALPHA WITH OXIA -# 1F73 GREEK SMALL LETTER EPSILON WITH OXIA -# 1F75 GREEK SMALL LETTER ETA WITH OXIA -# 1F77 GREEK SMALL LETTER IOTA WITH OXIA -# 1F79 GREEK SMALL LETTER OMICRON WITH OXIA -# 1F7B GREEK SMALL LETTER UPSILON WITH OXIA -# 1F7D GREEK SMALL LETTER OMEGA WITH OXIA -# 1FBB GREEK CAPITAL LETTER ALPHA WITH OXIA -# 1FBE GREEK PROSGEGRAMMENI -# 1FC9 GREEK CAPITAL LETTER EPSILON WITH OXIA -# 1FCB GREEK CAPITAL LETTER ETA WITH OXIA -# 1FD3 GREEK SMALL LETTER IOTA WITH DIALYTIKA AND OXIA -# 1FDB GREEK CAPITAL LETTER IOTA WITH OXIA -# 1FE3 GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND OXIA -# 1FEB GREEK CAPITAL LETTER UPSILON WITH OXIA -# 1FEE GREEK DIALYTIKA AND OXIA -# 1FEF GREEK VARIA -# 1FF9 GREEK CAPITAL LETTER OMICRON WITH OXIA -# 1FFB GREEK CAPITAL LETTER OMEGA WITH OXIA -# 1FFD GREEK OXIA -# 2000 EN QUAD -# 2001 EM QUAD -# 2126 OHM SIGN -# 212A KELVIN SIGN -# 212B ANGSTROM SIGN -# 2329 LEFT-POINTING ANGLE BRACKET -# 232A RIGHT-POINTING ANGLE BRACKET -# F900 CJK COMPATIBILITY IDEOGRAPH-F900 -#.. FA0D CJK COMPATIBILITY IDEOGRAPH-FA0D -# FA10 CJK COMPATIBILITY IDEOGRAPH-FA10 -# FA12 CJK COMPATIBILITY IDEOGRAPH-FA12 -# FA15 CJK COMPATIBILITY IDEOGRAPH-FA15 -#.. FA1E CJK COMPATIBILITY IDEOGRAPH-FA1E -# FA20 CJK COMPATIBILITY IDEOGRAPH-FA20 -# FA22 CJK COMPATIBILITY IDEOGRAPH-FA22 -# FA25 CJK COMPATIBILITY IDEOGRAPH-FA25 -# FA26 CJK COMPATIBILITY IDEOGRAPH-FA26 -# FA2A CJK COMPATIBILITY IDEOGRAPH-FA2A -#.. FA2D CJK COMPATIBILITY IDEOGRAPH-FA2D +# 0340..0341 [2] COMBINING GRAVE TONE MARK..COMBINING ACUTE TONE MARK +# 0343 COMBINING GREEK KORONIS +# 0374 GREEK NUMERAL SIGN +# 037E GREEK QUESTION MARK +# 0387 GREEK ANO TELEIA +# 1F71 GREEK SMALL LETTER ALPHA WITH OXIA +# 1F73 GREEK SMALL LETTER EPSILON WITH OXIA +# 1F75 GREEK SMALL LETTER ETA WITH OXIA +# 1F77 GREEK SMALL LETTER IOTA WITH OXIA +# 1F79 GREEK SMALL LETTER OMICRON WITH OXIA +# 1F7B GREEK SMALL LETTER UPSILON WITH OXIA +# 1F7D GREEK SMALL LETTER OMEGA WITH OXIA +# 1FBB GREEK CAPITAL LETTER ALPHA WITH OXIA +# 1FBE GREEK PROSGEGRAMMENI +# 1FC9 GREEK CAPITAL LETTER EPSILON WITH OXIA +# 1FCB GREEK CAPITAL LETTER ETA WITH OXIA +# 1FD3 GREEK SMALL LETTER IOTA WITH DIALYTIKA AND OXIA +# 1FDB GREEK CAPITAL LETTER IOTA WITH OXIA +# 1FE3 GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND OXIA +# 1FEB GREEK CAPITAL LETTER UPSILON WITH OXIA +# 1FEE..1FEF [2] GREEK DIALYTIKA AND OXIA..GREEK VARIA +# 1FF9 GREEK CAPITAL LETTER OMICRON WITH OXIA +# 1FFB GREEK CAPITAL LETTER OMEGA WITH OXIA +# 1FFD GREEK OXIA +# 2000..2001 [2] EN QUAD..EM QUAD +# 2126 OHM SIGN +# 212A..212B [2] KELVIN SIGN..ANGSTROM SIGN +# 2329 LEFT-POINTING ANGLE BRACKET +# 232A RIGHT-POINTING ANGLE BRACKET +# F900..FA0D [270] CJK COMPATIBILITY IDEOGRAPH-F900..CJK COMPATIBILITY IDEOGRAPH-FA0D +# FA10 CJK COMPATIBILITY IDEOGRAPH-FA10 +# FA12 CJK COMPATIBILITY IDEOGRAPH-FA12 +# FA15..FA1E [10] CJK COMPATIBILITY IDEOGRAPH-FA15..CJK COMPATIBILITY IDEOGRAPH-FA1E +# FA20 CJK COMPATIBILITY IDEOGRAPH-FA20 +# FA22 CJK COMPATIBILITY IDEOGRAPH-FA22 +# FA25..FA26 [2] CJK COMPATIBILITY IDEOGRAPH-FA25..CJK COMPATIBILITY IDEOGRAPH-FA26 +# FA2A..FA2D [4] CJK COMPATIBILITY IDEOGRAPH-FA2A..CJK COMPATIBILITY IDEOGRAPH-FA2D +# FA30..FA6A [59] CJK COMPATIBILITY IDEOGRAPH-FA30..CJK COMPATIBILITY IDEOGRAPH-FA6A +# 2F800..2FA1D [542] CJK COMPATIBILITY IDEOGRAPH-2F800..CJK COMPATIBILITY IDEOGRAPH-2FA1D +# Total code points: 924 + +# ================================================ # (4) Non-Starter Decompositions # These characters can be derived from the UnicodeData file # by including all characters whose canonical decomposition consists # of a sequence of characters, the first of which has a non-zero # combining class. # These characters are simply quoted here for reference. +# ================================================ + +# 0344 COMBINING GREEK DIALYTIKA TONOS +# 0F73 TIBETAN VOWEL SIGN II +# 0F75 TIBETAN VOWEL SIGN UU +# 0F81 TIBETAN VOWEL SIGN REVERSED II + +# Total code points: 4 -# 0344 COMBINING GREEK DIALYTIKA TONOS -# 0F73 TIBETAN VOWEL SIGN II -# 0F75 TIBETAN VOWEL SIGN UU -# 0F81 TIBETAN VOWEL SIGN REVERSED II