.../unicodetools/org/unicode/props
This is a set of revised Unicode property tools. Rather than the old-style tools that were written a long time ago, when Java was much more primitive (and slow, and had memory restrictions), this reads the Unicode data files and constructs “modern” versions of the properties. Each Unicode property is represented by an enum value, with property values backed by a UnicodeMap. The reading process is data-driven, and uses regexes to check the values.
Occasionally you need to add a ‘non-standard’ property. Here's what to do, with some examples of changes in the links.
If you are building before the UCD tools have been completely updated to new release X.Y.Z, you need to:
This generates enums corresponding to the properties and property values. Do this whenever the PropertyAliases.txt or PropertyValueAliases.txt files change.
It will regenerate the following files (see commit 2ff83c6 for examples of changes):
Run UCD.Main to generate new PropertyAliases.txt and PropertyValueAliases.txt.
Note: For some properties and values, it is sufficient to add them to the input PA.txt & PVA.txt files, run GenerateEnums and UCD.Main. Sometimes you need to change additional .java files.
The properties can be directly compared, such as
if (prop == UcdProperty.Unicode_1_Name) { NOT_IN_ICU.add(prop.toString()); return; }
From the enum you can get the type, and the names, and create an enum from any of the names.
To use the property values, call:
final IndexUnicodeProperties iup = IndexUnicodeProperties.make("6.2.0");
When any property is accessed, the in-memory version is checked. If there is none, then the on-disk version is checked (in Generated/BIN/x.y.z). If there is none, then the right Unicode file(s) are accessed to build, and then the property is cached on disk and in memory.
IMPORTANT NOTE: If you change the files in ucd/, then you must delete the files in Generated/BIN/x.y.z
To test the XML properties from https://www.unicode.org/Public/XXX/ucdxml/
NOTE: the following (until the test is fixed) are false differences:
*FAIL* kAccountingNumeric with 1114086 errors. *FAIL* kOtherNumeric with 1114082 errors. *FAIL* kPrimaryNumeric with 1114095 errors. *FAIL* kCompatibilityVariant with 1113110 errors. *FAIL* Bidi_Paired_Bracket with 11 errors. *FAIL* Name with 11 errors.
The problem is a difference in how missing values are handled.
For a general test of properties, run CheckProperties. You can supply any of the following as parameters:
enum Action {SHOW, COMPARE, ICU, EMPTY, INFO, SPACES, DETAILS, DEFAULTS, JSON, NAMES} enum Extent {SOME, ALL}
or a version, eg 6.2.0
The defaults are: COMPARE ALL {lastversion}
For example, ICU compares the ICU values.
NOTE: false differences with:
Age «2.0» ≠ «V2_0», etc.