The Unicode Character Database (UCD) Tools is a set of Python tools and a C library with a C++ API binding. The Python tools are designed to support extracting and processing data from the text-based UCD source files, while the C library is designed to provide easy access to this information within a C or C++ program.
The project uses and supports the following sources of Unicode codepoint data:
In order to build ucd-tools, you need:
NOTE: The C++ compiler is used to build the test for the C++ API.
To build the documentation, you need:
UCD Tools supports the standard GNU autotools build system. The source code does not contain the generated
configure files, so to build it you need to run:
./autogen.sh ./configure --prefix=/usr make
The tests can be run by using:
The program can be installed using:
sudo make install
The documentation can be built using:
To re-generate the source files from the UCD data when a new version of unicode is released, you need to run:
./configure --prefix=/usr --with-unicode-version=VERSION make ucd-update
VERSION is the Unicode version (e.g.
Additionally, you can use the
UCD_FLAGS option to control how the data is generated. The following flags are supported:
|--with-csur||Add ConScript Unicode Registry data.|
Report bugs to the ucd-tools issues page on GitHub.
UCD Tools is released under the GPL version 3 or later license.
The UCD data files in
Copyright © 1991-2014 Unicode, Inc. All rights reserved.
The files in
data/csur are based on the information from the ConScript Unicode Registry maintained by John Cowan and Michael Everson.