Starting with Unicode 15.1, the “source of truth” for not-yet-released versions of most of the data files is in https://github.com/unicode-org/unicodetools/tree/main/unicodetools/data/ucd/dev and parallel .../uca/dev etc. folders.
First: Files with a special process.
List of these files (see https://www.unicode.org/Public/UCD/latest/ucd/):
Process:
The “source of truth” for these is outside of GitHub for now. KenW updates or vets these files and posts them to https://www.unicode.org/Public/draft/ . A unicodetools GitHub contributor fetches these files and creates a pull request as above.
See https://github.com/unicode-org/properties/issues/8 “simplify versioning of readme files”
Changes are made in a GitHub pull request.
Pull request cycle:
One difference here: Multiple stages, generating charts along the way. Initial draft versions of eg annotations for candidates, which is in addition to the current data for the next version of Unicode. Once they get code points, they go into the regular files. The emoji tools also read annotation data from CLDR and from the candidates file, and use the CLDR emoji collation data (and interpolates the candidates data).
Another difference: The charts are not normative; they get updated out of cycle, for example with new vendor images.
https://github.com/unicode-org/unicodetools/tree/main/unicodetools/data/emoji/dev
Certain snapshots of the .../dev/ files are copied into https://www.unicode.org/Public/draft/ for Unicode alpha, beta, and final releases, and more as appropriate.
Make sure to publish exactly the intended set of files. Skip the NamesList.txt and Unihan data files (see above), and skip any others that are only for internal use.
For the alpha review, publish (at least) the UCD and emoji files, and the charts.
Run the pub/copy-alpha-to-draft.sh script from an up-to-date repo workspace. The script copies the set of the .../dev/ data files for an alpha snapshot from a unicodetools workspace to a target folder with the layout of https://www.unicode.org/Public/draft/ .
Send the resulting zip file to Rick for posting to https://www.unicode.org/Public/draft/ . Ask Rick to add other files that are not tracked in the unicodetools repo:
Note: No version/delta infixes in names of data files. We simply use the “draft” folder and the file-internal time stamps for versioning.
For the beta review, publish all of the data files, and the charts.
Run the pub/copy-beta-to-draft.sh script from an up-to-date repo workspace. The script copies the set of the .../dev/ data files for a beta snapshot from a unicodetools workspace to a target folder with the layout of https://www.unicode.org/Public/draft/ .
Send the resulting zip file to Rick for posting to https://www.unicode.org/Public/draft/ . Ask Rick to add other files that are not tracked in the unicodetools repo:
TODO: Write a script like /pub/copy-release-to-draft.sh that will be run on the unicode.org server and copy the set of the .../dev/ data files for a beta snapshot from a unicodetools workspace to the location behind https://www.unicode.org/Public/draft/ .
Verify the final set of files in the draft folder.
TODO: Write a script like /pub/copy-draft-to-release.sh that will be run on the unicode.org server and copy the files from the location behind https://www.unicode.org/Public/draft/ to the locations behind the version-specific release folders. For example:
After a Unicode release, copy a snapshot of the unicodetools repo .../dev/ files (matching the released files, of course) to a versioned unicodetools folder; for example: .../unicodetools/data/ucd/15.1.0/ . (We no longer append a “-Update” suffix to the folder name.)