Once the emoji are finalized for new version of TR51, or there is a new version of CLDR, run AacOrder.java to generate 3 new files which will be checked in.
Fix the versions at the top of the file, such as:
private static final VersionInfo VERSION = Emoji.VERSION12; private static final VersionInfo UCD_VERSION = Emoji.VERSION12;
The emoji version will be β₯ the UCD version.
Results:
:construction: TODO: Work with Mark on working replacements for “draft” URLs.
Before committing, diff the files with the current file using SVN. Spot-check that:
This file can be used both to determine AAC validity, and to get the sort order and emoji names on the sponsors' page, and get the names for the sponsor badges.
AacOrder.java uses ICU (as checked into CLDR) for sorting all characters other than emoji.
It sorts all emoji at the end, in the CLDR sorting order.
The file format is:
# Format: codepoint/range/string ; index ; name (if emoji) 0488..0489 ; 1 0591..05AF ; 3 05BD ; 34 ... 1F1FF 1F1E6 ; 129029 ; South Africa 1F1FF 1F1F2 ; 129030 ; Zambia 1F1FF 1F1FC ; 129031 ; Zimbabwe
There is an abbreviated version of the file, that doesn't have the sort-index. It is aac-order.txt. You compute the index by adding up the number of items on each line, instead of having it in a separate field.
This file is used for programmatic verification of input from users. It simply defines a UnicodeSet. The format is:
UnicodeSet EMOJI_ALLOWED = new UnicodeSet( // (!) EXCLAMATION MARK .. (~) TILDE 0x21,0x7e, // (Β‘) INVERTED EXCLAMATION MARK .. (Β¬) NOT SIGN 0xa1,0xac, ... .add("π§ββ") // (π§ββ) woman zombie .add("π§ββ") // (π§ββ) man zombie .add("π§ββ") .freeze(); // Total code points: 136314 // Total strings: 1577 // Total: 137891
You need to make sure, if used with C or C++, that the file encoding and C settings are for UTF-8 (Ascii would muck with the strings).
If anything has a name field in aac-order-ranges.txt, then you have to use that instead of the Unicode names. Only if the name field is blank do you use the Unicode name.
Use aac-order-ranges.txt to rebuild the sponsors page: adopted-characters.html
Compare with the previous version page to make sure that any changes are understood. A simple way to do that is to copy the contents of each page and paste into plain text files and diff the two. Each should look something like:
[Unicode] Adopt-a-Character Sponsors of Adopted Characters AAC Animation Character sponsors help support the work of the Unicode Consortium, to help modern software and computing systems support the widest range of human languages. More than 120,000 characters can be adoptedβsee Adopt a Character. The Unicode Consortium gratefully acknowledges the following generous character sponsors. Each adoption is permanent. 13 Gold Sponsors , Mark Davis and Anne Gundelfinger { Elastic } Elastic & Adobe Systems Incorporated Ξ± Ann Lewnes and Greg Welch π© Jason Jenkins ...
Use aac-order-ranges.txt to rebuild the backend for adopt-a-character.html#sponsorship_form
Spot-check that a few new characters and code point numbers from emoji-released.html can be entered.
π€£ 1F923
...
π 1F6D2
Spot-check that a few currently-adopted characters and code point numbers from adopted-characters.html and choosing.html can be entered.
Including one of each of the following:
Spot-check that a few invalid characters and code point numbers from cannot be entered, such as from:
list-unicodeset.jsp?abb=on&g=gc&a=[[:c:][:z:][:di:]ΰΏ-ΰΏεε]
Ignore the Unicode set generated at the top; for compactness it uses [^...]
where ...
are characters that are not in the invalid set.
Note: all of the charts are built so that if you select an image (drag across) and paste into a plain-text browser box, you get the character. To get the hex, you can use Inspect (or whatever it is called in your browser) to see the source, then copy.
Update pages for AAC that have char counts in the repo, as needed:
Update https://www.unicode.org/consortium/adopt-a-character.html in the repo to have images of the newest emoji (you can use the ones in the recent blog post) and sync back to the repo.
Prepare a blog post in http://goo.gl/lSaQNE, and run it by pr-unicode@googlegroups.com and unicode-web-presence-comm@googlegroups.com
Once everything checks out,