This README serves as an documentation entry point of Blink's text stack.
It can be viewed in formatted form here.
Blink's font and text stack covers those functional parts of the layout engine that provide measurement, geometric operations and drawing for runs of CSS-styled HTML text.
From source HTML to visual output this roughly comprises the following stages:
During the style resolution stage of layout
ComputedStyle objects are calculated for each element of the DOM tree. Each
ComputedStyle will own a
Font and a
FontDescription object. For this to work, after CSS is parsed into the various specialized
CSSValue-derived types, the
ConvertFont… methods in
style_builder_converter.cc convert from
FontDescription data structures. In the opposite direction, the
ValueForFont… methods in
computed_style_css_value_mapping.cc convert back from
FontDescription back to CSS values.
Then, during style resolution,
FontBuilder::CreateFont is called to update the
Font object based on the properties stored in
FontDescription and thus link those two objects together. The key to linking them together is
Font is assigned a
CSSFontSelector, which serves as a looking glass to know about which fonts are available in the document's scope. This includes fonts from the realm of web fonts as well as system fonts.
Font object does not yet represent a particular typeface. Similarly,
FontDescription does not point to an individual typeface yet. This is because
FontDescription contains a
FontFamily list of family names originating from the CSS
Font objects represent the API for style and layout operations to retrieve geometric information about a font used in the DOM: retrieve metrics such as line height, x-height etc. and measure text to retrieve its bounding box. The paint stage uses
Font objects to draw text.
Only when such geometric or painting operations are requested from a
Font object, internally
Font will perform font matching.
As soon as a
Font object is requested to perform any operation, it needs to find an individual typeface to work with. The information stored in its
FontDescription member needs to be resolved against available web fonts or system fonts. This process is called font matching. The detailed algorithm for this process is described in the CSS Fonts Module, section Matching Font Styles.
For this purpose, a
Font object has an
Update(CSSFontSelector*) method to inform it about an updated set of available fonts.
Font internally delegates the resolution of CSS family names from the
font-family against available fonts to its
FontFallbackList member, and thus hands the
CSSFontSelector down to
CSSFontSelector::GetFontData with a
FontDescription to perform a lookup.
CSSFontSelector will in turn ask
FontFaceCache to find a FontData object among the available web fonts. If there is already a match for
FontDescription in the cache, this
FontData object is returned. If not, the
FontSelectionAlgorithm::IsBetterMatchForRequest comparison function is used to find the best match among the available web fonts. This comparison function implements the CSS font matching algorithm.
CSSFontSelector can't find a font using this approach, it will try to find a font from the system. To this end, it will query
FontCache is for system fonts what
FontFaceCache is for web fonts.
FontCache has OS/system specific implementations in font_cache_skia.cc, font_cache_mac.mm, font_cache_linux.cc, font_cache_android.cc in order to perform system font lookups using the respective OS/system API.
Before looking at the next stages, we need to understand what text shaping means.
Shaping text is the process of mapping a unicode string to a sequence of glyph IDs from a font plus their exact geometrical positions through performing OpenType layout operations defined in the font. For Latin script, the output is mostly glyph IDs and horizontal advances, but for complex scripts, the output also describes positioning in the vertical direction, reordered glyphs, and association into grapheme clusters.
In more detail, a unicode string is not the only required input for this mapping. Instead, a number of variables are required before shaping can be performed. For a single run of text ready for shaping, the following input variables need to be isolated and constant:
This means that before shaping can be performed, incoming text and
FontDescription information coming from the layout code needs to be segmented into sub-runs where the above inputs stay the same. For example, if the incoming text contains Unicode sequences of different scripts, the incoming text run needs to be broken up into sub-runs of only one script. Similarly, if the incoming text contains differing orientations, for example Latin text embedded in Japanese vertical layout, then these sub-runs need to be isolated and shaped separately.
Emoji place additional requirements in isolating sub-runs for shaping. Emoji Unicode code points and code point sequences have different default presentation styles, text-default, or emoji-default. This is defined in the section Presentation Style of Unicode Technical Standard #51. So in order to select the correct font for emoji presentation — either a color font, or a regular contour font — the incoming text needs to be segmented and isolated by its emoji properties as well.
Because text shaping and font fallback are costly operations and geometric operations on text runs are performed over and over again during layout, a word cache is used to speed up these operations.
The basic unit for storing shaping results in a cache is a word, separated by spaces. Since CJK text often does not use spaces to separate words, for CJK text, each individual CJK character is treated as a word.
As the excursion on text shaping explains, multiple variables go into the shaping equation, such as a fixed font, a fixed font size, script, et cetera. This in turn means that the word cache becomes invalid if the set of available font changes changes because
FontFallbackList would change in what it returns when available fonts change. We cannot reduce the cache key computation for the word cache to the string/word itself plus the requested font as represented by
FontDescription, but we also need to capture the set of available fonts at the time shaping for this word and its
FontDescription was performed. This state is captured by computing a composite key off of the
cachin_gword_shaper.h is the entry point for retrieving shaping results through the word cache. It defers to
caching_word_shape_iterator.h for word/space or CJK segmentation and responds to requests for a
Width() or returns a
ShapeResultBuffercontaining a list of
ShapeResult objects. If not found in the cache
ShapeResult objects are produced by the text shaping stage. So
CachingWordShaper serves as the accelerating caching layer between
Font operations and
The section Setting up Shaping described the requirements for constant input requirements before shaping can be performed.
RunSegmenter is the top level API for segmenting incoming text runs using sub-segmenters. It splits text by their Unicode script property (via
ScriptRunIterator), orientation and direction — horizontal LTR/RTL, vertical LTR/RTL (via
OrientationIterator), and emoji presentation attributes (via
RunSegmenter is constructed from a text run in UTF-16
UChar format, then functions as an iterator returning sub-runs of constant script, emoji presentation, and orientation. These sub-runs are then suitable as units for shaping.
Shaping text runs is split into several stages: Run segmentation, shaping the initial segment starting with the primary font, identifying shaped and non-shaped sequences of the shaping result, and processing unshaped sub-runs by trying to shape using the remaining list of fonts, then trying fallback fonts until the last resort font is reached.
If small/petite caps formatting is requested, an additional lowercase/uppercase segmentation pass is required. In this stage, OpenType features in the font are matched against the requested formatting. If the respective caps feature is found, the feature is used in shaping and activated from the font. Otherwise small-caps glyphs are synthesized as required by the CSS Level 3 Fonts Module.
Below we will go through one example — for simplicity without caps formatting — to illustrate the process: The following is a run of vertical text to be shaped. After run segmentation in
RunSegmenter it is split into 4 segments. The segments indicated by the segementation results showing the script, orientation information and font fallback preference (text, emoji presentation) of the individual segment. The Japanese text at the beginning has script “Hiragana”, does not need rotation when laid out vertically and does not require an emoji font, as indicated by
0 い 1 ろ 2 は USCRIPT_HIRAGANA, OrientationIterator::OrientationKeep, FontFallbackPriority::kText 3 a 4 ̄ (Combining Macron) 5 a 6 A USCRIPT_LATIN, OrientationIterator::OrientationRotateSideways, FontFallbackPriority::kText 7 い 8 ろ 9 は USCRIPT_HIRAGANA, OrientationIterator::OrientationKeep, FontFallbackPriority::kText
Let's assume the CSS for this text run is as follows:
font-family: "Heiti SC", Tinos, sans-serif; where Tinos is a web font, defined as a composite font (i.e. two separate CSS
@font-face instances with different
unicode-range subsetting ranges), one for Latin
U+00-U+FF and one unrestricted
FontFallbackIterator provides these fonts to the shaper. It will start with Heiti SC, then the first part of Tinos for the restricted unicode-range, then the unrestricted full unicode-range part of Tinos, then a system sans-serif.
The initial segment 0-2 is sent to the shaper, together with the segmentation properties and the initial Heiti SC font. Characters 0-2 are shaped successfully with Heiti SC.
Glyphpos: 0 1 2 Cluster: 0 1 2 Glyph: い ろ は
The next segment, 3-6 is passed to the shaper. The shaper attempts to shape it with Heiti SC, which fails for the Combining Macron. So the shaping result for this segment would look similar to this.
Glyphpos: 3 4 5 6 Cluster: 3 3 5 6 Glyph: a ☐ a A (where ☐ is .notdef)
Now in the
extractShapeResults() step we notice that there is more work to do, since Heiti SC does not have a glyph for the Combining Macron combined with an a. So, cluster 3 consisting of the characters a plus ̄ (Combining Macron) is placed in the
HolesQueue for clusters that need to be processed after switching to the next fallback font.
After shaping the first segment as whole, the font is cycled to the next font coming from
FontFallbackIterator and the remaining items in the
HolesQueue are processed, picking them from the head of the queue.
In this case, the next font is Tinos (for the range
U+00-U+FF). Shaping using this font, assuming it is subsetted, fails again since there is no combining mark available in the unicode range
U+00-U+FF. This triggers requesting yet another font. This time, the Tinos font for the full range. With this, shaping succeeds with the following HarfBuzz result:
Glyphpos: 3 4 5 6 Cluster: 3 3 5 6 Glyph: a ◌̄ a A (with glyph coordinates placing the ◌̄ above the first a)
Now this sub run is successfully processed and can be appended to
ShapeResult. A new
ShapeResult::RunInfo is created. The logic in
ShapeResult::insertRun then takes care of merging the shape result into the right position the vector of
Shaping then continues analogously for the remaining Hiragana Japanese sub-run, and the result is inserted into
ShapeResult as well.
The section Text Shaping illustrates that font selection during shaping is part of an iterative process, which first tries to use as many glyphs as possible from the primary font, then in subsequent iterations proceeds to fill gaps from the secondary font and so on until there are no more so called
.notdef glyphs, i.e. no more boxes of text for which no glyph was found.
FontFallbackIterator meets the needs of HarfBuzzShaper to deliver new fonts to fill such gaps.
FontFallbackIterator is an iterator style API, which on calling
next() will deliver the first suitable font to try. A
FontFallbackList is the internal representation of fonts resolved from the CSS
FontFallbackList attempts to resolve font family names from the CSS
font-family: property in descending order. It tries to find them among the list of available web fonts which were declared by
@font-face rules or added to the document using the
FontFallbackIterator is intialized with a
FontFallbackList and starts retrieving fonts from this list as its first source for fonts. If during shaping a run of text
HarfBuzzShaper keeps requesting additional fonts after
FontFallbackList is exhausted, this means that
HarfBuzzShaper still tries to fill gaps in the run. In other words, the fonts specified in
font-family did not have sufficient glyph coverage to draw the whole run of text on the screen. In this situation, system font fallback is invoked, which means attempting to find a surrogate font which contains those glyphs that were missing so far. To this end
FontCache::FallbackFontForCharacter() in order to retrieve a font that has a glyph for the requested Unicode character. This means, beyond what is listed in
font-family there are additional system fonts pulled in to the shaping process.
FontFallbackIterator feeds fonts from the CSS
font-family list as well as system fallback fonts to
HarfBuzzShaper for use in the shaping iterations until ideally all gaps are filled and the full text run can be drawn with the correct glyphs. When there are gaps, and the .notdef tofu character must be rendered, the primary font is used for this.