blob: 1c5370c11e0280c186144fca62e46d0fdda2b947 [file] [log] [blame] [edit]
<?xml version="1.0"?>
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN"
"http://www.oasis-open.org/docbook/xml/4.3/docbookx.dtd" [
<!ENTITY % local.common.attrib "xmlns:xi CDATA #FIXED 'http://www.w3.org/2003/XInclude'">
<!ENTITY version SYSTEM "version.xml">
]>
<chapter id="utilities">
<title>Utilities</title>
<para>
HarfBuzz includes several auxiliary components in addition to the
main APIs. These include a set of command-line tools, a set of
lower-level APIs for common data types that may be of interest to
client programs, and an embedded library for working with
Unicode Character Database (UCD) data.
</para>
<section id="utilities-command-line-tools">
<title>Command-line tools</title>
<para>
HarfBuzz include three command-line tools:
<program>hb-shape</program>, <program>hb-view</program>, and
<program>hb-subset</program>. They can be used to examine
HarfBuzz's functionality, debug font binaries, or explore the
various shaping models and features from a terminal.
</para>
<section id="utilities-command-line-hbshape">
<title>hb-shape</title>
<para>
<emphasis><program>hb-shape</program></emphasis> allows you to run HarfBuzz's
<function>hb_shape()</function> function on an input string and
to examine the outcome, in human-readable form, as terminal
output. <program>hb-shape</program> does
<emphasis>not</emphasis> render the results of the shaping call
into rendered text (you can use <program>hb-view</program>, below, for
that). Instead, it prints out the final glyph indices and
positions, taking all shaping operations into account, as if the
input string were a HarfBuzz input buffer.
</para>
<para>
You can specify the font to be used for shaping and, with
command-line options, you can add various aspects of the
internal state to the output that is sent to the terminal. The
general format is
</para>
<programlisting>
<command>hb-shape</command> <optional>[OPTIONS]</optional>
<parameter>path/to/font/file.ttf</parameter>
<parameter>yourinputtext</parameter>
</programlisting>
<para>
The default output format is plain text (although JSON output
can be selected instead by specifying the option
<optional>--output-format=json</optional>). The default output
syntax reports each glyph name (or glyph index if there is no
name) followed by its cluster value, its horizontal and vertical
position displacement, and its horizontal and vertical advances.
</para>
<para>
Output options exist to skip any of these elements in the
output, and to include additional data, such as Unicode
code-point values, glyph extents, glyph flags, or interim
shaping results.
</para>
<para>
Output can also be redirected to a file, or input read from a
file. Additional options enable you to enable or disable
specific font features, to set variation-font axis values, to
alter the language, script, direction, and clustering settings
used, to enable sanity checks, or to change which shaping engine is used.
</para>
<para>
For a complete explanation of the options available, run
</para>
<programlisting>
<command>hb-shape</command> <parameter>--help</parameter>
</programlisting>
</section>
<section id="utilities-command-line-hbview">
<title>hb-view</title>
<para>
<emphasis><program>hb-view</program></emphasis> allows you to
see the shaped output of an input string in rendered
form. Like <program>hb-shape</program>,
<program>hb-view</program> takes a font file and a text string
as its arguments:
</para>
<programlisting>
<command>hb-view</command> <optional>[OPTIONS]</optional>
<parameter>path/to/font/file.ttf</parameter>
<parameter>yourinputtext</parameter>
</programlisting>
<para>
By default, <program>hb-view</program> renders the shaped
text in ASCII block-character images as terminal output. By
appending the
<command>--output-file=<optional>filename</optional></command>
switch, you can write the output to a PNG, SVG, or PDF file
(among other formats).
</para>
<para>
As with <program>hb-shape</program>, a lengthy set of options
is available, with which you can enable or disable
specific font features, set variation-font axis values,
alter the language, script, direction, and clustering settings
used, enable sanity checks, or change which shaping engine is
used.
</para>
<para>
You can also set the foreground and background colors used for
the output, independently control the width of all four
margins, alter the line spacing, and annotate the output image
with
</para>
<para>
In general, <program>hb-view</program> is a quick way to
verify that the output of HarfBuzz's shaping operation looks
correct for a given text-and-font combination, but you may
want to use <program>hb-shape</program> to figure out exactly
why something does not appear as expected.
</para>
</section>
<section id="utilities-command-line-hbsubset">
<title>hb-subset</title>
<para>
<emphasis><program>hb-subset</program></emphasis> allows you
to generate a subset of a given font, with a limited set of
supported characters, features, and variation settings.
</para>
<para>
By default, you provide an input font and an input text string
as the arguments to <program>hb-subset</program>, and it will
generate a font that covers the input text exactly like the
input font does, but includes no other characters or features.
</para>
<programlisting>
<command>hb-subset</command> <optional>[OPTIONS]</optional>
<parameter>path/to/font/file.ttf</parameter>
<parameter>yourinputtext</parameter>
</programlisting>
<para>
For example, to create a subset of Noto Serif that just includes the
numerals and the lowercase Latin alphabet, you could run
</para>
<programlisting>
<command>hb-subset</command> <optional>[OPTIONS]</optional>
<parameter>NotoSerif-Regular.ttf</parameter>
<parameter>0123456789abcdefghijklmnopqrstuvwxyz</parameter>
</programlisting>
<para>
There are options available to remove hinting from the
subsetted font and to specify a list of variation-axis settings.
</para>
</section>
</section>
<section id="utilities-common-types-apis">
<title>Common data types and APIs</title>
<para>
HarfBuzz includes several APIs for working with general-purpose
data that you may find convenient to leverage in your own
software. They include set operations and integer-to-integer
mapping operations.
</para>
<para>
HarfBuzz uses set operations for internal bookkeeping, such as
when it collects all of the glyph IDs covered by a particular
font feature. You can also use the set API to build sets, add
and remove elements, test whether or not sets contain particular
elements, or compute the unions, intersections, or differences
between sets.
</para>
<para>
All set elements are integers (specifically,
<type>hb_codepoint_t</type> 32-bit unsigned ints), and there are
functions for fetching the minimum and maximum element from a
set. The set API also includes some functions that might not
be part of a generic set facility, such as the ability to add a
contiguous range of integer elements to a set in bulk, and the
ability to fetch the next-smallest or next-largest element.
</para>
<para>
The HarfBuzz set API includes some conveniences as well. All
sets are lifecycle-managed, just like other HarfBuzz
objects. You increase the reference count on a set with
<function>hb_set_reference()</function> and decrease it with
<function>hb_set_destroy()</function>. You can also attach
user data to a set, just like you can to blobs, buffers, faces,
fonts, and other objects, and set destroy callbacks.
</para>
<para>
HarfBuzz also provides an API for keeping track of
integer-to-integer mappings. As with the set API, each integer is
stored as an unsigned 32-bit <type>hb_codepoint_t</type>
element. Maps, like other objects, are reference counted with
reference and destroy functions, and you can attach user data to
them. The mapping operations include adding and deleting
integer-to-integer key:value pairs to the map, testing for the
presence of a key, fetching the population of the map, and so on.
</para>
<para>
There are several other internal HarfBuzz facilities that are
exposed publicly and which you may want to take advantage of
while processing text. HarfBuzz uses a common
<type>hb_tag_t</type> for a variety of OpenType tag identifiers (for
scripts, languages, font features, table names, variation-axis
names, and more), and provides functions for converting strings
to tags and vice-versa.
</para>
<para>
Finally, HarfBuzz also includes data type for Booleans, bit
masks, and other simple types.
</para>
</section>
<section id="utilities-ucdn">
<title>UCDN</title>
<para>
HarfBuzz includes a copy of the <ulink
url="https://github.com/grigorig/ucdn">UCDN</ulink> (Unicode
Database and Normalization) library, which provides functions
for accessing basic Unicode character properties, performing
canonical composition, and performing both canonical and
compatibility decomposition.
</para>
<para>
Currently, UCDN supports direct queries for several more character
properties than HarfBuzz's built-in set of Unicode functions
does, such as the BiDirectional Class, East Asian Width, Paired
Bracket and Resolved Linebreak properties. If you need to access
more properties than HarfBuzz's internal implementation
provides, using the built-in UCDN functions may be a useful solution.
</para>
<para>
The built-in UCDN functions are compiled by default when
building HarfBuzz from source, but this can be disabled with a
compile-time switch.
</para>
</section>
</chapter>