README.rst - external/github.com/kjd/idna - Git at Google

 Internationalized Domain Names in Applications (IDNA)
 =====================================================

 Support for the Internationalised Domain Names in Applications
 (IDNA) protocol as specified in `RFC 5891 <https://tools.ietf.org/html/rfc5891>`_.
 This is the latest version of the protocol and is sometimes referred to as
 “IDNA 2008”.

 This library also provides support for Unicode Technical Standard 46,
 `Unicode IDNA Compatibility Processing <https://unicode.org/reports/tr46/>`_.

 This acts as a suitable replacement for the “encodings.idna” module that
 comes with the Python standard library, but which only supports the
 old, deprecated IDNA specification (`RFC 3490 <https://tools.ietf.org/html/rfc3490>`_).

 Basic functions are simply executed:

 .. code-block:: pycon

     >>> import idna
     >>> idna.encode('ドメイン.テスト')
     b'xn--eckwd4c7c.xn--zckzah'
     >>> print(idna.decode('xn--eckwd4c7c.xn--zckzah'))
     ドメイン.テスト

 Packages
 --------

 The latest tagged release version is published in the PyPI repository:

 .. image:: https://badge.fury.io/py/idna.svg
    :target: https://badge.fury.io/py/idna


 Installation
 ------------

 To install this library, you can use pip:

 .. code-block:: bash

     $ pip install idna

 Alternatively, you can install the package using the bundled setup script:

 .. code-block:: bash

     $ python setup.py install

 This library works with Python 3.4 or later. Earlier versions of this
 library support Python 2 - use "idna<3" in your requirements file if
 you need this library for a Python 2 application.


 Usage
 -----

 For typical usage, the ``encode`` and ``decode`` functions will take a domain
 name argument and perform a conversion to A-labels or U-labels respectively.

 .. code-block:: pycon

     >>> import idna
     >>> idna.encode('ドメイン.テスト')
     b'xn--eckwd4c7c.xn--zckzah'
     >>> print(idna.decode('xn--eckwd4c7c.xn--zckzah'))
     ドメイン.テスト

 You may use the codec encoding and decoding methods using the
 ``idna.codec`` module:

 .. code-block:: pycon

     >>> import idna.codec
     >>> print('домена.испытание'.encode('idna'))
     b'xn--80ahd1agd.xn--80akhbyknj4f'
     >>> print(b'xn--80ahd1agd.xn--80akhbyknj4f'.decode('idna'))
     домена.испытание

 Conversions can be applied at a per-label basis using the ``ulabel`` or ``alabel``
 functions if necessary:

 .. code-block:: pycon

     >>> idna.alabel('测试')
     b'xn--0zwm56d'

 Compatibility Mapping (UTS #46)
 +++++++++++++++++++++++++++++++

 As described in `RFC 5895 <https://tools.ietf.org/html/rfc5895>`_, the IDNA
 specification does not normalize input from different potential ways a user
 may input a domain name. This functionality, known as a “mapping”, is
 considered by the specification to be a local user-interface issue distinct
 from IDNA conversion functionality.

 This library provides one such mapping, that was developed by the Unicode
 Consortium. Known as `Unicode IDNA Compatibility Processing <https://unicode.org/reports/tr46/>`_,
 it provides for both a regular mapping for typical applications, as well as
 a transitional mapping to help migrate from older IDNA 2003 applications.

 For example, “Königsgäßchen” is not a permissible label as *LATIN CAPITAL
 LETTER K* is not allowed (nor are capital letters in general). UTS 46 will
 convert this into lower case prior to applying the IDNA conversion.

 .. code-block:: pycon

     >>> import idna
     >>> idna.encode('Königsgäßchen')
     ...
     idna.core.InvalidCodepoint: Codepoint U+004B at position 1 of 'Königsgäßchen' not allowed
     >>> idna.encode('Königsgäßchen', uts46=True)
     b'xn--knigsgchen-b4a3dun'
     >>> print(idna.decode('xn--knigsgchen-b4a3dun'))
     königsgäßchen

 Transitional processing provides conversions to help transition from the older
 2003 standard to the current standard. For example, in the original IDNA
 specification, the *LATIN SMALL LETTER SHARP S* (ß) was converted into two
 *LATIN SMALL LETTER S* (ss), whereas in the current IDNA specification this
 conversion is not performed.

 .. code-block:: pycon

     >>> idna.encode('Königsgäßchen', uts46=True, transitional=True)
     'xn--knigsgsschen-lcb0w'

 Implementors should use transitional processing with caution, only in rare
 cases where conversion from legacy labels to current labels must be performed
 (i.e. IDNA implementations that pre-date 2008). For typical applications
 that just need to convert labels, transitional processing is unlikely to be
 beneficial and could produce unexpected incompatible results.

 ``encodings.idna`` Compatibility
 ++++++++++++++++++++++++++++++++

 Function calls from the Python built-in ``encodings.idna`` module are
 mapped to their IDNA 2008 equivalents using the ``idna.compat`` module.
 Simply substitute the ``import`` clause in your code to refer to the
 new module name.

 Exceptions
 ----------

 All errors raised during the conversion following the specification should
 raise an exception derived from the ``idna.IDNAError`` base class.

 More specific exceptions that may be generated as ``idna.IDNABidiError``
 when the error reflects an illegal combination of left-to-right and
 right-to-left characters in a label; ``idna.InvalidCodepoint`` when
 a specific codepoint is an illegal character in an IDN label (i.e.
 INVALID); and ``idna.InvalidCodepointContext`` when the codepoint is
 illegal based on its positional context (i.e. it is CONTEXTO or CONTEXTJ
 but the contextual requirements are not satisfied.)

 Building and Diagnostics
 ------------------------

 The IDNA and UTS 46 functionality relies upon pre-calculated lookup
 tables for performance. These tables are derived from computing against
 eligibility criteria in the respective standards. These tables are
 computed using the command-line script ``tools/idna-data``.

 This tool will fetch relevant codepoint data from the Unicode repository
 and perform the required calculations to identify eligibility. There are
 three main modes:

 * ``idna-data make-libdata``. Generates ``idnadata.py`` and ``uts46data.py``,
   the pre-calculated lookup tables using for IDNA and UTS 46 conversions. Implementors
   who wish to track this library against a different Unicode version may use this tool
   to manually generate a different version of the ``idnadata.py`` and ``uts46data.py``
   files.

 * ``idna-data make-table``. Generate a table of the IDNA disposition
   (e.g. PVALID, CONTEXTJ, CONTEXTO) in the format found in Appendix B.1 of RFC
   5892 and the pre-computed tables published by `IANA <https://www.iana.org/>`_.

 * ``idna-data U+0061``. Prints debugging output on the various properties
   associated with an individual Unicode codepoint (in this case, U+0061), that are
   used to assess the IDNA and UTS 46 status of a codepoint. This is helpful in debugging
   or analysis.

 The tool accepts a number of arguments, described using ``idna-data -h``. Most notably,
 the ``--version`` argument allows the specification of the version of Unicode to use
 in computing the table data. For example, ``idna-data --version 9.0.0 make-libdata``
 will generate library data against Unicode 9.0.0.


 Testing
 -------

 The library has a test suite based on each rule of the IDNA specification, as
 well as tests that are provided as part of the Unicode Technical Standard 46,
 `Unicode IDNA Compatibility Processing <https://unicode.org/reports/tr46/>`_.
	Internationalized Domain Names in Applications (IDNA)
	=====================================================

	Support for the Internationalised Domain Names in Applications
	(IDNA) protocol as specified in `RFC 5891 <https://tools.ietf.org/html/rfc5891>`_.
	This is the latest version of the protocol and is sometimes referred to as
	“IDNA 2008”.

	This library also provides support for Unicode Technical Standard 46,
	`Unicode IDNA Compatibility Processing <https://unicode.org/reports/tr46/>`_.

	This acts as a suitable replacement for the “encodings.idna” module that
	comes with the Python standard library, but which only supports the
	old, deprecated IDNA specification (`RFC 3490 <https://tools.ietf.org/html/rfc3490>`_).

	Basic functions are simply executed:

	.. code-block:: pycon

	>>> import idna
	>>> idna.encode('ドメイン.テスト')
	b'xn--eckwd4c7c.xn--zckzah'
	>>> print(idna.decode('xn--eckwd4c7c.xn--zckzah'))
	ドメイン.テスト

	Packages
	--------

	The latest tagged release version is published in the PyPI repository:

	.. image:: https://badge.fury.io/py/idna.svg
	:target: https://badge.fury.io/py/idna


	Installation
	------------

	To install this library, you can use pip:

	.. code-block:: bash

	$ pip install idna

	Alternatively, you can install the package using the bundled setup script:

	.. code-block:: bash

	$ python setup.py install

	This library works with Python 3.4 or later. Earlier versions of this
	library support Python 2 - use "idna<3" in your requirements file if
	you need this library for a Python 2 application.


	Usage
	-----

	For typical usage, the ``encode`` and ``decode`` functions will take a domain
	name argument and perform a conversion to A-labels or U-labels respectively.

	.. code-block:: pycon

	>>> import idna
	>>> idna.encode('ドメイン.テスト')
	b'xn--eckwd4c7c.xn--zckzah'
	>>> print(idna.decode('xn--eckwd4c7c.xn--zckzah'))
	ドメイン.テスト

	You may use the codec encoding and decoding methods using the
	``idna.codec`` module:

	.. code-block:: pycon

	>>> import idna.codec
	>>> print('домена.испытание'.encode('idna'))
	b'xn--80ahd1agd.xn--80akhbyknj4f'
	>>> print(b'xn--80ahd1agd.xn--80akhbyknj4f'.decode('idna'))
	домена.испытание

	Conversions can be applied at a per-label basis using the ``ulabel`` or ``alabel``
	functions if necessary:

	.. code-block:: pycon

	>>> idna.alabel('测试')
	b'xn--0zwm56d'

	Compatibility Mapping (UTS #46)
	+++++++++++++++++++++++++++++++

	As described in `RFC 5895 <https://tools.ietf.org/html/rfc5895>`_, the IDNA
	specification does not normalize input from different potential ways a user
	may input a domain name. This functionality, known as a “mapping”, is
	considered by the specification to be a local user-interface issue distinct
	from IDNA conversion functionality.

	This library provides one such mapping, that was developed by the Unicode
	Consortium. Known as `Unicode IDNA Compatibility Processing <https://unicode.org/reports/tr46/>`_,
	it provides for both a regular mapping for typical applications, as well as
	a transitional mapping to help migrate from older IDNA 2003 applications.

	For example, “Königsgäßchen” is not a permissible label as *LATIN CAPITAL
	LETTER K* is not allowed (nor are capital letters in general). UTS 46 will
	convert this into lower case prior to applying the IDNA conversion.

	.. code-block:: pycon

	>>> import idna
	>>> idna.encode('Königsgäßchen')
	...
	idna.core.InvalidCodepoint: Codepoint U+004B at position 1 of 'Königsgäßchen' not allowed
	>>> idna.encode('Königsgäßchen', uts46=True)
	b'xn--knigsgchen-b4a3dun'
	>>> print(idna.decode('xn--knigsgchen-b4a3dun'))
	königsgäßchen

	Transitional processing provides conversions to help transition from the older
	2003 standard to the current standard. For example, in the original IDNA
	specification, the LATIN SMALL LETTER SHARP S (ß) was converted into two
	LATIN SMALL LETTER S (ss), whereas in the current IDNA specification this
	conversion is not performed.

	.. code-block:: pycon

	>>> idna.encode('Königsgäßchen', uts46=True, transitional=True)
	'xn--knigsgsschen-lcb0w'

	Implementors should use transitional processing with caution, only in rare
	cases where conversion from legacy labels to current labels must be performed
	(i.e. IDNA implementations that pre-date 2008). For typical applications
	that just need to convert labels, transitional processing is unlikely to be
	beneficial and could produce unexpected incompatible results.

	``encodings.idna`` Compatibility
	++++++++++++++++++++++++++++++++

	Function calls from the Python built-in ``encodings.idna`` module are
	mapped to their IDNA 2008 equivalents using the ``idna.compat`` module.
	Simply substitute the ``import`` clause in your code to refer to the
	new module name.

	Exceptions
	----------

	All errors raised during the conversion following the specification should
	raise an exception derived from the ``idna.IDNAError`` base class.

	More specific exceptions that may be generated as ``idna.IDNABidiError``
	when the error reflects an illegal combination of left-to-right and
	right-to-left characters in a label; ``idna.InvalidCodepoint`` when
	a specific codepoint is an illegal character in an IDN label (i.e.
	INVALID); and ``idna.InvalidCodepointContext`` when the codepoint is
	illegal based on its positional context (i.e. it is CONTEXTO or CONTEXTJ
	but the contextual requirements are not satisfied.)

	Building and Diagnostics
	------------------------

	The IDNA and UTS 46 functionality relies upon pre-calculated lookup
	tables for performance. These tables are derived from computing against
	eligibility criteria in the respective standards. These tables are
	computed using the command-line script ``tools/idna-data``.

	This tool will fetch relevant codepoint data from the Unicode repository
	and perform the required calculations to identify eligibility. There are
	three main modes:

	* ``idna-data make-libdata``. Generates ``idnadata.py`` and ``uts46data.py``,
	the pre-calculated lookup tables using for IDNA and UTS 46 conversions. Implementors
	who wish to track this library against a different Unicode version may use this tool
	to manually generate a different version of the ``idnadata.py`` and ``uts46data.py``
	files.

	* ``idna-data make-table``. Generate a table of the IDNA disposition
	(e.g. PVALID, CONTEXTJ, CONTEXTO) in the format found in Appendix B.1 of RFC
	5892 and the pre-computed tables published by `IANA <https://www.iana.org/>`_.

	* ``idna-data U+0061``. Prints debugging output on the various properties
	associated with an individual Unicode codepoint (in this case, U+0061), that are
	used to assess the IDNA and UTS 46 status of a codepoint. This is helpful in debugging
	or analysis.

	The tool accepts a number of arguments, described using ``idna-data -h``. Most notably,
	the ``--version`` argument allows the specification of the version of Unicode to use
	in computing the table data. For example, ``idna-data --version 9.0.0 make-libdata``
	will generate library data against Unicode 9.0.0.


	Testing
	-------

	The library has a test suite based on each rule of the IDNA specification, as
	well as tests that are provided as part of the Unicode Technical Standard 46,
	`Unicode IDNA Compatibility Processing <https://unicode.org/reports/tr46/>`_.