third_party/beautifulsoup4/TODO.txt - catapult.git - Git at Google

 Additions
 ---------

 More of the jQuery API: nextUntil?

 Optimizations
 -------------

 The html5lib tree builder doesn't use the standard tree-building API,
 which worries me and has resulted in a number of bugs.

 markup_attr_map can be optimized since it's always a map now.

 Upon encountering UTF-16LE data or some other uncommon serialization
 of Unicode, UnicodeDammit will convert the data to Unicode, then
 encode it at UTF-8. This is wasteful because it will just get decoded
 back to Unicode.

 CDATA
 -----

 The elementtree XMLParser has a strip_cdata argument that, when set to
 False, should allow Beautiful Soup to preserve CDATA sections instead
 of treating them as text. Except it doesn't. (This argument is also
 present for HTMLParser, and also does nothing there.)

 Currently, htm5lib converts CDATA sections into comments. An
 as-yet-unreleased version of html5lib changes the parser's handling of
 CDATA sections to allow CDATA sections in tags like <svg> and
 <math>. The HTML5TreeBuilder will need to be updated to create CData
 objects instead of Comment objects in this situation.
	Additions
	---------

	More of the jQuery API: nextUntil?

	Optimizations
	-------------

	The html5lib tree builder doesn't use the standard tree-building API,
	which worries me and has resulted in a number of bugs.

	markup_attr_map can be optimized since it's always a map now.

	Upon encountering UTF-16LE data or some other uncommon serialization
	of Unicode, UnicodeDammit will convert the data to Unicode, then
	encode it at UTF-8. This is wasteful because it will just get decoded
	back to Unicode.

	CDATA
	-----

	The elementtree XMLParser has a strip_cdata argument that, when set to
	False, should allow Beautiful Soup to preserve CDATA sections instead
	of treating them as text. Except it doesn't. (This argument is also
	present for HTMLParser, and also does nothing there.)

	Currently, htm5lib converts CDATA sections into comments. An
	as-yet-unreleased version of html5lib changes the parser's handling of
	CDATA sections to allow CDATA sections in tags like <svg> and
	<math>. The HTML5TreeBuilder will need to be updated to create CData
	objects instead of Comment objects in this situation.