Commit Graph

7 Commits

Author SHA1 Message Date
Jeroen van Rijn
2dd67dba89 [core:encoding/entity] Add new package to decode &<entity>; entities.
Includes generator to generate a lookup for named entitiess.
2021-12-05 02:52:23 +01:00
Jeroen van Rijn
5807214406 [xml] Improvements. 2021-12-05 02:52:23 +01:00
Jeroen van Rijn
23baf56c87 [xml] Improve CDATA + comment handling in tag body. 2021-12-05 02:52:23 +01:00
Jeroen van Rijn
beff90e1d1 [xml] Slight optimization.
About a 5% speed bump.

More rigorous optimization later.
2021-12-05 02:52:23 +01:00
Jeroen van Rijn
ec63d0bbd2 [xml] Robustness improvement.
Can now parse  https://www.w3.org/2003/entities/2007xml/unicode.xml no problem.
2021-12-05 02:52:22 +01:00
Jeroen van Rijn
46a4927aca [xml] Use io.Writer for xml.print(doc). 2021-12-05 02:52:22 +01:00
Jeroen van Rijn
b5c828fe4e [xml] Initial implementation of core:encoding/xml.
A from-scratch XML implementation, loosely modeled on the [spec](https://www.w3.org/TR/2006/REC-xml11-20060816).

Features:
		- Supports enough of the XML 1.0/1.1 spec to handle the 99.9% of XML documents in common current usage.
		- Simple to understand and use. Small.

Caveats:
		- We do NOT support HTML in this package, as that may or may not be valid XML.
		  If it works, great. If it doesn't, that's not considered a bug.

		- We do NOT support UTF-16. If you have a UTF-16 XML file, please convert it to UTF-8 first. Also, our condolences.
		- <[!ELEMENT and <[!ATTLIST are not supported, and will be either ignored or return an error depending on the parser options.

TODO:
- Optional CDATA unboxing.
- Optional `&gt;`, `&#32;`, `&#x20;` and other escape substitution in tag bodies.
- Test suite

MAYBE:
- XML writer?
- Serialize/deserialize Odin types?
2021-12-05 02:52:22 +01:00