Sujet : Re: Wondering ...
De : rich (at) *nospam* example.invalid (Rich)
Groupes : comp.miscDate : 28. May 2025, 20:30:20
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <1017o8c$3d7rp$1@dont-email.me>
References : 1 2 3 4 5 6
User-Agent : tin/2.6.1-20211226 ("Convalmore") (Linux/5.15.139 (x86_64))
hrtuybxi@outlook.com wrote:
On 24/04/2025 02:16, Lawrence D'Oliveiro wrote:
I was slightly disillusioned when I found that there were certain
characters that were not allowed in XML files, even when entity-encoded.
Not even in CDATA section? I wonder what characters those might be.
https://www.w3.org/TR/REC-xml/#charsetsCharacter Range
[2] Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] |
[#xE000-#xFFFD] | [#x10000-#x10FFFF]
/* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */
Most of the control characters are explicitly excluded, plus some
unicode code points.
And, yes, even in CDATA sections, because CDATA references the above
"Char" definition to define what characters are allowed in CDATA:
[20] CData ::= (Char* - (Char* ']]>' Char*))