Sujet : Comments about 0.24.1 specification changes
De : news (at) *nospam* zzo38computer.org.invalid
Groupes : comp.infosystems.geminiDate : 31. Aug 2024, 23:15:54
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <1724962885.bystand@zzo38computer.org>
User-Agent : bystand/1.3.0pre1
This article comments about:
gemini://geminiprotocol.net/news/2024_08_28.gmi
Clarification on encoding queries
This part is good.
Clarification on error reporting
This part is also good.
Correction to the example for multiple languages
While I think that there are some problems with the mechanism (and with
MIME in general), since that is what it is (and there is no point to
change it now), it is good to make the specification to describe this
mechanism correctly, so now that they fixed it to describe it correctly,
it is good.
it is not and never has been true that MIME media types are big trucks
you can just dump parameters on whenever you feel like it, each type has
a registered set of defined parameters. But it's not uncommon for people
to think otherwise and I have had this pointed out more than once as a
place where Gemini can be illicitly extended, so it can't hurt to be
explicit about this.
There are other places where extensions can be added anyways, such as adding
the extensions into the X.509 certificate.
I think that the "non-extensibility" does not actually work very well.
Because empty text lines are valid (and widely used) in Gemtext
documents. If a formulation like the above were used, it would be ambiguous
whether or not every document which did end with a CRLF did or did not also
include an empty text line after it which didn't include the optional final
newline. Since empty text lines are supposed to be rendered individually
each time they occur, this ambiguity actually has consequences. Absolutely
trivial consequences, it's true, but the problem of documents without final
newlines being ill-formed is trivial too.
There are a few issues with such consequences.
When viewing a document on the screen, an extra blank line might not be
very significant, but it might be significant for paged media, that you
might end up with an extra page which is blank (if the formatter does not
detect and discard it due to this reason).
I think that a final line break should be required (and does not result in
an extra blank line), although if it is not present then implementations
SHOULD treat it as though it is present, even though it is not valid.
Permit use of non-ASCII characters in text lines
I think that the ABNF should only use ASCII and to define "non-ASCII
characters" as bytes with the high bit set; which combinations of such
bytes are valid depends on the character encoding in use but does not
affect the structure of the document so does not need its own ABNF.
There are character sets that cannot (and/or should not) be mapped to
Unicode; writing the ABNF in terms of Unicode won't do, and I also think
that writing the ABNF in terms of the "canonical form" is also unnecessary.
My proposal would be to disallow character encodings that are not a
superset of ASCII, but to allow any others, independently of whether or
not they are subsets of Unicode. For example, UTF-16 would be disallowed,
but EUC-JP would be allowed (and UTF-8 would still be allowed too).
The ABNF can have its own definition for line breaks, instead of CRLF you
can define one that can be either LF or CRLF. For example:
gemtext-document = [bom] 1*gemtext-line
linebreak = [CR] LF
nonascii = %x80-FF
VCHAR /= nonascii
bom = %xEF %xBB %xBF ; if document character encoding is UTF-8 or unspecified
bom = () ; if document character encoding is specified and is not UTF-8
-- Don't laugh at the moon when it is day time in France.