Sujet : Escapes (was String-Based Macro Systems)
De : james.harris.1 (at) *nospam* gmail.com (James Harris)
Groupes : comp.lang.miscDate : 03. May 2024, 10:01:57
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <v12966$euk9$1@dont-email.me>
References : 1 2
User-Agent : Mozilla Thunderbird
On 13/04/2024 06:09, Blue-Maned_Hawk wrote:
Lawrence D'Oliveiro wrote:
The big difference with m4 is that it does away with these special
symbols; the mere occurrence of a name matching a defined macro (or an
argument of the macro currently being expanded) is sufficient to trigger
substitution. Do you think this is a good idea?
>
There are all kinds of pitfalls with such macro systems. The original
Macrogenerator could not cope with substitutions containing unpaired “<
... >” quote symbols, and even GNU m4 lacks something as simple as a
backslash-style “escape next single character, whatever it is”. While m4
lets you switch the quoting symbols, it still insists that they occur in
pairs.
>
Would adding such an escape character be useful?
Yes, of course.
Whenever a system has a system to escape symbols, there are two ways to go
about it: either the symbol is magic by default, and the escape makes it
normal, or the symbol is normal by default, and the escape makes it magic.
Having both of the systems at once is generally confusing, because it
makes it difficult to remember which symbols are which. It's more
practical to have all of them be one or the other.
One could say that having the symbols only become magic upon escapement is
better, because it clearly indicates when a symbol has magic properties.
This is analogous to the logic used to defend sigils, a form of
disambiguation repeatedly found to be pointless because names already do
that disambiguation. Therefore, the correct choice is magic by default.
Interesting points though I am not sure how you got to that conclusion (or what you mean by "the logic used to defend sigils").
In particular, magic characters are sometimes used in contexts in which there are no "names" with which to do any disambiguation. For example, the regular expression to match "parts" and "party" might be
"part[sy]"
I presume you would take that as magic-by-default so any occurrence of a magic symbol needs to be escaped as in a[b] appearing as
"a\[b\]"
Alternatively, if magic symbols were prefixed with ~ then the above two strings would appear as
"part~[xy~]"
"a[b]"
Is that really worse?
One fallacious argument i've heard used to justify magic by default is
that it means that the treatment of the escape symbol itself is consistent
with all the other symbols in that it's magic by default unless escaped by
itself. I consider this fallacious because in a system where magic must
be explicit, the escape symbol would be the _only_ exception, and it would
be _impossible_ to make any others—what i'd say is a worthwhile sacrifice.
Indeed. In C, backslash does double duty
\n - backslash /gives/ significance to n
\" - backslash /removes/ the significance of the double quote
That inconsistency does seem odd.
Either way, figuring out the solution to the problem of “Magic: by
default or by request?” is almost certainly a lower priority than the
majority of other problems.
It's an important issue nonetheless. And aren't there two contexts, as follows?
(1) A string which has to be converted by the compiler into binary encodings, e.g.
"Hello\nWorld"
(2) A string which /after any conversions/ means something to a function which processes it, e.g.
"Hello\:space:world"
where "\:space:" is meant to indicate whitespace to some program which processes the string and implies that the backslash has to remain in the encoded string.
-- James Harris