Sujet : Re: architectural goals, Byte Addressability And Beyond
De : monnier (at) *nospam* iro.umontreal.ca (Stefan Monnier)
Groupes : comp.archDate : 04. Jun 2024, 21:58:06
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <jwv1q5cmnrs.fsf-monnier+comp.arch@gnu.org>
References : 1 2 3 4 5
User-Agent : Gnus/5.13 (Gnus v5.13)
Bottom line: Code point conversion instructions like CU14 solve a
problem that people imagine who have no experience working with UTF-8.
The original instructions were CU12 and CU21 which convert between
UTF-8 and UTF-16. That really is useful, e.g., read a file of UTF-8
into a program in Java or Javascript which uses UTF-16. I agree the
UTF-32 versions added in zseries are less likely to be useful.
It's all really instances of the same: conversion between UTF-N1 and
UTF-N2 is only every worthwhile if you receive something using UTF-N1
and you have to return something that uses UTF-N2.
If your task is described at a higher level and you're not constrained
by some arbitrary choices in intermediate APIs then you're almost always
better off working straight from the encoding you receive.
Stefan