Liste des Groupes | Revenir à cl c |
On 04/06/2025 20:58, Paul Edwards wrote:UTF-8,"David Brown" <david.brown@hesbynett.no> wrote in message
news:101poqm$t350$1@dont-email.me...On 04/06/2025 11:23, Paul Edwards wrote:>>
And I know what you're thinking - all the data is in EBCDIC.
There are no other EBCDIC systems I could possibly jump to.
We would need an 80386 EBCDIC version of Win32 in order
for this to be remotely possible - which doesn't exist, and likely
never will exist.
>
For it to exist it would need some sort of pseudo-bios concept
that allowed charset conversion. And no such thing exists as far
as I am aware!
You don't need an EBCDIC operating system, or "pseudo-bios" (whatever
/that/ might be) to use data using EBCDIC character encoding. It is no
different from working with any other character encoding - ASCII,
>different 8-bit code pages, or whatever. If the data is just passing>
through your code, read it in and pass it out without a care. If you
need to convert it or mix it with some other encoding, work with a
common encoding - UTF-8 is normally the right choice.
If I have existing C code that does:
>
fopen("test1.dat", "rb");
fread into buf
if (memcmp(buf + 5, "XYZ", 3) == 0)
>
and test1.dat is in EBCDIC, the above program on the 80386
has been compiled with EBCDIC strings, so it works, and then
now you do:
>
printf("It matches!\n");
>
where do you expect those characters in the printf string - all
currently EBCDIC - to be translated to ASCII on a modern
Windows 10/11 system?
>
In /your/ code!
>
/You/ are responsible for writing the code that handles the data, and
which gets the encoding right. If you want to handle data in an odd
encoding, write code to handle it. That's what everyone else does when
dealing with data in different encodings.
And how do you expect Windows 10/11 to find "test1.dat" ->
all EBCDIC on its current ASCII filesystem?
Convert the character encoding for the string.
>
People do this all the time. They write code that uses UTF-8, and have
to deal with Windows crappy UCS2 / partial UTF-16 / wchar_t mess. Or
they have code that supports different code pages because they started
it in the pre-Unicode days and don't live in a little American-only
ASCII bubble.
>
C does not make this stuff particularly convenient, though it has
improved a little since C90 - other languages have vastly superior
string and encoding handling. But that does not mean you can't do it,
or should not do it.
>
Maybe if you actually wanted to contribute something useful to the C
world - something that other people might find useful - you could put
your efforts into writing a library that has functions for converting
encodings back and forth with UTF-8 as the base. Include support for
the dozen different EDBDIC versions.
>
Or do you really think that if someone sent me a file to read that was
in EDBDIC encoding, I'd be happy to install an EDBDIC "pseudo-bios" and
EDBDIC version of Windows so that I could read it?
Les messages affichés proviennent d'usenet.