Sujet : Re: China: Government Starts Phasing Out American Processors, Operating Systems on Government Computers
De : mds (at) *nospam* bogus.nodomain.nowhere (Mike Spencer)
Groupes : misc.news.internet.discuss comp.miscDate : 31. Mar 2024, 06:01:52
Autres entêtes
Organisation : Bridgewater Institute for Advanced Study - Blacksmith Shop
Message-ID : <87y19zm1tr.fsf@enoch.nodomain.nowhere>
References : 1 2 3 4 5 6 7
User-Agent : Gnus v5.7/Emacs 20.7
Rich <
rich@example.invalid> writes:
In comp.misc Mike Spencer <mds@bogus.nodomain.nowhere> wrote:
The PDF author had used used the ff ligature from whatever
$CURRENTLY-KEWL-CHARSET which was rendered readably. But the xpdf
author wasn't clueful enough to realize that no user ever enters a
ligature character code from the keyboard as a search target and write
compensating translations into the source code.
It may not be xpdf's author's fault. If the pdf creator did not
provide a proper reverse map table from the code point used for the ff
ligature to its actual character (or characters) then there's nothing a
pdf reader can do to fix the problem.
The problem is that the PDF specification allows for the PDF creator to
create arbitrary mappings from byte values used in the PDF file to any
given glyph in a font file. But it makes optional the reverse mapping
table which would define to a PDF reader program that "byte value 0x32
in this portion of this PDF [1] represents the 'ff' litgature".
Without that reverse table, PDF is effectively a "write only medium".
It will print a perfect document, but you can't search, nor copy out,
anything from it.
Thank you for that. Groveling through the PDF spec is well above my
pay grade. It sounds like a can of worms to me -- creeping
featuritis, make any weird hack the devs can think of possible.
Huh. Tnx.
[1] 0x32 can be made to represent any number of different glyphs within
a single given PDF. In fact, if one were so devious as to do so, every
byte in the pdf representing a text character could be 0x32, and each
one could "print" to the electronic sheet of paper a different font
glyph.
-- Mike Spencer Nova Scotia, Canada