Sujet : Re: Microsoft makes a lot of money, Is Intel exceptionally unsuccessful as an architecture designer?
De : david.brown (at) *nospam* hesbynett.no (David Brown)
Groupes : comp.archDate : 25. Sep 2024, 10:15:24
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vd0kbc$3keo5$2@dont-email.me>
References : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
User-Agent : Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.11.0
On 24/09/2024 22:28, Brett wrote:
MitchAlsup1 <mitchalsup@aol.com> wrote:
But I do notice that when converted to *.pdf the file shrinks by 5×
Word has unlimited undo and the file saves all versions of your doc.
Do a Save As and all that crap will be gone, shrinking the file by also 5x.
The key thing that impacts the size of MS Office documents is the unbelievable inefficiency of its XML / HTML. You can try exporting a Word document as HTML, or unpacking a .docx file and looking at the XML, or even just looking at the HTML source in an email produced by Outlook. Every little bit of text - often divided up by line or even word, comes with several dozen attributes to specify the class, alignment, font, colour, language, and so on. These are repeated every time, for every word, line or sentence, despite nothing changing. And then when you edit the file on a different system, they are all wrapped again in a new layer of the same pointless crap.
Undo lists and change history lists make the file several several times bigger, but these countless extra layers of tags can increase the raw XML data by several orders of magnitude. Of course they compress well, but it still affects the size significantly, as well as the speed and memory usage.
Try taking a well-worn Word document, unpacking it and looking at the XML file. Compare it to one you get after doing a "save as". Then try opening it in LibreOffice, saving it again in .docx format, and looking at the XML there.
The "undo" and changes lists also have a big privacy impact, as well as the inefficiencies.