Robert Swindells <
rjs@fdy2.co.uk> writes:
You could compare sizes of applications in the base.tgz tarball for each
architecture, this is available for RISC-V as well as all the others.
I did that, see below. There is one problem: RISC-V is only available
in the daily builds, many of the other architectures are not. So I
used 10.0 for all architectures except RISC-V. I also measured the
daily builds for AMD64, to see how the difference in versions of the
source affects the code sizes.
The .text section sizes are (sorted by the ksh column):
libc ksh pax ed
1102054 124726 66218 26226 riscv-riscv32
1077192 127050 62748 26550 riscv-riscv64
1030288 150686 79852 31492 mvme68k
779393 155764 75795 31813 vax
1302254 171505 83249 35085 amd64
1229032 178332 89180 36876 evbarm-aarch64
1539052 179055 82280 34717 amd64-daily
1374961 184458 96971 37218 i386
1247476 185792 96728 42028 evbarm-earmv7hf
1333952 187452 96328 39472 sparc
1586608 204032 106896 45408 evbppc
1536144 204320 106768 43232 hppa
1397024 216832 109792 48512 sparc64
1538536 222336 107776 44912 evbmips-mips64eb
1623952 243008 122096 50640 evbmips-mipseb
1689920 251376 120672 51168 alpha
2324752 2259984 1378000 ia64
libc seems to be quite different between different architectures,
probably with specialized code for different architectures (with vax
being an outlier towards small sizes, see below), the programs seem to
be less specialized. Looking at the two amd64 results, the current
differences between 10.0 and daily seem to be small for pax and ed,
while ksh and libc seem to have grown. quite a bit. The RISC-V
variants use compressed instructions, evbarm-earmv7hf use A32 (no
16-bit instructions). The ia64 binaries are statically linked and no
shared libraries are present in the base package.
I looked at what the largest functions in libc are:
For vax:
000afab4 g F .text 0000266d __ns_sprintrrf
00079d06 g F .text 00002c09 __vfwprintf_unlocked_l
000cd5f0 g F .text 00003888 __vfprintf_unlocked_l
For riscv-riscv32:
000792ac g F .text 0000238a __vfwprintf_unlocked_l
001169ec g F .text 000026aa __vfprintf_unlocked_l
000f64b2 g F .text 00002af0 __ns_sprintrrf
000c798c l F .text 00002c74 malloc_conf_init_helper
0013be64 l F .text 0000503a stats_arena_print
The last two functions do not occur in the vax's libc (as do a lot of
others), which probably explains much of the size difference.
__ns_sprintrrf is larger on RISC-V while __vfprintf_unlocked_l and
__vfwprintf_unlocked_l are smaller; for the total of these three
functions: they are a factor of 1.186 larger on the vax than on
riscv-riscv32. So the difference in libc sizes is probably due to
additional functions in riscv-riscv32.
Looking at ksh, pax and ed, the RISC-V variants have the smallest code
sizes, even for ksh. The VAX has significantly larger sizes, even
though it is still small relative to most other architectures.
So if a major goal of the VAX project was to have small code sizes,
going for RV32GC (riscv-riscv32) would have been a good idea. And an
implementation somewhat similar to the HP-PA TS-1 (with smaller cache
due to the SRAM technology at the time) plus a PDP-11-to-microcode
decoder would not have increased the cost compared to the actual VAX,
and probably resulted in faster execution.
In an earlier posting I suggested a PDP-11->RV32G decoder, but that's
not be a good match given the condition-code architecture of the
PDP-11 and the CC-less architecture of RISC-V. So one solution is to
have a microarchitecture that has a CC register for PDP-11 emulation
and decode the PDP-11 code to that. Another approach would be to add
carry and overflow to the GPRs as RISC-V extension as I suggested
elsewhere, and I guess then PDP-11 -> extended RISC-V would be
possible.
Instead of having a cache, an interleaved memory subsystem might also
be able to provide the memory bandwidth to make better use of the RISC
execution rate potential. Also, the compressed instructions reduce
the instruction bandwidth requirements (compared to RISC-V without
compressed instructions), but require an additional instruction buffer
(additional TTL chips).
Here are the scripts I used:
for i in alpha evbarm-earmv7hf evbmips-mips64eb evbmips-mipseb evbppc hppa i386 ia64 mvme68k sparc vax; do mkdir -p $i/unpacked && (cd $i && wget
http://ftp.fr.netbsd.org/pub/NetBSD/NetBSD-10.0/$i/binary/sets/base.tgz); done
for i in amd64 evbarm-aarch64 sparc64; do mkdir -p $i/unpacked && (cd $i && wget
http://ftp.fr.netbsd.org/pub/NetBSD/NetBSD-10.0/$i/binary/sets/base.txz); done
for i in riscv-riscv32 riscv-riscv64; do mkdir -p $i/unpacked && (cd $i && wget
http://ftp.fr.netbsd.org/pub/NetBSD-daily/HEAD/latest/$i/binary/sets/base.tgz); done
mkdir -p amd64-daily/unpacked
cd amd64-daily
wget
http://ftp.fr.netbsd.org/pub/NetBSD-daily/HEAD/latest/amd64/binary/sets/base.tar.xzcd ..
for i in *; do (cd $i/unpacked; if test -f ../base.tgz; then tar xfz ../base.tgz; else tar xfJ ../base.tar.xz; fi); done
for i in *; do for j in lib/libc.so bin/ksh bin/pax bin/ed; do (cd $i/unpacked; if test -f $j; then objdump -h $j|awk --non-decimal-data '/[.]text/ {printf("%8d ","0x"$3)}'; else echo -n " "; fi); done; echo $i; done|sort -nk2
For determining the largest functions in libc (in an unpacked/lib
directory):
objdump -t libc.so|grep '[.]text'|sort -t '\0' -k1.25
- anton
-- 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.' Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>