Newsportal USENET - Re: Roman numerals , recognizer "0r".

anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:

0
value: rdigit-value
2value: rdigit-string
constant rdigit-size
>
>
: romandigit ( u "romandigit" -- )
   , parse-name save-mem 2, ;
>
create romandigits
\ this table contains variants with 4 repetitions, you can comment
\ them out if desired
900 romandigit CM
500 romandigit D
400 romandigit CD
400 romandigit CCCC
300 romandigit CCC
200 romandigit CC
100 romandigit C
90 romandigit XC
50 romandigit L
40 romandigit XL
40 romandigit XXXX
30 romandigit XXX
20 romandigit XX
10 romandigit X
9 romandigit IX
5 romandigit V
4 romandigit IV
4 romandigit IIII
3 romandigit III
2 romandigit II
1 romandigit I
here constant end-romandigits
>
: roman>n? ( c-addr u -- n f )
\ if c-addr u contains a roman numeral, f is true and n is the value,
\ otherwise f is false.
   dup >r 'M' skip r> over - 1000 *
   romandigits case {: d: str1 n1 rd1 :}
   rd1 end-romandigits = ?of n1 str1 nip 0= endof
   str1 rd1 rdigit-string string-prefix? ?of
   str1 rd1 rdigit-string nip /string
   n1 rd1 rdigit-value +
   rd1 rdigit-size + contof
   str1 n1 rd1 rdigit-size + next-case ;

[...]

For ROMAN>N? I first tried an orthodox approach with data and return
stack only, and BEGIN etc., but with 4 stack items that have to be
updated possibly at every iteration that was somewhat unwieldy, and I
produced a buggy version. Then I tried this approach with the
extended CASE and locals, and I got it right on first try, despite its
bulk. I leave it to dxf to show how much better this becomes in
orthodox Forth.

After writing .ROMAN a good way to write a close-to-orthodox ROMAN>N?
became clear to me:

: roman>n? ( c-addr u -- n f )
\ if c-addr u contains a roman numeral, f is true and n is the value,
\ otherwise f is false.
dup >r 'M' skip r> over - 1000 * -rot
end-romandigits romandigits u+do ( n1 c-addr1 u1 )
2dup i rdigit-string string-prefix? if
i rdigit-string nip /string
rot i rdigit-value + -rot then
rdigit-size +loop
nip 0= ;

As so often, xDO ... xLOOP came to the rescue and meant that the stack
items to be managed were reduced by one. On the first attempt, I did
not go for that approach, because I wanted to keep two changing cells
on the return stack, and xDO ... xLOOP does not allow having the other
value on the return stack on entering the loop, and also restricts the
use of the return stack inside the loop. However, dealing with three
stack items on the data stack is not too bad, even though they all
change.

Compared to the extended case + locals variant, this variant is
shorter and also clearer.

Let's see if locals can still do something for us:

First, let's try the single-assignment style I usually prefer:

: roman>n? ( c-addr u -- n f )
\ if c-addr u contains a roman numeral, f is true and n is the value,
\ otherwise f is false.
dup >r 'M' skip r> over - 1000 *
end-romandigits romandigits u+do ( c-addr1 u1 n1 )
{: u1 :}
2dup i rdigit-string string-prefix? if
i rdigit-string nip /string
u1 i rdigit-value +
else
u1 then
rdigit-size +loop
over 0= 2nip ;

Not so great. How about assigning to the local?

: roman>n? ( c-addr u -- n f )
\ if c-addr u contains a roman numeral, f is true and n is the value,
\ otherwise f is false.
dup >r 'M' skip r> over - 1000 * {: n1 :}
end-romandigits romandigits u+do ( c-addr1 u1 )
2dup i rdigit-string string-prefix? if
i rdigit-string nip /string
i rdigit-value +to n1 then
rdigit-size +loop
nip n1 swap 0= ;

That looks best to me, although purists may disagree. A look at the
code between if...then is also interesting:

rot...-rot local
i 1->2    i 1->2
mov    r15,[r14]    mov    r15,[r14]
lit+ 2->2 lit+ 2->2
#8 #8
sub    rbx,$40    sub    rbx,$40
add    r15,-$08[rbx]    add    r15,-$08[rbx]
2@ 2->3 2@ 2->3
mov    r9,[r15] mov    r9,[r15]
mov    r15,$08[r15] mov    r15,$08[r15]
nip 3->2    nip 3->2
mov    r15,r9 mov    r15,r9
/string 2->2    /string 2->2
add    r10,$08    add    r10,$08
mov    r9,r15 mov    r9,r15
mov    r15,r13    mov    r15,r13
mov    r13,[r10]    mov    r13,[r10]
add    r13,r9 add    r13,r9
sub    r15,r9 sub    r15,r9
rot 2->3    i 2->2
mov    r9,$08[r10]    mov    [r10],r13
add    r10,$08    sub    r10,$08
i 3->2    mov    r13,r15
mov    [r10],r13    mov    r15,[r14]
sub    r10,$10    @ 2->2
mov    r13,r9 mov    r15,[r15]
mov    $08[r10],r15 lit 2->3
mov    r15,[r14]    #0
@ 2->2    mov    r9,$30[rbx]
mov    r15,[r15]    +!localn 3->1
+ 2->1    add    $00[rbp][r9],r15
add    r13,r15    add    rbx,$40
-rot 1->1
mov    rdx,$10[r10]
mov    rax,$08[r10]
mov    $10[r10],r13
mov    $08[r10],rdx
mov    r13,rax
add    rbx,$40

VFX might produce better code for the rot...-rot variant, though, but
the code uses some Gforth-specific features. Let's just try some
fragment for equivalent code:

: foo ?do if i 8 + 2@ nip /string rot i @ + -rot then 24 +loop ;
: bar {: n1 :} ?do if i 8 + 2@ nip /string i @ +to n1 then 24 +loop n1 ;

The following only shows the code between the IF and the THEN.

   VFX64 lxf
foo (rot -rot) bar (local)    foo (rot -rot)    bar (local)
MOV RDX, R14 MOV RDX, R14 mov eax , [esp] mov eax , [esp]
ADD RDX, # 08    ADD RDX, # 08    add eax , [esp+4h] add eax , [esp+4h]
MOV RCX, RDX MOV RCX, RDX add eax , # 8h    add eax , # 8h
MOV RCX, 0 [RDX] MOV RCX, 0 [RDX] mov ecx , [eax] mov ecx , [eax]
SUB RBX, RCX SUB RBX, RCX mov eax , [eax+4h] mov eax , [eax+4h]
ADD RCX, [RBP] ADD RCX, [RBP] sub ebx , ecx sub ebx , ecx
MOV RDX, R14 MOV RDX, R14 add ecx , [ebp] add ecx , [ebp]
MOV RAX, 0 [RDX] MOV RAX, 0 [RDX] mov eax , [esp] mov eax , [esp]
ADD RAX, [RBP+08] ADD [RDI+10], RAX add eax , [esp+4h] add eax , [esp+4h]
MOV [RBP], RCX MOV [RBP], RCX mov eax , [eax] mov eax , [eax]
MOV [RBP+08], RAX    add eax , [ebp+4h] add eax , [esp+8h]
mov [ebp] , ecx mov [esp+8h] , eax
mov [ebp+4h] , eax mov [ebp] , ecx
mov eax , [esp]

The overall code for BAR is quite a bit longer on both VFX64 and lxf,
though.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
   New standard: https://forth-standard.org/
EuroForth 2023 proceedings: http://www.euroforth.org/ef23/papers/
EuroForth 2024 proceedings: http://www.euroforth.org/ef24/papers/

Date	Sujet	#	Auteur
8 Jun 25	Re: Roman numerals , recognizer "0r".	3	Anton Ertl
9 Jun 25	Re: Roman numerals , recognizer "0r".	1	Anton Ertl
9 Jun 25	Re: Roman numerals , recognizer "0r".	1	Anton Ertl