On 4/24/2025 6:00 PM, MitchAlsup1 wrote:
On Thu, 24 Apr 2025 20:10:29 +0000, BGB wrote:
----------
>
>
But, a recent line of fiddling has gone in an odd direction.
I felt a need for a basic script language for some tasks;
I have been doing something similar--except I wrote my script
translator in eXcel.
Errm?...
I use it to read *.h files and spit out *.c files that translate
type mathfunction(type arguments) into a series of lines Brian's
compiler interprets as transcendental instructions in My 66000
ISA.
So, code contains the prototype::
extern type_r recognized_name( type_a1 name );
and my scripter punts out::
type_r recognized_spelling( type_a name )
{
register typoe_a __asm__("R1") = name;
__asm__("instruction_spelling\t%4,%4" : "=r" (R1) : "r" (r1) );
return R1;
}
So when user codes (in visibility of math.h)
y = sinpi( x );
compiler spits out:
SINPI Ry,Rx
OK.
In my case, the script interpreter is written in C.
Code needed to get the core interpreter working: Around 1000 lines;
Code needed after adding more stuff, around 1500 lines.
Though, not counting the ~ 600 lines needed mostly for the dynamic type-system and similar.
A vaguely similar design was implemented inside the TestKern shell, but was written to make use of BGBCC extensions. For this case, needed to write something that would also work in MSVC and GCC. Core design for the dialect was still similar though (and chose BASIC as a base partly becuase I already knew I could get something that was fairly usable with comparably small code).
Language sort of looks like:
//comment, contents entirely ignored by parser
rem stuff //also comment, but subject to token rules
x=a+b //basic assignment
let x=a+b //also assignment, creates vars in global scope
temp y=a+b //similar to let, but dynamically scoped
x=a*b+c*d //does compound statements with a normal-ish precedence.
x=12345 //integer literal, decimal
x=0x1234 //hexadecimal
x="string" //string, uses C style escapes
dim a(128) //creates a global array
if x<10 goto label
label:
goto label //goto
gosub label //subroutine call to label
return //return from most recent gosub
end //script terminates
print stuff //print stuff to console
x=arr(i) //load from array
arr(i)=x //store to array
Atypical stuff:
Dynamically typed;
Traditional BASIC used suffixes to encode type.
With no-suffix typically for a default numeric type.
QBasic and Visual Basic using static types.
Dynamically scoped;
Like Emacs Lisp.
Callee can see variables in the caller;
Variables can be created that do not effect caller.
...
Atypical syntax:
x = gosub label a=3, b=4 //gosub with return values and parameters.
return expr //return with expression
v=(vec 1,2,3) //vector type
m=(vec (vec 1,0,0),(vec 0,1,0),(vec 0,0,1)) //poor man's matrix
...
Precendence:
Literal values;
Unary operators (+, -, !, ~)
*, /, %
+, -
&, |, ^
<<, >>
== (=), !=, <, >, <=, =>
&&, ||
No assignment or comma operators; assignment is a statement.
Precedence rules differ here from C.
Unlike a C style tokenizer, any combination of operator symbols will be parsed as a single operator, regardless of whether or not such an operator exists (this shaves a big chunk of code off the tokenizer logic).
For now, the language lacks any ability to define proper functions in-language, so the only functions that exist are built-in.
For the first time in a very long time, this interpreter has an "eval" command in the console. Though, one needs to use parenthesis to eval an expression as (unlike JS or similar) statements and expressions are different and one may not have an implicit expression in a statement context. For my first major script language (JS based), there was an eval. Howerver, with the design of my later BS2 language, eval was no longer viable.
Where, there is a split between design choices that make sense for a light-duty script language, and one meant for "serious work" (more features, better performance, etc). Sometimes, one might climb the ladder of the language being better for implementation tasks, while ignoring things that are useful for light-duty scripting tasks (trying to make a language that does both but maybe ultimately does neither task particularly well).
So, say, the fate that befell my original BGBScript language, was that the VM became increasingly heavyweight (more code, more complex, ...) and less well suited for implementation tasks (as it tried to take on work that might have otherwise been left to C). BGBScript2 had essentially turned into a Java like language, not as good at implementing stuff as "just write everything in C", yet no longer great for scripting either (namely, Java-style code structure is not particularly amendable to interactive use of "eval"; nor is a statically-typed language particularly amendable to "hot patching" live code, etc...).
Like, when a scripting VM expands to 300 or 500 kLOC, using it for scripting a project is no longer as attractive of an option. A partial fork of this VM still survives though, I just now call it "BGBCC" and am using it mostly as a C compiler for my custom ISA project.
Though, from what I can see, modern JavaScript seems headed down a similar path.
A similar issue seems common in many long lived script VM projects. They get faster and more powerful, all while loosing the properties that made them useful for their original use cases.
Granted, the other option is to effectively "roll the clock backwards", and revert a language to a simpler form.
Judging by the past, could probably do another JavaScript style VM in around 10k LOC or so. Maybe less if the design priority is keeping code small. Besides the block structuring, there are "gotcha" things like break/continue handling that one needs to deal with. Naive AST-walking interpreters don't deal well with non-local control transfers (like break/continue/goto).
So, say, if the minimum becomes:
Parse language to AST;
Flatten AST into some sort of linear IR;
Interpret linear IR.
Then this would set a lower limit on the size of the interpreter.
Well, and giving up on 'break' and 'continue' wouldn't be great for usability. Then again, maybe there could be a "break/continue" flag, where the AST walker would simply walk outwards until getting back to a place where the break/continue status could be handled. Could maybe save a few kLOC.
Other design goal basically being to limit it to a similar feature set to ES3 (though, leaving out some ECMAScript's misfeatures).
Could do something Lisp-like, but I suspect the minimal interpreter for a Lisp dialect will still be larger than for an 80s style BASIC dialect.
It is different tradeoffs:
+ more expressive
+ less cruft
- Actually using Lisp syntax sucks worse than BASIC
Or, smaller still, something like Forth or PostScript, but, the general experience of trying to write code in these languages is a lot worse than BASIC.
I had also half wanted a SCAD interpreter, but, FWIW, the syntax for SCAD is vaguely similar to that of JavaScript (and, the idea of doing SCAD style stuff in unstructured BASIC does seem a bit ugly).
But, then again, when I threw together this interpreter, had imagining using it for NPC event scripting (as opposed to CSG).
Where, say, SCAD code looks sorta like, say:
color("black")
{
translate([-1,7,10-jawrot*0.08])
cube([2,0.25,jawrot*0.1]);
translate([-1.75,7.8,14])
cube([0.75,0.25,0.75]);
translate([1.0,7.8,14])
cube([0.75,0.25,0.75]);
}
color("brown")
{
translate([-4,-1,14])
cube([8,8,5]);
translate([0,5,16])
rotate([0,0,45])
cylinder(h=4,r1=5,r2=3,center=false,$fn=4);
}
...