Sujet : Re: Higher Order Logic Programming and Autograd
De : janburse (at) *nospam* fastmail.fm (Mild Shock)
Groupes : comp.lang.prologDate : 11. Mar 2025, 13:14:54
Autres entêtes
Message-ID : <vqp9fr$1bfso$1@solani.org>
References : 1 2
User-Agent : Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:128.0) Gecko/20100101 Firefox/128.0 SeaMonkey/2.53.20
But where is Autograd, automatic derivation from
some symbolic input? In general you can objectify
neural networks which I already did with the Prolog
list, and routines such as back/3 are pure Prolog.
Basically you could symbolically derive expit
(activation), mulderiv (the product with the derivative
of the activation) and matrran (the jacobian without
activation) from a DAG of vector functions. In a linear
neural network, the jacobian without activation is
the same as the weights, and expit has a simple derivative
that is based on the expit result itself which is
already stored as the activation:
/* g(x) = logistic function */
expit(X, Y) :- Y is 1/(1+exp(-X)).
/* g'(x) = g(x)*(1-g(x)) */
mulderiv(X, Y, Z) :- Z is X*Y*(1-Y).
See also:
A Gentle Introduction to torch.autograd
https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.htmlMild Shock schrieb:
What can we do with these new toys, we
can implement vector operations and matrice
operations. An then apply it for example
to layered neural networks by
representing them as:
/**
* Network is represented as [N0,M1,N1,...,Mn,Nn]
* - Where N0 are the input neurons vector
* - Where N1 .. Nn-1 are the hidden neurons vectors
* - Where Nn are the output neurons vector
* . Where M1 .. Mn are the transition weights matrice
*/
?- mknet([3,2], X).
X = [''(-1, 1, 1), ''(''(1, 1, -1), ''(1, 1, -1)), ''(-1, 1)].
The model evaluation at a data point
is straight forward:
eval([V], [V]) :- !.
eval([V,M,_|L], [V,M|R]) :- !,
matmul(M, V, H),
vecact(H, expit, J),
eval([J|L], R).
The backward calculation of deltas
is straight forward:
back([V], U, [D]) :- !,
vecact(U, V, sub, E),
vecact(E, V, mulderiv, D).
back([V,M,W|L], U, [D2,M,D|R]) :-
back([W|L], U, [D|R]),
mattran(M, M2),
matmul(M2, D, E),
vecact(E, V, mulderiv, D2).
You can use this to compute weight changes
and drive a gradient algorithm.
Mild Shock schrieb:
Somehow I shied away from implementing call/n for
my new Prolog system. I thought my new Prolog system
has only monomorphic caches , I will never be able to
>
replicate what I did for my old Prolog system with
arity polymorphic caches. This changed when I had
the idea to dynamically add a cache for the duration
>
of a higher order loop such as maplist/n, foldl/n etc…
>
So this is the new implementation of maplist/3:
>
% maplist(+Closure, +List, -List)
maplist(C, L, R) :-
sys_callable_cacheable(C, D),
sys_maplist(L, D, R).
>
% sys_maplist(+List, +Closure, -List)
sys_maplist([], _, []).
sys_maplist([X|L], C, [Y|R]) :-
call(C, X, Y),
sys_maplist(L, C, R).
>
Its similar as the SWI-Prolog implementation in that
it reorders the arguments for better first argument
indexing. But the new thing is sys_callable_cacheable/1,
>
which prepares the closure to be more efficiently
called. The invocation of the closure is already
quite fast since call/3 is implemented natively,
>
but the cache adds an itch more speed. Here some
measurements that I did:
>
/* SWI-Prolog 9.3.20 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
maplist(succ,L,_),fail; true)), fail.
% 2,003,000 inferences, 0.078 CPU in 0.094 seconds
>
/* Scryer Prolog 0.9.4-350 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
maplist(succ,L,_),fail; true)), fail.
% CPU time: 0.318s, 3_007_105 inferences
>
/* Dogelog Player 1.3.1 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
maplist(succ,L,_),fail; true)), fail.
% Zeit 342 ms, GC 0 ms, Lips 11713646, Uhr 10.03.2025 09:18
>
/* realla Prolog 2.64.6-2 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
maplist(succ,L,_),fail; true)), fail.
% Time elapsed 1.694s, 15004003 Inferences, 8.855 MLips
>
Not surprisingly SWI-Prolog is fastest. What was
a little surprise is that Scryer Prolog can do it quite
fast, possibly since they heavily use maplist/n all
>
over the place, they came up with things like '$fast_call'
etc.. in their call/n implementation. Trealla Prolog is
a little bit disappointing at the moment.
>