Newsportal USENET - Attention: Python versus Java (Was: Can Prologers produce 100% Prolog Code?)

Attention: Java is an example that doesn’t
understand \UXXXXXXXX, so one has to be careful
in inroducing \uXXXX and \UXXXXXXXX at the same time.
Although we have in Python that this works:
emoji = "\U0001F600" # 😀 GRINNING FACE
Python universal strings can even distingush between
original code point, and surrogate translation, since
the strings can be up to 32-bit words. Java does
only accept for grinning face the surrogate
translation, since their strings are 16-bit words:
String emoji = "\uD83D\uDE00"; // 😀 GRINNING FACE
Mild Shock schrieb:

Somebody wrote:
> It seems that it reads in as ðŸ‘\u008D but writes out as ðŸ‘\x8D\.
Can one then do ‘\uXXXX’ in 100% Prolog as
well? Even including surrogates? Of course,
here some DCG generator snippet from Dogelog
Player which is 100% Prolog. This is from the
Java backend, because I didn’t introduce ‘\uXXXX’
in my Prolog system, because it is not part of
ISO core standard. The ISO core standard would want '\xXX':
crossj_escape_code2(X) --> {X =< 0xFFFF}, !,
    {atom_integer(J, 16, X), atom_codes(J, H),
    length(H, N), M is 4-N}, [0'\, 0'u],
    cross_escape_zeros(M),
    cross_escape_codes2(H).
crossj_escape_code2(X) --> {crossj_high_surrogate(X, Y),
    crossj_low_surrogate(X, Z)},
    crossj_escape_code2(Y),
    crossj_escape_code2(Z).
crossj_high_surrogate(X, Y) :- Y is (X >> 10) + 0xD7C0.
crossj_low_surrogate(X, Y) :- Y is (X /\ 0x3FF) + 0xDC00.
Mild Shock schrieb:
The official replacement character is 0xFFFD:
>
> Replacement Character
> https://www.compart.com/de/unicode/U+FFFD
>
Well that is what people did in the past, replace
non-printables by the ever same code, instead of
using ‘\uXXXX’ notation. I have studied the
>
library(portray_text) extensively. And my conclusion
is still that it extremly ancient.
>
For example I find:
>
mostly_codes([H|T], Yes, No, MinFactor) :-
     integer(H),
     H >= 0,
     H =< 0x1ffff,
     [...]
    ;   catch(code_type(H, print),error(_,_),fail),
     [...]
>
https://github.com/SWI-Prolog/swipl-devel/blob/eddbde61be09b95eb3ca2e160e73c2340744a3d2/library/portray_text.pl#L235 >
>
Why even 0x1ffff and not 0x10ffff, this is a bug,
do you want to starve is_text_code/1 ? The official
Unicode range is 0x0 to 0x10ffff. Ulrich Neumerkel
>
often confused the range in some of his code snippets,
maybe based on a limited interpretation of Unicode.
But if one would switch to chars one could easily
>
support any Unicode code point even without
knowing the range. Just do this:
>
mostly_chars([H|T], Yes, No, MinFactor) :-
     atom(H),
     atom_length(H, 1),
     [...]
    ; /* printable check not needed */
     [...]
>
Mild Shock schrieb:
Hi,
>
The most radical approach is Novacore from
Dogelog Player. It consists of the following
major incisions in the ISO core standard:
>
- We do not forbid chars, like for example
   using lists of the form [a,b,c], we also
   provide char_code/2 predicate bidirectionally.
>
- We do not provide and _chars built-in
   predicates also there is nothing _strings. The
   Prolog system is clever enough to not put
   every atom it sees in an atom table. There
   is only a predicate table.
>
- Some host languages have garbage collection that
   deduplicates Strings. For example some Java
   versions have an options to do that. But we
   do not have any efforts to deduplicate atoms,
   which are simply plain strings.
>
- Some languages have constant pools. For example
   the Java byte code format includes a constant
   pool in every class header. We do not do that
   during transpilation , but we could of course.
   But it begs the question, why only deduplicate
   strings and not other constant expressions as well?
>
- We are totally happy that we have only codes,
   there are chances that the host languages use
   tagged pointers to represent them. So they
   are represented similar to the tagged pointers
   in SWI-Prolog which works for small integers.
>
- But the tagged pointer argument is moot,
   since atom length=1 entities can be also
   represented as tagged pointers, and some
   programming languages do that. Dogelog Player
   would use such tagged pointers without
   poluting the atom table.
>
- What else?
>
Bye
>
Mild Shock schrieb:
>
Technically SWI-Prolog doesn't prefer codes.
Library `library(pure_input)` might prefer codes.
But this is again an issue of improving the
library by some non existent SWI-Prolog community.
>
The ISO core standard is silent about a flag
back_quotes, but has a lot of API requirements
that support both codes and chars, for example it
requires atom_codes/2 and atom_chars/2.
>
Implementation wise there can be an issue,
like one might decide to implement the atoms
of length=1 more efficiently, since with Unicode
there is now an explosion.
>
Not sure whether Trealla Prolog and Scryer
Prolog thought about this problem, that the
atom table gets quite large. Whereas codes don't
eat the atom table. Maybe they forbit predicates
>
that have an atom of length=1 head:
>
h(X) :-
     write('Hello '), write(X), write('!'), nl.
>
Does this still work?
>
Mild Shock schrieb:
Concerning library(portray_text) which is in limbo:
>
> Libraries are (often) written for either
and thus the libraries make the choice.
>
But who writes these libraries? The SWI Prolog
community. And who doesn’t improve these libraries,
instead floods the web with workaround tips?
The SWI Prolog community.
>
Conclusion the SWI-Prolog community has itself
trapped in an ancient status quo, creating an island.
Cannot improve its own tooling, is not willing
to support code from else where that uses chars.
>
Same with the missed AI Boom.
>
(*) Code from elsewhere is dangerous, People
might use other Prolog systems than only SWI-Prolog,
like for exampe Trealla Prolog and Scryer Prolog.
>
(**) Keeping the status quo is comfy. No need to
think in terms of programm code. Its like biology
teachers versus pathology staff, biology teachers
do not everyday see opened corpses.
>
>
Mild Shock schrieb:
>
Inductive logic programming at 30
https://arxiv.org/abs/2102.10556
>
The paper contains not a single reference to autoencoders!
Still they show this example:
>
Fig. 1 ILP systems struggle with structured examples that
exhibit observational noise. All three examples clearly
spell the word "ILP", with some alterations: 3 noisy pixels,
shifted and elongated letters. If we would be to learn a
program that simply draws "ILP" in the middle of the picture,
without noisy pixels and elongated letters, that would
be a correct program.
>
I guess ILP is 30 years behind the AI boom. An early autoencoder
turned into transformer was already reported here (*):
>
SERIAL ORDER, Michael I. Jordan - May 1986
https://cseweb.ucsd.edu/~gary/PAPER-SUGGESTIONS/Jordan-TR-8604-OCRed.pdf >
>
Well ILP might have its merits, maybe we should not ask
for a marriage of LLM and Prolog, but Autoencoders and ILP.
But its tricky, I am still trying to decode the da Vinci code of
>
things like stacked tensors, are they related to k-literal clauses?
The paper I referenced is found in this excellent video:
>
The Making of ChatGPT (35 Year History)
https://www.youtube.com/watch?v=OFS90-FX6pg
>
>
>
>
>

Date	Sujet	#	Auteur
22 Feb 25	Prolog totally missed the AI Boom	28	Mild Shock
22 Feb 25	Auto-Encoders as Prolog Fact Stores (Was: Prolog totally missed the AI Boom)	3	Mild Shock
23 Feb 25	Ignorance in ILP circles confirmed (Was: Auto-Encoders as Prolog Fact Stores)	1	Mild Shock
19 Mar 25	Neuro infused logic programming [NILP] (Was: Auto-Encoders as Prolog Fact Stores)	1	Mild Shock
7 Mar 25	Last Exit Analogical Resoning (Was: Prolog totally missed the AI Boom)	1	Mild Shock
25 Mar 25	A software engineering analyis why Prolog fails (Was: Prolog totally missed the AI Boom)	3	Mild Shock
27 Mar 25	Lets re-iterate software engineering first! (Was: A software engineering analyis why Prolog fails)	2	Mild Shock
27 Mar 25	Re: Lets re-iterate software engineering first! (Was: A software engineering analyis why Prolog fails)	1	Mild Shock
23 Jun 25	No Coders completely Brain Dead (Was: Prolog totally missed the AI Boom)	12	Mild Shock
23 Jun 25	Unicode and atom length=1 (Was: No Coders completely Brain Dead)	11	Mild Shock
23 Jun 25	Most radical approach is Novacore from Dogelog Player (Was: Unicode and atom length=1)	10	Mild Shock
23 Jun 25	SWI-Prolog master not wide awake, doing day-sleeping (Was: Most radical approach is Novacore from Dogelog Player)	6	Mild Shock
23 Jun 25	Re: SWI-Prolog master not wide awake, doing day-sleeping (Was: Most radical approach is Novacore from Dogelog Player)	1	Mild Shock
23 Jun 25	The beauty of a double hook (Was: SWI-Prolog master not wide awake, doing day-sleeping)	1	Mild Shock
23 Jun 25	The beauty of a dual use hook (Was: SWI-Prolog master not wide awake, doing day-sleeping)	3	Mild Shock
23 Jun 25	maplist(char_code, Chars, Codes) is bidirectional (Was: The beauty of a dual use hook)	2	Mild Shock
23 Jun 25	I really have lost all hope and given up (Was: maplist(char_code, Chars, Codes) is bidirectional)	1	Mild Shock
27 Jun12:21	Do Prologers know the Unicode Range? (Was: Most radical approach is Novacore from Dogelog Player)	3	Mild Shock
27 Jun12:22	Can Prologers produce 100% Prolog Code? (Was: Do Prologers know the Unicode Range?)	2	Mild Shock
27 Jun12:36	Attention: Python versus Java (Was: Can Prologers produce 100% Prolog Code?)	1	Mild Shock
23 Jun 25	Do not give dogs what is holy [Matthew 7:6] (Was: Prolog totally missed the AI Boom)	5	Mild Shock
23 Jun 25	Typo:: Do not give dogs what is holy [Matthew 7:6] (Was: Prolog totally missed the AI Boom)	4	Mild Shock
23 Jun 25	What WG17 could do to prevent segregation [DEC-10 Prolog (10 November 1982)] (Was: Typo:: Do not give dogs what is holy)	3	Mild Shock
23 Jun 25	Avoid the cheap tricks by Scryer Prolog (Was: What WG17 could do to prevent segregation [DEC-10 Prolog (10 November 1982)])	2	Mild Shock
23 Jun 25	Why tuck the tail in front of a false Messias (Was: Avoid the cheap tricks by Scryer Prolog)	1	Mild Shock
29 Jun12:32	Missed the AI Boom because missed the Emojis (Was: Prolog totally missed the AI Boom)	2	Mild Shock
29 Jun12:36	Bonus in Trealla Prolog, different Tokenizer (Was: Missed the AI Boom because missed the Emojis)	1	Mild Shock
29 Jun15:35	Science is not prepared for the AI Revolution (Was: Prolog totally missed the AI Boom)	1	Mild Shock