Sujet : Re: ( Substring function in Python, Lisp) -- [Hijack] contains [hijk]
De : enometh (at) *nospam* meer.net (Madhu)
Groupes : comp.lang.lispDate : 16. Feb 2025, 16:43:20
Autres entêtes
Organisation : Motzarella
Message-ID : <m3r03xvqsn.fsf@leonis4.robolove.meer.net>
References : 1 2 3
* Paul Rubin <
874j0ucpn1.fsf@nightsong.com> :
Wrote on Sat, 15 Feb 2025 23:30:58 -0800:
Madhu <enometh@meer.net> writes:
(intern-runs "hijacks")
;; => (0 . 3) - longest substr of length 3 at position 0
>
I think that is not the question that was asked.
oops.
Although OP used the
term "substring", I think what was actually wanted is usually called a
subsequence, i.e. the characters in it don't have to be consecutive. So
"hijacks" has HIJK which has length 4. "firefighting" and
"prizefighting" both have EFGHI which has length 5.
This becomes the Longest Increasing Subsequence Problem
https://en.wikipedia.org/wiki/Longest_increasing_subsequenceThere are several good writeups about this on the web and lecture notes,
much better than the implementation I once cobbled up from the wiki
In common lisp:
https://kaygun.github.io/clean/2014-10-26-longest_increasing_subsequence.htmlhttps://kaygun.github.io/clean/2015-10-26-longest_increasing_subsequence_revisited.htmlUsing a suitable implementaion in a stupid way:
(defun lca (string)
(map 'string 'code-char (lca::longest-inc-seq (map 'list 'char-code
string))))
and running it on to extract into a hashtable with the keys as lcas
(hash-table-count $h2)
;; => 20437
(gethash "abcde" $h2)
("oxylabracidae" "cerambycidae" "bambocciade" "amoebicide" "ambuscade"
"absconded" "aborticide")
(sort (mapcar (lambda (x) (cons (car x) (length (cdr x))))
(group2 (hash-keys $h2) :test #'= :key #'length))
#'< :key #'car)
;; "length of lca . number of words"
((2 . 241) (3 . 1596) (4 . 4833) (5 . 7024) (6 . 4961) (7 . 1545) (8 . 217)
(9 . 19) (10 . 1))