* Paul Rubin <
87bjv1vaos.fsf@nightsong.com> :
Wrote on Sun, 16 Feb 2025 13:31:15 -0800:
Madhu <enometh@meer.net> writes:
This becomes the Longest Increasing Subsequence Problem
>
Still not quite, the subsequence is supposed to be consecutive, not just
increasing. Like "abcde" not "acegi".
Hen's posts really bring out the my reading comprehension skills.
"Longest Increasing Consecutive Subsequence Problem"
Google finds plenty of code but I guess it is not of any research
interest.
FWIW My code is pretty clunky, somehow TIME SBCL says it conses 0 bytes
when I run it on individual words without interning although when I'm
explicitly manipulating plists and arrays.
(defun intern-lics (word &optional hash-table dp)
"Returns the longest increasing consective sequence (lics) of WORD.
If DP is supplied it should be an array with fill-pointer with enough
capacity to contain (LENGTH WORD) integers. If hash-table is supplied
it should be an EQUAL hashtable with STRING keys and LIST values.
WORD is pushed onto the list which is the hash-value of the lics.
WORD is assumed consist of lowercase ASCII characters.
Ignores subsequences of length 1."
(let* ((n (length word))
(map nil) ;; (getf map c) -> last index of c in word
(dp ;; dp[i] = length of the lics that end at index i of word
(cond (dp (setf (fill-pointer dp) n) dp)
(t (make-array n))))
(maximum -1) ;;the length of the lics
(index -1) ;;index in word where where the lics ends
)
(loop for i from 0 for c across word
for prev = (code-char (1- (char-code c)))
;; if prev is present in word before the ith index, then c
;; will add to the increasing subsequence which has prev.
for prev-idx = (getf map prev)
for newl = (if prev-idx
(1+ (elt dp prev-idx))
1)
do
(setf (elt dp i) newl)
(setf (getf map c) i)
(when (< maximum (elt dp i))
(setq maximum (elt dp i))
(setq index i)))
(when (> maximum 2) ; otherwise not interesting
(let ((subseq (map 'string 'code-char
(reverse (loop for i downfrom (char-code (elt word index))
repeat maximum
collect i)))))
(when hash-table
(pushnew word (gethash subseq hash-table nil)
:test #'equal))
subseq))))
(intern-lics "absconder") ;"abcde"
(intern-lics "afterstudy") ; => "rstu"
(time (loop repeat 10 do (intern-lics "absconder")))
#||
0.000 seconds of real time
0.000074 seconds of total run time (0.000073 user, 0.000001 system)
100.00% CPU
148,248 processor cycles
0 bytes consed
||#
;; 0 bytes? rly?
(defvar $a(slurp-lines "/usr/share/dict/words"))
(length $a)
;; => 234937
(map-into $a 'string-downcase $a)
(setq $a (delete-duplicates $a :test #'equalp))
(length $a)
;; => 233615
(defvar $h (make-hash-table :test #'equal))
(clrhash $h)
(setq $array (make-array 124 :fill-pointer t))
(time (map nil (lambda (w) (intern-lics w $h $array)) $a))
#||
1.916 seconds of real time
1.916782 seconds of total run time (1.916782 user, 0.000000 system)
100.05% CPU
4,047,899,870 processor cycles
61,527,200 bytes consed
||#
(hash-table-count $h)
;; => 44
(setq $keys (sort (hash-keys $h) #'string<))
(sort (group2 $keys :test #'equal :key #'length) #'< :key #'car)
=> ((3 "xyz" "wxy" "uvw" "tuv" "stu" "rst" "qrs" "pqr" "opq" "nop" "mno" "lmn"
"klm" "jkl" "ijk" "hij" "ghi" "fgh" "efg" "def" "cde" "bcd" "abc")
(4 "stuv" "rstu" "qrst" "nopq" "mnop" "lmno" "klmn" "ijkl" "hijk" "fghi"
"efgh" "defg" "cdef" "bcde" "abcd")
(5 "rstuv" "lmnop" "klmno" "efghi" "defgh" "abcde"))
(gethash "abcde" $h)
("oxylabracidae" "cerambycidae" "carbacidometer" "bambocciade" "amoebicide"
"ambuscader" "ambuscade" "amblycephalidae" "abstractedness" "absconder"
"abscondence" "abscondedly" "absconded" "aborticide" "abjectedness")