Re: From JoyceUlysses.txt -- words occurring exactly once

Liste des GroupesRevenir à cl scheme 
Sujet : Re: From JoyceUlysses.txt -- words occurring exactly once
De : No_spamming (at) *nospam* noWhere_7073.org (B. Pym)
Groupes : comp.lang.lisp comp.lang.scheme
Date : 31. May 2024, 12:13:50
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <v3c7st$26biv$1@dont-email.me>
References : 1
User-Agent : XanaNews/1.18.1.6
On 5/30/2024, HenHanna wrote:

 
i'd not use Gauche for this, but maybe someone can change my mind.
 
 
_______________________
From JoyceUlysses.txt -- words occurring exactly once
 
 
Given a text file of a novel (JoyceUlysses.txt) ...
 
could someone give me a pretty fast (and simple) program that'd give me a list of all words occurring exactly once?
 
              -- Also, a list of words occurring once, twice or 3 times
 
 
 
re: hyphenated words        (you can treat it anyway you like)
 
       ideally, i'd treat  [editor-in-chief]
                           [go-ahead]  [pen-knife]
                           [know-how]  [far-fetched] ...
       as one unit.

Gauche Scheme

(use file.util)  ;; file->string
(use srfi-13)  ;; character sets
(use srfi-14)  ;; string-tokenize

(define h (make-hash-table 'string=?))

(dolist
  (s
    (string-tokenize (file->string "Alice.txt")
      (char-set-adjoin char-set:letter #\-)))
  (hash-table-update! h
    (regexp-replace* (string-upcase s) #/^-+/ "" #/-+$/ "")
    (pa$ + 1) 0))

(filter (lambda(kv) (< (cdr kv) 3))
  (hash-table->alist h))

  ===>

(("LASTED" . 2) ("WAY--NEVER" . 1) ("VISIT" . 1) ("CHANCED" . 1)
 ("WILDLY" . 2) ("BEHEAD" . 1) ("PROMISE" . 1) ("MEANWHILE" . 1)
 ("ENGAGED" . 1) ("KNIFE" . 2) ("ROARED" . 1) ("RETIRE" . 1)
 ("BLACKING" . 1) ("HATED" . 1) ("BRIGHT-EYED" . 1)
 ("SHEEP-BELLS" . 1) ("PROTECTION" . 1) ("CRIES" . 1) ("ADA" . 1)
 ("ENJOY" . 1) ("WRITHING" . 1) ("RAW" . 1) ("APPEALED" . 1)
 ("RELIEVED" . 1) ("CHILDHOOD" . 1) ("WEPT" . 1) ("RACE-COURSE" . 1)
 ("THEIRS" . 1) ("MAD--AT" . 1) ("SPOKEN" . 1) ("PENCILS" . 1)
 ("CLEAR" . 2) ("TREADING" . 2) ("RETURNED" . 2) ("CHERRY-TART" . 1)
 ("UNEASY" . 1) ("LOW-SPIRITED" . 1) ("BONE" . 1) ("PROMISED" . 1)
 ("HAPPENING" . 1) ("OYSTER" . 1) ("PATIENTLY" . 2) ("NEEDS" . 1)
 ("LESSON-BOOK" . 1) ("PITIED" . 1) ("UNCOMFORTABLY" . 1)
 ("ANTIPATHIES" . 1) ("PICTURED" . 1) ("DESPERATE" . 1)
 ("ENGRAVED" . 1)
 ...
)

Date Sujet#  Auteur
30 May 24 * From JoyceUlysses.txt -- words occurring exactly once6HenHanna
31 May 24 +- Re: From JoyceUlysses.txt -- words occurring exactly once1Jeff Barnett
31 May 24 +* Re: From JoyceUlysses.txt -- words occurring exactly once2Stefan Monnier
31 May 24 i`- Re: From JoyceUlysses.txt -- words occurring exactly once1Kaz Kylheku
31 May 24 +- Re: From JoyceUlysses.txt -- words occurring exactly once1Paul Rubin
31 May 24 `- Re: From JoyceUlysses.txt -- words occurring exactly once1B. Pym

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal