Sujet : Re: (Long post) Metaphone Algorithm In AWK
De : ben (at) *nospam* bsb.me.uk (Ben Bacarisse)
Groupes : comp.lang.awkDate : 21. Aug 2024, 00:58:06
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <87wmkapx0x.fsf@bsb.me.uk>
References : 1 2 3
User-Agent : Gnus/5.13 (Gnus v5.13)
porkchop@invalid.foo (Mike Sanders) writes:
Ben Bacarisse <ben@bsb.me.uk> wrote:
>
Using a word list, I found some odd matches. For example:
$ echo "drunkeness indigestion" | awk -f metaphone.awk -v find=texas
drunkeness
indigestion
Are these really metaphone matches for "texas"? It's possible (I don't
know the algorithm at all well) but I found it surprising.
>
Ben, give this try when you can. Finally starting to wrap my mind around
its usage a little more...
I don't know what your are asking for as this (your latest AWK) is not
just an implementation of the metaphone algorithm. With the extra
Levenshtein test it "texas" matches only a few words.
However, if I remove the extra condition (that levenshtein($x, find) <=
2) your AWK code matches a different set of words to the C
implementation. Looking a bit deeper, your AWK code give the code TKSS
to the word "texas" but the C code assigns is "TKS".
-- Ben.