Newsportal USENET - Re: slow fileutil::foreachLine

Sujet : Re: slow fileutil::foreachLine
De : rich (at) *nospam* example.invalid (Rich)
Groupes : comp.lang.tcl
Date : 17. Jun 2024, 16:40:29

Autres entêtes

Organisation : A noiseless patient Spider
Message-ID : <v4pldd$noq7$1@dont-email.me>
References : 1
User-Agent : tin/2.6.1-20211226 ("Convalmore") (Linux/5.15.139 (x86_64))

Mark Summerfield <mark@qtrac.eu> wrote:

I have this function:

proc ws::get_words {wordfile} {
set in [open $wordfile r]
try {
while {[gets $in line] >= 0} {
if {[regexp {^[a-z]+$} $line matched]} {
lappend ::ws::Words [string tolower $matched]
}
}
} finally {
close $in
}
}

It reads about 100_000 lines and ends up keeping about 65_000 of them
(from /usr/share/dict/words)

I tried replacing it with:

proc ws::get_words {wordfile} {
::fileutil::foreachLine line $wordfile {
if {[regexp {^[a-z]+$} $line matched]} {
lappend ::ws::Words [string tolower $matched]
}
}
}

The first version loads "instantly"; but the second version (with
foreachLine) takes seconds.

If you check the implementation of fileutil::foreachLine, you find:

set code [catch {uplevel 1 $cmd} result options]

Where "$cmd" is a variable holding a string of the "command" passed to
foreachLine.

Your original copy is all in a single procedure, so it will be bytecode
compiled, and for all but the first execution will run that compiled
bytecode.

The foreachLine version, since the "cmd" is a string, will receive
little to no byte code compiling, and the difference in time is the
overhead of not being able to bytecode compile the "command" string
passed to foreachLine.

Date	Sujet	#	Auteur
17 Jun 24	slow fileutil::foreachLine	2	Mark Summerfield
17 Jun 24	Re: slow fileutil::foreachLine	1	Rich