Sujet : Re: slow fileutil::foreachLine
De : rich (at) *nospam* example.invalid (Rich)
Groupes : comp.lang.tclDate : 17. Jun 2024, 16:40:29
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <v4pldd$noq7$1@dont-email.me>
References : 1
User-Agent : tin/2.6.1-20211226 ("Convalmore") (Linux/5.15.139 (x86_64))
Mark Summerfield <
mark@qtrac.eu> wrote:
I have this function:
proc ws::get_words {wordfile} {
set in [open $wordfile r]
try {
while {[gets $in line] >= 0} {
if {[regexp {^[a-z]+$} $line matched]} {
lappend ::ws::Words [string tolower $matched]
}
}
} finally {
close $in
}
}
It reads about 100_000 lines and ends up keeping about 65_000 of them
(from /usr/share/dict/words)
I tried replacing it with:
proc ws::get_words {wordfile} {
::fileutil::foreachLine line $wordfile {
if {[regexp {^[a-z]+$} $line matched]} {
lappend ::ws::Words [string tolower $matched]
}
}
}
The first version loads "instantly"; but the second version (with
foreachLine) takes seconds.
If you check the implementation of fileutil::foreachLine, you find:
set code [catch {uplevel 1 $cmd} result options]
Where "$cmd" is a variable holding a string of the "command" passed to
foreachLine.
Your original copy is all in a single procedure, so it will be bytecode
compiled, and for all but the first execution will run that compiled
bytecode.
The foreachLine version, since the "cmd" is a string, will receive
little to no byte code compiling, and the difference in time is the
overhead of not being able to bytecode compile the "command" string
passed to foreachLine.