threads with package bug (probably timing error) on magicsplat 9.01/02 and bawt 9.01

Liste des GroupesRevenir à cl tcl 
Sujet : threads with package bug (probably timing error) on magicsplat 9.01/02 and bawt 9.01
De : et99 (at) *nospam* rocketship1.me (et99)
Groupes : comp.lang.tcl
Date : 17. Jul 2025, 04:22:30
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <1059q9m$15ube$1@dont-email.me>
User-Agent : Mozilla Thunderbird
I have found a bug in tcl/tk 9.0x that occurs on windows when doing a [package require math] inside of several threads concurrently. I have tested several different distributions, a 9.01 and 9.02 magicsplat, a bawt 9.01 distro and tclkit, and as a control, 8.6.16 magicsplat (which does not fail).
My guess is that some kind of optimization was done in tcl 9.0 with regards to the package command that has caused a race condition or perhaps a sync-ing bug. I can get the failure with other packages besides math, but it happens easier with the math package.
I wrote up a ticket after 9.01 was out. I am posting this in hopes that someone who knows how to debug tcl on windows might have some insight or the ability to find the cause of the failure.
With the bawt 9.01 tclkit, the failure results in an access violation crash instead always on the same instruction. Unfortunately, I don't have any symbols, so I can't tell where the bug occurs in the tcl source code.
----------------------- details (sorry for the length) ----------
I have a test program and a windows batch script to run the test program in a loop.
The /path of the test script and wish.exe needs to be changed in the batch script before trying out the code. If anyone does want to try it, use the most recent addition to the ticket found here:
  https://core.tcl-lang.org/tcl/tktview/61c01e0edb
Running the batch is from a cmd.exe window, with 1 argument, 1-4, which selects which of my 4 distros to test. None of the paths would likely be correct on another system however.
The problem that results is that [package require math] fails with the message "cannot find package". It can occur rarely or often, depending on which distro is used. For example, magicsplat 9.02 got 163 failures in 43050 runs or about .3% of the time. Calling the package require math a second time usually succeeds however. Calling a package require on a non-existent package, e.g. foobar, first often succeeds as well.
Failures occur more often with bawt 9.01 however, for example, 6 failures in only 31 runs. I attribute this to a timing difference where bawt is using //zipfs in the auto_path, while magicsplat does not. On a much faster (2x) computer, running against magicsplat 9.01, the failure rate was about 5% of the time.
When examining the global variables in the failing thread, the key appears to be that auto_path does not include the path that an ifneeded should have set up, while it does appear in threads that do not fail.
I cannot get it to fail if I create only one thread (besides the main thread) and so my test script creates 3 threads. All the test code does in each thread is issue a [package require math] inside of a catch to trigger the error. After the error some diagnostic code is run, to output the auto_path and the error message to a tk dialog box.
If it does not fail, it will simply vwait forever, and the main thread will check for any failures (reported in a tsv shared variable) and exit quickly if no errors so it can run again.
If there is an error, and the tk message box is displayed with no interaction the process exits in 10 seconds, and another run occurs, and the batch script counts the runs/errors. There is a "no" button that can cancel the 10 second exit to allow inspection from the console.
In looking at the pkgIndex.tcl script for math, there is the line:
package ifneeded math  1.2.6 [list source [file join $dir math.tcl]]
and by instrumenting these 2 scripts (the pkgIndex.tcl and math.tcl), it was determined that this ifneeded was indeed executed (in all 3 threads), however, the math.tcl script had not been sourced in the failing thread.
That agrees with the setup of auto_path which is done in math.tcl:
variable home [file join [pwd] [file dirname [info script]]]
if {[lsearch -exact $::auto_path $home] == -1} {
lappend ::auto_path $home
}
So, the *mystery* is that when it fails, it appears as if the package database is common to all the threads and that something is not being sync'd correctly, so that executing the ifneeded in one thread makes another thread think it's already been done.
Anyhow, that's my guess, but I don't know if my thinking of a global common package database is actually correct.
This is as far as I can get with this, since I don't know how to debug the package code on windows, and I don't have a full distro on linux, like magicsplat or bawt.
I created the ticket (a while back) but there has been no additional entries on this (except by me). I don't know how serious this is for other users, but for me it means I can't reliably use threads code in version 9.
The latest test code and batch script are in the most recent addition to the ticket dated 2025-07-16.
The test script is a barebones whittling down of my tasks module, which is where I first found the failure. I could not get a failure using any simpler coding examples. However, my test script only calls tcl/tk core code plus the 2 package calls.
I would appreciate knowing if others have similar results and/or suggestions.
-eric

Date Sujet#  Auteur
17 Jul04:22 * threads with package bug (probably timing error) on magicsplat 9.01/02 and bawt 9.019et99
17 Jul08:04 `* Re: threads with package bug (probably timing error) on magicsplat 9.01/02 and bawt 9.018Harald Oehlmann
17 Jul20:52  `* Re: threads with package bug (probably timing error) on magicsplat 9.01/02 and bawt 9.017et99
18 Jul07:03   `* Re: threads with package bug (probably timing error) on magicsplat 9.01/02 and bawt 9.016Harald Oehlmann
18 Jul22:12    `* Re: threads with package bug (probably timing error) on magicsplat 9.01/02 and bawt 9.015et99
19 Jul08:32     `* Re: threads with package bug (probably timing error) on magicsplat 9.01/02 and bawt 9.014Harald Oehlmann
22 Jul02:35      `* Re: threads with package bug (probably timing error) on magicsplat 9.01/02 and bawt 9.013et99
22 Jul08:27       `* Re: threads with package bug (probably timing error) on magicsplat 9.01/02 and bawt 9.012Ralf Fassel
22 Jul09:08        `- Re: threads with package bug (probably timing error) on magicsplat 9.01/02 and bawt 9.011et99

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal