Liste des Groupes | Revenir à cl c |
On 05/03/2024 08:47 PM, Chris M. Thomasson wrote:The timeout logic was fun to play with back when I was programming server code. A connection would come in, and be very fast get its job done, got its result: over and out. Now, when a connection would come in, do a little something then stall for a while... My time code would flag it as a potential stalled connection. The problem is a bad actor can make a connection, send some data, then stop. Make a thousand others that do it. Make another ten thousand connections that do it via infected proxy computers. I wrote a program that simulated these scenarios. The timeout code needed to refer to a little database the server had about prior "potential" bad actors. It's a touchy situation to say the least.On 5/3/2024 8:44 PM, Chris M. Thomasson wrote:In re-routines timeout logic is implemented because they eventuallyOn 4/30/2024 2:04 AM, Stefan Ram wrote:>ram@zedat.fu-berlin.de (Stefan Ram) wrote or quoted:>The GIL only prevents multiple Python statements from being>
interpreted simultaneously, but if you're waiting on inputs (like
sockets), it's not active, so that could be distributed across
multiple cores.
Disclaimer: This is not on-topic here as it discusses Python,
not C or C++.
>
FWIW, here's some multithreaded Python code modeled after what
I use in an application.
>
I am using Python to prepare a press review for me, getting article
headers from several newssites, removing all headers matching a list
of regexps, and integrating everything into a single HTML resource.
(I do not like to read about Lindsay Lohan, for example, so articles
with the text "Lindsay Lohan" will not show up on my HTML review.)
>
I'm usually downloading all pages at once using Python threads,
which will make sure that a thread uses the CPU while another
thread is waiting for TCP/IP data. This is the code, taken from
my Python program and a bit simplified:
>
from multiprocessing.dummy import Pool
>
...
>
with Pool( 9 if fast_internet else 1 )as pool:
for i in range( 9 ):
content[ i ] = pool.apply_async( fetch,[ uris[ i ] ])
pool.close()
pool.join()
>
. I'm using my "fetch" function to fetch a single URI, and the
loop starts nine threads within a thread pool to fetch the
content of those nine URIs "in parallel". This is observably
faster than corresponding sequential code.
>
(However, sometimes I have a slow connection and have to download
sequentially in order not to overload the slow connection, which
would result in stalled downloads. To accomplish this, I just
change the "9" to "1" in the first line above.)
>
In case you wonder about the "dummy":
>
|The multiprocessing.dummy module module provides a wrapper
|for the multiprocessing module, except implemented using
|thread-based concurrency.
|
|It provides a drop-in replacement for multiprocessing,
|allowing a program that uses the multiprocessing API to
|switch to threads with a single change to import statements.
>
. So, this is an area where multithreading the Python way is easy
to use and enhances performance even in the presence of the GIL!
Agreed. However, its a very small sample. Try to download 60,000 files
concurrently from different sources all at once. This can be where the
single lock messes with performance...
Certain sources are faster than others. That's always fun... Think of
timeout logic... ;^D
come up and if expired then are retired.
Now, using words like retire gets involved when it's contextual
all the way to mu-ops of the core processor pipeline and the
notions of the usual model of speculative execution in modern
chips about mu-ops, pipelines, caches, and the execution order
and memory barriers and ordering guarantees of instruction
according to the chip.
Here though it means that implementing time out in open
items, gets involved checking each item at an interval
that represents the hard-timeout vis-a-vis the "it's expired"
timeout.
So in re-routines is that there's simply enough an auxiliary
data structure a task-set besides a task-queue, and going
through the items to finding expired items, yet, that's its
own sort of busy-working data structure, in a world where
items have apiece their own granular timeout lifetimes and intervals.
It's similar for open connections and something like as sweeper/closer,
with regards to protocol timeouts, socket timeouts, and these kinds
of things, with regards to whatever streams are implemented in
whatever system or user-space streams from sockets or datagrams.
Something like XmlHttpRequest or whatwg fetch, runs in its
own threads, sort of invisibly to a usual event-loop.
Les messages affichés proviennent d'usenet.