Sujet : Re: A technique from a chatbot
De : ram (at) *nospam* zedat.fu-berlin.de (Stefan Ram)
Groupes : comp.lang.pythonDate : 05. Apr 2024, 19:29:22
Autres entêtes
Organisation : Stefan Ram
Message-ID : <benchmark-20240405190253@ram.dialup.fu-berlin.de>
References : 1 2 3 4 5
Mark Bourne <
nntp.mbourne@spamgourmet.com> wrote or quoted:
I don't think there's a tuple being created. If you mean:
( word for word in list_ if word[ 0 ]== 'e' )
...that's not creating a tuple. It's a generator expression, which
generates the next value each time it's called for. If you only ever
ask for the first item, it only generates that one.
Yes, that's also how I understand it!
In the meantime, I wrote code for a microbenchmark, shown below.
This code, when executed on my computer, shows that the
next+generator approach is a bit faster when compared with
the procedural break approach. But when the order of the two
approaches is being swapped in the loop, then it is shown to
be a bit slower. So let's say, it takes about the same time.
However, I also tested code with an early return (not shown below),
and this was shown to be faster than both code using break and
code using next+generator by a factor of about 1.6, even though
the code with return has the "function call overhead"!
But please be aware that such results depend on the implementation
and version of the Python implementation being used for the benchmark
and also of the details of how exactly the benchmark is written.
import random
import string
import timeit
print( 'The following loop may need a few seconds or minutes, '
'so please bear with me.' )
time_using_break = 0
time_using_next = 0
for repetition in range( 100 ):
for i in range( 100 ): # Yes, this nesting is redundant!
list_ = \
[ ''.join \
( random.choices \
( string.ascii_lowercase, k=random.randint( 1, 30 )))
for i in range( random.randint( 0, 50 ))]
start_time = timeit.default_timer()
for word in list_:
if word[ 0 ]== 'e':
word_using_break = word
break
else:
word_using_break = ''
time_using_break += timeit.default_timer() - start_time
start_time = timeit.default_timer()
word_using_next = \
next( ( word for word in list_ if word[ 0 ]== 'e' ), '' )
time_using_next += timeit.default_timer() - start_time
if word_using_next != word_using_break:
raise Exception( 'word_using_next != word_using_break' )
print( f'{time_using_break = }' )
print( f'{time_using_next = }' )
print( f'{time_using_next / time_using_break = }' )