Liste des Groupes | Revenir à cl c |
On 18/06/2024 13:36, David Brown wrote:It is an artificial task that matches Michael's description of a "very simple thing like reading numbers from text file". Perhaps I should have asked for the median and mode as well as the mean. In Python, that would mean adding these lines :On 18/06/2024 10:56, Michael S wrote:A rather artificial task that you have to chosen so that it can be done as a Python one-liner, for the main body.On Mon, 17 Jun 2024 15:23:55 +0200>
David Brown <david.brown@hesbynett.no> wrote:
>I use Python rather than C because for>
PC code, that can often involve files, text manipulation, networking,
and various data structures, the Python code is at least an order of
magnitude shorter and faster to write. When I see the amount of
faffing around in order to read and parse a file consisting of a list
of integers, I find it amazing that anyone would actively choose C
for the task (unless it is for the fun of it).
>
The faffing (what does it mean, BTW ?) is caused by unrealistic
requirements. More specifically, by requirements of (A) to support
arbitrary line length (B) to process file line by line. Drop just one
of those requirements and everything become quite simple.
"Faffing around" or "faffing about" means messing around doing unimportant or unnecessary things instead of useful things. In this case, it means writing lots of code for handling memory management to read a file instead of using a higher-level language and just reading the file.
>
Yes, dropping requirements might make the task easier in C. But you still don't get close to being as easy as it is in a higher level language. (That does not have to be Python - I simply use that as an example that I am familiar with, and many others here will also have at least some experience of it.)
>>>
For task like that Python could indeed be several times shorter, but
only if you wrote your python script exclusively for yourself, cutting
all corners, like not providing short help for user, not testing that
input format matches expectations and most importantly not reporting
input format problems in potentially useful manner.
No, even if that were part of the specifications, it would still be far easier in Python. The brief Python samples I have posted don't cover such user help, options, error checking, etc., but that's because they are brief samples.
>OTOH, if we write our utility in more "anal" manner, as we should if>
we expect it to be used by other people or by ourselves long time after
it was written (in my age, couple of months is long enough and I am not
that much older than you) then code size difference between python and
C variants will be much smaller, probably factor of 2 or so.
Unless half the code is a text string for a help page, I'd expect a bigger factor. And I'd expect the development time difference to be an even bigger factor - with Python you avoid a number of issues that are easy to get wrong in C (such as memory management). Of course that would require a reasonable familiarity of both languages for a fair comparison.
>
C and Python are both great languages, with their pros and cons and different areas where they shine. There can be good reasons for writing a program like this in C rather than Python, but C is often used without good technical reasons. To me, it is important to know a number of tools and pick the best one for any given job.
>>>
W.r.t. faster to code, it very strongly depends on familiarity.
You didn't do that sort of tasks in 'C' since your school days, right?
Or ever? And you are doing them in Python quite regularly? Then that is
much bigger reason for the difference than the language itself.
Sure - familiarity with a particular tool is a big reason for choosing it.
>Now, for more complicated tasks Python, as the language, and even more>
importantly, Python as a massive set of useful libraries could have
very big productivity advantage over 'C'. But it does not apply to very
simple thing like reading numbers from text file.
IMHO, it does. I have slightly lost track of which programs were being discussed in which thread, but the Python code for the task is a small fraction of the size of the C code. I agree that if you want to add help messages and nicer error messages, the difference will go down.
>
Here is a simple task - take a file name as an command-line argument, then read all white-space (space, tab, newlines, mixtures) separated integers. Add them up and print the count, sum, and average (as an integer). Give a brief usage message if the file name is missing, and a brief error if there is something that is not an integer. This should be a task that you see as very simple in C.
>
>
#!/usr/bin/python3
import sys
>
if len(sys.argv) < 2 :
print("Usage: sums.py <input-file>")
sys.exit(1)
>
data = list(map(int, open(sys.argv[1], "r").read().split()))
n = len(data)
s = sum(data)
print("Count: %i, sum %i, average %i" % (n, s, s // n))
Some characteristics of how it is done are that the whole file is read into memory as effectively a single string, and all the numbers are collated into an in-memory array before it is processed.Yes. And that's fine.
Numbers are also conveniently separated by white-space (no commas!), so that .split can be used.Yes, that was the specification. But if you want it to support spaces, newlines, tabs and commas, you can write the split() as
You are using features from Python that allow arbitrary large integers that also avoid any overflow on that sum.I'm using features from Python in my Python code when showing that Python has features making it more convenient than C for this kind of task! What a horror! That's downright /evil/ of me!
A C version wouldn't have all those built-ins to draw on (presumably you expect the starting point to be 'int main(int n ,char** args){}'; using existing libraries is not allowed)./Exactly/.
Some would write it so that the file is processed serially and doesn't have to occupy memory, or needed to deal with files that might fill up memory.
They might also try and avoid building a large data[] array that may need to grow in size unless the bounds are determined in addvance.Run-time speed was not at issue. We all know that it is possible to write C code for a task like this which will run a great deal faster than the Python code, especially if you can give extra restrictions to the incoming data.
The C version would be doing it in a different mannner, and likely to be more efficient.
I haven't tried it directly in C (I don't have a C 'readfile's to hand); I tried it in my language on a 100MB test input of 15M random numbers ranging up to one million.No one is interested in that - that was not part of the task.
With a more arbitrary input format, this would be the kind of job that a compiler's lexer does. But nobody seriously writes lexers in Python.Yes, people do. (Look up the PLY project, for example.) Nobody seriously writes lexers in C these days. They use Python or another high level language during development, prototyping and experimentation, and if the language takes off as a realistic general-purpose language, they either write the lexer and the rest of the tools in the new language itself, or they use C++.
Les messages affichés proviennent d'usenet.