Sujet : Re: Whaddaya think?
De : nospam (at) *nospam* dfs.com (DFS)
Groupes : comp.lang.cDate : 16. Jun 2024, 17:20:07
Autres entêtes
Message-ID : <666f10b7$0$1412896$882e4bbb@reader.netnews.com>
References : 1 2
User-Agent : Betterbird (Windows)
On 6/15/2024 6:22 PM, Keith Thompson wrote:
DFS <nospam@dfs.com> writes:
I want to read numbers in from a file, say:
>
47 185 99 74 202 118 78 203 264 207 19 17 34 167 148 54 297 271 118
245 294 188 140 134 251 188 236 160 48 189 228 94 74 27 168 275 144
245 178 108 152 197 125 185 63 272 239 60 242 56 4 235 244 144 69 195
32 4 54 79 193 282 173 267 8 40 241 152 285 119 259 136 15 83 21 78 55
259 137 297 15 141 232 259 285 300 153 16 4 207 95 197 188 267 164 195
7 104 47 291
>
>
This code:
1 opens the file
2 fscanf thru the file to count the number of data points
3 allocate memory
4 rewind and fscanf again to add the data to the int array
>
>
Any issues with this method?
>
Any 'better' way?
>
Thanks
In a quick test, your code compiles without errors and runs correctly
with your input. I do get a warning about argc being unused, which you
should address.
-Wall doesn't warn about that, but -Wall -Wextra does.
In the bigger program of which this is a part, argc IS used.
----------------------------------------------------------
#include <stdio.h>
#include <stdlib.h>
>
int main(int argc, char *argv[]) {
>
int N=0, i=0, j=0;
The usual convention is to use all-caps for macro names. Calling your
variable N is not a real problem, but could be slightly confusing.
N is the number of integers in the input. i is an index. j is a value
read from the file. That's not at all clear from the names.
I suggest using longer and more descriptive names in lower case.
"N" could be "count". "i" is fine for an index, but "j" could be
"value".
N is used in statistics, and this is a stats program.
Consider using size_t rather than int for the count and index. That's
mostly a style point; it's not going to make any practical difference
unless you have at least INT_MAX elements.
int *nums;
FILE* datafile = fopen(argv[1], "r");
Undefined behavior if no argument was provided, i.e., argc < 1.
while(fscanf(datafile, "%d", &j) != EOF){
Numeric input with the *scanf functions has undefined behavior if the
scanned value is outside the range of the target type. For example, if
the input contains "99999999999999999999999999999999999999999999999999",
arbitrary bad things could happen. (Most likely it will just store some
incorrect value in j, with no indication that there was an error.)
strtol is trickier to use, but you can detect errors.
fscanf returns EOF on reaching the end of the file or on a read error,
and that's the only condition you check. It returns the number of items
scanned. If the input doesn't contain a string that can be interpreted
as an integer, fscanf will return 0, and you'll be stuck in an infinite
loop. `while (fscanf(...) == 1)` is more robust, but it doesn't
distinguish between a read error and bad data. It's up to you how and
whether to distinguish among different kinds of errors.
Your sample input consists of decimal integers with no sign. Decide
whether you want to hande "-123" or "+123". (fscanf will do so; so will
strtol.)
A change I might make down the road is to process positive floats. For now it's just positive ints.
N++;
}
nums = calloc(N, sizeof(int));
Consider using `sizeof *nums` rather than `sizeof(int)`. That way you
don't have to change the type in two places if the element type changes.
You'll be updating all the elements of the nums array, so there's not
much point in zeroing it. If you use malloc:
nums = malloc(N * sizeof *nums);
Whether you use calloc() or malloc(), you should check the return
value. If it returns a null pointer, it means the allocation failed.
Aborting the program is probably a good way to handle it.
I usually don't do error checking on my personal code.
(There are complications on Linux-based systems which I won't get into
here. Google "OOM killer" and "overcommit" for details.)
rewind(datafile);
This can fail if the input file is not seekable. For example, on a
Linux-based system you could do something like:
./your_program /dev/stdin < file
Perhaps that's an acceptable restriction, but be aware of it.
while(fscanf(datafile, "%d", &j) != EOF){
Again, UB for out of range values.
It's not guaranteed that you'll get the same data the second time you
read the file; some other process could modify it. This might not be
worth worrying about.
I updated the code to do one fscanf() thru the file.
I looked for an easy way to lock it while reading, but as I understand flock() it only places an 'advisory lock' on the file, and other processes are still free to modify it.
nums[i++] = j;
}
fclose (datafile);
printf("\n");
You haven't produced any output yet; why print a blank line? (Of course
you can if you want to.)
for(i=0;i<N;i++) {
printf("%d. %d\n", i+1, nums[i]);
}
printf("\n");
free(nums);
return(0);
A minor style point: a return statement doesn't require parentheses.
IMHO using parentheses make it look too much like a function call. I'd
write `return 0;`, or more likely I'd just omit it, since falling off
the end of main does an implicit `return 0;` (starting in C99).
Can't omit it. It's required by my brain.
}
A method that doesn't require rescanning the input file is to initially
allocate some reasonable amount of memory, then use realloc() to
expand the array as needed. Doubling the array size is probably
reasonable. It will consume more memory than a single allocation.
Done in a way, as you'll see below.
Thanks for the thorough analysis and good tips.
Updated
* dropped 2 variable declarations
* allocate 'on the fly'
* one fscanf thru the file
* 4 less lines of code (not incl brackets)
----------------------------------------------------------
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
int N=0;
int *nums = malloc(2 * sizeof(int));
FILE* datafile = fopen(argv[1], "r");
while(fscanf(datafile, "%d", &nums[N++]) == 1){
nums = realloc(nums, (N+1) * sizeof(int));
}
fclose (datafile);
N--;
for(int i=0;i<N;i++) {
printf("%d.%d ", i+1, nums[i]);
}
free(nums);
printf("\n");
return 0;
}
----------------------------------------------------------