DFS <
nospam@dfs.com> writes:
On 6/12/2024 6:30 PM, Keith Thompson wrote:
DFS <nospam@dfs.com> writes:
On 6/12/2024 5:30 PM, Barry Schwarz wrote:
On Wed, 12 Jun 2024 16:47:23 -0400, DFS <nospam@dfs.com> wrote:
>
Wrote a C program to mimic the stats shown on:
>
https://www.calculatorsoup.com/calculators/statistics/descriptivestatistics.php
>
My code compiles and works fine - every stat matches - except for one
anomaly: when using a dataset of consecutive numbers 1 to N, all values
40 are flagged as outliers. Up to 40, no problem. Random numbers
dataset of any size: no problem.
>
And values 41+ definitely don't meet the conditions for outliers (using
the IQR * 1.5 rule).
>
Very strange.
>
Edit: I just noticed I didn't initialize a char:
before: char outliers[100];
after : char outliers[100] = "";
>
And the problem went away. Reset it to before and problem came back.
>
Makes no sense. What could cause the program to go FUBAR at data point
41+ only when the dataset is consecutive numbers?
Also, why doesn't gcc just do you a solid and initialize to "" for you?
Makes perfect sense. The first rule of undefined behavior is
"Whatever happens is exactly correct." You are not entitled to any
expectations and none of the behavior (or perhaps all of the behavior)
can be called unexpected.
>
I HATE bogus answers like this.
>
Aren't you embarrassed to say things like that?
He has nothing to be embarrassed about. What he wrote is correct.
>
No it's not.
>
"Whatever happens is exactly correct." is nonsense.
>
"You are not entitled to any expectations" is nonsense.
Neither statement is nonsense.
I quoted the C standard's definition of "undefined behavior".
The C standard *imposes no requirements* on code whose behavior
is undefined.
Perhaps "Nothing that happens is incorrect" would be clearer.
The standard joke is that code with undefined behavior can make
demons fly out of your nose. Obviously it can't, but the point
is that if it did, it would not violate the requirements of the
C standard.
If your code has undefined behavior, and you have any expectations
at all about how it will behave, none of those expectations are
supported by the C standard. Compilers perform optimizations
under the assumption that the code's behavior is not undefined,
which can and does result in arbitrarily weird behavior if you lie
to the compiler by feeding it code whose behavior is undefined.
The C standard's definition of "undefined behavior" is "behavior, upon
use of a nonportable or erroneous program construct or of erroneous
data, for which this International Standard imposes no requirements".
If you don't like the way C deals with undefined behavior, that's
perfectly valid, and a lot of people are likely to agree with you.
>
Thanks for feeling my pain!
>
It's frustrating. By now I spent a half-hour dealing with it. gcc
could've just filled the char[] variable with 0s by default. I bet
that would save a LOT of people time and headaches.
Perhaps -- but it would also hurt the performance of code that
doesn't *need* automatic objects to be initialized implicitly.
I am neither defending nor attacking the decisions that went into
the ISO C standard. I am explaining what it says.
If you think that the language *should* require automatic objects
to be initialized to zero, that's a perfectly valid opinion.
But you need to accept the the fact that the language doesn't
require such initialization, it never has, and most C compilers do
not perform such initializations (unless perhaps you specify some
obscure option).
If that half hour led you to learn that, I suggest it was time
well spent.
And leaving out such a requirement was not accidental. It was a
deliberate decision made for reasons the authors felt were valid.
Zero-initializing all uninitialized automatic objects might be
an idea worth considering for a future standard, but it *would*
hurt performance.
But I advise against lashing out at people who are correctly explaining
what the C standard says.
>
The C standard really says "Whatever happens is exactly correct."?
Not in those words, but that's what "imposes no requirements" means.
If you write:
printf("%d\n", INT_MAX + 1);
and your program prints "0", or "hello, world", or invokes nethack, that
behavior does not violate the requirements of the C standard.
DFS, since you've been posting in comp.lang.c for at least ten years,
>
Time flies.
>
How do you know I've posted here that long?
I have a collection of saved articles from this newsgroup.
I took a quick look and found some of your posts from 10 years ago.
There are a number of other ways to search for old articles. That,
and I recognized the name "DFS" well enough to infer that you're
a semi-regular.
I'm surprised you're having difficulties with this.
>
I'm surprised at some of the wonkiness of gcc and C.
By all means be surprised, but *learn*.
* warns relentlessly when the printf specifier doesn't match the var
type, but gives no warning when you use an int with memset (instead of
the size_t specified in the function prototype).
Of course. The warning for a bad print specifier is not required,
but it's useful and fairly easy to generated.
Passing an int as the third argument in a call to memset() is
perfectly valid and well defined, and does not justify a warning.
Yes, the parameter is defined with type size_t, which is an unsigned
integer type. Given that the prototype is visible (which it always
will be if you have the required `#include <string.h>`), you can
pass an argument of any integer or floating-point type and it will
be implicitly converted to size_t. There is no ambiguity.
(If the int value exceeds SIZE_MAX, which is impossible in most
implementations, then the conversion still yields a well-defined
result.)
printf is a variadic function, so the types of the arguments after
the format string are not specified in its declaration. The printf
function has to *assume* that arguments have the types specified
by the format string. This:
printf("%d\n", foo);
(probably) has undefined behavior if foo is of type size_t.
There is no implicit conversion to the expected type. Note that
the format string doesn't have to be a string literal, so it's
not always even possible for the compiler to check the types.
Variadic functions give you a lot of flexibility at the cost of
making some type errors difficult to detect.
(I wrote "probably" because size_t *might* be a typedef for unsigned
int, and there are special rules about arguments of corresponding
signed and unsigned types.)
* a missing bracket } throws 50 nonsensical compiler errors.
Recovering from parse errors can be difficult. Look for the *first*
reported syntax error, fix it, and recompile.
* warns of unused vars but not uninitialized ones
Compilers warn about what they *can* warn about. No warnings are
required, but good compilers try to do as much analysis as they can.
In a simple example:
char buf[100];
printf("%d\n", buf[50]);
with "gcc -Wall" I get a "warning: ‘buf’ is used uninitialized".
In your code, you had something like:
char outliers[200];
// ...
if (some_condition) {
strcat(outliers, "some string literal");
}
That has undefined behavior, but it's a bit more difficult
to diagnose than a direct reference to the array. In a quick
experiment, gcc warned about the reference if it's unconditional,
but not if it's in an if statement.
No such warnings are required by the language. It's a matter of how
much effort the compiler developers have put into it while trying to
avoid *too many* warnings.
* one uninitialized var makes your program do crazy things. Worse than
crazy is it's identically crazy each time.
That's just the nature of undefined behavior. Consistent incorrect
behavior is allowed. Inconsistent incorrect behavior is allowed.
Seemingly correct behavior is allowed (and is perhaps the worst,
because it means you have a hidden bug that you can't test for that
might manifest in a future version).
You defined and did not initialize a character array. There could
be any number of reasons that its contents happened to be the same
from one run of your program to another.
And now you know how to fix the problem: make sure all objects are
initialized before you try to use their values. That's a fairly
easy rule to follow, even if compilers don't always do enough to
help enforce it.
-- Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.comvoid Void(void) { Void(); } /* The recursive call of the void */