Re: "undefined behavior"?

Liste des GroupesRevenir à cl c  
Sujet : Re: "undefined behavior"?
De : nospam (at) *nospam* dfs.com (DFS)
Groupes : comp.lang.c
Date : 12. Jun 2024, 22:53:35
Autres entêtes
Message-ID : <666a18de$0$958$882e4bbb@reader.netnews.com>
References : 1 2
User-Agent : Betterbird (Windows)
On 6/12/2024 5:30 PM, Barry Schwarz wrote:
On Wed, 12 Jun 2024 16:47:23 -0400, DFS <nospam@dfs.com> wrote:
 
Wrote a C program to mimic the stats shown on:
>
https://www.calculatorsoup.com/calculators/statistics/descriptivestatistics.php
>
My code compiles and works fine - every stat matches - except for one
anomaly: when using a dataset of consecutive numbers 1 to N, all values
40 are flagged as outliers.  Up to 40, no problem.  Random numbers
dataset of any size: no problem.
>
And values 41+ definitely don't meet the conditions for outliers (using
the IQR * 1.5 rule).
>
Very strange.
>
Edit: I just noticed I didn't initialize a char:
before: char outliers[100];
after : char outliers[100] = "";
>
And the problem went away.  Reset it to before and problem came back.
>
Makes no sense.  What could cause the program to go FUBAR at data point
41+ only when the dataset is consecutive numbers?
>
Also, why doesn't gcc just do you a solid and initialize to "" for you?
 Makes perfect sense.  The first rule of undefined behavior is
"Whatever happens is exactly correct."  You are not entitled to any
expectations and none of the behavior (or perhaps all of the behavior)
can be called unexpected.
I HATE bogus answers like this.
Aren't you embarrassed to say things like that?

Since we cannot see your code, I will guess that you use a non-zero
value in outliers[i] to indicate that the corresponding value has been
identified as an outlier.
No.
I compare the data point to the lower and upper bounds of a stat rule commonly called the "IQR Rule":
lo = Q1 - (1.5 * IQR)
hi = Q3 + (1.5 * IQR)
If it falls outside the range of lo-hi I strcat the value to a char.
The outlier routine starts line 170.
If you change
char outliers[200]="", temp[10]="";
to
char outliers[200], temp[10];
you might see what happens when you run the program for consecutive values:
$ ./prog 100 -c
=========================================================================
//this code is hereby released to the public domain
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include <string.h>
#include <time.h>
/*
  this program computes the descriptive statistics of a randomly generated set of N integers
  1.0 release Dec 2020
  2.0 release Jun 2024
  used the population skewness and Kurtosis formulas from:
 https://www.calculatorsoup.com/calculators/statistics/descriptivestatistics.php
  also test the results of this code against that site
  compile: gcc -Wall prog.c -o prog -lm
  usage  : ./prog N -option (where N is 2 or higher, and option is -r or -c or -o)
            -r generates N random numbers
   -c generates consecutive numbers 1 to N
   -o generates random numbers with outliers
*/
//random ints
int randNbr(int low, int high) {
return (low + rand() / (RAND_MAX / (high - low + 1) + 1));
}
//comparator function used with qsort
int compareint (const void * a, const void * b)
{
   if (*(int*)a > *(int*)b) return 1;
   else if (*(int*)a < *(int*)b) return -1;
   else return 0;
}
int main(int argc, char *argv[])
{
if(argc < 3) {
printf("Missing argument:\n");
printf(" * enter a number greater than 2\n");
printf(" * enter an option -r -c or -o\n");
exit(0);
}


//vars
int i=0, lastmode=0;
int N = atoi(argv[1]);
int nums[N];
//int *nums = malloc(N * sizeof(int));

double sumN=0.0, median=0.0, Q1=0.0, Q2=0.0, Q3=0.0, IQR=0.0;
double stddev = 0.0, kurtosis = 0.0;
double sqrdiffmean = 0.0, cubediffmean = 0.0, quaddiffmean = 0.0;
double meanabsdev = 0.0, rootmeansqr = 0.0;
char mode[100], tmp[12];

//generate random dataset
if(strcmp(argv[2],"-r") == 0) {
srand(time(NULL));
for(i=0;i<N;i++) { nums[i] = randNbr(1,N*3); }

printf("%d Randoms:\n", N);
printf("No commas  : ");   for(i=0;i<N;i++) { printf("%d ", nums[i]); }
printf("\nWith commas: "); for(i=0;i<N;i++) { printf("%d,", nums[i]); }
qsort(nums,N,sizeof(int),compareint);
printf("\nSorted     : "); for(i=0;i<N;i++) { printf("%d ", nums[i]); }
printf("\nSorted     : "); for(i=0;i<N;i++) { printf("%d,", nums[i]); }
}

//generate random dataset with outliers
if(strcmp(argv[2],"-o") == 0) {
srand(time(NULL));
nums[0] = 1; nums[1] = 3;
for(i=2;i<N-2;i++) { nums[i] = randNbr(100,N*30); }
nums[N-2] = 1000; nums[N-1] = 2000;

printf("%d Randoms with outliers:\n", N);
printf("No commas  : ");   for(i=0;i<N;i++) { printf("%d ", nums[i]); }
printf("\nWith commas: "); for(i=0;i<N;i++) { printf("%d,", nums[i]); }
qsort(nums,N,sizeof(int),compareint);
printf("\nSorted     : "); for(i=0;i<N;i++) { printf("%d ", nums[i]); }
printf("\nSorted     : "); for(i=0;i<N;i++) { printf("%d,", nums[i]); }
}


//generate consecutive numbers 1 to N
if(strcmp(argv[2],"-c") == 0) {
for(i=0;i<N;i++) { nums[i] = i + 1; }

printf("%d Consecutive:\n", N);
printf("No commas     : ");   for(i=0;i<N;i++) { printf("%d ", nums[i]); }
printf("\nWith commas   : "); for(i=0;i<N;i++) { printf("%d,", nums[i]); }
}

//various
for(i=0;i<N;i++) {sumN += nums[i];}
double min = nums[0], max = nums[N-1];

//calc descriptive stats
double mean = sumN / (double)N;
int ucnt = 1, umaxcnt=1;
for(i = 0; i < N; i++)
{
sqrdiffmean  += pow(nums[i] - mean, 2);  // for variance and sum squares
cubediffmean += pow(nums[i] - mean, 3);  // for skewness
quaddiffmean += pow(nums[i] - mean, 4);  // for Kurtosis
meanabsdev   += fabs((nums[i] - mean));  // for mean absolute deviation
rootmeansqr  += nums[i] * nums[i];       // for root mean square

//mode
if(ucnt == umaxcnt && lastmode != nums[i])
{
sprintf(tmp,"%d ",nums[i]);
strcat(mode,tmp);
}

if(nums[i]-nums[i+1]!=0) {ucnt=1;} else {ucnt++;}

if(ucnt>umaxcnt)
{
umaxcnt=ucnt;
memset(mode, '\0', sizeof(mode));
sprintf(tmp, "%d ", nums[i]);
strcat(mode, tmp);
lastmode = nums[i];
}
}


// median and quartiles
// quartiles divide sorted dataset into four sections
// Q1 = median of values less than Q2
// Q2 = median of the data set
// Q3 = median of values greater than Q2
if(N % 2 == 0) {
Q2 = median = (nums[(N/2)-1] + nums[N/2]) / 2.0;
i = N/2;
if(i % 2 == 0) {
Q1 = (nums[(i/2)-1] + nums[i/2]) / 2.0;
Q3 = (nums[i + ((i-1)/2)] + nums[i+(i/2)]) / 2.0;
}
if(i % 2 != 0) {
Q1 = nums[(i-1)/2];
Q3 = nums[i + ((i-1)/2)];
}
}
if(N % 2 != 0) {
Q2 = median = nums[(N-1)/2];
i = (N-1)/2;
if(i % 2 == 0) {
Q1 = (nums[(i/2)-1] + nums[i/2]) / 2.0;
Q3 = (nums[i + (i/2)] + nums[i + (i/2) + 1]) / 2.0;
}
if(i % 2 != 0) {
Q1 = nums[(i-1)/2];
Q3 = nums[i + ((i+1)/2)];
}
}


// outliers: below Q1−1.5xIQR, or above Q3+1.5xIQR
IQR = Q3 - Q1;
char outliers[200]="", temp[10]="";
if (N > 3) {

//range for outliers
double lo = Q1 - (1.5 * IQR);
double hi = Q3 + (1.5 * IQR);

//no outliers
if ( min > lo && max < hi) {
strcat(outliers,"none      (using IQR * 1.5 rule)");
}
//at least one outlier
if ( min < lo || max > hi) {
for(i = 0; i < N; i++) {
double val = (double)nums[i];
if(val < lo || val > hi) {
sprintf(temp,"%.0f ",val);
temp[strlen(temp)] = '\0';
strcat(outliers,temp);
}
}
strcat(outliers," (using IQR * 1.5 rule)");
}
outliers[strlen(outliers)] = '\0';
}


stddev   = sqrt(sqrdiffmean/N);
kurtosis = quaddiffmean / (N * pow(sqrt(sqrdiffmean/N),4));

//output
printf("\n--------------------------------------------------------------\n");
printf("Minimum            = %.0f\n", min);
printf("Maximum            = %.0f\n", max);
printf("Range              = %.0f\n", max - min);
printf("Size N             = %d\n"  , N);
printf("Sum  N             = %.0f\n", sumN);
printf("Mean μ             = %.2f\n", mean);
printf("Median             = %.1f\n", median);
if(umaxcnt > 1) {
printf("Mode(s)            = %s (%d occurrences ea)\n", mode,umaxcnt);}
if(umaxcnt < 2) {
printf("Mode(s)            = na (no repeating values)\n");}
printf("Std Dev  σ         = %.4f\n", stddev);
printf("Variance σ^2       = %.4f\n", sqrdiffmean/N);
printf("Mid Range          = %.1f\n", (max + min)/2);
printf("Quartiles");
if(N > 3) {printf("       Q1 = %.1f\n", Q1);}
if(N < 4) {printf("       Q1 = na\n");}
printf("                Q2 = %.1f      (median)\n", Q2);
if(N > 3) {printf("                Q3 = %.1f\n", Q3);}
if(N < 4) {printf("                Q3 = na\n");}
printf("IQR                = %.1f      (interquartile range)\n", IQR);
if(N > 3) {printf("Outliers           = %s\n", outliers);}
if(N < 4) {printf("Outliers           = na\n");}
printf("Sum Squares SS     = %.2f\n", sqrdiffmean);
printf("MAD                = %.4f    (mean absolute deviation)\n", meanabsdev / N);
printf("Root Mean Sqr      = %.4f\n", sqrt(rootmeansqr / N));
printf("Std Error Mean     = %.4f\n", stddev / sqrt(N));
printf("Skewness  γ1       = %.4f\n", cubediffmean / (N * pow(sqrt(sqrdiffmean/N),3)));
printf("Kurtosis  β2       = %.4f\n", kurtosis);
printf("Kurtosis Excess α4 = %.4f\n", kurtosis - 3);
printf("CV                 = %.6f  (coefficient of variation\n", sqrt(sqrdiffmean/N) / mean);
printf("RSD                = %.4f%%  (relative std deviation)\n", 100 * (sqrt(sqrdiffmean/N) / mean));
printf("--------------------------------------------------------------\n");
printf("Check results against\n");
printf("https://www.calculatorsoup.com/calculators/statistics/descriptivestatistics.php");
printf("\n\n");
//free(nums);
return(0);
}
=========================================================================

Since you did not initialize the array
outliers, you have no idea what indeterminate value any element of the
array contains when your program begins execution.  Apparently some of
them are non-zero.  The fact that the first 40 are zero and the
remaining non-zero is merely an artifact of how your system builds
this particular program with that particular set of compile and link
options.  Change anything and you could see completely different
behavior, or not.
 I don't use gcc but, in debug mode, some compilers will put
recognizable "garbage values" in uninitialized variables so you can
spot the condition more easily.
 In any case, the C language does not prevent you from shooting
yourself in the foot if you choose to.  Evaluating an indeterminate
value is one fairly common way to do this.
 

Date Sujet#  Auteur
12 Jun 24 * "undefined behavior"?77DFS
12 Jun 24 +* Re: "undefined behavior"?39Barry Schwarz
12 Jun 24 i`* Re: "undefined behavior"?38DFS
13 Jun 24 i `* Re: "undefined behavior"?37Keith Thompson
13 Jun 24 i  `* Re: "undefined behavior"?36DFS
13 Jun 24 i   `* Re: "undefined behavior"?35Keith Thompson
13 Jun 24 i    `* Re: "undefined behavior"?34Malcolm McLean
13 Jun 24 i     +- Re: "undefined behavior"?1Ben Bacarisse
13 Jun 24 i     +* Re: "undefined behavior"?29bart
13 Jun 24 i     i+* Re: "undefined behavior"?22Malcolm McLean
13 Jun 24 i     ii+* Re: "undefined behavior"?2Chris M. Thomasson
14 Jun 24 i     iii`- Re: "undefined behavior"?1Malcolm McLean
14 Jun 24 i     ii`* Re: "undefined behavior"?19Ben Bacarisse
14 Jun 24 i     ii `* Re: "undefined behavior"?18Malcolm McLean
14 Jun 24 i     ii  `* Re: "undefined behavior"?17Ben Bacarisse
14 Jun 24 i     ii   +* Re: "undefined behavior"?13Malcolm McLean
14 Jun 24 i     ii   i+* Re: "undefined behavior"?4Richard Harnden
14 Jun 24 i     ii   ii`* Re: "undefined behavior"?3Malcolm McLean
14 Jun 24 i     ii   ii `* Re: "undefined behavior"?2bart
14 Jun 24 i     ii   ii  `- Re: "undefined behavior"?1Malcolm McLean
14 Jun 24 i     ii   i`* Re: "undefined behavior"?8Ben Bacarisse
15 Jun 24 i     ii   i `* Re: "undefined behavior"?7Malcolm McLean
15 Jun 24 i     ii   i  +- Re: "undefined behavior"?1Ben Bacarisse
15 Jun 24 i     ii   i  `* Re: "undefined behavior"?5David Brown
15 Jun 24 i     ii   i   `* Re: "undefined behavior"?4Richard Harnden
16 Jun 24 i     ii   i    +- Re: "undefined behavior"?1Ben Bacarisse
16 Jun 24 i     ii   i    `* Re: "undefined behavior"?2David Brown
16 Jun 24 i     ii   i     `- Re: "undefined behavior"?1Malcolm McLean
14 Jun 24 i     ii   `* Re: "undefined behavior"?3Chris M. Thomasson
14 Jun 24 i     ii    `* Re: "undefined behavior"?2Ben Bacarisse
15 Jun 24 i     ii     `- Re: "undefined behavior"?1Chris M. Thomasson
14 Jun 24 i     i`* Re: "undefined behavior"?6Keith Thompson
14 Jun 24 i     i +- Re: "undefined behavior"?1bart
14 Jun 24 i     i +* Re: "undefined behavior"?3David Brown
14 Jun 24 i     i i`* Re: "undefined behavior"?2Keith Thompson
15 Jun 24 i     i i `- Re: "undefined behavior"?1David Brown
14 Jun 24 i     i `- Re: "undefined behavior"?1Keith Thompson
13 Jun 24 i     `* Re: "undefined behavior"?3Keith Thompson
14 Jun 24 i      `* Re: "undefined behavior"?2Malcolm McLean
14 Jun 24 i       `- Re: "undefined behavior"?1Keith Thompson
12 Jun 24 +* Re: "undefined behavior"?15David Brown
13 Jun 24 i+* Re: "undefined behavior"?6Keith Thompson
13 Jun 24 ii+* Re: "undefined behavior"?2David Brown
14 Jun 24 iii`- Re: "undefined behavior"?1Keith Thompson
19 Jun 24 ii`* Re: "undefined behavior"?3Tim Rentsch
19 Jun 24 ii `* Re: "undefined behavior"?2Keith Thompson
22 Jun 24 ii  `- Re: "undefined behavior"?1Tim Rentsch
13 Jun 24 i`* Re: "undefined behavior"?8DFS
13 Jun 24 i +* Re: "undefined behavior"?4Ike Naar
13 Jun 24 i i`* Re: "undefined behavior"?3DFS
13 Jun 24 i i `* Re: "undefined behavior"?2Lew Pitcher
13 Jun 24 i i  `- Re: "undefined behavior"?1DFS
13 Jun 24 i `* Re: "undefined behavior"?3David Brown
14 Jun 24 i  `* Re: "undefined behavior"?2Keith Thompson
14 Jun 24 i   `- Re: "undefined behavior"?1David Brown
12 Jun 24 +* Re: "undefined behavior"?19Janis Papanagnou
13 Jun 24 i`* Re: "undefined behavior"?18Keith Thompson
13 Jun 24 i +* Re: "undefined behavior"?2Janis Papanagnou
13 Jun 24 i i`- Re: "undefined behavior"?1David Brown
13 Jun 24 i `* Re: "undefined behavior"?15David Brown
13 Jun 24 i  `* Re: "undefined behavior"?14DFS
14 Jun 24 i   `* Re: "undefined behavior"?13David Brown
15 Jun 24 i    +* Re: "undefined behavior"?11DFS
15 Jun 24 i    i`* Re: "undefined behavior"?10Keith Thompson
15 Jun 24 i    i `* Re: "undefined behavior"?9DFS
15 Jun 24 i    i  `* Re: "undefined behavior"?8Keith Thompson
15 Jun 24 i    i   `* Re: "undefined behavior"?7DFS
15 Jun 24 i    i    +* Re: "undefined behavior"?2Janis Papanagnou
15 Jun 24 i    i    i`- Re: "undefined behavior"?1DFS
15 Jun 24 i    i    +- Re: "undefined behavior"?1James Kuyper
15 Jun 24 i    i    +- Re: "undefined behavior"?1Keith Thompson
15 Jun 24 i    i    +- Re: "undefined behavior"?1bart
15 Jun 24 i    i    `- Re: "undefined behavior"?1David Brown
15 Jun 24 i    `- Re: "undefined behavior"?1David Brown
12 Jun 24 +- Re: "undefined behavior"?1Keith Thompson
13 Jun 24 +- Re: "undefined behavior"?1bart
13 Jun 24 `- Re: "undefined behavior"?1Bonita Montero

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal