Sujet : Re: how cast works?
De : bc (at) *nospam* freeuk.com (Bart)
Groupes : comp.lang.cDate : 08. Aug 2024, 18:29:40
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <v92va5$4msg$1@dont-email.me>
References : 1 2 3 4 5
User-Agent : Mozilla Thunderbird
On 08/08/2024 17:32, Michael S wrote:
> On Thu, 8 Aug 2024 14:23:44 +0100
> Bart <
bc@freeuk.com> wrote:
>> Try godbolt.org. Type in a fragment of code that does different kinds
>> of casts (it needs to be well-formed, so inside a function), and see
>> what code is produced with different C compilers.
>>
>> Use -O0 so that the code isn't optimised out of existence, and so
>> that you can more easily match it to the C ource.
>>
>>
>
>
> I'd recommend an opposite - use -O2 so the cast that does nothing
> optimized away.
>
> int foo_i2i(int x) { return (int)x; }
> int foo_u2i(unsigned x) { return (int)x; }
> int foo_b2i(_Bool x) { return (int)x; }
> int foo_d2i(double x) { return (int)x; }
The OP is curious as to what's involved when a conversion is done. Hiding or eliminating code isn't helpful in that case; the results can also be misleading:
Take this example:
void fred(void) {
_Bool b;
int i;
i=b;
}
Unoptimised, it generates this code:
push rbp
mov rbp, rsp
mov al, byte ptr [rbp - 1]
and al, 1
movzx eax, al
mov dword ptr [rbp - 8], eax
pop rbp
ret
You can see from this that a Bool occupies one byte; it is masked to 0/1 (so it doesn't trust it to contain only 0/1), then it is widened to an int size.
With optimisation turned on, even at -O1, it produces this:
ret
That strikes me as rather less enlightening!
Meanwhile your foo_b2i function contains this optimised code:
mov eax, edi
ret
The masking and widening is not present. Presumably, it is taking advantage of the fact that a _Bool argument will be converted and widened to `int` at the callsite even though the parameter type is also _Bool. So the conversion has already been done.
You will see this if writing also a call to foo_b2i() and looking at the /non-elided/ code.
The unoptimised code for foo_b2i is pretty awful (like masking twice, with a pointless write to memory between them). But sometimes with gcc there is no sensible middle ground between terrible code, and having most of it eliminated.
The unoptimised code from my C compiler for foo_b2i, excluding entry/exit code, is:
movsx eax, byte [rbp + foo_b2i.x]
My compiler assumes that a _Bool type already contains 0 or 1.