Liste des Groupes | Revenir à c arch |
On 9/15/2024 2:40 PM, David Brown wrote:You have tried it, I have not, so I will take your word for it. Perhaps those I heard of using it were, as you say, compiling in C++ mode - my understanding is that is very common with MSVC.On 14/09/2024 08:34, BGB wrote:Go and try to write C with variables not declared at the start of a block in VS2008 or similar and see how far you get...On 9/13/2024 10:30 AM, David Brown wrote:>On 12/09/2024 23:14, BGB wrote:On 9/12/2024 9:18 AM, David Brown wrote:>On 11/09/2024 20:51, BGB wrote:On 9/11/2024 5:38 AM, Anton Ertl wrote:Josh Vanderhoof <x@y.z> writes:anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:
>
<snip lots>
>>>Though, generally takes a few years before new features become usable.>
Like, it is only in recent years that it has become "safe" to use most parts of C99.
>
Most of the commonly used parts of C99 have been "safe" to use for 20 years. There were a few bits that MSVC did not implement until relatively recently, but I think even have caught up now.
>
Until VS2013, the most one could really use was:
// comments
long long
Otherwise, it was basically C90.
'stdint.h'? Nope.
Ability to declare variables wherever? Nope.
...
Nonsense.
>
MS basically gave up on C and concentrated on C++ (then later C# and other languages). Their C compiler gained the parts of C99 that were in common with C++ - and anyway, most people (that I have heard of) using MSVC for C programming actually use the C++ compiler but stick approximately to a C subset. And this has been the case for a /long/ time - long before 2013.
>
While, it may work in C++ mode, it did not work in C mode.
IIRC, the ability to declare variables wherever got added in VS2013.<stdint.h> became part of C++ in C++11, but most C and C++ compilers have had it since shortly after C99 came out, even if they did not support much more of C99.
Looks like 'stdint.h' got added in VS2010.
Yes, that's a fair point. As far as the C language is concerned, there's no such thing - any generated code that gives the same (or equally valid) observable behaviour is simply an alternative output for the compiler. But it generally means that the compiler makes more than a minimal effort to generate more efficient results.It also depends on what one considers optimizing.>>>>>>Whether or not the target/compiler allows misaligned memory access;>
If set, one may use misaligned access.
Why would you need that? Any decent compiler will know what is allowed for the target (perhaps partly on the basis of compiler flags), and will generate the best allowed code for accesses like foo3() above.
>
Imagine you have compilers that are smart enough to turn "memcpy()" into a load and store, but not smart enough to optimize away the memory accesses, or fully optimize away the wrapper functions...
>
Why would I do that? If I want to have efficient object code, I use a good compiler. Under what realistic circumstances would you need to have highly efficient results but be unable to use a good optimising compiler? Compilers have been inlining code for 30 years at least (that's when I first saw it) - this is not something new and rare.
>
Say, you are using a target where you can't use GCC or similar.
Which target would that be? Excluding personal projects, some very niche devices, and long-outdated small CISC chips, there really aren't many devices that don't have a GCC and clang port. Of course there / are/ processors that gcc does not support, but almost nobody writes code that has to be portable to such devices.
>
And as for optimising compilers, I used at least two different optimising compilers in the mid nineties that inlined code automatically, before using gcc. (I can't remember if they inlined memcpy - it was a long time ago!). Optimising compilers are not a new concept, and are not limited to gcc and clang.
>
But, like:Very good.
Allocates variables into registers;
Evaluates expressions involving constants;
Turns "memcpy()" into inlined loads/stores in some cases;
Essentially treating it like a builtin function.
...
Well, at least BGBCC does this much.
Things it doesn't do though:Loop unrolling can be difficult in a compiler - it's also not always a good thing in the end (cache arrangements can sometimes mean a real loop is faster than an unrolled loop).
Loop unrolling;
Inline functions;Inlining small functions is a /very/ useful optimisation, IMHO, especially when it happens before other optimisations like constant propagation.
...
There is a partial feature to cache member loads and array loads within a basic-block, but will flush any such cached values whenever a memory store happens.Slow and always correct is better than fast and sometimes wrong!
Say:
i=foo->bar->x + foo->bar->y;
Will cache and reuse the first foo->bar.
But, if you do:
*ptr=0;
Or:
foo->z=3;
It will flush any memory of the cached values (unless the pointers are 'restrict').
There is an option to disable this caching though (at which point it will always do each member load). But, unlike TBAA, this optimization is less prone to break stuff.
It also has a special feature than small leaf functions which can fit entirely in scratch registers may skip creation of a stack frame.Pushing upwards would have been a waste of money.
But, I can note that even with these limitations, BGBCC+BJX2 still seems to be beating RV64G + "GCC -O3" in terms of performance in my tests (well, mostly because clever compiler can't beat ISA limitations).
Hitachi did release an ISA spec for SH-5 at least (and it might have worked OK, if Renesas had pushed "upwards" rather than focusing almost exclusively on the small embedded / microcontroller space).>>
Say:
BJX2, haven't ported GCC as it looks like a pain;
Also GCC is big and slow to recompile.
>
6502 and 65C816, because these are old and probably not worth the effort from GCC's POV.
>
Various other obscure/niche targets.
>
>
Say, SH-5, which never saw a production run (it was a 64-bit successor to SH-4), but seemingly around the time Hitachi spun-out Renesas, the SH-5 essentially got canned. And, it apparently wasn't worth it for GCC to maintain a target for which there were no actual chips (comparably the SH-2 and SH-4 lived on a lot longer due to having niche uses).
>
It would be quite ridiculous to limit the way you write code because of possible limitations for non-existent compilers for target devices that have never been made.
>
But, at present, people trying to worry about portability to things with non-power-of-2 integers, non-8-bit bytes, non-twos-complement arithmetic, etc, has a similar level of validity (or non-validity) to writing code for ISA's which never saw a release in "actual silicon".Agreed. There /are/ cores that have such features, like DSPs and very specialised cores, but the code you use on them is equally specialised. You don't need to port back and forth between such cores and "normal" targets.
If the compiler is naive (wrt inline memcpy):Can I recommend you try to implement gcc's __builtin_constant_p() function that determines if the result of an expression is known at compile time? (It's fine to have false negatives for complicated cases.) But it needs to be evaluated at compile time and used for dead-code elimination, otherwise there's little point.
memcpy(&v, cs, 8);
rl=(v>>4)&15;
Needs 5 instructions, but:
v=*(uint64_t *)cs;
rl=(v>>4)&15;
Uses 3 instructions.
Having the compiler turn the former into the latter is possible, but would require more complex pattern matching, and would likely need to be handled in the frontend (rather than in the function-call operation) in the backend.
No, tradition dictates that there is a maximum to the alignment, matching the size of the architecture. 16-bit implementations rarely have any type alignment greater than 16-bit, 32-bit implementations rarely have any alignment greater than 32-bit, and 8-bit implementations rarely have any alignment greater than 8-bit.Tradition dictates that struct members are pad-aligned aligned to their native alignment (usually equal to the size of the base type), unless the struct is 'packed'.Not necessarily, it wouldn't make sense for _Alignof to return 1 for all the basic integer types.>
Of course it makes sense to do that, on targets where an alignment of 1 is safe and efficient.
>
An implementation where all structs are packed by default could have unforeseen consequences...Yes - such as poor performance. And of course some programmers make unwarranted assumptions about alignments and paddings.
Presumably, _Alignof would give the same alignment as would appear in structs or similar.Yes. C requires that.
Scrap all that and have functions to read or write from a given address with specified sizes, using whatever method the compiler sees as most efficient and supported by the target. Or implement mempcy() optimisations for small known sizes, and use that.The main alternatives:But, for" minimum alignment" it may make sense to return 1 for anything that can be accessed unaligned.>
>
Again, I see no use for this.
>
Detect target architecture and "know" whether the architecture is unaligned-safe (ye olde mess of ifdef's);
Have a global PP define that applies to all types, but this doesn't allow for cases where some types are unaligned safe but others are not.
One possibility could be __minalign__(type), but (unlike doing it with preprocessor defines), one could not likely use it in preprocessor expressions.
#if __MINALIGN_LONG__==1
...
#else
...
#endif
Works, but:
#if _Alignof(long)==1
...
Poses problems, as generally the preprocessor is not able to evaluate things like this.
Make a better memcpy() implementation instead.Probably for unaligned deref's on targets where "memcpy()" is a less desirable option (say, if it takes several additional CPU instructions).>>>>
Where, _Alignof(int32_t) will give 4, but __MINALIGN_INT32__ would give 1 if the target supports misaligned pointers.
>
The alignment of types in C is given by _Alignof. Hardware may support unaligned accesses - C does not. (By that, I mean that unaligned accesses are UB.)
>
The point of __MINALIGN_type__ would be:
If the compiler defines it, and it is defined as 1, then this allows the compiler to be able to tell the program that it is safe to use this type in an unaligned way.
>
For what purpose?
>
Are you saying that you have no alignment restrictions for types up to 64-bits (that is, they are placed at any address), but /do/ have alignment restrictions for 128-bit types? That would be so strange that I suspect I am misunderstanding you.Note that a lot of what I am describing here is true of BJX2.This also applies to targets where some types are unaligned but others are not:>
Say, if all integer types 64 bits or less are unaligned, but 128-bit types are not.
>
For what purpose? And why do you want to worry about totally hypothetical systems?
>
It is also true of __m128 and similar in MSVC.This is all quite simple to handle - don't faff around converting pointer types unless you know exactly what you are doing, and you know it is safe to do and your alignments are correct according to the ABI requirements. A decent C compiler is not going to give you incorrect alignments unless you go out of your way to create them via explicit code (i.e., using casts).
__m128 v;
v=*(__m128 *)someptr;
May explode if someptr is not 16-byte aligned, as it may emit a "MOVDQA" or similar (rather than MOVDQU).
But, in both cases, if "int *" or "long *" is misaligned, both are fine with it.
There may be other compilers in a similar camp.I've worked with targets where unaligned access does not work - or where it is immensely slow. This is something that the compiler should get right, and the user should rely on the compiler.
But, then again, it is kinda hypothetical in the sense to claim that one can't cast and deref a pointer, since on most existing targets, it works without issue (except that on GCC one may also need to use 'volatile').
What you mean is that for some bad code, you have to supply these flags or you face "garbage in, garbage out". The code was already broken if these flags are needed for it to behave as the programmer intended.I haven't seen any issues with MSVC and this sort of code usually works as expected...>>
Most of this is being compiled by BGBCC for a 50 MHz cPU.
>
So, the CPU is slow and the compiler doesn't generate particularly efficient code unless one writes it in a way it can use effectively.
>
Which often means trying to write C like it was assembler and manually organizing statements to try to minimize value dependencies (often caching any values in variables, and using lots of variables).
>
>
In this case, the equivalent of "-fwrapv -fno-strict-aliasing" is the default semantics.
>
Generally, MSVC also responds well to a similar coding style as used for BGBCC (or, as it more happened, the coding styles that gave good results in MSVC also tended to work well in BGBCC).
>
Note that MSVC most certainly does /not/ work like "gcc -fwrapv" - signed integer overflow is UB in MSVC, and it generates code that assumes it never happens. There is an obscure officially undocumented (or documented unofficially, if you prefer) flag to turn off such optimisations.
>
Last I read about it, they had no plans to do any type-based alias analysis, but nor did they rule out the possibility in the future.
>
But, a lot of times, one has to supply these options to GCC otherwise the code will break. So, it almost makes sense to assume these semantics as a default.
In the case of BGBCC, I decided to make these semantics the default as a matter of a policy decision.And IMHO that's a /really/ bad idea. Instead of telling users "we know you write shit code - so I'll assume your source code might be shit, even if the results are worse when you write good code", why not encourage people to write code correctly by giving them the best results for correct code? And if possible, give them tools - static and run-time - to help spot their mistakes, rather than blessing those mistakes as a new norm.
There is some talk about pointer provenance semantics for C (apparently semi controversial), but admittedly thus far I don't fully understand the idea.It is complicated, but has big potential for improving static analysis, run-time checkers, and code optimisations. One thing you can be sure is that encouraging people to break the current C rules is only going to make it more likely that they will have trouble in the future.
Les messages affichés proviennent d'usenet.