Sujet : Re: Is Intel exceptionally unsuccessful as an architecture designer?
De : ldo (at) *nospam* nz.invalid (Lawrence D'Oliveiro)
Groupes : comp.archDate : 22. Sep 2024, 00:50:46
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vcnm4m$1p6a3$5@dont-email.me>
References : 1 2
User-Agent : Pan/0.160 (Toresk; )
On Fri, 20 Sep 2024 21:06 +0100 (BST), John Dallman wrote:
All the threads are executing exactly the same instructions,on the same
code path.
Yes, but look at the things that GPUs, for example, typically do: large
parts of their execution time is in pieces of code called “fragment
shaders”. In OpenGL, a “fragment” means a “pixel before compositing” --
one or more “fragments” get composited together to produce a final image
pixel. They could have just called it a “pixel in an intermediate image
buffer”, and avoided introducing yet another mysterious-sounding technical
term.
There are a lot of memory accesses involved in a typical fragment shader:
reading from texture buffers, reading/writing other image buffers. Then
you have things like stencil buffers and depth buffers, that play their
part in the computation. And geometry buffers, though these tend to be
smaller. Buffers coming out your ears, basically. So the proportion of
instructions that access memory is much higher than a typical CPU workload
-- probably not far short of 100%, certainly in execution time.
As I recall, DRAM access involves specifying “row” and “column” addresses.
As I further recall, if the “row” address does not change from one access
to the next, then you can specify multiple successive “column”-only
addresses and do faster sequential access to the memory (until you hit the
end of the row). GPUs would take full advantage of this, and their
patterns of memory usage should suit it quite well.
On the other hand, such heavily sequential access has poor caching
behaviour.
So you see the difference in memory behaviour between GPUs and CPUs: CPUs
have (or allow) more complex patterns of memory access, necessitating
elaborate memory controllers with multiple levels of caching to get the
necessary performance, while GPUs can make do with much simpler memory
interfaces that don’t benefit from caching.
This also complicates any ability to share memory between GPUs and CPUs.
Which brings us back to the point I made before: CPU RAM on the
motherboard is typically upgradeable, while GPU RAM comes on the same card
as the GPU, and is typically not upgradeable.