Re: Cost of handling misaligned access

Liste des GroupesRevenir à c arch 
Sujet : Re: Cost of handling misaligned access
De : already5chosen (at) *nospam* yahoo.com (Michael S)
Groupes : comp.arch
Date : 23. Feb 2025, 23:08:24
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <20250224000824.00003afd@yahoo.com>
References : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
User-Agent : Claws Mail 4.1.1 (GTK 3.24.34; x86_64-w64-mingw32)
On Sun, 23 Feb 2025 11:13:53 -0500
EricP <ThatWouldBeTelling@thevillage.com> wrote:

BGB wrote:
On 2/22/2025 1:25 PM, Robert Finch wrote: 
On 2025-02-22 10:16 a.m., EricP wrote: 
BGB wrote: 
On 2/21/2025 1:51 PM, EricP wrote: 
>
and this does 64-bit ADD up to 428 MHz (2.3 ns) on a Virtex-6:
>
Fast and Area Efficient Adder for Wide Data in Recent Xilinx
FPGAs, 2016
http://www.diva-portal.org/smash/get/diva2:967655/FULLTEXT02.pdf
 
>
Errm, skim, this doesn't really look like something you can pull
off in normal Verilog. 
>
Well that's what I'm trying to figure out because its not just
this paper
but a lot, like many hundreds, of papers I've read from
commercial or academic source that seem to be able to control the
FPGA results to a fine degree.
 
 
You could invoke some of the LE's directly as primitives in
Verilog, but then one has an ugly mess that will only work on a
specific class of FPGA.
 
Generally though, one has access in terms of said primitives,
rather than control over the logic block.
 
 
Vs, say, code that will work with Verilator, Vivado, and Quartus,
without needing to be entirely rewritten for each.
 
 
Though, that said, my design might still need some reworking to be
"effective" with Quartus or Altera hardware; or to use the
available hardware. 
 
Ok but this "portability" appears to be costing you dearly.
 
Say, rather than like on a Spartan or Artix (pure FPGA), the
Cyclone FPGA's tend to include ARM hard processors, with the FPGA
and ARM cores able to communicate over a bus. The FPGA part of the
DE10 apparently has its own RAM chip, but it is SDRAM (rather than
DDR2 or DDR3 like in a lot of the Xilinx based boards).
 
Well, apart from some low-end boards which use QSPI SRAMs (though,
having looked, a lot of these RAMs are DRAM internally, but the RAM
module has its own RAM refresh logic).
 
 
>
This can't just be left to the random luck of the wire router.
There must be something else that these commercial and academic
users are able to do to reliably optimize their design.
Maybe its a tool only available to big bucks customers.
>
This has me curious. I'm going to keep looking around.
>
 
I am sure it can be done as I have seen a lot of papers too with
results in the hundreds of megahertz. It has got to be the manual
placement and routing that helps. The routing in my design
typically takes up about 80% of the delay. One can build circuits
up out of individual primitive gates in Verilog (or(), and(), etc)
but for behavioral purposes I do not do that, instead relying on
the tools to generate the best combinations of gates. It is a ton
of work to do everything manually. I am happy to have things work
at 40 MHz even though 200 MHz may be possible with 10x the work
put into it. Typically running behavioural code. Doing things
mostly for my own edification. ( I have got my memory controller
working at 200 MHz, so it is possible).
One thing that I have found that helps is to use smaller modules
and tasks for repetitive code where possible. The tools seem to
put together a faster design if everything is smaller modules. I
ponder it may have to do with making place and route easier.
 
 
It is also possible to get higher speeds with smaller/simple
designs.
 
But, yeah, also I can note in Vivado, that the timing does tend to
be dominated more by "net delay" rather than "logic delay".
 
 
 
This is why my thoughts for a possible 75 MHz focused core would be
to drop down to 2-wide superscalar. It is more a question of what
could be done to try to leverage the higher clock-speed to an
advantage (and not lose too much performance in other areas). 
 
You are missing my point. You are trying work around a problem with
low level module design by rearranging high level architecture
components.
 
It sounds like your ALU stage is taking about 20 ns to do an ADD
and that is having consequences that ripple through the design,
like taking an extra clock for result forwarding,
which causes performance issues when considering Compare And Branch,
and would cause a stall with back-to-back operations.
 
This goes back to module optimization where you said:
 
BGB wrote:
On 2/21/2025 1:51 PM, EricP wrote:
 
and this does 64-bit ADD up to 428 MHz (2.3 ns) on a Virtex-6:
>
Fast and Area Efficient Adder for Wide Data in Recent Xilinx
FPGAs, 2016
http://www.diva-portal.org/smash/get/diva2:967655/FULLTEXT02.pdf
>
Errm, skim, this doesn't really look like something you can pull
off in normal Verilog.
>
Generally, one doesn't control over how the components hook
together, only one can influence what happens based on how they
write their Verilog.
>
You can just write:
  reg[63:0] tValA;
  reg[63:0] tValB;
  reg[63:0] tValC;
  tValC=tValA+tValB;
>
>
But, then it spits out something with a chain of 16 CARRY4's, so
there is a fairly high latency on the high order bits of the
result. 
 
It looks to me that Vivado intends that after you get your basic
design working, this module optimization is *exactly* what one is
supposed to do.
 
In this case the prototype design establishes that you need multiple
64-bit adders and the generic ones synthesis spits out are slow.
So you isolate that module off, use Verilog to drive the basic LE
selections, then iterate doing relative LE placement specifiers,
route the module, and when you get the fastest 64-bit adder you can
then lock down the netlist and save the module design.
 
Now you have a plug-in 64-bit adder module that runs at (I don't know
the speed difference between Virtex and your Spartan-7 so wild guess)
oh, say, 4 ns, to use multiple places... fetch, decode, alu, agu.
 
Then plug that into your ALU, add in SUB, AND, OR, XOR, functions,
isolate that module, optimize placement, route, lock down netlist,
and now you have a 5 ns plug-in ALU module.
 
Doing this you build up your own IP library of optimized hardware
modules.
 
As more and more modules are optimized the system synthesis gets
faster because much of the fine grain work and routing is already
done.
 


It sounds like your 1st hand FPGA design experience is VERY outdated.






Date Sujet#  Auteur
2 Feb 25 * Re: Cost of handling misaligned access112BGB
3 Feb 25 +* Re: Cost of handling misaligned access2MitchAlsup1
3 Feb 25 i`- Re: Cost of handling misaligned access1BGB
3 Feb 25 `* Re: Cost of handling misaligned access109Anton Ertl
3 Feb 25  +* Re: Cost of handling misaligned access11BGB
3 Feb 25  i`* Re: Cost of handling misaligned access10Anton Ertl
3 Feb 25  i +- Re: Cost of handling misaligned access1BGB
3 Feb 25  i `* Re: Cost of handling misaligned access8Thomas Koenig
4 Feb 25  i  `* Re: Cost of handling misaligned access7Anton Ertl
4 Feb 25  i   +* Re: Cost of handling misaligned access5Thomas Koenig
4 Feb 25  i   i`* Re: Cost of handling misaligned access4Anton Ertl
4 Feb 25  i   i +* Re: Cost of handling misaligned access2Thomas Koenig
10 Feb 25  i   i i`- Re: Cost of handling misaligned access1Mike Stump
10 Feb 25  i   i `- Re: Cost of handling misaligned access1Mike Stump
4 Feb 25  i   `- Re: Cost of handling misaligned access1MitchAlsup1
3 Feb 25  +* Re: Cost of handling misaligned access3Thomas Koenig
3 Feb 25  i`* Re: Cost of handling misaligned access2BGB
3 Feb 25  i `- Re: Cost of handling misaligned access1MitchAlsup1
4 Feb 25  +* Re: Cost of handling misaligned access41Anton Ertl
5 Feb 25  i`* Re: Cost of handling misaligned access40Terje Mathisen
5 Feb 25  i +* Re: Cost of handling misaligned access4Anton Ertl
5 Feb 25  i i+* Re: Cost of handling misaligned access2Terje Mathisen
6 Feb 25  i ii`- Re: Cost of handling misaligned access1Anton Ertl
6 Feb 25  i i`- Re: Cost of handling misaligned access1Anton Ertl
5 Feb 25  i `* Re: Cost of handling misaligned access35Michael S
6 Feb 25  i  +* Re: Cost of handling misaligned access32Anton Ertl
6 Feb 25  i  i`* Re: Cost of handling misaligned access31Michael S
6 Feb 25  i  i +* Re: Cost of handling misaligned access2Anton Ertl
6 Feb 25  i  i i`- Re: Cost of handling misaligned access1Michael S
6 Feb 25  i  i `* Re: Cost of handling misaligned access28Terje Mathisen
6 Feb 25  i  i  `* Re: Cost of handling misaligned access27Terje Mathisen
6 Feb 25  i  i   `* Re: Cost of handling misaligned access26Michael S
6 Feb 25  i  i    `* Re: Cost of handling misaligned access25Terje Mathisen
6 Feb 25  i  i     +* Re: Cost of handling misaligned access19Michael S
7 Feb 25  i  i     i`* Re: Cost of handling misaligned access18Terje Mathisen
7 Feb 25  i  i     i `* Re: Cost of handling misaligned access17Michael S
7 Feb 25  i  i     i  `* Re: Cost of handling misaligned access16Terje Mathisen
7 Feb 25  i  i     i   `* Re: Cost of handling misaligned access15Michael S
7 Feb 25  i  i     i    +- Re: Cost of handling misaligned access1Terje Mathisen
7 Feb 25  i  i     i    +* Re: Cost of handling misaligned access3MitchAlsup1
8 Feb 25  i  i     i    i+- Re: Cost of handling misaligned access1Terje Mathisen
8 Feb 25  i  i     i    i`- Re: Cost of handling misaligned access1Michael S
8 Feb 25  i  i     i    `* Re: Cost of handling misaligned access10Anton Ertl
8 Feb 25  i  i     i     +- Re: Cost of handling misaligned access1Terje Mathisen
8 Feb 25  i  i     i     +* Re: Cost of handling misaligned access6Michael S
8 Feb 25  i  i     i     i`* Re: Cost of handling misaligned access5Anton Ertl
8 Feb 25  i  i     i     i +- Re: Cost of handling misaligned access1Michael S
9 Feb 25  i  i     i     i +* Re: Cost of handling misaligned access2Michael S
11 Feb 25  i  i     i     i i`- Re: Cost of handling misaligned access1Michael S
9 Feb 25  i  i     i     i `- Re: Cost of handling misaligned access1Michael S
9 Feb 25  i  i     i     +- Re: Cost of handling misaligned access1Michael S
10 Feb 25  i  i     i     `- Re: Cost of handling misaligned access1Michael S
7 Feb 25  i  i     `* Re: Cost of handling misaligned access5BGB
7 Feb 25  i  i      `* Re: Cost of handling misaligned access4MitchAlsup1
7 Feb 25  i  i       `* Re: Cost of handling misaligned access3BGB
8 Feb 25  i  i        `* Re: Cost of handling misaligned access2Anssi Saari
8 Feb 25  i  i         `- Re: Cost of handling misaligned access1BGB
6 Feb 25  i  `* Re: Cost of handling misaligned access2Terje Mathisen
6 Feb 25  i   `- Re: Cost of handling misaligned access1Michael S
6 Feb 25  +* Re: Cost of handling misaligned access5Waldek Hebisch
6 Feb 25  i+* Re: Cost of handling misaligned access3Anton Ertl
6 Feb 25  ii`* Re: Cost of handling misaligned access2Waldek Hebisch
6 Feb 25  ii `- Re: Cost of handling misaligned access1Anton Ertl
6 Feb 25  i`- Re: Cost of handling misaligned access1Terje Mathisen
13 Feb 25  `* Re: Cost of handling misaligned access48Marcus
13 Feb 25   +- Re: Cost of handling misaligned access1Thomas Koenig
14 Feb 25   +* Re: Cost of handling misaligned access41BGB
14 Feb 25   i`* Re: Cost of handling misaligned access40MitchAlsup1
18 Feb 25   i `* Re: Cost of handling misaligned access39BGB
18 Feb 25   i  +* Re: Cost of handling misaligned access33MitchAlsup1
18 Feb 25   i  i+- Re: Cost of handling misaligned access1BGB
18 Feb 25   i  i`* Re: Cost of handling misaligned access31Michael S
18 Feb 25   i  i +- Re: Cost of handling misaligned access1Thomas Koenig
18 Feb 25   i  i +* Re: Cost of handling misaligned access26MitchAlsup1
18 Feb 25   i  i i`* Re: Cost of handling misaligned access25Terje Mathisen
18 Feb 25   i  i i `* Re: Cost of handling misaligned access24MitchAlsup1
19 Feb 25   i  i i  `* Re: Cost of handling misaligned access23Terje Mathisen
19 Feb 25   i  i i   `* Re: Cost of handling misaligned access22MitchAlsup1
19 Feb 25   i  i i    `* Re: Cost of handling misaligned access21BGB
20 Feb 25   i  i i     +- Re: Cost of handling misaligned access1Robert Finch
20 Feb 25   i  i i     +* Re: Cost of handling misaligned access5MitchAlsup1
20 Feb 25   i  i i     i+* Re: Cost of handling misaligned access2BGB
20 Feb 25   i  i i     ii`- Re: Cost of handling misaligned access1BGB
21 Feb 25   i  i i     i`* Re: Cost of handling misaligned access2Robert Finch
21 Feb 25   i  i i     i `- Re: Cost of handling misaligned access1BGB
21 Feb 25   i  i i     `* Re: Cost of handling misaligned access14BGB
22 Feb 25   i  i i      +- Re: Cost of handling misaligned access1Robert Finch
22 Feb 25   i  i i      `* Re: Cost of handling misaligned access12Robert Finch
23 Feb 25   i  i i       +* Re: Cost of handling misaligned access10BGB
23 Feb 25   i  i i       i`* Re: Cost of handling misaligned access9Michael S
24 Feb 25   i  i i       i +- Re: Cost of handling misaligned access1BGB
24 Feb 25   i  i i       i `* Re: Cost of handling misaligned access7Michael S
24 Feb 25   i  i i       i  +* Re: Cost of handling misaligned access4Robert Finch
24 Feb 25   i  i i       i  i+- Re: Cost of handling misaligned access1BGB
24 Feb 25   i  i i       i  i`* Re: Cost of handling misaligned access2MitchAlsup1
25 Feb 25   i  i i       i  i `- Re: Cost of handling misaligned access1BGB
25 Feb 25   i  i i       i  `* Re: Cost of handling misaligned access2MitchAlsup1
25 Feb 25   i  i i       i   `- Re: Cost of handling misaligned access1BGB
23 Feb 25   i  i i       `- Re: Cost of handling misaligned access1Robert Finch
18 Feb 25   i  i `* Re: Cost of handling misaligned access3BGB
19 Feb 25   i  i  `* Re: Cost of handling misaligned access2MitchAlsup1
18 Feb 25   i  `* Re: Cost of handling misaligned access5Robert Finch
17 Feb 25   `* Re: Cost of handling misaligned access5Terje Mathisen

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal