Sujet : Re: DRAM accommodations
De : blockedofcourse (at) *nospam* foo.invalid (Don Y)
Groupes : sci.electronics.designDate : 06. Sep 2024, 20:31:48
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vbflbe$tlhp$7@dont-email.me>
References : 1
User-Agent : Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.2.2
On 9/5/2024 3:54 PM, Don Y wrote:
Given the high rate of memory errors in DRAM, what steps
are folks taking to mitigate the effects of these?
Or, is ignorance truly bliss? <frown>
From discussions with colleagues, apparently, adding (external) ECC to
most MCUs is simply not possible; too much of the memory and DRAM
controllers are in-built (unlike older multi-chip microprocessors).
There's no easy way to generate a bus fault to rerun the bus cycle
or delay for the write-after-read correction.
And, among those devices that *do* support ECC, it's just a conventional
SECDEC implelmentation. So, a fair number of UCEs will plague any
design with an appreciable amount of DRAM (can you even BUY *small*
amounts of DRAM??)
For devices with PMMUs, it's possible to address the UCEs -- sort of.
But, this places an additional burden on the software and raises
the problem of "If you are getting UCEs, how sure are you that
undetected CEs aren't slipping through??" (again, you can only
detect the UCEs via an explicit effort so you pay the fee and take
your chances!)
For devices without PMMUs, you have to rely on POST or BIST. And,
*hope* that everything works in the periods between (restart often! :> )
Back of the napkin figures suggest many errors are (silently!) encountered
in an 8-hour shift. For XIP implementations, it's mainly data that is at
risk (though that can also include control flow information from, e.g.,
the pushdown stack). For implementations that load their application
into DRAM, then the code is suspect as well as the data!
[Which is likely to cause more detectable/undetectable problems?]