Sujet : Re: "RESET"
De : '''newspam''' (at) *nospam* nonad.co.uk (Martin Brown)
Groupes : sci.electronics.designDate : 28. May 2025, 09:43:00
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <1016iak$35lml$1@dont-email.me>
References : 1 2 3
User-Agent : Mozilla Thunderbird
On 25/05/2025 20:33, Joe Gwinn wrote:
On Sun, 25 May 2025 02:37:09 +0200, "Carlos E. R."
<robin_listas@es.invalid> wrote:
On 2025-05-25 00:34, Don Y wrote:
I don't quite understand the need for "reset" buttons on products.
>
That function is always available by cycling power -- even for devices
where that is difficult for the user (e.g., PoE, BBU, etc.)
>
Shouldn't a device be able to get itself out of a "pickle" without
requiring the user to intervene? Particularly devices that are
intended to "run forever"?
>
I.e., it seems like the presence of a reset button is a tacit admission
that the engineering is "lacking"...
>
Even the initial microprocessors have a reset pin. When they are powered
up, the status of the electronics is unknown, so a small time after
power up, the line is triggered by a timer (555 or whatever).
>
Then, there are many designs where you can not pull power, because there
is an unreachable battery.
>
Then, it is impossible to guarantee that the device will never find
itself in a pickle. No matter how fantastic the designers are.
Exactly. I recall a customer wanting us to verify all possible paths
through a bit of air traffic control radar software, about 100,000
lines of plain C. Roughly one in five executable line was an IF
statement, which is 20,000 IF statements. So there are 2^20000 =
10^6020 such paths.
Executing each path in the code at least once is a much more tractable problem. McCabes CCI metric will tell you how many test vectors it will take for a given complexity of code. And it should be done.
A sufficiently high CCI index for a routine also means that such code is highly unlikely to be correct.
I recall one instance on a mainframe (brand withheld to protect the guilty) where a rogue program that ran continuously was slowly using up IO handles repeatedly opening the tracker ball interface each time a new user accessed it (and never letting go).
One day after a particularly long uptime it completely ran out of IO handles. Guess what the first thing the error handler tried to do?
Yup! It tried to obtain a new IO handle to report the error!
The testing campaign will have only scratched the surface when the Sun
runs out of hydrogen and goes supernova. Tomorrow's problem.
You can't out test every possible combination of events but you can make sure that the code paths when executed in at least one scenario don't do anything horribly bad. A lot of faults can lurk in the rarely used error recovery code that only gets used after something else has gone wrong.
Ariane 5 was an example of that sort of thing.
comp.risks is littered with them.
-- Martin Brown