little "dependable" I/O drift
1980, IBM STL (since renamed SVL) was bursting at the seams and they
were moving 300 people (and their 3270 terminals) from the IMS (DBMS)
group to offsite bldg with dataprocessing service back to STL
datacenter. hey had tried "remote 3270", but found the human factors
unacceptable. I get con'ed into implementating channel extender support
(A220, A710/A715/A720, A510/A515) ... allowing channel attached 3270
cntrolers to be located at the offsite bldg, connected to mainframes
back in STL ... with no perceived difference in human factors (quarter
second or better trivial response).
https://en.wikipedia.org/wiki/Network_Systems_Corporationhttps://en.wikipedia.org/wiki/HYPERchannelSTL had spread 3270 controller boxes across all the channels with 3830
disk controller boxes. Turns out the A220 mainframe channel-attach boxes
(used for channel extender) had significantly lower channel busy for the
same amount of 3270 terminal traffic (as 3270 channel-attach
controllers) and as a result the throughput for IMS group 168s (with NSC
A220s) increased by 10-15% ... and STL considered using NSC HYPERChannel
A220 channel-extender configuration, for all 3270 controllers (even
those within STL). NSC tried to get IBM to release my support, but a
group in POK playing with some fiber stuff got it vetoed (concerned that
if it was in the market, it would make it harder to release their
stuff).
trivia: The vendor eventually duplicated my support and then the 3090
product administer tracked me down. He said that 3090 channels were
designed to have an aggregate total 3-5 channel errors (EREP reported)
for all systems&customers over a year period and there were instead 20
(extra, turned out to be channel-extender support). When I got a
unrecoverable telco transmission error, I would reflect a CSW
"channel-check" to the host software. I did some research and found that
if an IFCC (interface control check) was reflected instead, it basically
resulted in the same system recovery activity (and got vendor to change
their software from "CC" to "IFCC").
I was asked to give a talk at NASA dependable computing workshop and
used the 3090 example as part of the talk
https://web.archive.org/web/20011004023230/
http://www.hdcc.cs.cmu.edu/may01/index.htmlAbout the same time, the IBM communication group was fighting off the
release of mainframe TCP/IP ... and when that got reversed, they changed
their tactic and claimed that since they had corporate ownership of
everything that crossed datacenter walls, TCP/IP had to be released
through them; what shipped got 44kbytes/sec aggregate using nearly whole
3090 processor. I then did RFC1044 support and in some tuning tests at
Cray Research between Cray and IBM 4341, got sustained 4341 channel
throughput using only modest amount of 4341 CPU (something like 500
times improvement in bytes moved per instruction executed).
other trivia: 1988, the IBM branch office asks me if I could help LLNL
(national lab) "standardize" some fiber stuff they were playing with,
which quickly becomes FCS (fibre-channel standard, including some stuff
I had done in 1980), initially 1gbit/sec, full-duplex, aggregate
200mbyte/sec. Then the POK "fiber" group gets their stuff released in
the 90s with ES/9000 as ESCON, when it was already obsolete,
17mbytes/sec. Then some POK engineers get involved with FCS and define a
heavy-weight protocol that drastically cuts the native throughput which
eventually ships as FICON. Most recent public benchmark I've found is
z196 "Peak I/O" getting 2M IOPS using 104 FICON (over 104 FCS). About
the same time a FCS was announced for E5-2600 server blades claiming
over million IOPS (two such FCS having higher throughput than 104
FICON). Note also, IBM documents keeping SAPs (system assist processors
that do I/O) to 70% CPU (which would be more like 1.5M IOPS).
after leaving IBM in early 90s, I was brought in as consultant into
small client/server company, two former Oracle employees (that I had
worked with on cluster scale-up for IBM HA/CMP) were there, responsible
for something called "commerce server" doing credit card transactions,
the startup had also done this invention called "SSL" they were using,
it is now frequently called "electronic commerce". I had responsibility
for everything between webservers and the financial payment networks. I
then did a talk on "Why The Internet Wasn't Business Critical
Dataprocessing" (that Postel sponsored at ISI/USC), based on the
reliability, recovery & diagnostic software, procedures, etc I did for
e-commerce. Payment networks had a requirement that their trouble desks
doing first level problem determination within five minutes. Early
trials had a major sports store chain doing internet e-commerce ads
during week-end national football game half-times and there were
problems being able to connect to payment networks for credit-card
transactions ... after three hrs, it was closed as "NTF" (no trouble
found).
-- virtualization experience starting Jan1968, online at home since Mar1970