Re: Cinematronics Testing (Was: Re: Speech board idea...)

From: Zonn <zonn_at_concentric.net>
Date: Fri Jun 27 1997 - 17:00:00 EDT

At 02:32 PM 6/27/97 -0500, you wrote:
>> NOOOOOOOOOO!!!!! [Hands held up in the form of a cross] I have plenty to
>> do, and for the amount of work I'd put into, I'd be able to fix all the
>> non-working CPU cards I have and still have time for a nice vacation.
>
> This is the main reason why I made this a low priority -- I fixed all my
CPU boards!
>
>> Now your talking the Cinematronics Excersizor. Steve O. has one of these,
>> you'd be better off getting his working since all the expected signals are
>> already accounted for and marked on the CPU schematics.
>
> Not quite......I've talked to Steve about the exercisor. From what I
remember about our discussion(s), the exercisor needs to be used with a
signature analyzer, and what it does is basically keep pumping nops through
the CCPU. You look at the signatures (after time) of a certain point with
the signature analyzer, and you eventually find the signature(s) that are
bad. IMHO, you can do the same thing almost as well with 1 nop and a logic
probe.
>
>> If nothing works you obviously can't use software to test anything. If you
>> light up all the outputs by placing LED's on them and one doesn't work,
>> you've obviously found a bad output. Software testing get grey between
>> those two points. I spent 2 1/2 years as a hardware test engineer (as a
>> software engineer writing hardware test code) and know it can be pretty
>> tricky at times.
>
> If nothing works you have to replace every TTL chip anyways, so what's the
point in testing...

Very funny! :^) I obviously (at least I thought it was obvious) meant that
if your software test routine was unable to run well enough to indicate what
problems it might have found, assuming it could run well enough to even find
the problems, then from a observer point of view nothing appears to be
working. -- It just seemed like that sentence was rather long, so I shorten
it to a more encryptive one.

>Any test program methodology is based upon the assumption that you use
what's working to test what isn't. Since you don't know what's working and
what isn't, you have to go through and assume, each time, that something
different is working. After you've accumulated the results of all your
tests, you will see a pattern, depending what is working and what isn't.

It all, of course, assumes the test program can even run one instruction.
One cpu card had a bad clock, hard to test with software, etc.
>
> There are all kinds of algorithms to come up with test code nowadays, and
that's why I wrote the verilog model. I was going to run it (with the
controllability and observability points that I defined) through an ATPG
(Automatic Test Pattern Generator) and create ROM code based upon the
vectors that get returned. The ATPG will also tell you what sort of fault
coverage that you get based upon those vectors. This is the only
time-effective way to test VLSI circuits (which I design for a living) but
there is no reason why it should only work on VLSI stuff. ATPG should work
on any [known] arbitrary network of combinational and sequential logic whose
controllability and observability points are defined. The CCPU board meets
all those criteria, and the controllability and observability points are
actually pretty good (some were created by Cinematronics themselves, for the
very purpose of testing the CCPU board with their exercisor.)

I think this is an excellent idea! The analogy between this and VLSI is a
very good one, if you can just image this circuit much smaller, you would
have a C-CPU on a chip. The C-CPU is much like taking a simplified version
of, oh a PIC processor, and blowing it up to a board level design.

>
>> And just to be disagreable (since I have no intentions of verifying this ;^)
>> I think the %50 is a bit too high. There are too many parts on the board
>> that if they were to die, all software execution would come to a halt. I've
>> fixed a few of these boards, and the parts I've had to replace (outside of
>> I/O), would have kept even the first instruction from executing properly.
>
> You can't reliably test any complete system (using its defined inputs and
outputs) with any reasonable amount of fault coverage -- You need to break
it up by defining several [internal] controllability and observability
points. Thus, you can put some vectors on the "real" inputs of the system,
and observe them at some intermediate point before the output, where you've
"broken the loop." Then, on the other side of the break, you put a new set
of vectors in and observe them and some other observability point, etc. If
all but one of those sets of vectors pass, you've narrowed down the cause of
the problem (i.e. you don't have to bother testing ANYTHING in those other
areas.) Am I making sense?

Perfectly.

>
> If you're still feeling disagreeble, I LOVE talking about this stuff.

The only thing I disagreed on was the 50 percent instruction count on what
I've found to be typical failures on the board. A single PROM failure will
always take out more than 50% of the instructions, they interact so closely
when decoding an instruction my guess would be closer to 90%, depending on
the PROM. I can think of two of the PROMs, that if they were to fail, would
take out 100% of the instruction set. But I think we talk of different
things when we use these percentages...

>I'm going back to school at the end of August to get a Ph.D. in Electrical
Engineering, most likely specializing in, guess what, VLSI test methodology
(The Prof. that I'm most likely studying under has done a lot of work with
Built-In-Self-Test, etc...) In modern-day VLSI design, any sort of coverage
less than 90 or even 95% is entirely unacceptable, so there are all sorts of
sophisticated tools around for coming up with these vectors. The only thing
different about the testing work that I do, and the CCPU board is that I
have the luxury to set my controllability and observability points wherever
I want to (i.e. the best possible places.) Since Cinematronics DID define
controllability and observability points (those DIP shunts) I'm assuming
that they were as well thought out......I could easily be wrong.

I think the only misunderstanding has been what we considered test programs.
My definition was to: Write some software that by toggling an output line
could indicate which RAM chip has failed to pass it's test. Then extended
this methodology to beyond RAM, using a more limited instruction set, until
there was simply no way for the software to run well enough to indicate
errors. I believe this sort of software would be very limited in finding
hardware problems on the CPU card. This is based on the problems I've found
in the past. With the exception of I/O, I have yet to fix a CPU card that I
could have written software that would have found the problem.

Where this differs from what you describe is you're probing of "vectors" to
allow you to observe failure modes that software alone could not detect.
I'm assuming to "observe" these points you will need more that a logic probe
with an LED that lights up, and to place data on the "real" inputs, your
going to need more than a 4.7k resistor pulled to plus 5v.

Since I think we were also thinking of different things when we spoke of
percentage of coverage, I think using your definition I'll bet you can get
pretty damn close to 100% coverage by choosing your vectors properly.
>
> I'm getting excited about this project again. Anybody got a broken board
to sell? ;)

Not to sell!

-Zonn
Received on Fri Jun 27 14:00:22 1997

This archive was generated by hypermail 2.1.8 : Fri Aug 01 2003 - 00:31:37 EDT