Topics

7854 Bad display board

Paul
 

So after going down the ROM rabbit hole, procuring the appropriate EPROMs & programmer (BP Microsystems EP-1, super easy to use, and available for cheap on eBay), and successfully bringing the ROM board back to life, I'm now confronted with a bad display board, as evidenced by all vertical & horizontal mode switches being illuminated. The manual states that pressing mode switches will continue the self test, but they are unresponsive. I've done the usual 're-seat everything' move, but no luck.
What are my options? Does anyone have experience troubleshooting this board? Does anyone have a parts unit they'd be willing to sell a board from?

thanks!
Paul

Dan G
 

On Sun, May 10, 2020 at 06:44 PM, Paul wrote:

I'm now confronted with a bad display board, as evidenced by all vertical & horizontal
mode switches being illuminated.
Hi Paul,

Normally, a failure of the display subsystem self-test should result in all
vertical and only ALT horizontal mode switches being illuminated.

If you are really seeing all mode lamps lit, then it would suggest that the
self-test code got stuck somehow, rather than any specific
self-test failing. Is there any other indication that makes you
suspect the display board?


dan

Paul
 

/facepalm/ I read 'ALT' as 'All.'
Okie, so back to the ROM board, it would seem!

I might just delete this post.
thanks Dan!

Dan G
 

At this point, it would be good to narrow down just where the CPU
is getting stuck. The fact that you have all lights on means that the
CPU has started self-tests, but it never finishes.

So, let's see if you are now at least passing the RAM and ROM
self-tests: remove jumper P1306 (labeled IRTC in the upper right
corner of the MPU board), and power up the scope. If the CPU
has successfully passed the ROM self-test, then it will start the
RTC test next, and by removing P1306, we will force it to fail.

If you now get CHOP and B to light up after a few seconds,
then the RTC self-test has failed as expected, and it at least
confirms that your ROM checksum tests are passing, and that
the CPU can make it through the entire RTC self-test code
sequence.

Don't forget to replace the P1306 jumper after this.


dan

Paul
 

Hey Dan -

Last night I replaced all the ROMs & EPROMs with the patched -02 version and pulled the FPLA and I'm still getting all the lights. No change if I pull the jumper. My next course of action this evening:
1) check all the supply voltages
2) Check some of the test points on the top of the ROM & MPU with a LA to see if I can tell how far along it's getting.

If I pull one of the 4 ROMs (now EPROMs), I'll just get a single, dim 'A' illuminated, so it's at least getting somewhere before failing. When I get all lights on, there's no describable delay.
I've also got a card edge connector en route for P130 on the MPU so I can get eyes on all the address & data lines.

thanks -
Paul

Harvey White
 

Don't know if they did this for this processor, but often if a firmware test fails, there will be some sort of output somewhere identifying the test.  I don't absolutely mean the complicated ones, but some of the most basic tests.  It might just be that the processor loops on a particular address with a fail.  I've seen that, and equipment of this vintage had a lot of "startup" and "check the processor" tests that could behave like that.  IIRC, the DM5010 has a bunch.  Whether or not you can find the data and extract anything useful from it depends much on the depth of the manual.

If you have a signature analyzer, some of the same equipment (vintage wise) makes extensive use of that function.  It generally places the processor in a no-op loop and you get to see what all the addressing does.

Harvey

On 5/11/2020 10:47 AM, Paul wrote:
Hey Dan -

Last night I replaced all the ROMs & EPROMs with the patched -02 version and pulled the FPLA and I'm still getting all the lights. No change if I pull the jumper. My next course of action this evening:
1) check all the supply voltages
2) Check some of the test points on the top of the ROM & MPU with a LA to see if I can tell how far along it's getting.

If I pull one of the 4 ROMs (now EPROMs), I'll just get a single, dim 'A' illuminated, so it's at least getting somewhere before failing. When I get all lights on, there's no describable delay.
I've also got a card edge connector en route for P130 on the MPU so I can get eyes on all the address & data lines.

thanks -
Paul


Dan G
 

On Mon, May 11, 2020 at 10:47 AM, Paul wrote:

Last night I replaced all the ROMs & EPROMs with the patched -02 version and
pulled the FPLA and I'm still getting all the lights. No change if I pull the
jumper.
That is a somewhat surprising result, given that the RTC self-test takes only
a small number of fairly linear CPU instructions after the ROM test, and,
with J1306 removed, cannot jump to any code that could loop indefinitely.
(That is, not until it has changed the lamp pattern. It will then loop
between 0x73A and 0x742, waiting for you to press one of the
mode switches.)

I have a sneaking suspicion that another fault developed in the scope
since the beginning of your ROM board saga, and that if you put all the
original ROMs and EPROMs back, the scope will still just get stuck
anyway and won't give you a ROM error code any more.

Either that, or the new EPROM ICs that you are using are doing something
really strange to your address and/or data lines.

In any event, I agree that breaking out the logic analyzer is probably the
best way forward.


Good luck,
dan

Paul
 

yep - you called it! Old ROMs in, all lights on. Haven't broken out the LA yet, but scoped the test points on the ROM board, and am seeing activity on the chip select test points. It's clearly in a pretty short loop, but it's not always the same. Is there a safe way to reset the CPU? It looks like grounding TP1200-1 might do it?
All the supply rails are spot on.

Dan G
 

On Mon, May 11, 2020 at 08:50 PM, Paul wrote:

Is there a safe way to reset the CPU? It looks like
grounding TP1200-1 might do it?
Grounding TP1200-1 should be safe -- that's what the 067-0911-00
diaganostic interface does. The CPU requires the _RESET line
to be held low for at least 3 clock cycles, so you may need to use
a de-bounced switch for reliable results.


dan

Paul
 

Rigged up a reset and it's helping. I won't really be able to tell what's going on at startup until the edge card connector arrives and I can look at the address lines, but I can tell by looking at the chip selects once it's up that it ends up in the same loop. I'm clocking the LA off the rom type selector jumper, which I think is giving me a pulse for every fetch. If that's the case, the loop is around 106 instructions long.
We'll see how useful this little 338 logic analyzer ends up being - it's pretty rudimentary, and only 256 words deep, but it is rather adorable.

Paul
 

One thing to note - All lights illuminate immediately. The manual seems to indicate that it takes a few seconds to indicate an error code, but even when I was getting the bad ROM light pattern, it was immediate.

Dan G
 

On Thu, May 14, 2020 at 11:54 AM, Paul wrote:

One thing to note - All lights illuminate immediately. The manual seems to
indicate that it takes a few seconds to indicate an error code, but even when
I was getting the bad ROM light pattern, it was immediate.
That doesn't sound right. There should be about a 3 second delay between
pressing the POWER button, and the front panel lamps lighting up. That's
because the lamps are normally explicitly enabled by the CPU.

The normal sequence goes like this:

1. Turn on AC power.
2. The _RESET line should be held low by power-up (PUP)
logic until the low voltage regulator has stabilized. The _RESET
line is also applied directly to U71/U72 pin 1 (_CLR), so
all front panel lamps should be off as long as the _RESET
line is asserted.
3. _RESET is de-asserted, allowing the CPU to start running the
firmware code.
4. The CPU first examines RAM addresses D908 and D90A. If they
contain a special bit pattern, the firmware decides that memory contents
have been preserved by external power, skips self-tests, and goes
straight to ready mode.
5. Otherwise, it turns on all front panel lamps by writing to
memory-mapped register U71/U72 at address E010,
and then starts the self-test sequence.

If your lamps turn on immediately at power-on, then the two most likely
causes are: (1) your _RESET is being de-asserted too soon, and the
CPU starts racing before voltages have stabilized, or (2) the _RESET
line is not reaching your mode switch board A2.

If you still have your reset button attached to TP1200-1, you can try
pressing it. All front panel lamps should be off as long as you hold down
the button. If that works, then I would look at whether your _RESET
line is being de-asserted too soon.


dan

Paul
 

Interesting... The front panel lights extinguish when I press reset, and illuminate immediately upon releasing the button. The 'Busy' LED stays illuminated at all times.
I just got the edge connector (and printed manual) today, so I can start looking at what's going on in the address bus.
thanks
Paul

Dan G
 

On Thu, May 14, 2020 at 03:00 PM, Paul wrote:

Interesting... The front panel lights extinguish when I press reset, and
illuminate immediately upon releasing the button. The 'Busy' LED stays
illuminated at all times.
This confirms two things:

1) your A2 board is receiving the _RESET signal correctly
2) _RESET is de-asserted prematurely at power-on

At this point, I would deal with 2) first, before even looking at
what the CPU is doing, since if it starts running too early, it's
mostly a case of garbage-in-garbage-out.

It seems there is something wrong either with the PUP signal
from the low voltage regulator board, or else the little bit of
glue logic on the MPU board that takes the PUP signal from
pin 3 of the motherboard edge connector, and feeds it to
U920 pin 5.

You can test a portion of the glue logic by installing a jumper
across P101 on the MPU board. With it in place, the power-up
delay should be maintained indefinitely, and your mode switch
lamps should never light up at all.


dan

Paul
 

It seems there is something wrong either with the PUP signal
from the low voltage regulator board,
On power-up, PUP goes high immediately high and stays high... OK, making my way through the PS docs... damn, I was hoping this wouldn't take me back in there.

You can test a portion of the glue logic by installing a jumper
across P101 on the MPU board. With it in place, the power-up
delay should be maintained indefinitely, and your mode switch
lamps should never light up at all.
Yep, that works as expected

Dan G
 

On Thu, May 14, 2020 at 08:38 PM, Paul wrote:

On power-up, PUP goes high immediately high and stays high... OK, making my
way through the PS docs... damn, I was hoping this wouldn't take me back in
there.
PUP shouldn't go high immediately. That is where the 2-3 second delay
is supposed to come from: it should start low after AC power is first applied,
then after a few seconds it should go high and stay high. This circuit is
described on page 2-50 of the service manual.


dan

Paul
 

That delay is generated by C172. Maybe one of the gates in U179 is stuck high? In either case, that shouldn't impact the ability to troubleshoot the boot issue if I've got the reset wired up (right?). I'm now using the TESTPUP signal on P130 to reset. I think I'm at a point where I can at least view the address lines, 256 bytes at a time, Triggering off Address 0000 && TESTPUP = 0. The first instruction is 6D8F, but I'm not sure where to start looking next.
thanks,
Paul

Dan G
 

On Sat, May 16, 2020 at 12:54 PM, Paul wrote:

I think I'm at a point where I can at least view the
address lines, 256 bytes at a time, Triggering off Address 0000 && TESTPUP =
0. The first instruction is 6D8F, but I'm not sure where to start looking
next.
Hi Paul,

After _RESET is de-asserted, the CPU should immediately read the following
addresses in this sequence:

Address, Data, Explanation:
0000, DB00, read and set the workspace pointer
0002, 04E2, read the address of the first instruction, then jump there
04E2, 0300
04E4, 0007, read and execute the first 2-word instruction: LIMI 7
04E6, 04E0
04E8, DA14 read and execute the next 2-word instruction: CLR @>DA14

It should then continue to run code in the lower half of the ROM address space,
as it executes the self-tests. Interrupts should not be affecting the CPU until
it starts running the RTC self-test (so, after the ROM checksum test).

I've searched the entire ROM image, and 6D8F is not in the ROM. Where are
you reading this value? It also makes no sense as an address, since it is odd,
and all memory access is on even addresses.

If you are not seeing the above sequence of address fetches, then there is
something very wrong, or perhaps you are not clocking your LA on the
correct clock phase (remember, there are 4 distinct phases of the 3 MHz
CPU clock generated by U920). You may also want to get your LA to acquire
the state of the IAQ signal coming from the CPU (TP500-3) -- it tells you
when the CPU is fetching an instruction. You also need to acquire the
_MEMEN (TP1200-5) line. The CPU uses the address lines for things other
than memory when this line is high. You will need a 36 channel (or more)
LA to fully see what the CPU is doing.

With your PUP circuit broken, you will need to simulate it manually. I would suggest
adding a de-bounced switch to P101: hold it down while the scope is powered
off, then turn the scope on, then release the switch after a few seconds.
I don't have access to the internal design of the TMS9900, so I cannot tell
you the exact effects of attempting to run it before all voltages have stabilized.
This CPU needs -5V, +5V and +12V. It is possible that an internal latch-up
may occur if the CPU starts running before all three voltages are stable, in such a
way that _RESET won't be able to recover from it. I suspect that we would need
someone from TI's TMS9900 design team (and an amazing memory!) to clarify
this edge case. It is probably best to simply avoid it.

In any case, I encourage you to address the PUP circuit first. The P101 work-around
will simulate it for the MPU board, but the RAM board (and possibly others -- I didn't
check all digital circuits in the 7854) also depend on this signal, and manual resets
will not help there. Also, make sure all four clock phases are generated correctly
with the required inter-phase relationships - a two channel scope should be enough
to do this quickly.


Good luck,
dan

Paul
 

I've searched the entire ROM image, and 6D8F is not in the ROM. Where are
you reading this value? It also makes no sense as an address, since it is odd,
and all memory access is on even addresses.
Right - that sounds like a set-up error on my part. I didn't think I even have the 0 bit set up in that group, but I must have made a mistake. I'm looking at addresses now, since I've only got 16 channels of pods.

If you are not seeing the above sequence of address fetches, then there is
something very wrong, or perhaps you are not clocking your LA on the
correct clock phase (remember, there are 4 distinct phases of the 3 MHz
CPU clock generated by U920).
I was clocking on the 3rd or 4th phase, as those are what's on the edge connector

You will need a 36 channel (or more) LA to fully see what the CPU is doing.
Yeah, I'm thinking this 338 is the wrong tool for the job. Time to level up to something more substantial.

In any case, I encourage you to address the PUP circuit first.
Good call, will do.

BTW, is there a program listing or disassembly somewhere that you were referring to, or are you just intimately familiar with this beast?
thanks again!
Paul

Dan G
 

On Mon, May 18, 2020 at 08:03 PM, Paul wrote:

I was clocking on the 3rd or 4th phase, as those are what's on the edge
connector
Hi Paul,

I recommend taking a look at the "TMS 9900 Memory Bus Timing"
diagram in the TMS 9900 Microprocessor Data Manual, if you
haven't seen it already. It should give you an idea of what CPU signals
you need to acquire with your LA, and how to interpret them.

I don't have the 338 LA, but it appears to have 32 channels, based on
its User's Manual. For initial diagnostics, I think you only need to monitor
the 15 address lines, plus _MEMEN, IAQ, WAIT, HOLDA, DBIN, _WE,
READY and _RESET. You can use whatever channels remain to get a
partial view of the data lines. While not perfect, that should be sufficient
for this task.

BTW, is there a program listing or disassembly somewhere that you were
referring to, or are you just intimately familiar with this beast?
There is no published firmware source code that I am aware of. I have
a side pet project that may change this in the future, but there is no
timeline yet. For now, let's just say that I happen to be familiar with portions
of the 7854 firmware that your scope does not seem to be able to get through.


dan