viletim wrote:The grey background bug technical report.
The bug has been there since the beginning. Of the first batch (100 units) I received no reports of problems that could be attributed to this. On the second batch many people complained that the background colour (at h'3f00') was always grey in many games.
I confirmed for myself that the bug only affected the background colour and only some (most) games. I suspected that the problem may be related to the RAM chip as I had substituted the part used in the original production for one which was faster and of a different brand. Swapping the RAM chip between a boards from the first and second batch showed that the bug moved with the RAM chip.
I decided to modify the logic in the PLD to use the a copy of the background colour byte stored in the PLD instead of the RAM chip. I did this and it appeared to solve the problem. This became version 1.1 POF. This was a quick solution that avoided the need to spend time on solving the root cause when I really needed time to deal with customers.
While this did solve the bug, it was reported by some ( a bit later) that when using version 1.1 the background colour can appear misaligned to the rest of the picture by a fraction of a pixel. I didn't think of this at time I made the modifications but it does make sense. During image rendering when the data is stored in two different places (RAM and PLD) if one of these is faster than the other by a significant amount it may be visible in the output video. From testing five boards from my stock I can only see a visible background shift on one of them. However, I don't have game Mega Man II which is what some were complaining about.
To get to the bottom of the real problem I started looking at the data on the bus. I noticed that games which suffered from the bug were writing to the palette registers on every frame, while those that did not were only writing the data once on the scene change. Here's a capture of the data from an affected game. It writes the whole 32 bytes to the palette every vertical blanking interval. Here's a trace from the RAM chip's /WE line during the palette write:
Zooming in, I noticed there was a short glitch some time after the last write of the block. Here's the event from the CRO. 1/yellow is the RAM's /WE, R/white is the PPU's /CE, and 2/cyan is A0. I would show A1 and A2 also, but this is only a dual channel CRO.
What's happening here? The RAM's /WE line may only go low when writing to the PPU data register and then only if the address register hold the location in range 3F00-3F1F. In the above picture, you see the last two writes to the data register, destined for palette area 3F1E and 3F1F. A0 is visible and is high, while A1 and A2 are also high (though not shown) so this is definitely to the data register.
A little while later a glitch is visible on the /WE line. The data on the bus is h'10' at this point (not shown) -- looking up the colour I find it is indeed grey. If you look at the A0 trace you will see it is low at the time of the glitch. This is not data meant for the data register (address 7), but instead it's meant for the address register (address 6).
What is happening is all events in the PLD are edge driven. Addressed latched on /CE fall, data on /CE rise. The address (I mean A0, A1, A2) is latched for every read/write to the PPU. Once the palette block is written it is still latched at the data register (address 7) then the game writes to the address register.... What is the logic for /WE then?
IF the latched address is in the palette area
AND the write is to the data register
THEN /WE = R/W or /CE
I added another condition that for /WE to be asserted the write must be to the data register according to the current values of A0, A1, A2 as well as the latched version (as this is a mix of edge and level logic). This eliminates the glitch and the problem of the corruption of the background colour byte. This is version 1.3 POF.
Here is a close up of the glitch while running version 1.0 POF.
