I bought another Hitachi H2 MSX last year, mostly because I wanted the manual, which I’ve scanned. Unfortunately for my free time but fortunately for my, um, education in retro computing, this computer had issues with its video RAM. Often, the computer would boot up with a garbled screen. Resetting after a couple minutes would usually fix the issue. The video RAM is made by Toshiba, and is called TMM416P-2 (also marked 4116-2). If you have this memory, I’d recommend you look out for issues, because all eight ICs had the same issue, namely: crazy-ass noise on the -5V line. (How much noise is “crazy-ass” noise? In this case, it’s +-3V.) The noise sort of comes and goes, or at least gets stronger and weaker, randomly, which made it too hard for me to find a combination of capacitors to tame it. (Though it’s more likely to be present after turning the computer on after a long while.) I ended up socketing them all, replacing one that unfortunately died during the very professional desoldering process, and added 103 ceramic capacitors to (almost) every one, between the -5 and GND pins, which seems to have a slight positive effect. (The bottom part of the case has a hook that requires some clearance and prevents two of the chips from getting their capacitor.) I also replaced the zener diode with a 7905, which fit perfectly after bending the legs a little bit.
Details
The -5V rail for the 4116 VRAM chips is generated using a zener diode. Replacing this, or the capacitor on the rail, unfortunately didn’t have any effect. Hmm, odd!
Next, I decided to desolder the -5V pin on the first 4116 IC, and drive it using my own known good -5V supply (using a standard 7905 regulator). Result: noise both on the first chip and all the others. Hmm, odd!
Next, I did this for the rest of the 4116 ICs, and was able to see that each and every one generates noise.
Next, I decided to desolder all of them and individually test them on my 4116 tester. (They still produced the noise while in the tester.) I decided to desolder all of them because the H2 seemed to support 4416 ICs for the video RAM, and I happened to have some of those that were waiting to be put to use. I.e., there are holes of the right size, right next to the VDP, and the silkscreen on those holes says “TMS4416”. ;)
Well, today’s lesson is, do not necessarily trust the silkscreen. The TMS9928A doesn’t even support 4416 VRAM! The holes where the data pins go didn’t even have any traces on them. The TMS9928A can be made to support 4416 RAM using a custom circuit, though. Maybe I should have implemented this circuit. I even bought the two required parts! But then decided against it for complexity management reasons.
Unfortunately, expecting to be able to use the 4416 slots, I had desoldered the original VRAM ICs in a rather brutish manner, losing vias and traces in the process, which meant that I needed to add a bunch of bodge wires to get them to work again. At least the bodge wires aren’t too complex to figure out, if at some point one of them decides to become loose again. I ended up keeping the original, noisy, RAM chips. But since they’re now all socketed, it shouldn’t be too hard to replace them at some point, if necessary.
Pictures
After
Before looking at the picture of the bodge wires below, please keep in mind that it is rude to stare.
Hi, just a quick thing that may be useful if you’re trying to save memory while using libwebp to encode a monochrome image.
We assume that you already have the monochrome data, and you just want to encode it. Your data size is width*height*1, and is 100% equivalent to the Y channel in YUV.
Allocating a bunch of memory (width*height/2) for the UV part would be silly, but looks like it’s required, right?
Well, we can actually get away with just width/2, by doing this:
picture.y = y_data; // straightforward
picture.y_stride = width; // straightforward
picture.u = dummy_uv_data; // array of length width/2 full of 0x80 bytes
picture.v = dummy_uv_data; // it's the same array!
picture.uv_stride = 0; // stride is 0! this means we'll always read UV from the same location for every single pixel row
BTW, you can generate your y_data with the following ImageMagick command:
convert test.jpg y:test.y
Below is the full code demonstrating the use of this trick. Note that the code is about 90% generated by ChatGPT, and I haven’t cleaned it up beyond the minimum necessary to get it to work. Don’t forget to adjust the width and height variables to match your input image.
Ah yes, printf debugging. If you’re like me and occasionally need to place a dozen “got here”s at once, you may find this, or something like this helpful.
You need some kind of facility to create global shortcuts. If you, like many sensible people in the world, are a KDE user, you’ll find such a facility right in the settings:
Define two shortcuts, perhaps name them “Next” and “Redefine”. Perhaps Meta+Ctrl+Alt+Shift isn’t very ergonomic, but it’s probably unique at least.
Next, we’ll add actions. “Next” should type something like ‘printf(“Got here23”);’, and “redefine” allows you to change the ‘printf(“Got here’ prefix and the ‘”);’ suffix.
Here are two example shell scripts to accomplish this. Dependencies: xclip, xdotool. (Note: these scripts probably won’t work on Wayland, but I’d assume there are Wayland-compatible replacements for these two programs.)
#!/bin/bash
cd $(dirname -- "${BASH_SOURCE[0]}")
touch prefix_or_suffix
prefix_or_suffix=$(cat prefix_or_suffix)
if [ "$prefix_or_suffix" == 1 ]; then
xclip -o -selection primary > suffix
prefix_or_suffix=0
else # 0 or blank or junk
xclip -o -selection primary > prefix
prefix_or_suffix=1
fi
echo -n $prefix_or_suffix > prefix_or_suffix
echo -n 1 > next_i
If your system is kind of slow and xdotool’s output gets chopped up somehow, maybe try xdotool key –delay 50. You could also do echo $string_to_type | xclip, and then xdotool to send Ctrl-V in order to paste. That might be a little faster for long strings.
Here’s a short video clip that shows how this works:
By the way, this is the 100th post on this blog. :O
Hi! My sabbatical ended and I’ve been working again since two months ago. Boo. However there’s this thing I just wanted to get off my chest, so I spent a few hours that I did not really have and wrote some code and this blog post about it!
Last year, I made a 4164/4116 DRAM tester for the Raspberry Pi Pico, which works just as you would expect, you program the Pico, place it on a breadboard, add some wires and something to drop the 5V to 3.3V for the Q output, place the 4164 or 4116 chip you’d like to test on the breadboard, connect a USB cable from the Pico to a computer, and look at the terminal output (or just the on-board LED). This is useful if you have already extracted a 4164 chip that you have determined to be bad. (I have written previously how you could determine whether a 4164 chip is bad, here and here.)
We “allocate” 64 KB of RAM on the Pico. We need to read in two 8-bit addresses, and combine them to a 16-bit address. If a write is being attempted, we write the same bit value into the appropriate address of Pico’s RAM. If a read is being attempted, we check if the DRAM’s output is the same as what we have in the Pico’s RAM. (We could also use 64 Kb of RAM on the Pico at a minimum, but as we’ll see in the next section we do not really have a lot of time for such shenanigans.)
Note: to use this software, you need to at least mostly know what you’re doing.
One note on hooking up the Pico directly to 5V components, as seen in the above pictures
Not guaranteed to not fry your Pico (note the double negation), do this at your own risk. Your Pico will possibly also draw more current than a normal TTL chip when driven above 3.6V or so, which could easily damage your precious hardware! (The current on the address pins will likely be supplied by a pair of 74LS157 chips, on the Q pins by the RAM chip, and the current on the RAM’s data in pin by the CPU or any other. \RAS and \CAS probably by custom logic chips.) Use a 74HCT245 between the Pico and the device you’d like to test.
Caveat 1
The Pico is very fast when compared to an 8-bit computer from the 1980s, but 4164 transition times are extremely fast too. If you look at a timing diagram for the 4164 (which you will find in any 4164 datasheet), you will notice that all transition times listed are on the order of <ten, tens, or low hundreds of nanoseconds. Most 4164 chips have a -20, -15, -12, or -10 suffix in their part number. This indicates the minimum allowable number of nanoseconds × 10 for the sum of all transitions. (If the DRAM is driven faster, it probably won’t work correctly. However, if it’s driven slower, most things will generally work out, though if your system e.g. reads the DRAM’s output too slowly it might be too late and not work out.)
The stock Pico runs at 125 MHz, which means that one CPU cycle is 8 ns. From hearsay, you can probably overclock any Pico to 200 MHz (clock cycles are 5 ns), and many people report that their Pico runs fine even at 400 MHz (clock cycles are 2.5 ns). For -15 DRAMs, you have 150/8 = 18.75 CPU cycles per transition if the DRAM is driven at its max speed. (Note: it isn’t on MSX machines, at least.) 18 CPU cycles isn’t a lot. Remember, we need to convert two 8-bit addresses to a 16-bit address, and then check if the newly read value matches our previously recorded value. Is that doable in 18 CPU cycles? I don’t think so, but I’m not an ARM assembly expert.
So, did I get it to work? Well… sort of but not quite.
Caveat 2
Note that there are many failure modes for DRAM chips. For example, if the chip gets super-hot within a few seconds, it’s probably shorted. I’d expect there to be a very low resistance between ground and another pin. Before hooking up the “live” tester I’m going to explain on this page, check for that kind of stuff. I would not recommend using the live tester on a chip that gets super-hot within seconds. You could risk melting your connectors, and if the short is not between VCC and GND, potentially also risk your Pico due to excessive current on a pin driven by the Pico.
Caveat 3
Untested with 4116 chips, only tested with my MSX’s main RAM.
Current status
The live tester successfully verifies that a TMS4164-15 DRAM chip under test in my stand-alone 4164 RAM tester is outputting the correct values. (There is no reason why it shouldn’t work with a 4116 chip. The tester certainly does! You just need to re-wire slightly. Also, 5V on the Pico, yeah, it doesn’t seem to “explode immediately.” But -5V or 12V? You’d better leave those pins unconnected!)
On a real system (my trusty Hitachi MB-H2 MSX) with TMS4164-15NL DRAM chips, the live tester manages just fine from power up, up until the first ~11000 comparisons (which is a split second), but at some point reads a 0 when it should have been a 1, and prints an error. Printing errors takes a long time at 115200 bps, so we go completely off the rails once we’ve encountered the first error. (That’s slightly configurable though, see “Lnobs” section for details.)
However, in the live tester’s “DEBUG” mode, it just collects a lot of samples into memory, and prints them out when the sample memory is full. Using a simple script (the Perl script included in the repo), I can then verify that all the samples check out. Note that the DEBUG code also prints out who many times it had to wait until it got data from the PIO. The answer is 0 times every time, which means that we’re too slow or almost too slow. (Sometimes there is a handful of mismatches, I’ll look into those at some point. Could be that we were just too slow, or the 5V is messing with the system ;D)
There’s a lot that could be improved, hence the “WIP” attribute in title. The first obvious improvement would be to try a little harder in the non-debug mode. The Pico has two CPU cores, and we’re only using one. We could attain more throughput by running according to the following scheme:
Core 1: Wait for sample 1 Tell core 2 to wait for sample 2 Process sample 1 Wait for sample 3 Tell core 2 to wait for sample 4 Process sample 3 …
Core 2: Wait for instructions from CPU 1 Wait for sample 2 Process sample 2 Wait for instructions from CPU 1 Wait for sample 4 Process sample 2 …
Another potential optimization would be to write the processing code in ARM assembly. (My experience with ARM assembly is mostly read-only, so not sure how much better I can get without spending way too much effort.)
Also I haven’t tried overclocking yet. Probably should!
Some more technical details
We use two PIO state machines. One waits for RAS high→low (“RAS SM”, and the other one waits for CAS high→low (“CAS SM”, which comes after the RAS transition.
Not all RAS transitions are followed by CAS transitions. For example, refresh is mostly RAS-only. In addition, though perhaps not used on the MSX(?), RAS transitions may be followed by multiple CAS transitions.
In the C code, we wait for events on the CAS SM, and then read from both the RAS SM’s FIFO and the CAS SM’s FIFO. In the PIO code, the CAS SM tells the RAS SM whether to push its address or not. (We could alternatively (maybe) always push and have the CPU make sure the FIFO never gets full, but my experiments in that regard didn’t go that well.)
There are a lot of defines that change the way the system works.
Knobs
Setting PRINT_ERROR_THRESHOLD to something above 0 only starts printing errors after encountering that many errors.
CORRECT_ERRORS causes the Pico’s memory to be updated when we encounter a mismatch
VERBOSE_STATUS_LEDS causes the Pico to perform GPIO writes at GPIO16+ (or so, I recommend you check the source to find the exact GPIO pin number) to indicate whether we’re reading or writing. This isn’t very beneficial performance-wise.
SWAP_RAS_AND_CAS_ADDRESSES: my Hitachi MB-H2 MSX applies the CPU’s A0-A7 to the RAM pins at RAS time, and A8-A15 at CAS time. When thinking “rows” and “columns”, most people would probably assume that “rows” use the more significant bits, but that is not necessarily the case, and it doesn’t matter. When operating in DEBUG mode, you’ll see accesses that are mostly linear if this define is set correctly. Otherwise each access will be 256 apart.
I haven’t seen any DRAM chip failures (except on YouTube) where only some addresses were broken and the chip appeared to work otherwise. (SRAM chips are different story. I’m sort of planning on doing an SRAM tester too, but probably not too soon.) Most DRAM chips I’ve seen are an all-or-nothing affair. For all-or-nothing affairs, this chip tester is very likely to find the problem immediately, especially if you compare all 8 chips and only one is weird. For hypothetical chips with just a single address problem (or perhaps, a single broken row), either it’ll be difficult with the code not 100% working right now, or it might take several attempts and statistics.