User mode memory testing in Linux

As discussed here, I recently updated the hardware on my home server. I ended up using 2x2GB of DDR3 SDRAM that I had from an older upgrade. Newegg had a great deal for 2x4GB of DDR3 SDRAM (Mushkin DDR3 1600 for $35) that I decided to jump in on. Got the memory and I decided to take a chance on mixing it with my old 2x2GB Corsair kit (this is a really bad idea – don’t try this. If the clock speed and timings of your memory don’t match, you’re definitely going to have a bad time).  I popped in the new memory and booted up the machine – it booted up just fine! I was impressed and thought that maybe I’d gotten really lucky 🙂 However, I had a nagging suspicion that this was too good to be true. Sure enough, I was trying to download a new version of plex media server and tar zxvf failed saying the download was corrupt.

I was immediately sure that this must be bad memory mixing mojo, but since I’m lazy, I decided to figure out a way to do memory testing in user mode. Basically, memtest is the gold standard for testing your memory for issues. However, it requires you to create a bootable disk and reboot your machine in order to run the diagnostics. I was too lazy to do this, so I searched around for something that I could run on my system without having to reboot it (again, don’t try this – this is dumb when your memory is proven to be flaky). Some searching led me to memtester which looked like it would do the trick. I downloaded it and ran memtester 8G 100 to test the RAM modules (since I had 12 GB of RAM, I felt comfortable giving 8G to memtester to play with, hoping the OOM killer will take care of any issues). memtester found a bunch of bit flips and other assorted errors very quickly. I had to make a choice at this point – do I pull out my 2x2GB kit and continue with the 2x4GB or figure out a way to make this work.

I decided to try to make it work since the clock speed and timings for the two kits were identical (although voltages were different). I decided to lower the memory clock frequency in the BIOS to 1333 MHz instead of 1600 MHz. Rebooted the machine and ran memtester overnight – this time the system was stable and I’ve decided to keep it running this way. Note that the memtester single iteration takes a long time (around an hour), so I actually didn’t run 100 iterations. I just decided to arbitrarily stop the test after 10 hours of successful iterations.

Leave a Reply

Your email address will not be published. Required fields are marked *