AHCI breaky Samsung EVO 860 SSD blues

bullfrog · January 7, 2019, 1:46pm

I recently purchased a Samsung EVO 860 250GB SSD. It has turned out to be a strange beast. I originally split this drive up into a bunch of 25GB chunks to multiboot OpenBSD, FreeBSD, ArchLinux, and slew of Haikus. Beta1 64, Nightlies 32 and 64, and custom builds are co-existing on this laptop along with the above BSDs and Linux. Old school MBR scheme, 3 primary, 1 extended, the rest logical.

My first OS installed to /dev/disk/scsi/0/0/0/0 was OpenBSD 6.4. I could barely get it installed. Once I did, it quickly became unusable, ahci errors galore. Did I mention data corruption? Yes. Lots. And irreparable fragmentation. This is no matter how many times I did a fresh format and install on that partition. So I gave up on OpenBSD. I tried FreeBSD 12 on that partition. Better luck, but still my console was filled with constant AHCI messages. No apparent data corruption. Things are looking better.

Meanwhile, Haiku and Linux were happily working in their upper partition spaces. No errors at all. That was until I decided to check to see how Haiku handled that first partition. The following is a small sampling of my non stop syslog puke gleaned from tail -f:

KERN: add_memory_type_range(10830, 0xff801000, 0x1000, 0)
KERN: set MTRRs to:
KERN:   mtrr:  0: base: 0x2ce8b000, size:     0x1000, type: 0
KERN:   mtrr:  1: base: 0xc7a00000, size:   0x100000, type: 0
KERN:   mtrr:  2: base: 0xf0000000, size: 0x10000000, type: 0
KERN:   mtrr:  3: base: 0xe0000000, size: 0x20000000, type: 1

So. I have an SSD that has a rather unusable first partition. I’m not the only one who has reported this with this issue, and it isn’t limited to Haiku. I found somebody on https://forums.freebsd.org that has the same issue, but gave up and returned the drives as faulty. I’m going to do some further testing. I think I can work around this by putting a small offset before the first partition. Besides this issue, I love this drive. It may be flawed, but I’ve got 5 years and 145+TBW left of warrantied use left on it. I’m not ready for an RMA. All important stuff is on redundant backup. I ain’t skeered. Besides, its really really fast. I like that.

I’ll report back after I test putting an offset on the first partition.

PulkoMandy · January 7, 2019, 2:16pm

Hi,

MTRRs are used for enabling or disabling write combining and caching on RAM. It is completely unrelated to SSD storage.

If you want to check your partition layout, you can do so in DriveSetup.

bullfrog · January 7, 2019, 2:55pm

I understand this. But is there a good reason why I have an infinite stream of this message being logged ad infinitum upon boot? Only one partition doing so? And only on this particular disk. Which is giving not just me, but others issues with a particular hardware?

bullfrog · January 7, 2019, 3:04pm

A quick trip to gparted land, a 200MB offset on /dev/disk/scsi/0/0/0/0, a quick install and new boot record, the MTRR messages go away. I’ll test how the other OSs go with this new offset partition before seeing how small of an offset it takes to get the job done. If nothing else, I just found an issue and solution to Haiku writing a constant stream of messages to a log file on a particular SSD. Which in anyone’s book should be considered a GoodThing™. SSDs don’t like constant writing for prolonged periods.

PulkoMandy · January 7, 2019, 3:32pm

Current SSDs are durable enough that you don’t need to worry about that more than you would for a plain old spinning hard disk drive.

The MTRRs is not really an helpful part of the log, we would need to know why this was requested (something is trying to allocate a memory range with specific needs, maybe to perform DMA access to the disk?). I don’t see a reason why an offset on the disk would make any difference, but maybe a complete syslog would shed some light on it.

bullfrog · January 7, 2019, 3:58pm

The exact analog to this in both FreeBSD and OpenBSD when installed at the beginning of this particular drive is a constant stream of ahci errors. Others using this exact model of drive with FreeBSD and OpenBSD are experiencing an identical steady stream of ahci errors.

I have sneaking suspicion it’s either an engineering glitch on Samsung’s part, or they made a design choice for proprietary reasons. Either way, I’ll duplicate this again and provide syslogs for further diag. I don’t think I’ll make a trac report, as I doubt this is a bug in Haiku or other OS. In any case, even the latest greatest SSD tech is only good for so many TBW. If this stream of errors went unnoticed on a 100% duty cycle system, those bytes add up fast. I’m talking about a constant stream. Not an occasional message. It all adds up.

extrowerk · January 7, 2019, 4:02pm

Haiku’s kernel still built to be verbose, because debug purposes.

bullfrog · January 7, 2019, 6:50pm

True, but this is like the micro machines man on meth verbose.

extrowerk · January 7, 2019, 8:47pm

AFAIK: Beta isn’t for production.

bullfrog · January 7, 2019, 10:40pm

This is isn’t a beta issue it’s a work around for hardware that consistently does the same weird thing. It does the same thing on production stable FreeBSD and OpenBSD. And not just my hardware. I’ve run into others with the same problem.