Wow. ST3000NM0033 just 4.3 years shy came back with a 188 Command_Timeout of 60130459662.

This server houses either a vermin nest of some sort or someone fcked up plugging a gorram cable. Either way it’s pants on fire.

Update:

2nd act as .

We complained.
They rejected,
no error to be detected.

Bewildered we insisted,
and grumbling they assisted.

They pulled out all the disks,
and concluded all at risk.

Wait, what?
Only one was shut!

Now we’ve none,
the raid is gone.

Next is a letter we have to sign,
that loosing all our data is just fine.

You wonder if we have a backup?
oc, always prepared for such a f… mess.
Beko Pharm

Sorry. I’m not used to this sort of chaos engineering.

Raid 1 degraded today. The SSD percentage used value is 215% with a total of 598TB data units written. That’s fine I guess 😀

Friendly reminder to my not so tech savy fellows: It’s _when_ and not _whether_ hard disks die. No backup = no important data. A raid is not a backup.0