Postmortem September 17th outage and rollback

  • Want to keep track of this thread?
    Accounts can bookmark posts, watch threads for updates, and jump back to where you stopped reading.
    Create account
@Null: are you serious about stickers taking that much time?

If it's an SQL dump recovery, disable INDEX and FOREIGN KEY constraints on the sticker table columns before you COPY table data. Afterwards you index and add foreign key constraints. This will be much faster.

Because 120M tuples of one byte at a slow 30 MB/s will import in four seconds. So even if your sticker tuples are an insane hundred bytes wide, it should still be done in less than seven minutes on slow hardware.
 
They may take our drives, but they'll never take our STIIIIICKERS!!!
Ya cunt I was about to post that myself ya culture stealing bastard! Gonna come to your house and wipe my hairy bawsack all over your door handles.

Dear feeder once again proving nothing can keep us down though I was watching the news during downtime expecting to see Lucas or Mr Consent Accident had suicide bombed Kiwi HQ.
 
Why not remove stickers? I don't mean as a function in general, but would it be possible to skip the process of reacquiring sticker data for system restoring? Sure it would make the forum fresh, but it seems silly to make sure each post has the right stickers.
 
Checking Kiwifarms is like checking the morning paper but every now and then the paper randomly catches fire.
No checking Kiwifarms is like checking the morning paper but every now and then the paper randomly catches fire some deranged lunatic runs up to you and sets the paper on fire in your hand, cackles MENacingly and runs off.
 
They're WD M.2 1.6TB enterprise drives
Every WD drive I had died on me, tho with enough time to move everything.

Meanwhile every seagate drive I had completely shat the bed without warning including a fucking backup seagate HDD that died while copying stuff being recovered from another dead seagate drive.

Meanwhile I got 15-year old HGST drives that plain refuses to die, but HGST sold out to WD and doesn't exists anymore.

No idea which SSD brand is the most reliable right now.
 
Last edited:
Every WD drive I had died on me, tho with enough time to move everything.

Meanwhile every seagate drive I had completely shat the bed without warning including a fucking backup seagate HDD that died while copying stuff being recovered from another dead seagate drive.

Meanwhile I got 15-year old HGST drives that plain refuses to die, but HGST sold out to WD and doesn't exists anymore.

No idea which SSD brand is the most reliable right now.
I have the exact opposite experience. From time to time I decide to listen to people claiming WDs are the most reliable and Seagates are shit. So I replace the Seagate for a WD and it dies within days (4 times this has happened), or it plain out refuses to work at all (one occurrence). So I tuck my tail between my legs and go back to Seagate, which all work well beyond their supposed lifespan. I won't shit on WD tho, because it works for other people. But at this point I stopped trying. The universe has decided it's how it has to be and I won't fight it
 
Meanwhile I got 15-year old HGST drives that plain refuse to die, but HGST sold out to WD and doesn't exists anymore.
It's not very comparable because WD's SSD business was actually acquired from SanDisk and has only been relevant in maybe the last 5 years since they basically ignored the market in its nascent state. HDDs are such a crapshoot it's unreal, everyone has their favorite brand to hate but Toshibas have been the most unreliable in my experience, which is interesting as they're well regarded by Backblaze.
At best a firmware bug, at worse, the drives sudoku'd.
SSDs really do appear to operate on black magic nowadays. The amount of batshit insane failure modes that keep cropping up lately really beggars belief and this brave new world of UEFI seems to have made everything unbelievably complicated. I've got a pair of 3.84 TB SN630s in one of my boxes and now I'm just hoping I don't get felted too.
 
Made an account to clarify things. I'm the dude that helps Null with this stuff when he needs it. No, I don't monitor the server cause it's Null's. But maybe I'll setup some stuff for Null to monitor stuff including drive health.

The storage on the server use ZFS pools. The SATA SSD array (SNEED Pool) bypasses the RAID Controller and is entirely JBOD passthrough.
The 4 x NVMe drives are U.2 drives in the front and are in FEED ZFS Pool. The backplane handles SATA, SAS, and U.2. U.2 has it's own area for those drives and connects to the motherboard with a OCuLink cable to a JNVMe header.
The 4 x 1.6TB WD Ultrastar DC SN620 NVMe U.2 drives disappeared from the server last night. But before that, the kernel reported write errors to one of them.

There are a few reasons this could have happened, from most likely to less likely:
- BIOS/UEFI Firmware stopped communicating with the NVMe drives. This happened with a certain BIOS setting when it was initially setup
- The drives actually died from the workload. Unlikely considering these can handle 1.7 Drive Writes per day. But very feasible. These are 2nd hand enterprise drives
- The backplane/JNVMe headers exploded. Super unlikely

The drives are likely still alive, and the server's firmware probably took a shit.
We need to inspect the server's BIOS settings or possibly even update the firmware. Then we can determine if the drives are toast or useless.
There was no foul play at hand here. At best a firmware bug, at worse, the drives sudoku'd.
You sound like you’re from Reddit. I don’t like Reddit niggers.
 
From time to time I decide to listen to people claiming WDs are the most reliable and Seagates are shit.
WD are far from the most reliable, all I was saying is that at least that WD drive gave me a warning instead of just assploding itself with my data on it like the seagate drives did several times.

I even got a seagate drive that for some fucking reason refuses to work with windows but will work with linux, even tho its NTFS.
Toshibas have been the most unreliable in my experience, which is interesting as they're well regarded by Backblaze.
Never had or even seen a 3.5 Toshiba drive, only 2.5 be it from laptops or portable drives, but none failed on me. Frankly I wouldn't be surprised if that division has dropped the ball in terms of quality since the main company is not doing well. Or maybe they are pulling a seagate and doing shit drives on purpose so you'll have to buy a new one.
 
I thought I still didn't have access to the clearnet site, but I was just trying to access the .pl address lolololololol
 
I thought Chris had the highest reaction score from when a mod set his to like a billion as a goof? Was it undone?
https://kiwifarms.st/members/ (future readers, substitute in the link of choice if .st is ever Dong'd)
1695057942807.png
 
Back
Top Bottom