Previous incidents
Outage overnight
Resolved Jul 31 at 09:19am PDT
Looking into what happened overnight that knocked us offline for a while. We lost network link on the server, so we're following up with OVH.
Update: It was an OVH billing issue, even though we had auto-renew enabled they didn't renew properly :(
Outage
Resolved Jul 28 at 08:14am PDT
08:14 - Went to upgrade to lemmy 0.18.3 and it triggered a kernel panic. System is rebooting now.
08:20 - System is back up but DB upgrades are currently running for 0.18.3, should be back online shortly.
08:30 - All sorted out and online with 0.18.3, enjoy!
Memory issues
Resolved Jul 23 at 08:13pm PDT
Something strange is going on with the memory in our server, currently investigating but things should be back online and stable for now.
Edit: Fixed fully now, was a hugepages misconfig.
Database locks
Resolved Jul 21 at 08:41am PDT
Seeing more database locks again, have killed off the offending queries while we dig more.
Update: This mornings slow delete query came in from the fediverse and impacted multiple lemmy sites at the same time. The core issue is https://github.com/LemmyNet/lemmy/issues/3165
Services restarted
Resolved Jul 18 at 01:57pm PDT
Just restarted services quickly to fix the Hot feed - https://lemmy.ca/post/1685851?scrollToComments=true
Database locking again
Resolved Jul 17 at 03:37pm PDT
We're hitting a lemmy bug triggered when someone deletes their account, that is locking the database. I'm disabling account removal temporarily until we can identify the fix.
Database locking
Resolved Jul 17 at 01:18pm PDT
Saw some database locking that took things down, restarted postgres and investigating further.
Services restarted
Resolved Jul 16 at 04:07pm PDT
Restarted services to do some memory tuning, we've been seeing pictrs crashing recently due to high memory consumption.