It just happened again. I couldn’t ssh in despite the limit on docker resources, which leads me to believe it may not be related to docker or Lemmy.
This time it lasted only 20 minutes or so. Once it was over I could log back in and investigate a little. There isn’t much to see. lemmy-ui was killed sometime during the event
IMAGE COMMAND CREATED STATUS PORTS
nginx:1-alpine "/docker-entrypoint.…" 9 days ago Up 25 hours 80/tcp, 0.0.0.0:14252->8536/tcp, :::14252->8536/tcp
dessalines/lemmy-ui:0.18.0 "docker-entrypoint.s…" 9 days ago Up 3 minutes 1234/tcp
dessalines/lemmy:0.18.0 "/app/lemmy" 9 days ago Up 25 hours
asonix/pictrs:0.4.0-rc.7 "/sbin/tini -- /usr/…" 9 days ago Up 25 hours 6669/tcp, 8080/tcp
mwader/postfix-relay "/root/run" 9 days ago Up 25 hours 25/tcp
postgres:15-alpine "docker-entrypoint.s…" 9 days ago Up 25 hours
I still have no idea what’s going on.
Here’s an update. I set up atop on my VPS and waited until the issue occurred again. Here’s the atop log from the event.
ATOP - ip-172-31-7-27 2023/07/22 18:40:02 ----------------- 10m0s elapsed PRC | sys 9m49s | user 12.66s | #proc 134 | #zombie 0 | #exit 3 | CPU | sys 99% | user 0% | irq 0% | idle 0% | wait 0% | MEM | tot 957.1M | free 49.8M | buff 0.1M | slab 95.1M | numnode 1 | SWP | tot 0.0M | free 0.0M | swcac 0.0M | vmcom 2.4G | vmlim 478.6M | PAG | numamig 0 | migrate 0 | swin 0 | swout 0 | oomkill 0 | PSI | cpusome 63% | memsome 99% | memfull 88% | iosome 99% | iofull 0% | DSK | xvda | busy 100% | read 461505 | write 171 | avio 1.30 ms | DSK | xvda1 | busy 100% | read 461505 | write 171 | avio 1.30 ms | NET | transport | tcpi 2004 | tcpo 1477 | udpi 9 | udpo 11 | NET | network | ipi 2035 | ipo 1521 | ipfrw 20 | deliv 2015 | NET | eth0 ---- | pcki 2028 | pcko 1500 | si 4 Kbps | so 1 Kbps | PID SYSCPU USRCPU VGROW RGROW RDDSK WRDSK CPU CMD 41 5m17s 0.00s 0B 0B 0B 0B 53% kswapd0 1 21.87s 0.00s 0B -80.0K 1.2G 0B 4% systemd 21681 20.28s 0.00s 0B 4.0K 4.2G 0B 3% lemmy 435 18.00s 0.00s 0B 392.0K 163.1M 0B 3% snapd 21576 17.20s 0.00s 0B 0B 4.2G 0B 3% pict-rs
The culprit seems to be kswapd0 trying to move memory to swap space, although there is no swap space.
I set memory swappiness to 0 on the system for now, I’ll check if that makes a difference.