On Wed, Aug 20, 2025 at 05:25:46PM +0200, Michael van Elst wrote:
> On Wed, Aug 20, 2025 at 12:30:50PM +0200, Christof Meerwald wrote:
> > 
> > So the lighttpd process (that was the first one that got into tstiles)
> > is waiting for ffffdadd0c6dc500 and that's ioflush waiting in biowait:
> > 
> > db(0)> bt/a ffffdadd0c6dc500
> > trace: pid 0 lid 195 at 0xffff8981544e0d60
> > sleepq_block() at netbsd:sleepq_block+0x13a
> > cv_wait() at netbsd:cv_wait+0xb7
> > biowait() at netbsd:biowait+0x42
> > wapbl_buffered_flush() at netbsd:wapbl_buffered_flush+0xa2
> > wapbl_write_commit() at netbsd:wapbl_write_commit+0x28
> > wapbl_flush() at netbsd:wapbl_flush+0x552
> > ffs_sync() at netbsd:ffs_sync+0x176
> > VFS_SYNC() at netbsd:VFS_SYNC+0x22
> > sched_sync() at netbsd:sched_sync+0x90
> > db(0)>
> 
> If that's really the stalling process, then there is probably an issue
> with disk I/O. Do you see where writes happen? As you can see, it's
> an FFS filesytem with logging (WAPBL) enabled.
> 
> Most drivers however should dump errors on console, at least after
> some timeout.

It's a virtio block device, and I am not seeing any errors on the
console. And everything starts working again when I enter a username
on the VNC console (once I get asked for a password).

So to me it looks like something gets stuck in the virtio
communication.

BTW, I then did try running that in the background:

  while sleep 0.5; do echo Hello world >some-test-file.txt; sync; rm 
some-test-file.txt; done

and the NetBSD VPS survived less than 2 hours before "sync" got stuck
(and everthing else trying to do I/O then waiting for that sync).


Christof

-- 
https://cmeerw.org                             sip:cmeerw at cmeerw.org
mailto:cmeerw at cmeerw.org                   xmpp:cmeerw at cmeerw.org

Reply via email to