On Sat, 18 Mar 2017 03:50:39 -0700 (PDT) Vitaly Isaev <vitalyisa...@gmail.com> wrote:
[...] > Assume that application does some heavy lifting with multiple file > descriptors (e.g., opening - writing data - syncing - closing), what > actually happens to Go runtime? Does it block all the goroutines at > the time when expensive syscall occures (like syscall.Fsync)? Or only > the calling goroutine is blocked while the others are still operating? IIUC, since there's no general mechanism to have kernel somehow notify the process of the completion of any generic syscalls, when a goroutine enters a syscall, it essentially locks its unrelying OS thread and waits until the syscall completes. The scheduler detects the goroutine is about to sleep in the syscall and schedules another goroutine(s) to run, but the underlying OS thread is not freed. This is in contrast to network I/O which uses the platform-specific poller (such as IOCP on Windows, epoll on Linux, kqueue on FreeBSD and so on) so when an I/O operation on a socket is about to block, the goroutine which performed that syscall is suspended, put on the wait list, its socket is added to the set the poller monitors and its underlying OS thread is freed to be able to serve a runnable goroutine. > So does it make sense to write programs with multiple workers that do > a lot of user space - kernel space context switching? Does it make > sense to use multithreading patterns for disk input? It may or may not. A syscall-heavy workload might degrade the goroutine scheduling to actually be N×N instead of M×N. This might not be the problem in itself (not counting a big number of OS threads allocated and mostly sleeping) but concurrent access to the same slow resource such as rotating medum is almost always a bad idea: say, your HDD (and the file system on it) might only provide such and such read bandwidth, so spreading the processing of the data being read across multiple goroutines is only worth the deal if this processing is so computationally complex that a single goroutine won't cope with that full bandwidth. If one goroutine is OK with keeping up with that full bandwidth, having two goroutines read that same data will make each deal with only half the bandwidth, so they will sleep > 50% of the time. Note that reading two files in parallel off the filesystem located on the same rotating medium will usually result in lowered full bandwidth due to seek times required to jump around the blocks of different files. SSDs and other kinds of medium might have way better performance characteristics so it worth measuring. IOW, I'd say that trying to parallelizing might be a premature optimization. It worth keeping in mind that goroutines serve two separate purposes: 1) they allow you to write natural sequential control flow instead of callback-ridden spaghetti code; 2) they allow performing tasks truely in parallel--if the hardware supports it (multiple CPUs and/or cores). This (2) is tricky because it assumes such goroutines have something to do; if they instead contend on some shared resource, the parallelization won't really happen. -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.