public inbox for goredo-devel@lists.stargrave.org
Atom feed
From: "Niklas Böhm" <mail@jnboehm•com>
To: goredo-devel@lists.cypherpunks.su
Subject: Handling EINTR in unix.FcntlFlock
Date: Sat, 4 Jan 2025 10:59:09 +0100 [thread overview]
Message-ID: <98f44f62-1f44-4375-8cf3-d10b0e1c81f9@jnboehm.com> (raw)
[-- Attachment #1: Type: text/plain, Size: 1932 bytes --]
Greetings everyone,
I was using goredo on an NFS and noticed that I sometimes ran into
issues where my program would fail with the following error:
run.go:234: interrupted system call /gpfs01/.../folders/.redo/1.zip.lock
After doing some digging, it seems like the problem is that calling
unix.FcntlFlock with F_SETLKW can be too slow over an NFS and will get
interrupted (see `man 2 flock`, Section on errors [1]). Apparently
there is an automatic restart mechanism [2], but it's also unreliable,
so I thought it's better to handle it explicitly and basically extend
the error check from
if errors.Is(err, unix.EDEADLK) {
to
if errors.Is(err, unix.EDEADLK) || errors.Is(err, unix.EINTR) {
This seems to resolve the interrupted system call error above.
Unfortunately I cannot realiably reproduce this error, but since the fix
is reasonaly easy, I was hoping that it could be incorporated into
goredo proper.
I have attached the small diff, and here it is also reproduced, in case
the explicit line is not clear:
diff --git a/run.go b/run.go
index 506fd35..5423b49 100644
--- a/run.go
+++ b/run.go
@@ -227,7 +227,7 @@ func runScript(tgt *Tgt, errs chan error, forced,
traced bool) error {
tracef(CLock, "LOCK_EX: %s", fdLock.Name())
LockAgain:
if err = unix.FcntlFlock(fdLock.Fd(),
unix.F_SETLKW, &flock); err != nil {
- if errors.Is(err, unix.EDEADLK) {
+ if errors.Is(err, unix.EDEADLK) ||
errors.Is(err, unix.EINTR) {
time.Sleep(10 * time.Millisecond)
goto LockAgain
}
Cheers and happy belated new year
Nik
[1]: https://www.man7.org/linux/man-pages/man2/fcntl.2.html#ERRORS
[2]:
https://unix.stackexchange.com/questions/509375/what-is-interrupted-system-call
[-- Attachment #2: goredo-eintr.diff --]
[-- Type: text/x-patch, Size: 493 bytes --]
diff --git a/run.go b/run.go
index 506fd35..5423b49 100644
--- a/run.go
+++ b/run.go
@@ -227,7 +227,7 @@ func runScript(tgt *Tgt, errs chan error, forced, traced bool) error {
tracef(CLock, "LOCK_EX: %s", fdLock.Name())
LockAgain:
if err = unix.FcntlFlock(fdLock.Fd(), unix.F_SETLKW, &flock); err != nil {
- if errors.Is(err, unix.EDEADLK) {
+ if errors.Is(err, unix.EDEADLK) || errors.Is(err, unix.EINTR) {
time.Sleep(10 * time.Millisecond)
goto LockAgain
}
next reply other threads:[~2025-01-04 10:16 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-04 9:59 Niklas Böhm [this message]
2025-01-04 12:43 ` Handling EINTR in unix.FcntlFlock Sergey Matveev
2025-01-04 14:04 ` Niklas Böhm
2025-01-07 11:05 ` Sergey Matveev