-
Notifications
You must be signed in to change notification settings - Fork 7
[BUG] properties.msgpack.lock is not deleted after changing the file #127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
So it seems that the file is created by the lock procedure, but it is leave to the application to remove it https://pkg.go.dev/github.com/gofrs/flock#Flock.Unlock. That seems strange but probably ok. |
I guess this behavior is intended by the library. If there is a lot of concurrency, the following scenario can happen:
In our case, this scenario is highly improbable, so we can prioritize filesystem cleanliness. |
No, this isn't correct. B will see that another process creates the file and either continue to wait or just fail. At least that is how it should work. Removing the file shouldn't make any difference because the important part is that you unlock the file and if the file isn't there means that no one is locking it. The open question is, how the interleave with remove, create and lock file works: So I don't know, for me it seems safe in theory, although I would prefer to do some tests regarding data consistency, we could spin up multiple process that does concurrent changes, and see if we have any data race. |
|
@lucarin91 The problem is that the processes are working on file descriptors pointing to inodes: A creates a file I tested it with the following script: package main
import (
"fmt"
"os"
"syscall"
"github.com/gofrs/flock"
)
func main() {
path := "race.lock"
fileLock := flock.New(path)
fmt.Printf("try taking the lock on %s...\n", path)
locked, err := fileLock.TryLock()
if err != nil {
panic(err)
}
if locked {
fileInfo, _ := os.Stat(path)
stat := fileInfo.Sys().(*syscall.Stat_t)
fmt.Printf("Lock Acquired!!\n")
fmt.Printf("File name: %s\n", path)
fmt.Printf("Inode ID : %d \n", stat.Ino)
fmt.Println("delete lock file from another shell and push enter")
var input string
fmt.Scanln(&input)
fileLock.Unlock()
fmt.Println("Unlocked.")
} else {
fmt.Println("Unable to acquire the lock.")
}
}
|
Yeah, that makes sense. You should use file permissions to prevent removing a locked file. But just for completeness, this doesn't mean that the flock system doesn't work; it just means that the file should be there to make the locking system work. You can even check with |
|
At this point, I am wondering if we should not only keep the .lock file but also use the |
|
I thought the same, but in I see two solutions:
|
Yeah, at this point, we could remove renameio and write to the file directly, not sure why we used both renameio and flock |
I guess we need both because they solve 2 different problems:
|
Right, so at this point, we could convert this task to improve the documentation regarding why implemented in that way. |
lucarin91
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After our conversation, I would revert the change and add some documentation comments about the behavior of the lock file, and the reason of using that toghether with renameio
This reverts commit 27e28db.
done.
|
| if err := flock.Unlock(); err != nil { | ||
| slog.Error("failed to unlock file lock", "path", flock.Path(), "error", err) | ||
| } | ||
| if err := os.Remove(flock.Path()); err != nil { | ||
| slog.Error("failed to delete lock file", "path", flock.Path(), "error", err) | ||
| } | ||
| locked = false | ||
| slog.Warn("lock file removed due to timeout", "path", flock.Path()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would reintroduce a way to force the lock in case of a timeout error. What do you think? I would avoid being unable to return properties if something went from at some point
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can maintain the original behavior since the probability of a race condition in our context is quite low.
Or we can add an additional check to improve the resiliency:
We can force the lock after a timeout only if the .lock file appears stale (for example, no activity in the last 30 seconds).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if locking a file in Linux will change the last changed date in the file. So maybe we cannot know that we should check. I think it is ok to keep the last mechanism. Although we need to relock the file after deleting it, which isn't something we do.
if err := os.Remove(flock.Path()); err != nil {
slog.Error("failed to delete lock file", "path", flock.Path(), "error", err)
}
if err := flock.Lock(); err != nil {
slog.Error("failed to unlock file lock", "path", flock.Path(), "error", err)
}
locked = false
slog.Warn("lock file removed due to timeout", "path", flock.Path())There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked, and you are right: locking the file doesn't tell us whether the file is stale or not.
So yes, we can do something:
`func getLock(flock flock.Flock, lockFn lockFunc, errorMsg string) (UnlockFunc, error) {
ctx, cancel := context.WithTimeout(context.Background(), 3time.Second)
defer cancel()
locked, err := lockFn(ctx, 100*time.Millisecond)
if err != nil {
if errors.Is(err, context.DeadlineExceeded) {
if err := flock.Unlock(); err != nil {
slog.Error("failed to unlock file lock", "path", flock.Path(), "error", err)
}
if err := os.Remove(flock.Path()); err != nil {
slog.Error("failed to delete lock file", "path", flock.Path(), "error", err)
}
if err := flock.Lock(); err != nil {
slog.Error("failed to unlock file lock", "path", flock.Path(), "error", err)
}
locked = false
slog.Warn("lock file removed due to timeout", "path", flock.Path())
} else {
return emptyUnlockFunc, fmt.Errorf("failed trying to acquire %s for %s: %w", errorMsg, flock.Path(), err)
}
}
if !locked {
return emptyUnlockFunc, fmt.Errorf("unable to acquire %s for %s", errorMsg, flock.Path())
}
return func() error {
if err := flock.Unlock(); err != nil {
return fmt.Errorf("failed to unlock file lock for %s: %w", flock.Path(), err)
}
return nil
}, nil
}`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't what I wrote? More or less? I would also avoid to do flock.Unlock(), which should make us wait forever
Motivation
closes #109
The properties.msgpack.lock should protect properties.msgpack only during read/write operations; instead it seems that the lock file remains there indefinitely after the first access.
To reproduce
rm .local/share/arduino-app-cli/properties.msgpack.lock
curl http://localhost:8800/v1/properties/asd
file .local/share/arduino-app-cli/properties.msgpack.lock
Change description
Explicitly add file lock removal after checking if it exists
Additional Notes
Reviewer checklist
main.