Conversation
Signed-off-by: Arsen Gumin <gumin@live.ru>
Summary of ChangesHello @aagumin, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the model CSI driver by introducing a file exclusion mechanism. Users can now specify patterns to prevent certain files or directories from being pulled and mounted with a model, providing greater control over storage and potentially optimizing resource usage. This feature is integrated into the CSI volume creation and node publishing workflows, ensuring that only relevant model assets are deployed. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a feature to filter files during model mounting using gitignore-style patterns provided via an exclude-files parameter. The implementation adds logic to parse, validate, and apply these patterns. The core pattern matching is handled in a new patterns.go file. While the feature is a good addition, my review identifies several areas for improvement, including code duplication in validation logic, potential bugs in directory pattern matching, and some inefficiencies. Addressing these points will enhance the robustness and maintainability of the new feature.
pkg/service/patterns.go
Outdated
| path = filepath.ToSlash(path) | ||
| pathParts := strings.Split(path, "/") | ||
|
|
||
| return m.matcher.Match(pathParts, false) |
There was a problem hiding this comment.
The isDir argument to m.matcher.Match is hardcoded to false. This will cause incorrect behavior for gitignore patterns that are meant to match directories (e.g., logs/). The FilePatternMatcher.Match function should accept a boolean isDir argument and pass it to the underlying go-git matcher.
This will also require changes in filterFilesByPatterns to pass info.IsDir() and to correctly handle directory matches (e.g., by using os.RemoveAll and returning filepath.SkipDir). A similar issue exists in puller.go where Match is called without directory information.
| return m.matcher.Match(pathParts, false) | |
| return m.matcher.Match(pathParts, isDir) |
| // Validate patterns for security | ||
| for _, p := range excludeFilePatterns { | ||
| if strings.HasPrefix(p, "/") && len(p) > 1 { | ||
| return nil, isStaticVolume, status.Errorf(codes.InvalidArgument, "invalid parameter:%s: absolute paths not allowed: %s", s.cfg.Get().ParameterKeyExcludeFiles(), p) | ||
| } | ||
| if strings.Contains(p, "..") { | ||
| return nil, isStaticVolume, status.Errorf(codes.InvalidArgument, "invalid parameter:%s: parent directory reference not allowed: %s", s.cfg.Get().ParameterKeyExcludeFiles(), p) | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
This validation logic for excludeFilePatterns is duplicated in pkg/service/dynamic_server_handler.go, pkg/service/node.go, and pkg/service/patterns.go. To improve maintainability and ensure consistency, this logic should be extracted into a single, reusable function (e.g., ValidateExcludePatterns in the service package). This new function can then be called from all the places where validation is needed.
| } | ||
|
|
||
| if info.IsDir() && path != targetDir { | ||
| isEmpty, _ := isDirEmpty(path) |
There was a problem hiding this comment.
The error returned from isDirEmpty is being ignored. This function can return errors other than io.EOF (e.g., permission errors when opening the directory), which would be silently ignored. This could lead to incorrect behavior, such as not removing a directory that should be removed. The error should be checked and handled, for example by logging it and continuing the walk.
isEmpty, err := isDirEmpty(path)
if err != nil {
logger.Logger().WithError(err).Warnf("Failed to check if directory is empty: %s", path)
return nil
}
if isEmpty {| if err.Error() == "EOF" { | ||
| return true, nil // Directory is empty | ||
| } |
There was a problem hiding this comment.
Comparing the error string with "EOF" is brittle and not idiomatic Go. It's better to use errors.Is(err, io.EOF) or err == io.EOF for this check. This will make the code more robust against changes in the error message. Note that this will require importing the io package.
if err == io.EOF {
return true, nil // Directory is empty
}| matcher, err := NewFilePatternMatcher(excludeFilePatterns) | ||
| if err != nil { | ||
| return errors.Wrap(err, "create file pattern matcher") | ||
| } |
There was a problem hiding this comment.
Signed-off-by: Arsen Gumin <gumin@live.ru>
| } | ||
|
|
||
| func (cfg *RawConfig) ParameterKeyExcludeFiles() string { | ||
| return cfg.ServiceName + "/exclude-files" |
There was a problem hiding this comment.
Maybe return cfg.ServiceName + "/exclude-file-patterns".
| excludeFilePatternsParam := strings.TrimSpace(parameters[s.cfg.Get().ParameterKeyExcludeFiles()]) | ||
| var excludeFilePatterns []string | ||
| if excludeFilePatternsParam != "" { | ||
| if err := json.Unmarshal([]byte(excludeFilePatternsParam), &excludeFilePatterns); err != nil { |
There was a problem hiding this comment.
What about using a comma-separated format for the parameter value? For example: /foo,/bar.
| } | ||
|
|
||
| // Validate exclude_file_patterns | ||
| for _, p := range req.ExcludeFilePatterns { |
There was a problem hiding this comment.
I think this path security check is unnecessary because, after using gitignore to match the results of modelArtifact.GetPatterns, a secure list of absolute paths can be obtained.
| excludeFilePatternsParam := volumeAttributes[s.cfg.Get().ParameterKeyExcludeFiles()] | ||
| var excludeFilePatterns []string | ||
| if excludeFilePatternsParam != "" { | ||
| if err := json.Unmarshal([]byte(excludeFilePatternsParam), &excludeFilePatterns); err != nil { |
|
|
||
| // NewFilePatternMatcher creates a new pattern matcher from a list of gitignore-compatible patterns | ||
| func NewFilePatternMatcher(patterns []string) (*FilePatternMatcher, error) { | ||
| // Validate patterns for security |
|
|
||
| logger.WithContext(ctx).Infof("Applying file exclusion patterns: %v", excludeFilePatterns) | ||
|
|
||
| _, err = filterFilesByPatterns(targetDir, matcher) |
There was a problem hiding this comment.
The file system operation should not be introduced, after we set up fetchPatterns, the files pulled by b.Fetch should already be what the end user needs.
implement #17