Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: Johnson-zs The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Reviewer's GuideRefines filesystem monitoring and text index maintenance by making move/delete semantics more accurate for files vs directories, switching directory-based index operations to use an ancestor_paths field for better performance and correctness, and simplifying resource limit handling in the filesystem watcher. Sequence diagram for updated file move handling in FSEventCollectorPrivatesequenceDiagram
participant FSMonitor
participant FSEventCollectorPrivate as FSEventCollectorPrivate
FSMonitor->>FSEventCollectorPrivate: handleFileMoved(fromPath, fromName, toPath, toName)
FSEventCollectorPrivate->>FSEventCollectorPrivate: fullFromPath = normalizePath(fromPath, fromName)
FSEventCollectorPrivate->>FSEventCollectorPrivate: fullToPath = normalizePath(toPath, toName)
FSEventCollectorPrivate->>FSEventCollectorPrivate: fromShouldIndex = shouldIndexFile(fullFromPath)
FSEventCollectorPrivate->>FSEventCollectorPrivate: toShouldIndex = shouldIndexFile(fullToPath)
alt fromShouldIndex && !toShouldIndex
FSEventCollectorPrivate->>FSEventCollectorPrivate: handleFileDeleted(fromPath, fromName)
FSEventCollectorPrivate-->>FSMonitor: return
else !fromShouldIndex && !toShouldIndex
FSEventCollectorPrivate-->>FSMonitor: return
else
FSEventCollectorPrivate->>FSEventCollectorPrivate: hasConflict = false
alt createdFilesList.contains(fullFromPath)
FSEventCollectorPrivate->>FSEventCollectorPrivate: createdFilesList.remove(fullFromPath)
alt toShouldIndex
FSEventCollectorPrivate->>FSEventCollectorPrivate: createdFilesList.insert(fullToPath)
end
end
alt deletedFilesList.contains(fullToPath)
FSEventCollectorPrivate->>FSEventCollectorPrivate: deletedFilesList.remove(fullToPath)
FSEventCollectorPrivate->>FSEventCollectorPrivate: hasConflict = true
end
alt modifiedFilesList.contains(fullFromPath)
FSEventCollectorPrivate->>FSEventCollectorPrivate: modifiedFilesList.remove(fullFromPath)
end
alt !hasConflict
FSEventCollectorPrivate->>FSEventCollectorPrivate: movedFilesList.insert(fullFromPath, fullToPath)
end
FSEventCollectorPrivate-->>FSMonitor: return
end
Updated class diagram for FSEventCollectorPrivate and FSMonitorPrivateclassDiagram
class FSEventCollectorPrivate {
%% Methods
+void handleFileMoved(QString fromPath, QString fromName, QString toPath, QString toName)
+void handleDirectoryCreated(QString path, QString name)
+void handleDirectoryDeleted(QString path, QString name)
+void handleDirectoryMoved(QString fromPath, QString fromName, QString toPath, QString toName)
+void flushCollectedEvents()
+void cleanupRedundantEntries()
+void removeEntriesCoveredByDirectories()
+bool shouldIndexFile(QString path) const
+bool isDirectory(QString path) const
+bool isMaxEventCountExceeded() const
%% Event lists
-QSet~QString~ createdFilesList
-QSet~QString~ deletedFilesList
-QSet~QString~ modifiedFilesList
-QHash~QString, QString~ movedFilesList
-QSet~QString~ deletedDirectoriesMarker
}
class FSMonitorPrivate {
+bool startMonitoring()
+void setupWorkerThread()
+void addDirectoryRecursively(QString path)
+bool addWatchForDirectory(QString path)
+void handleFileDeleted(QString path, QString name)
+void handleFileMoved(QString fromPath, QString fromName, QString toPath, QString toName)
+void handleDirectoriesBatch(QStringList paths)
%% Resource limits
-double maxUsagePercentage
-int maxWatches
%% State
-bool active
-QSet~QString~ watchedDirectories
}
class FSMonitorWorker {
+void processDirectory(QString path)
+signal directoryToWatch(QString path)
+signal subdirectoriesFound(QStringList directories)
+signal directoriesBatchToWatch(QStringList paths)
}
FSEventCollectorPrivate --> FSMonitorPrivate : used by
FSMonitorPrivate --> FSMonitorWorker : owns
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- Dropping the
resourceLimitReachedflag inFSMonitorPrivatemeansresourceLimitReachedwill now be emitted and warnings logged every timeisWithinWatchLimit()fails (e.g., for each directory in a batch), which can cause log/notification spam; consider reintroducing a one-shot guard or internal throttling while still preventing further watch additions. removeEntriesCoveredByDirectories()iterates over every entry in all event sets for each deleted directory, which can become quadratic with many directories/files; consider restructuring to iterate the event sets once and skip any entry whose prefix matches a deleted directory (e.g., by precomputing a sorted list of deleted directories or using a trie-like structure).
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Dropping the `resourceLimitReached` flag in `FSMonitorPrivate` means `resourceLimitReached` will now be emitted and warnings logged every time `isWithinWatchLimit()` fails (e.g., for each directory in a batch), which can cause log/notification spam; consider reintroducing a one-shot guard or internal throttling while still preventing further watch additions.
- `removeEntriesCoveredByDirectories()` iterates over every entry in all event sets for each deleted directory, which can become quadratic with many directories/files; consider restructuring to iterate the event sets once and skip any entry whose prefix matches a deleted directory (e.g., by precomputing a sorted list of deleted directories or using a trie-like structure).
## Individual Comments
### Comment 1
<location> `src/services/textindex/fsmonitor/fseventcollector.cpp:585-594` </location>
<code_context>
+void FSEventCollectorPrivate::removeEntriesCoveredByDirectories()
</code_context>
<issue_to_address>
**issue (bug_risk):** Moved files under deleted directories are not pruned from movedFilesList
This helper clears deleted/created/modified lists for paths under deleted directories, but does not touch movedFilesList. That means move events for files under a deleted directory can survive cleanupRedundantEntries() and still be processed, even though the directory (and its contents) were removed. Please either also prune movedFilesList when `fromPath` or `toPath` is under a directory in deletedDirectoriesMarker, or clarify in comments why retaining these moves is correct.
</issue_to_address>
### Comment 2
<location> `src/services/textindex/fsmonitor/fsmonitor.cpp:149-152` </location>
<code_context>
// Start monitoring
active = true;
watchedDirectories.clear();
- resourceLimitReached = false; // Reset resource limit flag
// Start worker thread
</code_context>
<issue_to_address>
**issue:** Removing the resourceLimitReached guard can cause repeated warnings and signals when the watch limit is exceeded
With `resourceLimitReached` removed, `addWatchForDirectory` is now always invoked and will emit the warning and `resourceLimitReached` signal every time `isWithinWatchLimit()` fails. With many directories beyond the limit, this can spam logs and repeatedly notify the UI. Please add a one-shot guard (or equivalent centralized check) so that once the limit is hit, we stop scheduling further `addWatch` calls and only emit the warning/signal once.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| void FSEventCollectorPrivate::removeEntriesCoveredByDirectories() | ||
| { | ||
| // Remove entries covered by deleted directories from all lists | ||
| for (const QString &dir : deletedDirectoriesMarker) { | ||
| // From deletedFilesList | ||
| QMutableSetIterator<QString> deletedIt(deletedFilesList); | ||
| while (deletedIt.hasNext()) { | ||
| const QString &path = deletedIt.next(); | ||
| if (path != dir && path.startsWith(dir + "/")) { | ||
| deletedIt.remove(); |
There was a problem hiding this comment.
issue (bug_risk): Moved files under deleted directories are not pruned from movedFilesList
This helper clears deleted/created/modified lists for paths under deleted directories, but does not touch movedFilesList. That means move events for files under a deleted directory can survive cleanupRedundantEntries() and still be processed, even though the directory (and its contents) were removed. Please either also prune movedFilesList when fromPath or toPath is under a directory in deletedDirectoriesMarker, or clarify in comments why retaining these moves is correct.
| resourceLimitReached = false; // Reset resource limit flag | ||
|
|
||
| // Start worker thread | ||
| if (!workerThread.isRunning()) { |
There was a problem hiding this comment.
issue: Removing the resourceLimitReached guard can cause repeated warnings and signals when the watch limit is exceeded
With resourceLimitReached removed, addWatchForDirectory is now always invoked and will emit the warning and resourceLimitReached signal every time isWithinWatchLimit() fails. With many directories beyond the limit, this can spam logs and repeatedly notify the UI. Please add a one-shot guard (or equivalent centralized check) so that once the limit is hit, we stop scheduling further addWatch calls and only emit the warning/signal once.
Significantly improved directory deletion performance by replacing inefficient PrefixQuery with TermQuery on ancestor_paths field. The changes include: 1. Removed redundant file extension check for deleted folders 2. Added deleted directories marker to track directory deletions 3. Implemented efficient entry cleanup using ancestor_paths field 4. Optimized move processor to use TermQuery instead of PrefixQuery 5. Simplified removeDirectoryIndex function with batch deletion The previous implementation used PrefixQuery which required scanning the entire dictionary trie, causing performance issues when deleting large directories. The new approach leverages the ancestor_paths field for exact matching, providing O(1) lookup performance. Log: Improved directory deletion performance significantly Influence: 1. Test deleting directories with various sizes and nested structures 2. Verify file deletion functionality remains unaffected 3. Test directory move operations between monitored areas 4. Validate index cleanup after directory deletions 5. Check performance with large directory structures 6. Verify blacklisted file handling still works correctly perf: 优化目录删除性能 通过使用 ancestor_paths 字段的 TermQuery 替换低效的 PrefixQuery,大幅提 升目录删除性能。主要改进包括: 1. 移除对已删除文件夹的冗余文件扩展名检查 2. 添加已删除目录标记来跟踪目录删除操作 3. 使用 ancestor_paths 字段实现高效的条目清理 4. 优化移动处理器,使用 TermQuery 替代 PrefixQuery 5. 简化 removeDirectoryIndex 函数,采用批量删除方式 之前的实现使用 PrefixQuery 需要扫描整个字典树,在删除大型目录时会导致 性能问题。新方法利用 ancestor_paths 字段进行精确匹配,提供 O(1) 的查找 性能。 Log: 显著提升目录删除性能 Influence: 1. 测试删除不同大小和嵌套结构的目录 2. 验证文件删除功能不受影响 3. 测试在监控区域之间的目录移动操作 4. 验证目录删除后的索引清理 5. 检查大型目录结构下的性能表现 6. 验证黑名单文件处理功能仍正常工作
Removed the resourceLimitReached flag that was preventing directory monitoring from resuming after hitting watch limits. The flag was causing permanent blocking of new directory watches even when directories were removed and capacity became available. Now the system properly checks available capacity for each new watch request. Key changes: 1. Eliminated resourceLimitReached flag and all conditional checks based on it 2. Simplified logic to only check current watch capacity using isWithinWatchLimit() 3. Maintained resource limit warnings and signals but removed the persistent blocking behavior 4. Improved directory removal handling by using removeWatchForDirectory method Log: Fixed issue where file system monitoring would not resume after hitting directory watch limits Influence: 1. Test monitoring large directory structures that exceed system watch limits 2. Verify that removing directories frees up capacity for new watches 3. Check that resource limit warnings are still properly emitted 4. Test directory moves and deletions to ensure watch cleanup works correctly 5. Verify monitoring resumes automatically when capacity becomes available fix: 移除资源限制达到标志以恢复监控 删除了resourceLimitReached标志,该标志在达到监视限制后阻止目录监控恢复。 该标志导致即使目录被移除且容量可用时,新的目录监视也会被永久阻塞。现在系 统会为每个新的监视请求正确检查可用容量。 主要变更: 1. 移除resourceLimitReached标志及所有基于该标志的条件检查 2. 简化逻辑,仅使用isWithinWatchLimit()检查当前监视容量 3. 保留资源限制警告和信号,但移除了永久阻塞行为 4. 通过使用removeWatchForDirectory方法改进目录移除处理 Log: 修复文件系统监控在达到目录监视限制后无法恢复的问题 Influence: 1. 测试监控超过系统监视限制的大型目录结构 2. 验证移除目录后是否释放容量用于新的监视 3. 检查资源限制警告是否仍正确发出 4. 测试目录移动和删除,确保监视清理正常工作 5. 验证容量可用时监控是否自动恢复
The fix addresses a specific scenario where file indexes were not being properly deleted when files were moved from indexed to non-indexed locations. Previously, the code only checked if either source or target files should be indexed, which could leave orphaned index entries when files were moved to locations with unsupported extensions or excluded paths. Key changes: 1. Added explicit checks for both source and target file indexing status 2. Implemented four distinct move scenarios with appropriate handling 3. Added proper debug logging for each scenario to aid troubleshooting 4. Scenario 1 (indexed → non-indexed) now correctly triggers index deletion 5. Scenario 3 (non-indexed → non-indexed) is properly ignored 6. Maintained existing behavior for other scenarios (non-indexed → indexed and indexed → indexed) Log: Fixed file index cleanup when moving files to non-indexed locations Influence: 1. Test moving files from supported extensions (e.g., .txt) to unsupported extensions (e.g., .abc) 2. Verify index entries are removed when files move to excluded directories 3. Test normal file renames between indexed locations (e.g., a.txt → b.txt) 4. Verify files moving from non-indexed to indexed locations are properly indexed 5. Check debug logs for correct scenario classification during file operations fix: 正确处理文件索引监控中的文件移动场景 此修复解决了一个特定场景下的问题:当文件从索引位置移动到非索引位置时,文 件索引未能正确删除。之前的代码只检查源文件或目标文件是否应该被索引,这可 能导致文件移动到不支持扩展名或排除路径时留下孤立的索引条目。 主要变更: 1. 添加了对源文件和目标文件索引状态的显式检查 2. 实现了四种不同的移动场景及其相应处理逻辑 3. 为每种场景添加了适当的调试日志以帮助故障排除 4. 场景1(索引 → 非索引)现在正确触发索引删除 5. 场景3(非索引 → 非索引)被正确忽略 6. 保持了其他场景(非索引 → 索引 和 索引 → 索引)的现有行为 Log: 修复文件移动到非索引位置时的索引清理问题 Influence: 1. 测试从支持扩展名(如.txt)移动到不支持扩展名(如.abc)的文件 2. 验证文件移动到排除目录时索引条目被移除 3. 测试索引位置之间的正常文件重命名(如a.txt → b.txt) 4. 验证从非索引位置移动到索引位置的文件被正确索引 5. 检查文件操作期间调试日志中的正确场景分类
deepin pr auto reviewGit Diff 代码审查报告总体评估这次代码提交主要改进了文件系统监控和索引处理逻辑,特别是在目录移动、删除和索引清理方面做了优化。代码整体质量良好,但有几个方面可以进一步改进。 详细审查1. 配置文件修改 (org.deepin.dde.file-manager.textindex.json)问题: "testdir",
"xwechat_files"改进建议:
2. FSEventCollector 修改 (fseventcollector.cpp)优点:
问题和改进建议:
void FSEventCollectorPrivate::removeEntriesCoveredByDirectories()
{
// Remove entries covered by deleted directories from all lists
for (const QString &dir : deletedDirectoriesMarker) {
// 从三个列表中迭代删除
QMutableSetIterator<QString> deletedIt(deletedFilesList);
while (deletedIt.hasNext()) {
// ...
}
// ...
}
}
if (path != dir && path.startsWith(dir + "/")) {
deletedIt.remove();
}
3. FSMonitor 修改 (fsmonitor.cpp)优点:
问题和改进建议:
if (!isWithinWatchLimit()) {
fmWarning() << "FSMonitor: Watch limit reached (" << watchedDirectories.size()
<< "/" << maxWatches << "), cannot add directory:" << path;
Q_EMIT q_ptr->resourceLimitReached(watchedDirectories.size(), maxWatches);
return false;
}
void FSMonitorPrivate::handleFileDeleted(const QString &path, const QString &name)
{
// ...
if (isDirectory(fullPath)) {
// ...
removeWatchForDirectory(fullPath);
}
// ...
}
4. 索引处理修改 (moveprocessor.cpp 和 taskhandler.cpp)优点:
问题和改进建议:
// 使用 TermQuery 在 ancestor_paths 字段上进行精确匹配
// ancestor_paths 存储的目录路径不带尾部斜杠
TermQueryPtr ancestorQuery = newLucene<TermQuery>(
newLucene<Term>(L"ancestor_paths", fromPath.toStdWString()));
TopDocsPtr allDocs = m_searcher->search(ancestorQuery, m_reader->maxDoc());
if (!allDocs || allDocs->totalHits == 0) {
fmDebug() << "[DirectoryMoveProcessor::processDirectoryMove] No documents found for directory move:" << fromPath;
return true; // Not an error, directory might be empty or not indexed
}
int32_t deleteCount = allDocs->totalHits;
writer->deleteDocuments(ancestorQuery);
5. RemoveFileListHandler 修改 (taskhandler.cpp)优点:
问题和改进建议:
for (const QString &itemPath : fileList) {
// ...
TermQueryPtr ancestorQuery = newLucene<TermQuery>(
newLucene<Term>(L"ancestor_paths", itemPath.toStdWString()));
TopDocsPtr result = searcher->search(ancestorQuery, 1);
// ...
}
安全性建议
总结这次提交在性能和代码组织方面有显著改进,特别是在索引处理方面。主要改进点包括:
主要需要改进的地方:
建议在合并前解决这些问题,以提高代码的健壮性和可维护性。 |
Added "xwechat_files" to the file manager's textindex ignore list to prevent WeChat file directories from being indexed. This improves indexing performance and avoids unnecessary processing of temporary WeChat files that don't require search functionality. Log: Added xwechat_files directory to textindex ignore list Influence: 1. Verify that files in xwechat_files directories are no longer indexed 2. Test search functionality to ensure xwechat_files content is excluded 3. Check indexing performance with WeChat directories present 4. Confirm other ignored directories still function correctly feat: 在文本索引忽略列表中添加 xwechat_files 将 "xwechat_files" 添加到文件管理器的文本索引忽略列表中,防止微信文件目 录被索引。 这提高了索引性能,避免对不需要搜索功能的临时微信文件进行不必要的处理。 Log: 在文本索引忽略列表中添加 xwechat_files 目录 Influence: 1. 验证 xwechat_files 目录中的文件不再被索引 2. 测试搜索功能以确保 xwechat_files 内容被排除 3. 检查存在微信目录时的索引性能 4. 确认其他被忽略的目录仍能正常工作
Summary by Sourcery
Improve file system monitoring and text index maintenance around directory and file moves/deletions, and optimize directory index removal using ancestor path queries.
Bug Fixes:
Enhancements: