Skip to content

fix bugs#3584

Open
Johnson-zs wants to merge 4 commits intolinuxdeepin:masterfrom
Johnson-zs:master
Open

fix bugs#3584
Johnson-zs wants to merge 4 commits intolinuxdeepin:masterfrom
Johnson-zs:master

Conversation

@Johnson-zs
Copy link
Contributor

@Johnson-zs Johnson-zs commented Feb 12, 2026

Summary by Sourcery

Improve file system monitoring and text index maintenance around directory and file moves/deletions, and optimize directory index removal using ancestor path queries.

Bug Fixes:

  • Correctly handle file move events where source and destination differ in indexability, ensuring index entries are created, updated, or deleted as appropriate.
  • Track deleted directories explicitly so that file events under those directories are deduplicated and cleaned up when flushing collected events.
  • Fix directory move/delete handling in the monitor to keep directory watches in sync and properly signal resource limit conditions without relying on a persistent limit flag.
  • Handle directory moves from/to outside monitored paths explicitly in the event collector to avoid misclassification of events.
  • Ensure directory deletions mark both the directory and its contents correctly in deletion lists for downstream processing.
  • Determine whether paths in removal tasks are files or directories using ancestor path information, preventing mis-removal scenarios.

Enhancements:

  • Optimize directory index removal and directory move processing by querying on an ancestor_paths field with TermQuery and performing batch deletions/updates instead of path prefix scans.

@deepin-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Johnson-zs

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sourcery-ai
Copy link

sourcery-ai bot commented Feb 12, 2026

Reviewer's Guide

Refines filesystem monitoring and text index maintenance by making move/delete semantics more accurate for files vs directories, switching directory-based index operations to use an ancestor_paths field for better performance and correctness, and simplifying resource limit handling in the filesystem watcher.

Sequence diagram for updated file move handling in FSEventCollectorPrivate

sequenceDiagram
    participant FSMonitor
    participant FSEventCollectorPrivate as FSEventCollectorPrivate

    FSMonitor->>FSEventCollectorPrivate: handleFileMoved(fromPath, fromName, toPath, toName)
    FSEventCollectorPrivate->>FSEventCollectorPrivate: fullFromPath = normalizePath(fromPath, fromName)
    FSEventCollectorPrivate->>FSEventCollectorPrivate: fullToPath = normalizePath(toPath, toName)

    FSEventCollectorPrivate->>FSEventCollectorPrivate: fromShouldIndex = shouldIndexFile(fullFromPath)
    FSEventCollectorPrivate->>FSEventCollectorPrivate: toShouldIndex = shouldIndexFile(fullToPath)

    alt fromShouldIndex && !toShouldIndex
        FSEventCollectorPrivate->>FSEventCollectorPrivate: handleFileDeleted(fromPath, fromName)
        FSEventCollectorPrivate-->>FSMonitor: return
    else !fromShouldIndex && !toShouldIndex
        FSEventCollectorPrivate-->>FSMonitor: return
    else
        FSEventCollectorPrivate->>FSEventCollectorPrivate: hasConflict = false

        alt createdFilesList.contains(fullFromPath)
            FSEventCollectorPrivate->>FSEventCollectorPrivate: createdFilesList.remove(fullFromPath)
            alt toShouldIndex
                FSEventCollectorPrivate->>FSEventCollectorPrivate: createdFilesList.insert(fullToPath)
            end
        end

        alt deletedFilesList.contains(fullToPath)
            FSEventCollectorPrivate->>FSEventCollectorPrivate: deletedFilesList.remove(fullToPath)
            FSEventCollectorPrivate->>FSEventCollectorPrivate: hasConflict = true
        end

        alt modifiedFilesList.contains(fullFromPath)
            FSEventCollectorPrivate->>FSEventCollectorPrivate: modifiedFilesList.remove(fullFromPath)
        end

        alt !hasConflict
            FSEventCollectorPrivate->>FSEventCollectorPrivate: movedFilesList.insert(fullFromPath, fullToPath)
        end

        FSEventCollectorPrivate-->>FSMonitor: return
    end
Loading

Updated class diagram for FSEventCollectorPrivate and FSMonitorPrivate

classDiagram
    class FSEventCollectorPrivate {
        %% Methods
        +void handleFileMoved(QString fromPath, QString fromName, QString toPath, QString toName)
        +void handleDirectoryCreated(QString path, QString name)
        +void handleDirectoryDeleted(QString path, QString name)
        +void handleDirectoryMoved(QString fromPath, QString fromName, QString toPath, QString toName)
        +void flushCollectedEvents()
        +void cleanupRedundantEntries()
        +void removeEntriesCoveredByDirectories()
        +bool shouldIndexFile(QString path) const
        +bool isDirectory(QString path) const
        +bool isMaxEventCountExceeded() const

        %% Event lists
        -QSet~QString~ createdFilesList
        -QSet~QString~ deletedFilesList
        -QSet~QString~ modifiedFilesList
        -QHash~QString, QString~ movedFilesList
        -QSet~QString~ deletedDirectoriesMarker
    }

    class FSMonitorPrivate {
        +bool startMonitoring()
        +void setupWorkerThread()
        +void addDirectoryRecursively(QString path)
        +bool addWatchForDirectory(QString path)
        +void handleFileDeleted(QString path, QString name)
        +void handleFileMoved(QString fromPath, QString fromName, QString toPath, QString toName)
        +void handleDirectoriesBatch(QStringList paths)

        %% Resource limits
        -double maxUsagePercentage
        -int maxWatches

        %% State
        -bool active
        -QSet~QString~ watchedDirectories
    }

    class FSMonitorWorker {
        +void processDirectory(QString path)
        +signal directoryToWatch(QString path)
        +signal subdirectoriesFound(QStringList directories)
        +signal directoriesBatchToWatch(QStringList paths)
    }

    FSEventCollectorPrivate --> FSMonitorPrivate : used by
    FSMonitorPrivate --> FSMonitorWorker : owns
Loading

File-Level Changes

Change Details Files
Adjust file move handling in FSEventCollector so index updates reflect whether source/target paths are indexable, and properly handle directory deletes/moves with directory-aware cleanup.
  • Remove special-case indexing for deleted folders in shouldIndexFile, so only supported extensions are considered indexable.
  • Refactor handleFileMoved to compute from/to indexability once, treat moves from indexed to non-indexed as deletions, ignore moves between non-indexed files, and reuse the target indexability when rewriting createdFilesList.
  • Change handleDirectoryDeleted to mark deleted directories in a dedicated set and insert them into deletedFilesList instead of delegating to handleFileDeleted.
  • Update handleDirectoryMoved to explicitly treat moves to/from outside the monitored tree via directoryCreated/directoryDeleted and otherwise delegate to handleFileMoved.
  • Ensure flushCollectedEvents and clearEvents also clear the new deletedDirectoriesMarker set.
  • Introduce removeEntriesCoveredByDirectories and call it from cleanupRedundantEntries so events under deleted directories are pruned from created/deleted/modified lists.
src/services/textindex/fsmonitor/fseventcollector.cpp
src/services/textindex/fsmonitor/fseventcollector_p.h
Optimize index cleanup for directories by using ancestor_paths-based TermQuery and by classifying paths as files or directories based on index content rather than heuristics.
  • In removeDirectoryIndex, replace path-prefix based PrefixQuery on the path field with a TermQuery on the ancestor_paths field, then delete all matching docs via writer->deleteDocuments(Query) and update progress in bulk.
  • Tighten error handling and logging in removeDirectoryIndex, including an early return when no hits are found and a single debug message summarizing deletions.
  • In RemoveFileListHandler, create a single IndexSearcher up front and, for each input path, query ancestor_paths to decide if it represents a directory (has descendants) or a file; call removeDirectoryIndex for directories and removeFile for files, updating counters and logs accordingly.
  • Fix a minor logging alignment issue in cleanupIndexs where a debug stream continuation was misindented.
src/services/textindex/task/taskhandler.cpp
Simplify FSMonitor resource limit handling by removing the persistent resourceLimitReached flag and always consulting current watch limits, while centralizing watch removal for deleted/moved directories.
  • Stop resetting resourceLimitReached in startMonitoring and remove checks that short-circuited worker callbacks when the flag was set, so behavior depends solely on isWithinWatchLimit and the current active flag.
  • In addWatchForDirectory and handleDirectoriesBatch, rely directly on isWithinWatchLimit and always emit resourceLimitReached when the limit is hit, logging which directory or batch is affected.
  • Update handleFileDeleted and handleFileMoved for directories to use removeWatchForDirectory instead of duplicating watcher removal logic.
  • Remove the resourceLimitReached member from FSMonitorPrivate since it is no longer used.
src/services/textindex/fsmonitor/fsmonitor.cpp
src/services/textindex/fsmonitor/fsmonitor_p.h
Align directory move processing with the new ancestor_paths-based indexing, ensuring documents under moved directories are found and updated efficiently.
  • In DirectoryMoveProcessor::processDirectoryMove, replace the PrefixQuery on the path field with a TermQuery on ancestor_paths using the raw fromPath (without trailing slash).
  • Reintroduce normalizedFromPath only for computing new paths when rewriting document paths after a directory move, keeping its use separate from the index query.
  • Maintain existing behavior for early exit when no docs are found or when the operation is cancelled via TaskState.
src/services/textindex/task/moveprocessor.cpp

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues, and left some high level feedback:

  • Dropping the resourceLimitReached flag in FSMonitorPrivate means resourceLimitReached will now be emitted and warnings logged every time isWithinWatchLimit() fails (e.g., for each directory in a batch), which can cause log/notification spam; consider reintroducing a one-shot guard or internal throttling while still preventing further watch additions.
  • removeEntriesCoveredByDirectories() iterates over every entry in all event sets for each deleted directory, which can become quadratic with many directories/files; consider restructuring to iterate the event sets once and skip any entry whose prefix matches a deleted directory (e.g., by precomputing a sorted list of deleted directories or using a trie-like structure).
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Dropping the `resourceLimitReached` flag in `FSMonitorPrivate` means `resourceLimitReached` will now be emitted and warnings logged every time `isWithinWatchLimit()` fails (e.g., for each directory in a batch), which can cause log/notification spam; consider reintroducing a one-shot guard or internal throttling while still preventing further watch additions.
- `removeEntriesCoveredByDirectories()` iterates over every entry in all event sets for each deleted directory, which can become quadratic with many directories/files; consider restructuring to iterate the event sets once and skip any entry whose prefix matches a deleted directory (e.g., by precomputing a sorted list of deleted directories or using a trie-like structure).

## Individual Comments

### Comment 1
<location> `src/services/textindex/fsmonitor/fseventcollector.cpp:585-594` </location>
<code_context>
+void FSEventCollectorPrivate::removeEntriesCoveredByDirectories()
</code_context>

<issue_to_address>
**issue (bug_risk):** Moved files under deleted directories are not pruned from movedFilesList

This helper clears deleted/created/modified lists for paths under deleted directories, but does not touch movedFilesList. That means move events for files under a deleted directory can survive cleanupRedundantEntries() and still be processed, even though the directory (and its contents) were removed. Please either also prune movedFilesList when `fromPath` or `toPath` is under a directory in deletedDirectoriesMarker, or clarify in comments why retaining these moves is correct.
</issue_to_address>

### Comment 2
<location> `src/services/textindex/fsmonitor/fsmonitor.cpp:149-152` </location>
<code_context>
     // Start monitoring
     active = true;
     watchedDirectories.clear();
-    resourceLimitReached = false;   // Reset resource limit flag

     // Start worker thread
</code_context>

<issue_to_address>
**issue:** Removing the resourceLimitReached guard can cause repeated warnings and signals when the watch limit is exceeded

With `resourceLimitReached` removed, `addWatchForDirectory` is now always invoked and will emit the warning and `resourceLimitReached` signal every time `isWithinWatchLimit()` fails. With many directories beyond the limit, this can spam logs and repeatedly notify the UI. Please add a one-shot guard (or equivalent centralized check) so that once the limit is hit, we stop scheduling further `addWatch` calls and only emit the warning/signal once.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +585 to +594
void FSEventCollectorPrivate::removeEntriesCoveredByDirectories()
{
// Remove entries covered by deleted directories from all lists
for (const QString &dir : deletedDirectoriesMarker) {
// From deletedFilesList
QMutableSetIterator<QString> deletedIt(deletedFilesList);
while (deletedIt.hasNext()) {
const QString &path = deletedIt.next();
if (path != dir && path.startsWith(dir + "/")) {
deletedIt.remove();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Moved files under deleted directories are not pruned from movedFilesList

This helper clears deleted/created/modified lists for paths under deleted directories, but does not touch movedFilesList. That means move events for files under a deleted directory can survive cleanupRedundantEntries() and still be processed, even though the directory (and its contents) were removed. Please either also prune movedFilesList when fromPath or toPath is under a directory in deletedDirectoriesMarker, or clarify in comments why retaining these moves is correct.

Comment on lines -149 to 152
resourceLimitReached = false; // Reset resource limit flag

// Start worker thread
if (!workerThread.isRunning()) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: Removing the resourceLimitReached guard can cause repeated warnings and signals when the watch limit is exceeded

With resourceLimitReached removed, addWatchForDirectory is now always invoked and will emit the warning and resourceLimitReached signal every time isWithinWatchLimit() fails. With many directories beyond the limit, this can spam logs and repeatedly notify the UI. Please add a one-shot guard (or equivalent centralized check) so that once the limit is hit, we stop scheduling further addWatch calls and only emit the warning/signal once.

Significantly improved directory deletion performance by replacing
inefficient PrefixQuery with TermQuery on ancestor_paths field. The
changes include:
1. Removed redundant file extension check for deleted folders
2. Added deleted directories marker to track directory deletions
3. Implemented efficient entry cleanup using ancestor_paths field
4. Optimized move processor to use TermQuery instead of PrefixQuery
5. Simplified removeDirectoryIndex function with batch deletion

The previous implementation used PrefixQuery which required scanning the
entire dictionary trie, causing performance issues when deleting large
directories. The new approach leverages the ancestor_paths field for
exact matching, providing O(1) lookup performance.

Log: Improved directory deletion performance significantly

Influence:
1. Test deleting directories with various sizes and nested structures
2. Verify file deletion functionality remains unaffected
3. Test directory move operations between monitored areas
4. Validate index cleanup after directory deletions
5. Check performance with large directory structures
6. Verify blacklisted file handling still works correctly

perf: 优化目录删除性能

通过使用 ancestor_paths 字段的 TermQuery 替换低效的 PrefixQuery,大幅提
升目录删除性能。主要改进包括:
1. 移除对已删除文件夹的冗余文件扩展名检查
2. 添加已删除目录标记来跟踪目录删除操作
3. 使用 ancestor_paths 字段实现高效的条目清理
4. 优化移动处理器,使用 TermQuery 替代 PrefixQuery
5. 简化 removeDirectoryIndex 函数,采用批量删除方式

之前的实现使用 PrefixQuery 需要扫描整个字典树,在删除大型目录时会导致
性能问题。新方法利用 ancestor_paths 字段进行精确匹配,提供 O(1) 的查找
性能。

Log: 显著提升目录删除性能

Influence:
1. 测试删除不同大小和嵌套结构的目录
2. 验证文件删除功能不受影响
3. 测试在监控区域之间的目录移动操作
4. 验证目录删除后的索引清理
5. 检查大型目录结构下的性能表现
6. 验证黑名单文件处理功能仍正常工作
Removed the resourceLimitReached flag that was preventing directory
monitoring from resuming after hitting watch limits. The flag
was causing permanent blocking of new directory watches even when
directories were removed and capacity became available. Now the system
properly checks available capacity for each new watch request.

Key changes:
1. Eliminated resourceLimitReached flag and all conditional checks based
on it
2. Simplified logic to only check current watch capacity using
isWithinWatchLimit()
3. Maintained resource limit warnings and signals but removed the
persistent blocking behavior
4. Improved directory removal handling by using removeWatchForDirectory
method

Log: Fixed issue where file system monitoring would not resume after
hitting directory watch limits

Influence:
1. Test monitoring large directory structures that exceed system watch
limits
2. Verify that removing directories frees up capacity for new watches
3. Check that resource limit warnings are still properly emitted
4. Test directory moves and deletions to ensure watch cleanup works
correctly
5. Verify monitoring resumes automatically when capacity becomes
available

fix: 移除资源限制达到标志以恢复监控

删除了resourceLimitReached标志,该标志在达到监视限制后阻止目录监控恢复。
该标志导致即使目录被移除且容量可用时,新的目录监视也会被永久阻塞。现在系
统会为每个新的监视请求正确检查可用容量。

主要变更:
1. 移除resourceLimitReached标志及所有基于该标志的条件检查
2. 简化逻辑,仅使用isWithinWatchLimit()检查当前监视容量
3. 保留资源限制警告和信号,但移除了永久阻塞行为
4. 通过使用removeWatchForDirectory方法改进目录移除处理

Log: 修复文件系统监控在达到目录监视限制后无法恢复的问题

Influence:
1. 测试监控超过系统监视限制的大型目录结构
2. 验证移除目录后是否释放容量用于新的监视
3. 检查资源限制警告是否仍正确发出
4. 测试目录移动和删除,确保监视清理正常工作
5. 验证容量可用时监控是否自动恢复
The fix addresses a specific scenario where file indexes were not being
properly deleted when files were moved from indexed to non-indexed
locations. Previously, the code only checked if either source or target
files should be indexed, which could leave orphaned index entries when
files were moved to locations with unsupported extensions or excluded
paths.

Key changes:
1. Added explicit checks for both source and target file indexing status
2. Implemented four distinct move scenarios with appropriate handling
3. Added proper debug logging for each scenario to aid troubleshooting
4. Scenario 1 (indexed → non-indexed) now correctly triggers index
deletion
5. Scenario 3 (non-indexed → non-indexed) is properly ignored
6. Maintained existing behavior for other scenarios (non-indexed →
indexed and indexed → indexed)

Log: Fixed file index cleanup when moving files to non-indexed locations

Influence:
1. Test moving files from supported extensions (e.g., .txt) to
unsupported extensions (e.g., .abc)
2. Verify index entries are removed when files move to excluded
directories
3. Test normal file renames between indexed locations (e.g., a.txt
→ b.txt)
4. Verify files moving from non-indexed to indexed locations are
properly indexed
5. Check debug logs for correct scenario classification during file
operations

fix: 正确处理文件索引监控中的文件移动场景

此修复解决了一个特定场景下的问题:当文件从索引位置移动到非索引位置时,文
件索引未能正确删除。之前的代码只检查源文件或目标文件是否应该被索引,这可
能导致文件移动到不支持扩展名或排除路径时留下孤立的索引条目。

主要变更:
1. 添加了对源文件和目标文件索引状态的显式检查
2. 实现了四种不同的移动场景及其相应处理逻辑
3. 为每种场景添加了适当的调试日志以帮助故障排除
4. 场景1(索引 → 非索引)现在正确触发索引删除
5. 场景3(非索引 → 非索引)被正确忽略
6. 保持了其他场景(非索引 → 索引 和 索引 → 索引)的现有行为

Log: 修复文件移动到非索引位置时的索引清理问题

Influence:
1. 测试从支持扩展名(如.txt)移动到不支持扩展名(如.abc)的文件
2. 验证文件移动到排除目录时索引条目被移除
3. 测试索引位置之间的正常文件重命名(如a.txt → b.txt)
4. 验证从非索引位置移动到索引位置的文件被正确索引
5. 检查文件操作期间调试日志中的正确场景分类
@deepin-ci-robot
Copy link
Contributor

deepin pr auto review

Git Diff 代码审查报告

总体评估

这次代码提交主要改进了文件系统监控和索引处理逻辑,特别是在目录移动、删除和索引清理方面做了优化。代码整体质量良好,但有几个方面可以进一步改进。

详细审查

1. 配置文件修改 (org.deepin.dde.file-manager.textindex.json)

问题

"testdir",
"xwechat_files"

改进建议

  • 缩进不一致:"xwechat_files" 前有多余空格,应与其他条目保持一致
  • 建议添加注释说明为什么添加这个排除目录,方便后续维护

2. FSEventCollector 修改 (fseventcollector.cpp)

优点

  • 改进了文件移动处理的逻辑,更清晰地划分了不同场景
  • 添加了详细的注释说明不同场景的处理方式
  • 新增了 deletedDirectoriesMarker 来跟踪删除的目录

问题和改进建议

  1. 性能问题
void FSEventCollectorPrivate::removeEntriesCoveredByDirectories()
{
    // Remove entries covered by deleted directories from all lists
    for (const QString &dir : deletedDirectoriesMarker) {
        // 从三个列表中迭代删除
        QMutableSetIterator<QString> deletedIt(deletedFilesList);
        while (deletedIt.hasNext()) {
            // ...
        }
        // ...
    }
}
  • 问题:时间复杂度为 O(n*m),当删除目录和文件数量较多时性能较差
  • 改进建议:考虑使用前缀树(Trie)结构来存储路径,或者先对路径排序后进行批量处理
  1. 代码重复

    • 三个列表的删除逻辑几乎相同,可以提取为模板函数减少重复代码
  2. 潜在的安全问题

if (path != dir && path.startsWith(dir + "/")) {
    deletedIt.remove();
}
  • 路径拼接没有规范化,可能导致路径遍历问题
  • 改进建议:使用 QDir::cleanPath() 或类似的规范化函数处理路径

3. FSMonitor 修改 (fsmonitor.cpp)

优点

  • 简化了资源限制检查逻辑,移除了 resourceLimitReached 标志
  • 统一了监控目录添加的代码路径

问题和改进建议

  1. 资源限制处理
if (!isWithinWatchLimit()) {
    fmWarning() << "FSMonitor: Watch limit reached (" << watchedDirectories.size()
                << "/" << maxWatches << "), cannot add directory:" << path;
    Q_EMIT q_ptr->resourceLimitReached(watchedDirectories.size(), maxWatches);
    return false;
}
  • 问题:每次达到限制都会发出警告和信号,可能导致日志和信号泛滥
  • 改进建议:考虑添加防抖机制或限制警告频率
  1. 目录监控移除
void FSMonitorPrivate::handleFileDeleted(const QString &path, const QString &name)
{
    // ...
    if (isDirectory(fullPath)) {
        // ...
        removeWatchForDirectory(fullPath);
    }
    // ...
}
  • 改进建议:确保 removeWatchForDirectory 方法在所有调用点都正确处理了错误情况

4. 索引处理修改 (moveprocessor.cpp 和 taskhandler.cpp)

优点

  • 使用 ancestor_paths 字段替代前缀查询,提高了性能
  • 简化了目录索引删除逻辑,使用批量删除替代逐个删除

问题和改进建议

  1. 路径处理不一致
// 使用 TermQuery 在 ancestor_paths 字段上进行精确匹配
// ancestor_paths 存储的目录路径不带尾部斜杠
TermQueryPtr ancestorQuery = newLucene<TermQuery>(
        newLucene<Term>(L"ancestor_paths", fromPath.toStdWString()));
  • 问题:注释提到路径不带尾部斜杠,但没有代码确保 fromPath 符合此要求
  • 改进建议:添加路径规范化处理,确保查询和存储的路径格式一致
  1. 错误处理
TopDocsPtr allDocs = m_searcher->search(ancestorQuery, m_reader->maxDoc());
if (!allDocs || allDocs->totalHits == 0) {
    fmDebug() << "[DirectoryMoveProcessor::processDirectoryMove] No documents found for directory move:" << fromPath;
    return true;   // Not an error, directory might be empty or not indexed
}
  • 问题allDocs 为空的情况和 totalHits 为 0 的情况可能需要不同的处理
  • 改进建议:考虑分别处理这两种情况,或者添加更详细的日志
  1. 批量删除的原子性
int32_t deleteCount = allDocs->totalHits;
writer->deleteDocuments(ancestorQuery);
  • 改进建议:添加事务处理或错误恢复机制,确保批量删除失败时能够恢复

5. RemoveFileListHandler 修改 (taskhandler.cpp)

优点

  • 优化了文件和目录的判断逻辑,减少了不必要的查询
  • 使用 ancestor_paths 字段提高查询效率

问题和改进建议

  1. 查询效率
for (const QString &itemPath : fileList) {
    // ...
    TermQueryPtr ancestorQuery = newLucene<TermQuery>(
            newLucene<Term>(L"ancestor_paths", itemPath.toStdWString()));
    TopDocsPtr result = searcher->search(ancestorQuery, 1);
    // ...
}
  • 问题:对每个项目都执行一次查询,当文件列表很大时效率较低
  • 改进建议:考虑批量查询或使用多线程处理
  1. 错误处理
    • 缺少对 searcher->search 可能抛出的异常的处理
    • 改进建议:添加 try-catch 块处理可能的异常

安全性建议

  1. 路径遍历防护

    • 在所有路径操作前进行规范化处理
    • 验证路径是否在允许的范围内
  2. 资源限制

    • 添加对处理文件数量的限制,防止资源耗尽
    • 实现超时机制,防止长时间运行
  3. 输入验证

    • 对所有外部输入进行验证和清理
    • 特别注意来自文件系统的事件数据

总结

这次提交在性能和代码组织方面有显著改进,特别是在索引处理方面。主要改进点包括:

  1. 使用 ancestor_paths 字段替代前缀查询,提高查询效率
  2. 简化了资源限制检查逻辑
  3. 改进了目录移动和删除的处理

主要需要改进的地方:

  1. 路径处理的规范性和一致性
  2. 批量操作的性能优化
  3. 错误处理和恢复机制
  4. 代码重复的消除

建议在合并前解决这些问题,以提高代码的健壮性和可维护性。

Added "xwechat_files" to the file manager's textindex ignore list to
prevent
WeChat file directories from being indexed. This improves indexing
performance
and avoids unnecessary processing of temporary WeChat files that don't
require
search functionality.

Log: Added xwechat_files directory to textindex ignore list

Influence:
1. Verify that files in xwechat_files directories are no longer indexed
2. Test search functionality to ensure xwechat_files content is excluded
3. Check indexing performance with WeChat directories present
4. Confirm other ignored directories still function correctly

feat: 在文本索引忽略列表中添加 xwechat_files

将 "xwechat_files" 添加到文件管理器的文本索引忽略列表中,防止微信文件目
录被索引。
这提高了索引性能,避免对不需要搜索功能的临时微信文件进行不必要的处理。

Log: 在文本索引忽略列表中添加 xwechat_files 目录

Influence:
1. 验证 xwechat_files 目录中的文件不再被索引
2. 测试搜索功能以确保 xwechat_files 内容被排除
3. 检查存在微信目录时的索引性能
4. 确认其他被忽略的目录仍能正常工作
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants