Conversation
WalkthroughAdds DITA ditamap parsing and rewriting: new XML parser dependency, utilities to recursively collect referenced files, compute a common ancestor, copy and rewrite ditamaps to use relative hrefs, an import fallback for json-schema-ref-parser, and new unit/integration tests with DITA sample data. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Index as processDitaMap
participant Utils as DITA Utils
participant FS as Filesystem
participant DITA as DITA Processor
User->>Index: processDitaMap(ditamapPath)
Index->>Utils: parseDitamap(ditamapPath)
Utils->>Utils: recursively parse mapref/topicref\nresolve local hrefs, skip external, detect circulars
Utils-->>Index: referencedPaths[]
alt Parent/sibling traversal detected
Index->>Utils: findCommonAncestor(ditamapPath, referencedPaths)
Utils-->>Index: commonAncestor
Index->>Utils: copyAndRewriteDitamap(ditamapPath, commonAncestor)
Utils->>FS: read original ditamap
Utils->>Utils: rewrite href attributes to relative paths
Utils->>FS: write rewritten ditamap -> newDitamapPath
Utils-->>Index: newDitamapPath
Index->>DITA: run DITA on newDitamapPath
DITA-->>Index: generation result
Index->>FS: cleanup temporary rewritten ditamap
else No traversal needed
Index->>DITA: run DITA on original ditamapPath
DITA-->>Index: generation result
end
Index-->>User: result / exit code
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45-90 minutes
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (3)
src/utils.js (3)
8-8: Remove unused importXMLBuilder.
XMLBuilderis imported but not used anywhere in this file. The code uses regex-based replacement for rewriting the ditamap instead.-const { XMLParser, XMLBuilder } = require("fast-xml-parser"); +const { XMLParser } = require("fast-xml-parser");
124-127: Use the built-in logging system instead ofconsole.warn.Per coding guidelines for
src/**/*.js, use the built-in logging system with available log levels: debug, info, warn, error. However,parseDitamapdoesn't have access toconfig. Consider either passingconfigas a parameter or accepting this as a limitation for this utility function.One option is to make warnings silent or accept a logger callback:
-function parseDitamap(ditamapPath, visitedMaps = new Set(), depth = 0, maxDepth = 10) { +function parseDitamap(ditamapPath, visitedMaps = new Set(), depth = 0, maxDepth = 10, onWarning = null) {Then at line 126:
- console.warn(`Warning: Failed to parse nested ditamap ${absolutePath}: ${error.message}`); + if (onWarning) onWarning(`Failed to parse nested ditamap ${absolutePath}: ${error.message}`);
514-521: Consider falling back to original processing instead of aborting on parse failure.Currently, if
parseDitamapfails, DITA processing is aborted entirely. The original behavior was to process the ditamap without path rewriting. Consider falling back to that behavior for better resilience.} catch (error) { log( config, "warning", - `Failed to parse ditamap for path analysis: ${error.message}. Aborting DITA processing.` + `Failed to parse ditamap for path analysis: ${error.message}. Proceeding without path rewriting.` ); - return null; + // Continue with original source - path rewriting won't occur }
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (13)
package.json(1 hunks)src/index.test.js(1 hunks)src/openapi.js(1 hunks)src/utils.js(4 hunks)src/utils.test.js(1 hunks)test/data/dita/parent-sibling-refs/maps/circular-map-a.ditamap(1 hunks)test/data/dita/parent-sibling-refs/maps/circular-map-b.ditamap(1 hunks)test/data/dita/parent-sibling-refs/maps/main-map-with-mapref.ditamap(1 hunks)test/data/dita/parent-sibling-refs/maps/nested-map.ditamap(1 hunks)test/data/dita/parent-sibling-refs/maps/test-map.ditamap(1 hunks)test/data/dita/parent-sibling-refs/parent-topics/parent-topic.dita(1 hunks)test/data/dita/parent-sibling-refs/sibling-topics/nested/nested-topic.dita(1 hunks)test/data/dita/parent-sibling-refs/sibling-topics/sibling-topic.dita(1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{js,ts}
📄 CodeRabbit inference engine (AGENTS.md)
**/*.{js,ts}: Use async/await for asynchronous operations
Prefer destructuring for function parameters
Use meaningful variable names that reflect Doc Detective terminology
Add JSDoc comments for complex functions
Files:
src/openapi.jssrc/utils.test.jssrc/index.test.jssrc/utils.js
src/**/*.js
📄 CodeRabbit inference engine (AGENTS.md)
Use the built-in logging system with available log levels: debug, info, warn, error
Files:
src/openapi.jssrc/utils.test.jssrc/index.test.jssrc/utils.js
**/*.test.js
📄 CodeRabbit inference engine (AGENTS.md)
**/*.test.js: When possible, directly import and run functions rather than use extensive mocking and stubbing in tests
Mock external dependencies (file system, HTTP requests) in tests
Test both successful and error scenarios
Validate configuration handling thoroughly in tests
Use realistic test data that matches actual usage patterns
Use Mocha for unit tests
Use Chai for assertions
Files:
src/utils.test.jssrc/index.test.js
🧠 Learnings (5)
📚 Learning: 2025-11-25T23:59:45.298Z
Learnt from: CR
Repo: doc-detective/resolver PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-25T23:59:45.298Z
Learning: Applies to **/*.test.js : Mock external dependencies (file system, HTTP requests) in tests
Applied to files:
src/openapi.jssrc/utils.test.js
📚 Learning: 2025-11-25T23:59:45.298Z
Learnt from: CR
Repo: doc-detective/resolver PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-25T23:59:45.298Z
Learning: Applies to **/*.test.js : When possible, directly import and run functions rather than use extensive mocking and stubbing in tests
Applied to files:
src/openapi.jssrc/utils.test.js
📚 Learning: 2025-11-25T23:59:45.298Z
Learnt from: CR
Repo: doc-detective/resolver PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-25T23:59:45.298Z
Learning: Applies to **/*.test.js : Use realistic test data that matches actual usage patterns
Applied to files:
src/utils.test.js
📚 Learning: 2025-11-25T23:59:45.298Z
Learnt from: CR
Repo: doc-detective/resolver PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-25T23:59:45.298Z
Learning: Applies to **/*.test.js : Validate configuration handling thoroughly in tests
Applied to files:
src/utils.test.js
📚 Learning: 2025-11-25T23:59:45.298Z
Learnt from: CR
Repo: doc-detective/resolver PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-25T23:59:45.298Z
Learning: Applies to **/*.test.js : Test both successful and error scenarios
Applied to files:
src/utils.test.jssrc/index.test.js
🧬 Code graph analysis (3)
src/utils.test.js (1)
src/utils.js (11)
fs(1-1)path(6-6)referencedFiles(92-92)referencedFiles(486-486)i(176-176)i(304-304)i(1352-1352)i(1360-1360)result(274-274)commonAncestor(504-504)newDitamapPath(220-220)
src/index.test.js (1)
src/utils.js (9)
path(6-6)require(7-7)require(8-8)require(9-14)referencedFiles(92-92)referencedFiles(486-486)needsRewrite(487-487)commonAncestor(504-504)newDitamapPath(220-220)
src/utils.js (4)
src/index.test.js (7)
require(5-5)require(891-891)ditamapPath(896-896)ditamapPath(941-941)referencedFiles(899-899)referencedFiles(944-944)referencedFiles(998-998)src/openapi.js (1)
require(1-1)dev/index.js (3)
require(1-1)require(2-2)require(3-3)src/utils.test.js (15)
ditamapPath(15-15)ditamapPath(34-34)ditamapPath(53-53)ditamapPath(163-163)ditamapPath(175-175)ditamapPath(189-189)ditamapPath(204-204)ditamapPath(218-218)ditamapPath(229-229)ditamapPath(240-240)referencedFiles(16-16)referencedFiles(35-35)referencedFiles(56-56)referencedFiles(99-99)i(123-123)
🪛 GitHub Actions: Test (& Publish)
src/utils.test.js
[error] 171-171: findCommonAncestor test failed. Expected POSIX path '/home/user/docs' but got Windows path 'D:\home\user\docs'.
[error] 183-183: findCommonAncestor test failed. Expected POSIX path '/home/user' but got Windows path 'D:\home\user'.
[error] 225-225: findCommonAncestor test failed. Expected POSIX path '/home/user/docs' but got Windows path 'D:\home\user\docs'.
[error] 235-235: findCommonAncestor test failed. Expected POSIX path '/home/user/docs/maps' but got Windows path 'D:\home\user\docs\maps'.
🔇 Additional comments (15)
test/data/dita/parent-sibling-refs/maps/circular-map-b.ditamap (1)
1-6: LGTM!This test file correctly establishes a circular reference with
circular-map-a.ditamapfor testing the circular reference detection inparseDitamap. The DITA map structure is valid.test/data/dita/parent-sibling-refs/maps/circular-map-a.ditamap (1)
1-6: LGTM!Valid DITA map structure that pairs with
circular-map-b.ditamapto test circular reference detection.src/utils.js (2)
151-199: LGTM!The common ancestor logic is well-implemented with proper handling for edge cases including the Unix root directory and fallback to
process.cwd()when paths have no common ancestor (e.g., different Windows drives).
222-242: Regex may misshrefwhen it's the first attribute.The regex pattern requires whitespace before
href=(\s+href=), which won't match whenhrefis the first attribute directly after the tag name:<topicref href="topic.dita"/> <!-- Won't match - no space before href --> <topicref format="dita" href="topic.dita"/> <!-- Will match -->Consider adjusting the regex to make the preceding content optional or use a more flexible pattern:
- const hrefRegex = /(<(?:topicref|mapref)[^>]*\s+href=")([^"]+)(")/g; + const hrefRegex = /(<(?:topicref|mapref)(?:\s+[^>]*)?\s+href="|<(?:topicref|mapref)\s+href=")([^"]+)(")/g;Or more simply:
- const hrefRegex = /(<(?:topicref|mapref)[^>]*\s+href=")([^"]+)(")/g; + const hrefRegex = /(<(?:topicref|mapref)\s[^>]*?href=")([^"]+)(")/g;⛔ Skipped due to learnings
Learnt from: CR Repo: doc-detective/resolver PR: 0 File: AGENTS.md:0-0 Timestamp: 2025-11-25T23:59:45.298Z Learning: Applies to src/resolve.js : Follow existing regex patterns for markup detection when adding new featurestest/data/dita/parent-sibling-refs/maps/nested-map.ditamap (1)
1-6: LGTM!Valid DITA map structure that tests parent directory traversal (
../sibling-topics/) for the path rewriting functionality. The relative path reference appropriately exercises theparseDitamapandcopyAndRewriteDitamaputilities.package.json (1)
32-32: No issues found withfast-xml-parser@^5.3.2dependency.Version 5.3.2 is the latest release on npm and has no known security vulnerabilities. The package is actively maintained and the caret version constraint allows for safe patch and minor updates.
test/data/dita/parent-sibling-refs/maps/main-map-with-mapref.ditamap (1)
1-7: DITA map fixture looks consistent with parent/nested structureThe topicref and mapref paths correctly reflect the parent-topics and maps directory layout and match how tests consume this fixture.
test/data/dita/parent-sibling-refs/sibling-topics/sibling-topic.dita (1)
1-14: Sibling topic fixture is well-formed and aligned with test conventionsXML, DITA DOCTYPE, test metadata comments, and inline step comment all follow the same pattern as existing DITA fixtures and will work for detection tests.
test/data/dita/parent-sibling-refs/parent-topics/parent-topic.dita (1)
1-14: Parent topic fixture is consistent and ready for path‑resolution testsThe topic structure, test metadata, and checkLink step are consistent with other DITA fixtures and suitable for exercising parent-directory resolution.
test/data/dita/parent-sibling-refs/maps/test-map.ditamap (1)
1-8: Test map correctly models parent and sibling referencesThe three topicref hrefs correctly point to parent-topics and sibling-topics (including nested) relative to the maps directory, matching what parse/rewrite tests assert.
test/data/dita/parent-sibling-refs/sibling-topics/nested/nested-topic.dita (1)
1-14: Nested sibling topic fixture is correctly structuredThe nested topic XML and embedded test metadata/comments are consistent with other DITA fixtures, and the path matches the expectations in the ditamap tests.
src/index.test.js (1)
889-1015: DITA parent/sibling integration tests are thorough and correctly scopedThe new integration suite exercises the full parse→analyze→rewrite flow (including nested maprefs and no‑rewrite scenarios), uses realistic fixture data, and reliably cleans up temporary outputs. This gives solid end‑to‑end coverage of the new DITA map behavior.
src/utils.test.js (2)
10-159: parseDitamap tests provide strong coverage of core behaviorsThe parseDitamap suite thoroughly exercises topicrefs, recursive maprefs (including circular), malformed XML handling, external HTTP reference skipping, recursion depth limits, and non-existent file paths. Use of real fixture files plus targeted temp files and consistent cleanup matches the testing guidelines and should keep regressions in check.
251-397: copyAndRewriteDitamap tests validate key edge cases and platform nuancesThe copyAndRewriteDitamap tests cover correct path rewriting, XML prolog/DOCTYPE preservation, mapref handling, skipping external HTTP URLs, missing files, malformed XML, and forward‑slash normalization in hrefs. Temp files are tracked and cleaned up in afterEach, reducing test pollution. This suite gives strong confidence in the new DITA rewrite behavior across platforms.
src/openapi.js (1)
5-15: I'll verify the race condition claim in the review comment by examining the actual code and usage patterns in src/openapi.js.
<function_calls>
cat -n src/openapi.js | head -50
</function_calls>
<function_calls>
wc -l src/openapi.js
</function_calls>
<function_calls>
grep -n "loadDescription|parser." src/openapi.js | head -30
</function_calls>
|
📝 Documentation updates detected! New suggestion: Document DITA map processing and parent/sibling path resolution |
|
This job failed primarily due to two issues: 1. Infinite Recursion in Ditamap ParsingThe warning shows: Solution:
2. Git Submodule Configuration ErrorSolution:
Next Steps
Would you like further details on where the recursion depth is set, or a code review for your |
|
@hawkeyexl I've opened a new pull request, #89, to work on those changes. Once the pull request is ready, I'll request review from you. |
…isons - Add normalizePath helper to convert backslashes to forward slashes and strip drive letters. - Use normalizePath when asserting expected ancestors in sibling, parent, single-reference and empty-reference tests so assertions are OS-independent.
- Change newDitamapPath from const to let so it can be modified - If the computed new path would be the same as the original, append "_rewritten" before the extension to avoid overwriting the source
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
src/utils.js (2)
44-141: parseDitamap recursion limit and logging could be more flexibleThe core logic (visited set for circular refs + depth guard + external-URL skipping) looks solid and should prevent infinite recursion on cyclic maps.
Two refinements to consider:
- The fixed
maxDepth = 10means valid but deeply nested hierarchies will deterministically fail with “Maximum recursion depth exceeded”, andprocessDitaMapwill abort DITA processing. Given the earlier CI failure on deep maps, it would help to either (a) raise the default (e.g., 20+) or (b) makemaxDepthconfigurable via a config option or an exported constant thatprocessDitaMapcan override.- The nested-map failure path uses
console.warn, while the rest of this module routes through theloghelper. For consistency with the logging guidelines forsrc/**/*.js, consider allowing a logger to be injected (e.g., optional callback parameter) or threading the existinglog/configdown fromprocessDitaMapso these warnings show up in the same logging pipeline.
151-199: Common-ancestor fallback toprocess.cwd()may be surprisingThe common-prefix computation is correct for typical cases and nicely handles Windows vs. POSIX paths. The special-case fallback:
if (commonComponents.length === 0 || (commonComponents.length === 1 && commonComponents[0] === '')) { return process.cwd(); }means that when paths only share the filesystem root (e.g.,
/foo/...and/bar/...or different Windows drives), you return the current working directory rather than a true ancestor of all inputs.This is workable (relative paths will still be computable), but semantically a bit odd; you might instead prefer returning the original ditamap’s directory or the actual root (
/on POSIX) so the “common ancestor” is always guaranteed to lie on all path prefixes.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/utils.js(4 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{js,ts}
📄 CodeRabbit inference engine (AGENTS.md)
**/*.{js,ts}: Use async/await for asynchronous operations
Prefer destructuring for function parameters
Use meaningful variable names that reflect Doc Detective terminology
Add JSDoc comments for complex functions
Files:
src/utils.js
src/**/*.js
📄 CodeRabbit inference engine (AGENTS.md)
Use the built-in logging system with available log levels: debug, info, warn, error
Files:
src/utils.js
🧬 Code graph analysis (1)
src/utils.js (2)
src/index.test.js (8)
require(5-5)require(891-891)ditamapPath(896-896)ditamapPath(941-941)path(890-890)referencedFiles(899-899)referencedFiles(944-944)referencedFiles(998-998)src/utils.test.js (17)
ditamapPath(15-15)ditamapPath(34-34)ditamapPath(53-53)ditamapPath(169-169)ditamapPath(181-181)ditamapPath(195-195)ditamapPath(210-210)ditamapPath(224-224)ditamapPath(235-235)ditamapPath(246-246)path(2-2)referencedFiles(16-16)referencedFiles(35-35)referencedFiles(56-56)referencedFiles(99-99)href(378-378)i(123-123)
🔇 Additional comments (2)
src/utils.js (2)
491-541: DITA map path-analysis and rewrite flow looks correctThe new pre-processing in
processDitaMap—parsing references, detecting parent/sibling traversals viapath.relative, computing a common ancestor, rewriting to a temporary ditamap under that ancestor, then cleaning up the temp file afterditacompletes—fits well with the existing flow and uses the centralloghelper consistently.The try/catch around
parseDitamapplus the warning+early-return on failure should also prevent the previous hard failure when recursion depth is exceeded or XML parsing blows up.
8-31: Dependency declaration verified as correct
fast-xml-parseris properly declared as a runtime dependency inpackage.jsonwith version^5.3.2, not indevDependencies. The concern raised in the review comment has been verified and confirmed—no issues exist with this dependency declaration.
| function copyAndRewriteDitamap(originalPath, commonAncestor) { | ||
| // Read the original ditamap | ||
| let xmlContent; | ||
| try { | ||
| xmlContent = fs.readFileSync(originalPath, "utf8"); | ||
| } catch (error) { | ||
| throw new Error(`Failed to read ditamap file ${originalPath}: ${error.message}`); | ||
| } | ||
|
|
||
| const originalDir = path.dirname(originalPath); | ||
| let newDitamapPath = path.join(commonAncestor, path.basename(originalPath)); | ||
|
|
||
| // Avoid overwriting the original file | ||
| if (path.resolve(newDitamapPath) === path.resolve(originalPath)) { | ||
| const baseName = path.basename(originalPath, path.extname(originalPath)); | ||
| const ext = path.extname(originalPath); | ||
| newDitamapPath = path.join(commonAncestor, `${baseName}_rewritten${ext}`); | ||
| } | ||
|
|
||
| // Use regex to find and replace href attributes while preserving formatting | ||
| // Match href attributes in topicref and mapref elements | ||
| const hrefRegex = /(<(?:topicref|mapref)[^>]*\s+href=")([^"]+)(")/g; | ||
|
|
||
| let newXmlContent = xmlContent.replace(hrefRegex, (match, prefix, href, suffix) => { | ||
| // Skip external references | ||
| if (href.startsWith("http://") || href.startsWith("https://") || href.startsWith("//")) { | ||
| return match; | ||
| } | ||
|
|
||
| // Resolve the old absolute path | ||
| const oldAbsolutePath = path.resolve(originalDir, href); | ||
|
|
||
| // Calculate new relative path from the new ditamap location | ||
| const newRelativePath = path.relative(commonAncestor, oldAbsolutePath); | ||
|
|
||
| // Normalize to forward slashes for consistency | ||
| const normalizedPath = newRelativePath.replace(/\\/g, "/"); | ||
|
|
||
| return prefix + normalizedPath + suffix; | ||
| }); | ||
|
|
||
| try { | ||
| // Write the rewritten ditamap to the new location | ||
| fs.writeFileSync(newDitamapPath, newXmlContent, "utf8"); | ||
| } catch (error) { | ||
| throw new Error(`Failed to write rewritten ditamap ${newDitamapPath}: ${error.message}`); | ||
| } | ||
|
|
||
| return newDitamapPath; | ||
| } |
There was a problem hiding this comment.
Href rewrite: broaden external-URL detection and consider uniqueness of temp filename
The overall approach—regex over <topicref|mapref> + relative rebasing from commonAncestor and avoiding overwrite of the original map—looks good and addresses the earlier overwrite concern.
Two minor points:
-
External URLs
Currently you only skiphttp://,https://, and protocol-relative//:if (href.startsWith("http://") || href.startsWith("https://") || href.startsWith("//")) { return match; }
DITA
hrefvalues can legitimately be other absolute-URI schemes (mailto:,ftp:,file:, etc.). Those will get treated as filesystem paths and rewritten, which can corrupt the link target. Since you already haveisRelativeUrlin this module, consider reusing it so only genuinely relative/file-ish hrefs are rewritten:- if (href.startsWith("http://") || href.startsWith("https://") || href.startsWith("//")) { - return match; - } + // Only rewrite relative/file-like hrefs; leave absolute URLs/schemes alone + if (!isRelativeUrl(href)) { + return match; + }
-
Temp filename reuse (non-blocking)
Appending_rewrittenavoids clobbering the original, but concurrent runs on the same map will still share the same temp filename incommonAncestor. If you expect parallel invocations, you might want to include a hash (e.g., MD5 oforiginalPath+ timestamp) in the temp filename to make it unique per run. Otherwise, this is acceptable as a baseline.
🤖 Prompt for AI Agents
In src/utils.js around lines 210 to 259, the href-rewrite logic only treats
http/https/“//” as external and may incorrectly rewrite other absolute URIs
(mailto:, ftp:, file:, tel:, data:, etc.) and the fallback temp filename
_rewritten can collide across concurrent runs; fix by reusing the module's
isRelativeUrl helper to detect genuinely relative hrefs and only rewrite those,
and make the fallback filename unique (e.g., append a short hash or timestamp
derived from originalPath and Date.now()) before writing the new ditamap.
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
src/utils.js (1)
107-123: Avoid rewriting non-relative URI schemes; reuseisRelativeUrlfor href filteringBoth
parseDitamapandcopyAndRewriteDitamapcurrently special‑case onlyhttp://,https://, and protocol‑relative//hrefs:if (href.startsWith("http://") || href.startsWith("https://") || href.startsWith("//")) { // ... }DITA
hrefvalues may legitimately use other absolute URI schemes (mailto:,ftp:,file:,tel:,data:, etc.). Those will be treated as file paths and incopyAndRewriteDitamapwill be rewritten, corrupting external links in the rewritten map.Since you already have an
isRelativeUrlhelper, you can safely limit parsing/rewriting to genuinely relative or file-like hrefs and leave absolute URIs untouched:// In parseDitamap - // Skip external references (http://, https://, etc.) - if (href.startsWith("http://") || href.startsWith("https://") || href.startsWith("//")) { - continue; - } + // Only follow relative/file-like hrefs; skip absolute URLs/schemes + if (!isRelativeUrl(href)) { + continue; + } // In copyAndRewriteDitamap - // Skip external references - if (href.startsWith("http://") || href.startsWith("https://") || href.startsWith("//")) { - return match; - } + // Only rewrite relative/file-like hrefs; leave absolute URLs/schemes unchanged + if (!isRelativeUrl(href)) { + return match; + }This prevents accidental rewriting of non-HTTP(S) URIs while still handling relative DITA links as intended.
Also, if you expect concurrent runs that might target the same map under the same
commonAncestor, consider adding a small uniqueness component (timestamp or short hash) to the_rewrittenfilename to avoid collisions, though for most serial workflows the current naming is likely sufficient.Also applies to: 233-249
🧹 Nitpick comments (2)
src/utils.js (2)
55-141: Tighten recursion handling and make depth limit easier to tune inparseDitamapThe recursion guard and circular-reference tracking look good, but a couple of tweaks would make this more robust:
- Don’t count cycles toward
maxDepth
Because the depth check runs before thevisitedMapscheck, a circular reference first hit at or beyondmaxDepthwill throw instead of being short‑circuited by the visited set. Re‑ordering the checks avoids this:function parseDitamap(ditamapPath, visitedMaps = new Set(), depth = 0, maxDepth = 10) { - // Check recursion depth - if (depth >= maxDepth) { - throw new Error(`Maximum recursion depth (${maxDepth}) exceeded while parsing ditamap: ${ditamapPath}`); - } - - // Circular reference detection - const normalizedPath = path.resolve(ditamapPath); - if (visitedMaps.has(normalizedPath)) { - return []; // Skip already-visited maps - } + // Normalize first so both checks use the same key + const normalizedPath = path.resolve(ditamapPath); + + // Circular reference detection + if (visitedMaps.has(normalizedPath)) { + return []; // Skip already-visited maps + } + + // Enforce maximum recursion depth only for unseen maps + if (depth >= maxDepth) { + throw new Error( + `Maximum recursion depth (${maxDepth}) exceeded while parsing ditamap: ${normalizedPath}` + ); + }
Expose
maxDepthvia config (not just a positional parameter)
You already support amaxDepthargument; wiring this to a config option (and passing it fromprocessDitaMap) would let callers tune depth for deep but valid DITA hierarchies without code changes.Optional: use the shared logger instead of
console.warn
To keep logging consistent with the rest ofsrc/**/*.js, consider accepting an optional logger orconfigparameter and usinglog(config, "warning", ...)for nested-ditamap failures instead ofconsole.warn.
151-199: Root-only common ancestor is treated as “no common ancestor” and makes root-handling block unreachableWhen paths only share the filesystem root (e.g.,
/home/...and/opt/...),commonComponentsbecomes[''], so this condition fires:if (commonComponents.length === 0 || (commonComponents.length === 1 && commonComponents[0] === '')) { return process.cwd(); }That means:
- The actual common ancestor
/is ignored.- The later
if (commonPath === '') { commonPath = '/'; }branch is effectively unreachable.If you want the root directory to be a valid common ancestor, you can treat that explicitly and simplify the fallback:
- // If no common ancestor found (e.g., different drives on Windows), use current working directory - if (commonComponents.length === 0 || (commonComponents.length === 1 && commonComponents[0] === '')) { - return process.cwd(); - } - - // Reconstruct the common ancestor path - let commonPath = commonComponents.join(path.sep); - - // Handle the root directory case on Unix-like systems - if (commonPath === '') { - commonPath = '/'; - } + // Reconstruct the common ancestor path + let commonPath = commonComponents.join(path.sep); + + // If only the root component is shared, use the filesystem root + if (commonPath === '') { + return path.sep; + } + + // If no common ancestor components at all (e.g., different drives on Windows), use CWD + if (commonComponents.length === 0) { + return process.cwd(); + }Adjust the exact behavior if you intentionally prefer
process.cwd()over the root directory, but in that case you can drop the now-unreachable root-handling code.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/utils.js(5 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{js,ts}
📄 CodeRabbit inference engine (AGENTS.md)
**/*.{js,ts}: Use async/await for asynchronous operations
Prefer destructuring for function parameters
Use meaningful variable names that reflect Doc Detective terminology
Add JSDoc comments for complex functions
Files:
src/utils.js
src/**/*.js
📄 CodeRabbit inference engine (AGENTS.md)
Use the built-in logging system with available log levels: debug, info, warn, error
Files:
src/utils.js
🧠 Learnings (1)
📚 Learning: 2025-11-25T23:59:45.298Z
Learnt from: CR
Repo: doc-detective/resolver PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-25T23:59:45.298Z
Learning: Applies to src/resolve.js : Follow existing regex patterns for markup detection when adding new features
Applied to files:
src/utils.js
🧬 Code graph analysis (1)
src/utils.js (2)
src/index.test.js (16)
require(5-5)require(891-891)ditamapPath(896-896)ditamapPath(941-941)path(890-890)referencedFiles(899-899)referencedFiles(944-944)referencedFiles(998-998)commonAncestor(912-912)commonAncestor(956-956)newDitamapPath(916-916)newDitamapPath(957-957)sourceDir(903-903)sourceDir(1001-1001)relativePath(905-905)relativePath(1003-1003)src/utils.test.js (42)
ditamapPath(15-15)ditamapPath(34-34)ditamapPath(53-53)ditamapPath(169-169)ditamapPath(181-181)ditamapPath(195-195)ditamapPath(210-210)ditamapPath(224-224)ditamapPath(235-235)ditamapPath(246-246)path(2-2)referencedFiles(16-16)referencedFiles(35-35)referencedFiles(56-56)referencedFiles(99-99)referencedPaths(170-173)referencedPaths(182-185)referencedPaths(196-199)referencedPaths(211-214)referencedPaths(225-227)referencedPaths(236-236)referencedPaths(247-249)i(123-123)commonAncestor(175-175)commonAncestor(187-187)commonAncestor(201-201)commonAncestor(216-216)commonAncestor(229-229)commonAncestor(238-238)commonAncestor(251-251)commonAncestor(274-274)commonAncestor(293-293)commonAncestor(316-316)commonAncestor(344-344)commonAncestor(359-359)commonAncestor(366-366)commonAncestor(395-395)newDitamapPath(276-276)newDitamapPath(295-295)newDitamapPath(318-318)newDitamapPath(345-345)newDitamapPath(368-368)
🔇 Additional comments (2)
src/utils.js (2)
425-433: Behavior of new.ditamaphandling inqualifyFileslooks soundThe new branch:
if (isFile && path.extname(source) === ".ditamap" && config.processDitaMaps) { const ditaOutput = await processDitaMap({config, source}); if (ditaOutput) { const currentIndex = sequence.indexOf(source); sequence.splice(currentIndex + 1, 0, ditaOutput); } continue; }
- Gates DITA processing behind
config.processDitaMaps(opt‑in).- Inserts the generated output directory immediately after the map in
sequence.- Skips further qualification of the map itself when processing fails (returns
null), which is consistent with “abort DITA processing” semantics.This integration point looks correct and matches the rest of the file’s style.
490-545: DITA map preprocessing and temp cleanup inprocessDitaMaplooks goodThe enhanced
processDitaMapflow is coherent:
Uses
parseDitamap(source)plus a relative‑path check to decide when a rewrite is actually needed (needsRewrite).Computes a
commonAncestorand callscopyAndRewriteDitamaponly in that case, avoiding unnecessary copies.Sets
cwdfor theditainvocation to the directory of the (possibly rewritten) map so relative hrefs resolve as expected:const workingDir = copiedDitamapPath ? path.dirname(copiedDitamapPath) : path.dirname(source); const ditaOutputDir = await spawnCommand("dita", ["-i", source, "-f", "dita", "-o", outputDir], { cwd: workingDir, });Catches path-analysis failures from
parseDitamap, logs a warning, and aborts cleanly by returningnullinstead of throwing.Immediately deletes the temporary rewritten ditamap on success or failure, with a debug log and warning on cleanup failure.
Overall this is a solid baseline for relative DITA resolution and should behave predictably for both rewritten and non‑rewritten maps.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
src/utils.js (1)
239-246: UseisRelativeUrlfor consistency withparseDitamap.This addresses more schemes than before, but for consistency and maintainability, consider using the existing
isRelativeUrlhelper as suggested in a previous review comment.- // Skip external references (http, https, ftp, etc.) - if (href.startsWith("http://") || href.startsWith("https://") || href.startsWith("//") || href.startsWith("ftp://")) { - return match; - } - - // Skip absolute file:// URIs - these should not be rewritten - if (href.startsWith("file:/")) { + // Skip external/absolute references - only rewrite relative/file-like hrefs + if (!isRelativeUrl(href)) { return match; }
🧹 Nitpick comments (5)
src/utils.js (5)
8-8: Remove unused importXMLBuilder.
XMLBuilderis imported but never used in this file.-const { XMLParser, XMLBuilder } = require("fast-xml-parser"); +const { XMLParser } = require("fast-xml-parser");
126-127: Use built-inlogfunction instead ofconsole.warn.Per coding guidelines,
src/**/*.jsfiles should use the built-in logging system. However,parseDitamapdoesn't have access toconfig. Consider either passingconfigas a parameter or accepting this limitation for utility functions.
111-113: Broaden external URL detection usingisRelativeUrl.This check misses other absolute URI schemes (
mailto:,ftp:,file:,tel:,data:, etc.) which would incorrectly be treated as file paths. TheisRelativeUrlhelper already exists in this module.- // Skip external references (http://, https://, etc.) - if (href.startsWith("http://") || href.startsWith("https://") || href.startsWith("//")) { + // Skip external/absolute references + if (!isRelativeUrl(href)) { continue; }
295-295: Use built-in logging system.Similar to line 126,
console.warnshould be replaced with thelogfunction per coding guidelines. Since this utility function doesn't have access toconfig, consider either:
- Passing
configas a parameter- Accepting
console.warnfor standalone utilities- Creating a simpler logging wrapper
538-576: Consider using a separate variable instead of reassigningsource.Reassigning the
sourceparameter (line 566) can make debugging harder. Using a distinct variable likeditamapToProcesswould clarify which path is being used.+ let ditamapToProcess = source; + try { referencedFiles = parseDitamap(source); // Check if any referenced files require parent traversal const sourceDir = path.dirname(path.resolve(source)); needsRewrite = referencedFiles.some(refPath => { const relativePath = path.relative(sourceDir, refPath); return relativePath.startsWith(".."); }); if (needsRewrite) { log(config, "info", `Ditamap references parent/sibling directories. Rewriting paths...`); // Find common ancestor directory const commonAncestor = findCommonAncestor(source, referencedFiles); log(config, "debug", `Common ancestor directory: ${commonAncestor}`); // Copy and rewrite the ditamap copiedDitamapPath = copyAndRewriteDitamap(source, commonAncestor); log(config, "debug", `Created temporary ditamap: ${copiedDitamapPath}`); // Use the copied ditamap for processing - source = copiedDitamapPath; + ditamapToProcess = copiedDitamapPath; } } catch (error) {Then use
ditamapToProcessin thespawnCommandcall.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/utils.js(5 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{js,ts}
📄 CodeRabbit inference engine (AGENTS.md)
**/*.{js,ts}: Use async/await for asynchronous operations
Prefer destructuring for function parameters
Use meaningful variable names that reflect Doc Detective terminology
Add JSDoc comments for complex functions
Files:
src/utils.js
src/**/*.js
📄 CodeRabbit inference engine (AGENTS.md)
Use the built-in logging system with available log levels: debug, info, warn, error
Files:
src/utils.js
🧠 Learnings (1)
📚 Learning: 2025-11-25T23:59:45.298Z
Learnt from: CR
Repo: doc-detective/resolver PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-25T23:59:45.298Z
Learning: Applies to src/resolve.js : Follow existing regex patterns for markup detection when adding new features
Applied to files:
src/utils.js
🔇 Additional comments (5)
src/utils.js (5)
151-199: LGTM!The common ancestor logic is well-implemented with proper handling of edge cases including Unix root directories and fallback to
process.cwd().
248-251: Clarify the purpose of this hex filename check.This regex matches filenames like
a1b2c3...d4e5f6.ditamap(32+ hex chars). The comment mentions "hashes/IDs" but it's unclear why a.ditamapwith a hex name should be skipped. If this is intentional, please add a more descriptive comment explaining the use case.
473-481: LGTM!The addition of
config.processDitaMapscheck provides configuration flexibility for controlling DITA map processing.
584-592: LGTM!The cleanup logic properly handles temporary file removal with appropriate error handling and logging.
269-280: Clarify the purpose of the second existence check.The second
fs.existsSync()check is intentional: it validates that the forward-slash normalized path resolves correctly and distinguishes two failure modes in error reporting (missing source file vs. problematic path rewrite). Update the comment to clarify this defensive validation purpose instead of relying on the assumption thatpath.relative()andpath.resolve()will always round-trip correctly across platforms and filesystem edge cases.
| function parseDitamap(ditamapPath, visitedMaps = new Set(), depth = 0, maxDepth = 10) { | ||
| // Check recursion depth | ||
| if (depth >= maxDepth) { | ||
| throw new Error(`Maximum recursion depth (${maxDepth}) exceeded while parsing ditamap: ${ditamapPath}`); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
rg -n "parseDitamap\(" --type=js -A2Repository: doc-detective/resolver
Length of output: 2484
🏁 Script executed:
# Check if maxDepth is passed as argument in any calls
rg -n "parseDitamap\(" --type=js -B1 -A3Repository: doc-detective/resolver
Length of output: 3993
🏁 Script executed:
# Check for config files or config patterns
fd -e "config\\.js" -e "config\\.json" -e "\\.env" | head -20Repository: doc-detective/resolver
Length of output: 48
🏁 Script executed:
# Look for configuration usage patterns in src directory
rg "config\\..*Depth|maxDepth|MAX_DEPTH" --type=js -B2 -A2Repository: doc-detective/resolver
Length of output: 1412
🏁 Script executed:
sed -n '130,150p' src/utils.test.jsRepository: doc-detective/resolver
Length of output: 749
🏁 Script executed:
# Check if there's any configuration system in the codebase at all
rg "config\." --type=js src/ | head -20Repository: doc-detective/resolver
Length of output: 1786
🏁 Script executed:
# Look for any environment variable usage pattern in src
rg "process\\.env|require.*config" --type=js src/Repository: doc-detective/resolver
Length of output: 1583
🏁 Script executed:
# Check if there's a logger/config initialization
head -50 src/utils.jsRepository: doc-detective/resolver
Length of output: 1675
🏁 Script executed:
# Check logger implementation
cat -n src/logger.js | head -30Repository: doc-detective/resolver
Length of output: 111
🏁 Script executed:
# Find logger
fd "logger" --type=fRepository: doc-detective/resolver
Length of output: 48
🏁 Script executed:
# Check logger implementation in lib
head -40 lib/logger.jsRepository: doc-detective/resolver
Length of output: 138
🏁 Script executed:
sed -n '530,560p' src/utils.jsRepository: doc-detective/resolver
Length of output: 1108
🏁 Script executed:
# Check the resolveReferences function signature
grep -n "function resolveReferences\|exports.resolveReferences" src/utils.js -A5Repository: doc-detective/resolver
Length of output: 48
🏁 Script executed:
sed -n '520,570p' src/utils.jsRepository: doc-detective/resolver
Length of output: 1930
🏁 Script executed:
# Check if there are any recent PRs or comments about this depth issue
git log --oneline -20 -- src/utils.jsRepository: doc-detective/resolver
Length of output: 154
Consider increasing the default maxDepth or making it configurable via config.
The default maxDepth of 10 is too low for realistic DITA structures. The test suite includes a case with 12 nested maps, exceeding this limit. While the parameter can be overridden by callers, none currently do so. Since resolveReferences already receives a config object, either increase the default to a safer value (e.g., 20-50) or expose config.ditamapMaxDepth to allow users to handle deeply nested maps without modifying source code.
🤖 Prompt for AI Agents
In src/utils.js around lines 55-59, the default maxDepth (10) for parseDitamap
is too low and causes failures for realistic DITA maps; change parseDitamap to
take its maxDepth from config (e.g., config.ditamapMaxDepth) with a higher sane
default (suggest 50) and update callers (such as resolveReferences) to pass
through config.ditamapMaxDepth; specifically, modify the function signature to
accept a config or options parameter, use config.ditamapMaxDepth || 50 as the
maxDepth value, and ensure all call sites forward the existing resolveReferences
config so deep maps can be handled without source edits.
Summary by CodeRabbit
New Features
Tests
Chores
✏️ Tip: You can customize this high-level summary in your review settings.