Conversation
…nization - Added HerettoUploader class to handle file uploads to Heretto CMS. - Integrated uploader into the existing upload process for changed screenshots. - Enhanced screenshot saving logic to track changes and source integration metadata. - Created documentation for Heretto screenshot sync feature, detailing the upload process and architecture. - Updated resolver to build file mappings and handle Heretto-specific configurations. - Introduced utility functions to find Heretto integrations based on file paths. - Added error handling and logging for upload operations to improve reliability.
- Create integration tests for Heretto API interactions. - Implement job asset retrieval with pagination support. - Validate presence of .ditamap files in job assets. - Update package.json scripts for integration testing. - Modify Heretto functions to improve error handling and logging.
- Update `saveScreenshot.js` to mark new screenshots as changed for upload. - Add a new test file `screenshot.test.js` to verify sourceIntegration preservation and changed status for screenshots. - Create a new test file `upload.test.js` to validate the upload module, including collecting changed files and integration configuration. - Modify `package-lock.json` to link to local `doc-detective-common` instead of a versioned package. - Enhance `heretto.js` to retrieve resource dependencies from the Heretto REST API, including detailed logging and XML parsing.
… integration tests - Refactor folder resolution logic to ensure correct target folder is identified before file upload. - Implement integration tests for Heretto upload functionality, covering document retrieval and upload processes. - Update existing tests to reflect changes in folder resolution behavior.
WalkthroughAdds a new sourceIntegration JSON schema and injects source-integration metadata across v3 schemas; implements a Heretto upload pipeline (uploader class, registry, orchestration), integrates upload-on-change into test runs, extends resolver to fetch resource dependencies/file mappings, and adds tests, docs, and CI for the feature. Changes
Sequence Diagram(s)sequenceDiagram
participant Runner as Test Runner
participant Report as Report Builder
participant Collector as collectChangedFiles
participant Orchestrator as uploadChangedFiles
participant Uploader as HerettoUploader
participant Heretto as Heretto REST API
participant FS as Local FS
Runner->>Report: execute tests (capture screenshots)
Report->>Report: set outputs.changed / screenshotPath / sourceIntegration
Report->>Collector: provide final report
Collector-->>Orchestrator: list of changed files with sourceIntegration
rect rgb(235,245,255)
note right of Orchestrator: Best-effort upload loop
Orchestrator->>Uploader: select uploader by type
loop per changed file
Uploader->>Heretto: resolve fileId (resourceDependencies/search)
alt fileId found
Heretto-->>Uploader: return fileId
else not found
Uploader->>Heretto: createDocument / search fallback
Heretto-->>Uploader: return documentId
end
Uploader->>FS: read localFilePath
FS-->>Uploader: return file content
Uploader->>Heretto: uploadFile(documentId, content)
Heretto-->>Uploader: upload result
Uploader-->>Orchestrator: per-file result (PASS/FAIL/SKIPPED)
end
end
Orchestrator-->>Report: attach uploadResults
Report-->>Runner: return enriched report
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
core/src/tests/saveScreenshot.js (1)
292-296: Missing output metadata for overwrite="true" case.When
overwriteis"true", the screenshot is replaced but the function returns without populatingresult.outputs.screenshotPath,result.outputs.changed, orresult.outputs.sourceIntegration. This breaks the upload workflow, as downstream logic expects these fields to identify changed screenshots for upload to Heretto CMS.🔎 Proposed fix
if (step.screenshot.overwrite == "true") { // Replace old file with new file result.description += ` Overwrote existing file.`; fs.renameSync(filePath, existFilePath); + result.outputs.screenshotPath = existFilePath; + result.outputs.changed = true; + // Preserve sourceIntegration metadata + if (step.screenshot.sourceIntegration) { + result.outputs.sourceIntegration = step.screenshot.sourceIntegration; + } return result; }
♻️ Duplicate comments (6)
common/src/schemas/output_schemas/config_v3.schema.json (3)
5834-5882: DuplicatesourceIntegrationdefinition.This is identical to the definition at lines 5588-5636. See the earlier comment about using
$refto consolidate.
13691-13739: DuplicatesourceIntegrationdefinition.Same as lines 5588-5636 and 5834-5882.
13937-13985: DuplicatesourceIntegrationdefinition.Fourth copy of the same schema. This reinforces the recommendation to centralize the definition.
common/src/schemas/output_schemas/test_v3.schema.json (3)
5648-5697: DuplicatesourceIntegrationdefinition.Same nested
$schemaobservation applies here as noted in the earlier occurrence. No additional concerns—structure is consistent.
12979-13028: Consistent with prior occurrences.Same
sourceIntegrationschema; no new issues beyond what was noted earlier.
13225-13274: Consistent with prior occurrences.Same
sourceIntegrationschema; no new issues beyond what was noted earlier.
🧹 Nitpick comments (26)
docs/docs/references/integrations/heretto-screenshot-sync.md (3)
52-77: Add language specifier to fenced code block.Per markdownlint, fenced code blocks should have a language specified. For the ASCII diagram, use
textorplaintext:🔎 Proposed fix
-``` +```text ┌─────────────────────────────────────────────────────────────────────┐
85-94: Add language specifier for HTTP example.This code block shows an HTTP request. Use
httpas the language specifier:🔎 Proposed fix
-``` +```http POST /ezdnxtgen/api/search
100-106: Add language specifier for HTTP example.🔎 Proposed fix
-``` +```http PUT /rest/all-files/<document-id>/contentcore/src/integrations/heretto.js (2)
34-228: Consider adding request timeouts to prevent hanging.The HTTP requests in this class don't have timeouts configured. If the Heretto API becomes unresponsive, requests will hang indefinitely. This could cause the upload process to stall without feedback.
🔎 Example fix for one request (apply pattern to all)
const req = protocol.request(options, (res) => { // ... handler code }); + + // Timeout after 30 seconds + req.setTimeout(30000, () => { + req.destroy(new Error('Request timeout')); + }); req.on("error", (error) => {
200-207: Synchronous file read may block on large files.
fs.readFileSyncblocks the event loop. For large screenshot files this is likely acceptable, but for consistency with the async nature of the class, consider usingfs.promises.readFile.🔎 Proposed async alternative
+ const fsPromises = require("fs").promises; + // Read the local file if (!fs.existsSync(localFilePath)) { result.description = `Local file not found: ${localFilePath}`; return result; } - const fileContent = fs.readFileSync(localFilePath); + const fileContent = await fsPromises.readFile(localFilePath); const contentType = this.getContentType(localFilePath);core/src/integrations/index.js (1)
199-212: Consider supporting additional integration types in the future.The switch statement currently only handles
"heretto". While this matches the current schema enum, consider adding a debug log for unknown types to aid future debugging when new integrations are added.🔎 Proposed enhancement
switch (sourceIntegration.type) { case "heretto": return config?.integrations?.heretto?.find( (h) => h.name === sourceIntegration.integrationName ); default: + // Log for debugging when new integration types are added return null; }core/dev/index.js (1)
11-11: Consider removing the commented local path.The commented line contains a Windows user path with a username (
hawkeyexl). While harmless in a dev file, it's good practice to remove such personal paths before committing to avoid cluttering the codebase with developer-specific artifacts.core/src/tests.js (1)
801-817: Consider making the integration check more extensible.The upload logic currently checks specifically for
config?.integrations?.heretto?.length > 0. If additional integrations (e.g., GitHub, GitLab) that support uploads are added in the future, this condition will need to be updated each time.Consider refactoring to check for any configured integrations that support uploads, or delegate the check to the
uploadChangedFilesfunction itself.💡 Suggested approach
Option 1: Let
uploadChangedFileshandle the logic internally:- // Upload changed files back to source integrations (best-effort) - // This automatically syncs any changed screenshots back to their source CMS - if (config?.integrations?.heretto?.length > 0) { - try { - const uploadResults = await uploadChangedFiles({ config, report, log }); - report.uploadResults = uploadResults; - } catch (error) { - log(config, "warning", `Failed to upload changed files: ${error.message}`); - report.uploadResults = { - total: 0, - successful: 0, - failed: 0, - skipped: 0, - error: error.message, - }; - } - } + // Upload changed files back to source integrations (best-effort) + // This automatically syncs any changed screenshots back to their source CMS + try { + const uploadResults = await uploadChangedFiles({ config, report, log }); + report.uploadResults = uploadResults; + } catch (error) { + log(config, "warning", `Failed to upload changed files: ${error.message}`); + report.uploadResults = { + total: 0, + successful: 0, + failed: 0, + skipped: 0, + error: error.message, + }; + }Then let
uploadChangedFilesreturn early with zero counts if no integrations are configured.Option 2: Create a helper function:
function hasUploadableIntegrations(config) { return config?.integrations?.heretto?.length > 0 || config?.integrations?.github?.length > 0 || // ... other integrations }common/src/schemas/output_schemas/step_v3.schema.json (1)
4802-4851: Avoid duplicating thesourceIntegrationschema; prefer a single reusable definitionThe two
sourceIntegrationblocks underscreenshot(main properties andcomponents.schemas.object.properties) are effectively identical and also mirror the standalonesourceIntegration_v3schema introduced elsewhere. This works, but it creates three sources of truth that can drift.If your schema tooling allows it, consider one of these to reduce maintenance:
- Reference the shared
sourceIntegration_v3schema here via$ref, or- Define a single
components.schemas.sourceIntegrationsubschema in this file and$refit from bothscreenshotshapes instead of repeating the full object.That keeps step_v3 aligned with other v3 schemas and makes future integration types (beyond
heretto) easier to add in one place.Also applies to: 5048-5097
core/test/screenshot.test.js (2)
43-52: Consider usingfs.rmSyncfor more robust cleanup.
fs.rmdirSyncis deprecated in Node.js 16+ in favor offs.rmSyncwith therecursiveoption. The current cleanup also doesn't handle cases where files might still exist due to test failures.🔎 Proposed fix
afterEach(function () { // Cleanup temp files if (fs.existsSync(tempDir)) { - const files = fs.readdirSync(tempDir); - for (const file of files) { - fs.unlinkSync(path.join(tempDir, file)); - } - fs.rmdirSync(tempDir); + fs.rmSync(tempDir, { recursive: true, force: true }); } });
84-97: Hard-coded step index may be brittle.Accessing
result.specs[0].tests[0].contexts[0].steps[1]assumes a fixed structure. If the test framework adds implicit steps or the goTo step fails, this index could break silently. Consider adding validation or extracting a helper function.🔎 Proposed helper approach
// Helper to find screenshot step function findScreenshotStep(result) { const steps = result.specs[0]?.tests[0]?.contexts[0]?.steps || []; return steps.find(s => s.outputs?.path?.endsWith('.png') || s.screenshot); }common/src/schemas/output_schemas/screenshot_v3.schema.json (2)
190-240: Consider using$refto avoid schema duplication.The
sourceIntegrationschema is fully defined inline here and duplicated again incomponents.schemas.object(lines 437-485). According to the AI summary, a dedicatedsourceIntegration_v3.schema.jsonwas added. Using$refwould reduce duplication and ensure consistency.🔎 Proposed fix using $ref
}, - "sourceIntegration": { - "description": "Information about the source integration for this screenshot, enabling upload of changed files back to the source CMS. Set automatically during test resolution for files from integrations.", - "$schema": "http://json-schema.org/draft-07/schema#", - "title": "sourceIntegration", - "type": "object", - "additionalProperties": false, - "required": [ - "type", - "integrationName" - ], - "properties": { - ... - }, - "examples": [...] + "sourceIntegration": { + "$ref": "sourceIntegration_v3.schema.json#" }
436-486: Duplicate schema definition in components.schemas.object.This is the same
sourceIntegrationschema repeated from lines 191-239. Same recommendation applies: use$reftosourceIntegration_v3.schema.jsonfor consistency and maintainability.core/test/heretto-upload.test.js (2)
32-34: Shared mutable state creates test ordering dependency.
testDocumentIdis modified in the first test of the "createDocument and upload flow" describe block and used by subsequent tests. This creates an implicit ordering dependency where later tests will skip if the first test fails. Consider documenting this intentional coupling or restructuring to make each test independent.
368-376: Usefs.rmSyncfor cleanup.
fs.rmdirSyncis deprecated in Node.js 16+. Usingfs.rmSyncwithrecursive: trueis more robust and handles non-empty directories.🔎 Proposed fix
} finally { // Cleanup - if (fs.existsSync(localFilePath)) { - fs.unlinkSync(localFilePath); - } - if (fs.existsSync(tempDir)) { - fs.rmdirSync(tempDir); - } + if (fs.existsSync(tempDir)) { + fs.rmSync(tempDir, { recursive: true, force: true }); + } }common/src/schemas/output_schemas/spec_v3.schema.json (1)
6002-6052: Consider consolidating duplicate schema definitions using $ref.The
sourceIntegrationschema is defined identically in four locations. To improve maintainability and reduce the risk of inconsistencies during future updates, consider extracting this into a shared definition and referencing it using$ref.💡 Refactoring approach
- Add the schema once to a
definitionsor$defssection at the root level:"$defs": { "sourceIntegration": { "description": "Information about the source integration for this screenshot, enabling upload of changed files back to the source CMS. Set automatically during test resolution for files from integrations.", "$schema": "http://json-schema.org/draft-07/schema#", "title": "sourceIntegration", "type": "object", "additionalProperties": false, "required": ["type", "integrationName"], "properties": { "type": { "type": "string", "description": "The type of integration. Currently supported: 'heretto'. Additional types may be added in the future.", "enum": ["heretto"] }, "integrationName": { "type": "string", "description": "The name of the integration configuration in the config file. Used to look up authentication credentials." }, "fileId": { "type": "string", "description": "The unique identifier (UUID) of the file in the source CMS. If not provided, the file will be looked up by path." }, "filePath": { "type": "string", "description": "The path of the file in the source CMS. Used for lookup if fileId is not available." }, "contentPath": { "type": "string", "description": "The local path to the file that references this source. Used for resolving relative paths." } }, "examples": [...] } }
- Replace each inline definition with a reference:
"sourceIntegration": { "$ref": "#/$defs/sourceIntegration" }This reduces duplication and ensures all four locations stay synchronized automatically.
Also applies to: 6248-6298, 13579-13629, 13825-13875
common/src/schemas/output_schemas/report_v3.schema.json (1)
6021-6070:$schemakeyword should not appear in nested definitions.The
$schemakeyword (line 6023 and repeated in hunks 2-4) is meant to declare the meta-schema version at the root level of a schema document, not inside nested object definitions. Some validators may emit warnings or behave unexpectedly.If this file is generated by
dereferenceSchemas.js, consider updating the sourcesourceIntegration_v3.schema.jsonto omit$schema(since it will be bundled), or post-process the dereferenced output to remove$schemafrom non-root locations.🔎 Proposed fix (apply to all 4 occurrences)
"sourceIntegration": { "description": "Information about the source integration for this screenshot, enabling upload of changed files back to the source CMS. Set automatically during test resolution for files from integrations.", - "$schema": "http://json-schema.org/draft-07/schema#", "title": "sourceIntegration", "type": "object",common/src/schemas/output_schemas/config_v3.schema.json (1)
5588-5636: Consider using$refto deduplicate thesourceIntegrationschema.The
sourceIntegrationschema definition is duplicated identically across multiple sections in this file (lines 5588-5636, 5834-5882, 13691-13739, 13937-13985). This violates DRY and makes maintenance error-prone.Additionally, the nested
"$schema": "http://json-schema.org/draft-07/schema#"(line 5590) is non-standard—$schemashould only appear at the document root. Most validators will ignore it, but it may cause confusion.🔎 Suggested approach
- Define
sourceIntegrationonce underdefinitionsor$defs:"$defs": { "sourceIntegration": { "title": "sourceIntegration", "type": "object", "additionalProperties": false, "required": ["type", "integrationName"], "properties": { // ... properties here without nested $schema } } }
- Reference it wherever needed:
"sourceIntegration": { "description": "Information about the source integration...", "$ref": "#/$defs/sourceIntegration" }common/src/schemas/output_schemas/test_v3.schema.json (1)
5402-5451: Consider removing nested$schemain dereferenced output.The
$schemakeyword at line 5404 is non-standard when appearing inside a nested object definition rather than at the schema root. This appears to be an artifact from dereferencingsourceIntegration_v3.schema.json. While most validators ignore it, removing it from the source schema before dereferencing would produce cleaner output.The
sourceIntegrationstructure itself is well-designed with appropriate required fields and clear descriptions.resolver/src/heretto.js (7)
228-261: Consider adding a safety limit for pagination.The pagination loop relies on
totalPagesfrom the API response. If the API returns incorrect pagination metadata (e.g.,totalPageskeeps increasing), this could loop indefinitely.🔎 Proposed safety limit
async function getJobAssetDetails(client, fileId, jobId) { const allAssets = []; let page = 0; const pageSize = 100; + const maxPages = 1000; // Safety limit let hasMorePages = true; - while (hasMorePages) { + while (hasMorePages && page < maxPages) { const response = await client.get( `/files/${fileId}/publishes/${jobId}/assets`,
492-498: Metadata keys mixed with path data may cause collisions.Storing metadata keys like
_ditamapPath,_ditamapId, and_ditamapParentFolderIddirectly inpathToUuidMapcould conflict with actual file paths starting with underscores. Consider using a separate object structure.🔎 Proposed structure separation
+ // Store metadata separately from path mappings + pathToUuidMap._metadata = { + ditamapPath: relativePath, + ditamapId: ditamapId, + ditamapParentFolderId: ditamapParentFolder, + }; - // Store the ditamap info as reference points for creating new files - pathToUuidMap._ditamapPath = relativePath; - pathToUuidMap._ditamapId = ditamapId; - pathToUuidMap._ditamapParentFolderId = ditamapParentFolder;
515-515: Remove unusedparentPathparameter.The
parentPathparameter inextractDependenciesis defined but never used in the function body.🔎 Proposed fix
- const extractDependencies = (obj, parentPath = "") => { + const extractDependencies = (obj) => {
617-680: Side-effect mutation ofherettoConfig.The function mutates
herettoConfigby addingresourceDependenciesandfileMappingproperties. While this may be intentional for downstream usage, consider returning these values explicitly or documenting this behavior in the JSDoc.* @param {Object} config - Doc Detective config for logging -* @returns {Promise<string|null>} Path to extracted content or null on failure +* @returns {Promise<string|null>} Path to extracted content or null on failure. +* Side effect: When uploadOnChange is true, adds resourceDependencies and fileMapping to herettoConfig. */
784-786: Silent error swallowing hinders debugging.Errors from
readdirSync/statSyncare silently ignored. While skipping inaccessible directories is reasonable, consider logging at debug level for troubleshooting.🔎 Proposed enhancement
The function doesn't have access to
logandconfig. If debugging becomes necessary, consider passing these parameters or returning a structure that includes skipped paths.
841-841: Consider reusing the API client.A new API client is created on every call to
searchFileByName. If this function is called frequently during file resolution, consider accepting the client as a parameter or caching it.
907-936: Consider adding file size validation before upload.The upload uses
maxBodyLength: InfinityandmaxContentLength: Infinity, removing protections against oversized files. While this may be intentional for large images, consider adding a reasonable upper bound or at least logging the file size.🔎 Proposed enhancement
// Read file as binary const fileBuffer = fs.readFileSync(localFilePath); + + // Log file size for monitoring + const fileSizeKB = (fileBuffer.length / 1024).toFixed(2); + log(config, "debug", `File size: ${fileSizeKB} KB`); + + // Optional: Add size limit + const MAX_FILE_SIZE = 50 * 1024 * 1024; // 50 MB + if (fileBuffer.length > MAX_FILE_SIZE) { + log(config, "warning", `File exceeds size limit: ${localFilePath}`); + return { + status: "FAIL", + description: `File exceeds maximum size of ${MAX_FILE_SIZE / 1024 / 1024} MB`, + }; + }
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (10)
common/dist/schemas/config_v3.schema.jsonis excluded by!**/dist/**common/dist/schemas/report_v3.schema.jsonis excluded by!**/dist/**common/dist/schemas/resolvedTests_v3.schema.jsonis excluded by!**/dist/**common/dist/schemas/screenshot_v3.schema.jsonis excluded by!**/dist/**common/dist/schemas/sourceIntegration_v3.schema.jsonis excluded by!**/dist/**common/dist/schemas/spec_v3.schema.jsonis excluded by!**/dist/**common/dist/schemas/step_v3.schema.jsonis excluded by!**/dist/**common/dist/schemas/test_v3.schema.jsonis excluded by!**/dist/**core/package-lock.jsonis excluded by!**/package-lock.jsonresolver/package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (30)
common/src/schemas/dereferenceSchemas.jscommon/src/schemas/output_schemas/config_v3.schema.jsoncommon/src/schemas/output_schemas/report_v3.schema.jsoncommon/src/schemas/output_schemas/resolvedTests_v3.schema.jsoncommon/src/schemas/output_schemas/screenshot_v3.schema.jsoncommon/src/schemas/output_schemas/sourceIntegration_v3.schema.jsoncommon/src/schemas/output_schemas/spec_v3.schema.jsoncommon/src/schemas/output_schemas/step_v3.schema.jsoncommon/src/schemas/output_schemas/test_v3.schema.jsoncommon/src/schemas/schemas.jsoncommon/src/schemas/src_schemas/config_v3.schema.jsoncommon/src/schemas/src_schemas/screenshot_v3.schema.jsoncommon/src/schemas/src_schemas/sourceIntegration_v3.schema.jsoncore/dev/index.jscore/package.jsoncore/src/integrations/heretto.jscore/src/integrations/index.jscore/src/tests.jscore/src/tests/saveScreenshot.jscore/test/heretto-upload.test.jscore/test/screenshot.test.jscore/test/upload.test.jsdocs/docs/references/integrations/heretto-screenshot-sync.mdresolver/.github/workflows/integration-tests.ymlresolver/package.jsonresolver/src/config.jsresolver/src/heretto.integration.test.jsresolver/src/heretto.jsresolver/src/heretto.test.jsresolver/src/utils.js
🧰 Additional context used
🧬 Code graph analysis (8)
core/src/tests.js (1)
core/src/integrations/index.js (1)
require(10-10)
resolver/src/heretto.test.js (1)
resolver/src/heretto.js (2)
assets(303-303)completedJob(645-651)
core/test/upload.test.js (3)
core/src/integrations/heretto.js (1)
require(9-9)core/dev/index.js (2)
require(1-1)result(40-40)resolver/src/heretto.js (1)
require(7-7)
core/test/screenshot.test.js (5)
core/src/integrations/heretto.js (3)
fs(5-5)require(9-9)path(6-6)core/src/tests/saveScreenshot.js (5)
fs(5-5)require(1-1)require(2-2)require(3-3)path(4-4)resolver/src/heretto.js (3)
fs(2-2)require(7-7)path(3-3)core/test/upload.test.js (18)
require(2-6)assert(1-1)result(11-11)result(16-16)result(21-21)result(59-59)result(102-102)result(134-134)result(169-169)result(226-226)result(237-237)result(243-243)result(257-257)result(270-270)result(278-278)result(285-285)result(290-290)result(295-295)core/dev/index.js (2)
require(1-1)path(2-2)
core/src/integrations/index.js (1)
core/test/upload.test.js (1)
HerettoUploader(309-309)
core/test/heretto-upload.test.js (2)
core/src/integrations/heretto.js (3)
fs(5-5)require(9-9)path(6-6)core/test/upload.test.js (4)
require(2-6)HerettoUploader(309-309)mockLog(312-312)mockLog(431-431)
resolver/src/utils.js (2)
core/src/tests/saveScreenshot.js (2)
filePath(75-75)path(4-4)core/src/integrations/heretto.js (1)
path(6-6)
resolver/src/heretto.integration.test.js (1)
resolver/src/heretto.js (8)
client(597-597)client(841-841)client(908-908)job(290-290)job(636-640)completedJob(645-651)assets(303-303)outputPath(662-669)
🪛 LanguageTool
docs/docs/references/integrations/heretto-screenshot-sync.md
[style] ~17-~17: Opting for a less wordy alternative here may improve the clarity of your writing.
Context: ...ynchronization loop where running tests not only validates documentation accuracy but also updates it. ## Architecture The implementatio...
(NOT_ONLY_ALSO)
🪛 markdownlint-cli2 (0.18.1)
docs/docs/references/integrations/heretto-screenshot-sync.md
52-52: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
85-85: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
100-100: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🔇 Additional comments (53)
core/package.json (1)
39-40: Local file references should be reverted before merging to main.These
file:../references work for local monorepo development but will break for consumers installing from npm, as the sibling directories won't exist. Before merging to main, these should be updated to published package versions or the monorepo should use a workspace manager (npm workspaces, pnpm, yarn workspaces) that handles this automatically.Is this intentional for development purposes only, or is there a plan to convert this to a proper workspace setup before release?
common/src/schemas/dereferenceSchemas.js (1)
47-47: LGTM!The new
sourceIntegration_v3.schema.jsonis correctly added to the v3 schema list, maintaining alphabetical ordering and ensuring it will be processed through the dereference and publish workflow.common/src/schemas/output_schemas/sourceIntegration_v3.schema.json (1)
1-50: LGTM!The schema is well-designed with appropriate required fields (
type,integrationName) for routing uploads and optional fields (fileId,filePath,contentPath) for flexible file resolution strategies. TheadditionalProperties: falseconstraint ensures strict validation.common/src/schemas/src_schemas/sourceIntegration_v3.schema.json (1)
1-45: LGTM!The source schema is well-defined with clear descriptions for each property. The examples demonstrate both use cases (with fileId vs. with contentPath for path resolution).
core/src/integrations/heretto.js (2)
557-564: LGTM!The
escapeXmlfunction correctly handles the five XML special characters that require escaping in content and attributes.
918-937: LGTM!The
getContentTypefunction provides sensible MIME type mappings for common image and document formats, with a safe fallback toapplication/octet-stream.core/src/integrations/index.js (4)
22-32: Return type clarification: returns instance, not class.Based on the relevant code snippet from
core/test/upload.test.js(line 308),getUploaderis called expecting an instance that has theuploadmethod. The current implementation correctly returns an instance from the registry. The implementation is correct.
39-68: LGTM!The
collectChangedFilesfunction correctly traverses the nested report structure (specs → tests → contexts → steps) and collects files that meet all criteria: has screenshot, is changed, and has sourceIntegration metadata.
79-191: LGTM!The
uploadChangedFilesfunction implements a robust best-effort upload pattern:
- Skips gracefully when no uploader or config is found
- Continues processing after individual failures
- Provides detailed results tracking for debugging
- Logs progress at appropriate levels
218-223: LGTM!The
registerUploaderfunction provides proper validation ensuring new uploaders implement the required interface before being added to the registry.core/dev/index.js (1)
34-34: LGTM!The
uploadOnChange: trueconfiguration enables the new Heretto upload feature for development testing, aligning with the broader PR changes.common/src/schemas/src_schemas/screenshot_v3.schema.json (1)
72-75: LGTM! Well-documented schema extension.The sourceIntegration property addition is clean and provides clear documentation about its automatic population during test resolution. The reference to the external schema is properly structured.
core/src/tests.js (1)
29-29: LGTM! Clean import.The import of
uploadChangedFilesis properly structured and aligns with the integration module pattern.resolver/src/heretto.test.js (2)
232-381: Excellent test coverage for asset handling!The new test suites for
getJobAssetDetailsandvalidateDitamapInAssetsare comprehensive and cover important edge cases:
- Pagination handling
- Empty/missing asset scenarios
- Path validation for ditamap files
- Filtering of invalid entries
This strengthens confidence in the asset retrieval workflow.
391-533: Well-structured polling and validation tests.The expanded
pollJobStatustests now properly validate:
- Asset presence after job completion
- Handling of missing ditamap files
- Error scenarios during asset fetch
- Integration between polling and asset validation
The use of fake timers to test polling behavior is a good pattern.
resolver/src/config.js (2)
91-91: LGTM! Enhanced DITA step detection.The new inline step pattern
<data name="step">...</data>expands step detection capabilities for DITA workflows, aligning with the enhanced source integration metadata tracking.
247-250: Improved screenshot detection patterns.The two additional regex patterns enhance screenshot matching in DITA content by supporting different attribute orderings of
outputclassandhref. This makes the detection more robust.resolver/package.json (2)
7-9: Good test script organization.Separating integration tests from unit tests is a best practice. The explicit
--ignorepattern ensures integration tests don't run in the default test suite, whiletest:integrationandtest:allprovide flexibility for different testing scenarios.
33-35: LGTM! Dependencies updated appropriately.
- The local file reference for
doc-detective-commonindicates a monorepo structure, which is appropriate for tightly coupled development.- The
fast-xml-parseraddition supports XML/REST processing for Heretto integration.resolver/src/heretto.integration.test.js (2)
1-52: Excellent integration test setup!The approach to CI-only integration tests is well-implemented:
- Clear documentation of required environment variables
- Conditional test execution with descriptive skip messages
- Proper guards to prevent accidental runs against production APIs
- Helpful console logging for developers to understand why tests are skipped
This pattern should be a model for other integration tests.
121-214: Comprehensive workflow coverage.The full publishing workflow tests cover all critical stages:
- Job triggering
- Status polling (with appropriate 6-minute timeout)
- Asset details retrieval
- Ditamap validation
- Output download and extraction
The timeout values are well-tuned for real API interactions.
resolver/src/utils.js (3)
37-50: LGTM! Clean utility function.The
findHerettoIntegrationfunction properly:
- Handles null/undefined config._herettoPathMapping
- Normalizes paths for comparison
- Uses startsWith for directory matching
- Returns null when no match found
256-287: Well-structured path mapping initialization.The initialization of
_herettoPathMappingduring qualification and its population when loading Heretto content creates a clean association between output paths and integration names for later sourceIntegration metadata attachment.
760-800: Thorough sourceIntegration metadata handling.The logic properly handles both simple string and object-based screenshot values, ensuring sourceIntegration metadata is attached consistently. The checks prevent unnecessary processing when Heretto integration isn't involved.
core/test/upload.test.js (4)
8-232: Excellent test coverage for collectChangedFiles!The test suite thoroughly covers:
- Null and empty input handling
- Changed vs unchanged file detection
- Missing sourceIntegration scenarios
- Multiple files across different tests
- Edge cases like missing screenshot property
This comprehensive coverage will help catch regressions early.
311-428: Thorough dependency resolution testing.The
resolveFromDependenciestests cover all matching strategies:
- Exact path matches
- Relative path handling
- Filename + folder matching
- Filename-only fallback
- Internal key filtering
- Windows path normalization
- No-match scenarios
This is exactly the kind of thorough testing complex path resolution logic needs.
430-524: Comprehensive parent folder resolution tests.The tests cover all critical scenarios:
- Sibling file lookups
- Direct path matching
- Ditamap parent folder fallback
- Preference ordering (sibling over fallback)
- Relative path normalization
Well done on covering these complex scenarios!
612-648: Good validation test coverage.The upload validation tests ensure proper error handling for missing or incomplete configuration, which is important for providing clear error messages to users.
core/src/tests/saveScreenshot.js (3)
24-26: LGTM! Clean initialization of the changed tracking flag.The
changedflag is properly initialized tofalseand will be set totrueonly when necessary (new screenshots or variation exceeded).
367-382: LGTM! Variation handling correctly tracks changes.The logic correctly differentiates between screenshots that exceed the variation threshold (marked as changed) and those within acceptable variation (not marked as changed), while consistently preserving the
screenshotPathandsourceIntegrationmetadata in both cases.
390-399: LGTM! New screenshot handling is correct.The fallback logic properly handles new screenshots by setting the output path, marking them as changed, and preserving the
sourceIntegrationmetadata. This ensures new screenshots are tracked for upload to Heretto CMS.resolver/.github/workflows/integration-tests.yml (4)
5-7: Consider removing theherettobranch filter before merging.The
herettobranch filter appears to be a development branch used for this PR. Once merged tomain, this filter may be unnecessary unless you intend to maintain theherettobranch long-term.
44-51: LGTM on the secrets handling and test execution.Good use of environment variables for credentials with appropriate fallback for
HERETTO_SCENARIO_NAME. The CI flag is correctly set.
63-119: Well-structured failure notification job.The notify-on-failure job correctly:
- Only triggers on scheduled runs when tests fail
- Checks for existing open issues before creating duplicates
- Adds comments to existing issues with run details
- Uses appropriate labels for tracking
36-38: No action needed — Node.js v24 is now available.Node.js v24.12.0 was released in December 2025 and is fully supported by
actions/setup-node@v4. The concern about v24 unavailability is no longer valid.Likely an incorrect or invalid review comment.
core/test/screenshot.test.js (4)
54-97: Test case for new screenshots is well-structured.The test correctly validates:
sourceIntegrationpreservation with all expected fieldschangedflag istruefor new screenshots- Uses appropriate assertions with descriptive messages
99-169: Good coverage for variation threshold scenarios.The test properly validates that
sourceIntegrationis preserved when variation exceeds threshold and the file is overwritten. The two-phase approach (initial screenshot, then variation test) is appropriate.
171-241: LGTM for within-variation test case.Good validation that
changedisfalsewhen the screenshot is within the variation threshold andsourceIntegrationis still preserved.
243-273: Correctly tests the negative case.Good boundary test ensuring
sourceIntegrationisundefinedwhen not provided in the input.common/src/schemas/src_schemas/config_v3.schema.json (1)
261-310: Well-designed schema additions for Heretto integration.The new properties are appropriately structured:
fileMappingandresourceDependenciesare correctly marked asreadOnlysince they're auto-populated during processinguploadOnChangedefaults tofalse, which is a safe opt-in approach- Property descriptions clearly explain their purpose
core/test/heretto-upload.test.js (2)
15-50: Good test setup with credential gating.The test properly:
- Sets an appropriate timeout for API calls (60s)
- Skips gracefully when credentials aren't available
- Provides clear instructions for running the tests
- Uses DEBUG_HERETTO flag for conditional logging
128-277: Comprehensive upload flow coverage.The tests cover the full lifecycle:
- Document creation in a known folder
- Content upload with PNG verification
- Document location and content verification
- Content update flow
The handling of both
createdandexistsInFoldercases is defensive and appropriate for integration tests that may run multiple times.common/src/schemas/output_schemas/report_v3.schema.json (1)
6031-6055: Well-structured schema with clear property definitions.The
sourceIntegrationschema is well-designed:
- Required fields (
type,integrationName) ensure minimum viable metadata- Optional fields (
fileId,filePath,contentPath) support flexible lookup strategiesadditionalProperties: falseenforces strict validation- Examples clearly demonstrate both ID-based and path-based resolution patterns
common/src/schemas/output_schemas/config_v3.schema.json (1)
8631-8678: Schema additions for Heretto integration look well-structured.The
fileMapping,uploadOnChange, andresourceDependenciesproperties are appropriately defined:
readOnly: truecorrectly marks auto-populated fieldsuploadOnChangedefaults tofalse, which is a safe default- Dynamic key mappings via
additionalPropertiesis the right pattern for path-to-metadata mappingsOne optional improvement: consider adding
requiredarrays to the nested object schemas if certain fields (e.g.,fileIdinfileMapping,uuidinresourceDependencies) are always expected to be present. This would help catch incomplete data during validation.common/src/schemas/output_schemas/resolvedTests_v3.schema.json (1)
8643-8692: LGTM! Well-structured Heretto configuration properties.The new properties
fileMapping,uploadOnChange, andresourceDependenciesare well-defined with clear descriptions and appropriate constraints. ThereadOnly: truemarkers correctly identify auto-populated fields, whileuploadOnChangeremains user-configurable.resolver/src/heretto.js (8)
58-76: LGTM!The REST API client factory follows the same pattern as the existing
createApiClientand properly configures the Accept header for XML responses.
269-273: LGTM!The validation logic correctly checks for
.ditamapfiles within the expected output directory.
293-332: LGTM!The post-completion validation ensures output integrity by verifying
.ditamappresence before returning success. The separation of asset validation errors from polling errors improves debuggability.
702-758: LGTM!The file mapping logic correctly parses DITA files and builds a mapping of local paths to metadata with appropriate error handling for parse failures.
797-823: LGTM!The recursive traversal correctly extracts image references. The
fast-xml-parserlibrary doesn't produce circular references, so stack overflow is not a concern for typical DITA documents.
978-1026: LGTM!The file ID resolution follows a sensible priority order (known ID → mapping → search) with caching to avoid repeated API calls.
404-426: Good security practice: zip-slip protection.The path traversal protection correctly validates that extracted entries stay within the target directory, preventing zip-slip vulnerabilities.
1028-1049: Verify iffindFilesWithExtensionsshould be exported.The AI summary mentions
findFilesWithExtensionsis "public via module export," but it's not included in the exports. If this is intentional (internal helper), no action needed. If it should be exported for testing or external use, add it to the exports.
… paths - Implement generateSpecId to derive safe spec IDs from absolute or relative file paths. - Update parseTests to use generateSpecId for generating spec IDs, reducing collisions from files with the same basename.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
resolver/src/heretto.js (1)
15-16: Consider exporting REST_API_PATH constant for consistency.The
REST_API_PATHconstant is defined but not exported in the module exports (lines 1028-1049), while other constants likePOLLING_INTERVAL_MSare exported. If external code (such as the HerettoUploader class or tests) needs to reference this path, it should be added to the exports for consistency.🔎 Proposed addition to exports
module.exports = { createAuthHeader, createApiClient, createRestApiClient, findScenario, triggerPublishingJob, getJobStatus, getJobAssetDetails, validateDitamapInAssets, pollJobStatus, downloadAndExtractOutput, loadHerettoContent, buildFileMapping, searchFileByName, uploadFile, resolveFileId, getResourceDependencies, // Export constants for testing POLLING_INTERVAL_MS, POLLING_TIMEOUT_MS, DEFAULT_SCENARIO_NAME, + REST_API_PATH, };
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
resolver/src/heretto.jsresolver/src/utils.js
🚧 Files skipped from review as they are similar to previous changes (1)
- resolver/src/utils.js
🔇 Additional comments (15)
resolver/src/heretto.js (15)
7-7: LGTM!The addition of
XMLParserfromfast-xml-parseris appropriate for parsing DITA XML content and resource dependencies.
58-76: LGTM!The
createRestApiClientfunction correctly configures an axios instance for Heretto's REST API with appropriate baseURL, authentication, and headers.
177-180: LGTM!The enhanced return value now includes both
scenarioIdandfileId, which properly supports the new resource dependency workflow.
220-273: LGTM!Both
getJobAssetDetailsandvalidateDitamapInAssetsare well-implemented:
- Pagination logic correctly handles multiple pages of assets
- Asset validation properly checks for
.ditamapfiles in the expected output directory
293-332: LGTM!The enhanced validation logic correctly verifies that a
.ditamapfile exists in the job output before considering the job successful. Error handling and logging are appropriate.
404-426: Excellent security improvement!The path traversal protection correctly prevents zip slip attacks by validating that each extracted file path remains within the intended output directory. The implementation properly handles both regular files and the output directory itself.
447-579: LGTM!The
getResourceDependenciesfunction is well-designed with:
- A two-phase approach that fetches ditamap info first for reliability
- Graceful fallback if the dependencies endpoint is unavailable
- Robust XML parsing and path extraction logic
- Comprehensive error handling and logging
598-598: LGTM!The
loadHerettoContentenhancements cleanly integrate resource dependency fetching and file mapping whenuploadOnChangeis enabled, storing the results appropriately in the config object for later use.Also applies to: 617-628, 671-680
693-758: LGTM!The
buildFileMappingfunction correctly:
- Recursively finds and parses DITA files to extract image references
- Resolves relative paths to absolute local paths for mapping
- Handles parse errors gracefully without failing the entire operation
760-823: LGTM!Both utility functions are well-implemented:
findFilesWithExtensionscorrectly traverses directories recursively with appropriate error handlingextractImageReferencesproperly extracts image hrefs from parsed DITA XML structure
863-870: The previous issue has been resolved.The no-op
.replace()call mentioned in the past review has been removed. The current implementation correctly posts to the/ezdnxtgen/api/searchendpoint with the appropriate baseURL override.
825-896: LGTM!The
searchFileByNamefunction correctly:
- Constructs an appropriate search request body
- Posts to the correct search API endpoint
- Finds exact filename matches from search results
- Handles errors gracefully
898-966: LGTM!The
uploadFilefunction is well-implemented with:
- Appropriate content-type detection for common image formats
- Correct use of the REST API client for binary uploads
- Good error handling and structured return values
968-1026: LGTM!The
resolveFileIdfunction implements an efficient three-tier resolution strategy:
- Uses known fileId from source integration metadata
- Checks the file mapping cache
- Falls back to searching by filename and caches the result
This approach minimizes API calls while ensuring files can be found.
1028-1049: LGTM!The module exports are comprehensive, including all new public functions and test constants. (Note: The optional
REST_API_PATHexport suggestion is covered in a separate comment.)
- Modify resolve function to return heretto: URIs unchanged. - Update comments to reflect new URI handling in path resolution.
…tails - Add screenshotPath to outputs when overwriting existing files - Set changed flag to true when a file is overwritten - Preserve sourceIntegration metadata in outputs if provided
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
core/src/tests/saveScreenshot.js (1)
299-301: Consider extracting metadata propagation to reduce repetition.The
sourceIntegrationpropagation pattern is repeated across four code paths. While clear, a small helper could consolidate this if more metadata fields are added in the future.🔎 Optional: Extract helper function
// Add near top of function or as module helper function setOutputMetadata(result, step, screenshotPath, changed) { result.outputs.screenshotPath = screenshotPath; result.outputs.changed = changed; if (step.screenshot.sourceIntegration) { result.outputs.sourceIntegration = step.screenshot.sourceIntegration; } } // Then use: setOutputMetadata(result, step, existFilePath, true);Also applies to: 376-378, 386-388, 402-404
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
core/src/tests/saveScreenshot.js
🧰 Additional context used
🧬 Code graph analysis (1)
core/src/tests/saveScreenshot.js (3)
resolver/src/utils.js (2)
result(120-120)fs(1-1)resolver/src/heretto.js (1)
fs(2-2)core/src/integrations/heretto.js (1)
fs(5-5)
🔇 Additional comments (5)
core/src/tests/saveScreenshot.js (5)
24-26: LGTM!Good initialization pattern for the
changedflag with a sensible default offalse.
296-301: LGTM!Correct handling for the explicit overwrite scenario: the screenshot path is set to the original location,
changedis marked true since the file was replaced, and source integration metadata is properly preserved.
373-378: LGTM!Correctly marks the screenshot as changed when variation exceeds the threshold. The path and metadata propagation are properly handled.
384-388: LGTM!Correctly leaves
changedasfalsewhen the screenshot is within the acceptable variation threshold, since no meaningful change occurred. The existing screenshot path and metadata are properly preserved.
396-405: LGTM!The final block correctly handles new screenshots by setting
changedtotrueand preserving metadata. The condition!result.outputs.screenshotPathcorrectly identifies scenarios where no existing file was processed.
Introduce a HerettoUploader class for automatic screenshot uploads to Heretto CMS, improving the upload process and error handling. Enhance screenshot saving logic to track changes and integrate metadata for uploads. Add integration tests for Heretto API interactions and job asset handling.
Summary by CodeRabbit
New Features
Documentation
Tests
Chores
✏️ Tip: You can customize this high-level summary in your review settings.