Add AST-based test detection for documentation formats#69
Draft
Add AST-based test detection for documentation formats#69
Conversation
|
Important Review skippedBot user detected. To trigger a single review, invoke the You can disable this status message by setting the Comment |
- Add AST parser adapters in resolver/src/ast/ directory: - parsers/markdown.js using unified + remark-parse - parsers/html.js using unified + rehype-parse - parsers/asciidoc.js using asciidoctor - parsers/rst.js using restructured - parsers/xml.js using @xmldom/xmldom - index.js exposing parseToAst(content, format) - Add AST node matching engine in resolver/src/ast/matcher.js: - matchNodes(ast, astConfig) returns matched nodes with positions and extracted values - Supports attribute matching: exact string, regex, any-of, exists-check - Supports extract mapping for capture groups - Add caching for parsed ASTs in resolver/src/ast/cache.js - Extend markupDefinition schema in common/src/schemas/src_schemas/config_v3.schema.json: - Add ast property as astNodeMatch object - Add validation requiring at least one of ast or regex - Modify markup processing in utils.js parseContent() function: - Add AST-based matching before regex patterns - Handle AST-only and AST+regex (AND) modes - Add dependencies to resolver/package.json: - unified, remark-parse, rehype-parse, asciidoctor, restructured, @xmldom/xmldom - Add tests for AST-based detection Co-authored-by: hawkeyexl <5209367+hawkeyexl@users.noreply.github.com>
- Fix format determination to check array length before accessing first element - Improve regex pattern detection to require minimum length between delimiters Co-authored-by: hawkeyexl <5209367+hawkeyexl@users.noreply.github.com>
- Modify dereferenceSchemas.js to use bundle instead of dereference for config_v3 - Add WeakSet-based cycle detection to deleteDollarIds function - This preserves $refs for recursive schemas like astNodeMatch.children Co-authored-by: hawkeyexl <5209367+hawkeyexl@users.noreply.github.com>
- Use full dereferencing and handle circular refs with custom clone function - The breakCircularRefs function preserves schema structure while breaking cycles - Both config_v3 and resolvedTests_v3 schemas are now handled correctly - All 306 common tests and 43 resolver tests pass Co-authored-by: hawkeyexl <5209367+hawkeyexl@users.noreply.github.com>
Co-authored-by: hawkeyexl <5209367+hawkeyexl@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Extend markupDefinition schema to add AST node matching
Add AST-based test detection for documentation formats
Dec 5, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Extends
markupDefinitionschema to support AST node matching alongside regex patterns. When bothastandregexare specified, AST identifies candidate nodes first, then regex filters matched content (AND operation). At least one ofastorregexmust be specified.Schema Changes
astproperty tomarkupDefinitionwithastNodeMatchtypeastNodeMatchsupports:nodeType,attributes,content,children(recursive),extract(capture group mapping)/pattern/), any-of (array), exists-check (boolean)AST Parsers (
resolver/src/ast/)markdown.js— unified + remark-parse → MDASThtml.js— unified + rehype-parse → HASTasciidoc.js— asciidoctor → Asciidoctor ASTrst.js— restructured → RST treexml.js— @xmldom/xmldom → DOM-like AST for DITAcache.js— AST cache by file path + content hashmatcher.js—matchNodes(ast, astConfig)returns matched nodes with positions and extracted valuesIntegration
parseContent()in utils.js to handle AST-based matching before regexastNodeMatch.childrenself-referenceExample
{ "name": "bashCodeBlock", "ast": { "nodeType": "code", "attributes": { "lang": ["bash", "sh"] }, "extract": { "$1": "lang", "$2": "value" } }, "actions": [{ "runShell": { "command": "$2" } }] }Combined AST + regex (filters bash blocks containing
# IMPORTANT:):{ "ast": { "nodeType": "code", "attributes": { "lang": "bash" } }, "regex": ["# IMPORTANT:([\\s\\S]*)"], "actions": [{ "runShell": { "command": "$1" } }] }Original prompt
Created from VS Code.
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.