-
Notifications
You must be signed in to change notification settings - Fork 38
sink: support batch dml with virtual column #3787
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Skipping CI for Draft Pull Request. |
Summary of ChangesHello @wk989898, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces crucial enhancements to support batch DML operations on tables containing virtual columns. The changes primarily involve adjusting the internal logic that determines when batch DML can be utilized, specifically by removing the restriction imposed by virtual columns. Furthermore, the handling of generated columns during the construction of SQL statements has been refined to correctly exclude them, ensuring compatibility and preventing errors, thereby enabling more efficient data synchronization. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
/test pull-cdc-mysql-integration-heavy |
|
/test pull-cdc-mysql-integration-light |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request aims to add support for batch DML with virtual columns. The main changes involve removing the restriction on batching for tables with virtual columns, and ensuring that generated columns are skipped when preparing DML arguments. My review has found a couple of critical issues in pkg/sink/sqlmodel/row_change.go that could lead to panics. There's an inconsistency in how generated columns are handled between ColumnCount and getArgsWithGeneratedColumn, and a bug in whereColumnsAndValues where column names are filtered but their corresponding values are not. I've also pointed out an outdated comment and a misleading test case name that should be addressed for clarity. Overall, the direction is good, but the implementation needs to be more careful about consistency and correctness when handling generated columns.
pkg/sink/sqlmodel/row_change.go
Outdated
| c := 0 | ||
| for _, col := range r.sourceTableInfo.GetColumns() { | ||
| if !col.Hidden { | ||
| if !col.Hidden && !col.IsVirtualGenerated() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method IsVirtualGenerated does not seem to exist on *model.ColumnInfo. This will likely cause a compilation error.
Also, there's a potential inconsistency. This function uses IsVirtualGenerated (assuming it checks for virtual generated columns), but getArgsWithGeneratedColumn in pkg/sink/mysql/sql_builder.go uses IsGenerated(), which checks for both virtual and stored generated columns. This can lead to a mismatch between the number of columns counted here and the number of arguments prepared, causing a panic in NewRowChange if the table has stored generated columns.
To handle all generated columns consistently, you should probably use IsGenerated() here.
| if !col.Hidden && !col.IsVirtualGenerated() { | |
| if !col.Hidden && !col.IsGenerated() { |
pkg/sink/sqlmodel/row_change.go
Outdated
| if !column.IsGenerated() { | ||
| columnNames = append(columnNames, column.Name.O) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seems to be a bug here. You are filtering out generated columns when building columnNames, but the corresponding values in the values slice are not filtered. If columns contains any generated columns (for example, if a unique index used for the WHERE clause contains a generated column), len(columnNames) will be different from len(values). This will trigger the panic you've added on line 248. You should filter values in sync with columnNames.
|
/test all |
|
/retest |
| @@ -230,8 +230,12 @@ func (r *RowChange) whereColumnsAndValues() ([]string, []interface{}) { | |||
| } | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function seems only for Update statment, how about insert statement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The insert statement has filtered the generated column by generatedColumnsNameSet
|
Do we have enough tests with virtual column scenarios? Do we need to add some related integration tests to ensure correctness? |
Signed-off-by: wk989898 <nhsmwk@gmail.com>
There exists a generated column test. https://github.com/pingcap/ticdc/blob/master/tests/integration_tests/generate_column/data/virtual.sql |
pkg/sink/mysql/mysql_writer_dml.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please change the comment here
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: hongyunyan The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[LGTM Timeline notifier]Timeline:
|
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughPer-table batch-SQL gating now uses TableInfo and inspects handle-key columns; virtual/generated columns are excluded from WHERE/index construction. Nil handling for column values was tightened. Tests and integration fixtures for generated columns were updated. Changes
Sequence Diagram(s)sequenceDiagram
participant Writer as Writer (w)
participant TableInfo as TableInfo
participant SQLBuilder as SQLBuilder
participant DB as Database
Writer->>TableInfo: Inspect columns & flags (isVirtual, isNotNull, has PK/UK)
TableInfo-->>Writer: Return column metadata
alt Not eligible for batch (no PK/UK, all handle-keys virtual, single-event, or safe-mode mismatch)
Writer->>SQLBuilder: generateNormalSQLs(events)
else Eligible for batch
Writer->>SQLBuilder: generateBatchSQL(events)
end
SQLBuilder->>DB: Execute SQL(s)
DB-->>Writer: Execution result
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
pkg/sink/mysql/mysql_writer_dml_test.go (2)
34-37:⚠️ Potential issue | 🔴 CriticalTest struct fields
hasPKandhasVirtualColsare stale after the signature change.
hasVirtualColsis declared (line 37) but never used in the test invocations.hasPKis used incorrectly as noted above. Both fields should be replaced with atableInfo *common.TableInfofield that the test constructs per case.
553-560:⚠️ Potential issue | 🟡 Minor
newDMLEventcreates events with nilTableInfo, which will cause nil-pointer dereference in the updatedshouldGenBatchSQL.The production code at line 139 accesses
tableInfo.HasPKOrNotNullUKand at line 147 callstableInfo.GetOrderedHandleKeyColumnIDs(). SincenewDMLEventdoesn't setTableInfo, passing these events toshouldGenBatchSQL(even with a correct*common.TableInfofirst arg) would work — but the mock function should be updated ifTestShouldGenBatchSQLis fixed to pass*common.TableInfo.
🤖 Fix all issues with AI agents
In `@pkg/sink/mysql/mysql_writer_dml_test.go`:
- Around line 108-113: The test is passing a bool to shouldGenBatchSQL but the
function expects a *common.TableInfo; update the test table-driven cases to
build a *common.TableInfo for each tt (use the test's tt.hasPK to set
TableInfo.HasPKOrNotNullUK accordingly) and pass that pointer as the first arg
to shouldGenBatchSQL while leaving tt.events as the second arg; ensure the
constructed TableInfo is created inside the t.Run so parallel tests get their
own instance.
In `@pkg/sink/mysql/mysql_writer_dml.go`:
- Around line 146-159: The code sets allVirtualGenerated to true and checks
columns from tableInfo.GetOrderedHandleKeyColumnIDs(), but if that call returns
an empty slice the variable remains true and wrongly disables batch SQL; modify
the logic in mysql_writer_dml.go to explicitly handle an empty colIDs result:
after calling GetOrderedHandleKeyColumnIDs(), if len(colIDs) == 0 then treat as
"not all virtual-generated" (i.e., do not flip allVirtualGenerated to true) or
return early based on the intended semantics; update the block around
allVirtualGenerated, colIDs, and the loop over tableInfo.GetColumnInfo(colID) so
empty slices don't cause batch SQL to be disabled.
- Around line 139-160: The loop over tableInfo.GetOrderedHandleKeyColumnIDs()
currently silently continues when tableInfo.GetColumnInfo(colID) returns !exist,
which can mask bugs; change the behavior so that when a handle-key column is
missing you immediately return false (and optionally emit a warning including
the missing colID and table identifier) instead of continuing. Update the loop
in the block that sets allVirtualGenerated to check existence: if !exist ->
return false (or log + return false), otherwise proceed to check
info.IsVirtualGenerated(); keep the existing allVirtualGenerated logic unchanged
for present columns.
In `@pkg/sink/mysql/sql_builder.go`:
- Around line 283-291: The fallback loop over tableInfo.GetColumns() (in the
block that builds colNames/args) calls col.IsVirtualGenerated() without ensuring
col is non-nil, causing a potential nil-pointer panic; update the loop that
iterates tableInfo.GetColumns() to skip nil entries (e.g., if col == nil {
continue }) before calling col.IsVirtualGenerated(), and ensure
ExtractColVal(row, col, i) and col.Name.O are only used after that nil check so
col is always valid when referenced.
🧹 Nitpick comments (3)
pkg/sink/sqlmodel/row_change.go (1)
233-254: Double-filtering of virtual columns whenuniqueIndex != nil.When
uniqueIndexis not nil,getColsAndValuesOfIdx(inutils.go) already filters out virtual generated columns. The subsequent filter at lines 235-238 is therefore redundant for that code path — it only has effect whenuniqueIndex == niland the full column set is used. This works correctly but is worth noting for clarity.The consistency check at lines 250-253 is always true by construction since both slices are built in lockstep, but it's a reasonable defensive guard.
pkg/sink/mysql/mysql_writer_dml.go (2)
128-134: Doc comment is missing condition 3 (all-virtual-generated handle key).The numbered list in the doc comment now only mentions conditions 1–4 but doesn't describe the new "all handle-key columns are virtual generated → no batch" rule. This is the key behavioral change of the PR and should be documented.
141-141: Unnecessaryelseafter early return.The
elseblock afterreturn falseon line 140 is redundant. The code inside can be de-nested for readability.Suggested improvement
if !tableInfo.HasPKOrNotNullUK { return false - } else { - // if the table has pk or uk, - // but all the columns in the pk or uk are virtual generated columns, - // we also don't generate batch sql, - // because the virtual generated column can't be used to detect the row identity. - allVirtualGenerated := true - colIDs := tableInfo.GetOrderedHandleKeyColumnIDs() - for _, colID := range colIDs { - info, exist := tableInfo.GetColumnInfo(colID) - if !exist { - continue - } - if !info.IsVirtualGenerated() { - allVirtualGenerated = false - } - } - if allVirtualGenerated { - return false - } } + + // if the table has pk or uk, + // but all the columns in the pk or uk are virtual generated columns, + // we also don't generate batch sql, + // because the virtual generated column can't be used to detect the row identity. + allVirtualGenerated := true + colIDs := tableInfo.GetOrderedHandleKeyColumnIDs() + for _, colID := range colIDs { + info, exist := tableInfo.GetColumnInfo(colID) + if !exist { + continue + } + if !info.IsVirtualGenerated() { + allVirtualGenerated = false + break + } + } + if allVirtualGenerated { + return false + }Note: also added
breakafter settingallVirtualGenerated = false— once we find a non-virtual column, there's no need to continue iterating.
| if !tableInfo.HasPKOrNotNullUK { | ||
| return false | ||
| } else { | ||
| // if the table has pk or uk, | ||
| // but all the columns in the pk or uk are virtual generated columns, | ||
| // we also don't generate batch sql, | ||
| // because the virtual generated column can't be used to detect the row identity. | ||
| allVirtualGenerated := true | ||
| colIDs := tableInfo.GetOrderedHandleKeyColumnIDs() | ||
| for _, colID := range colIDs { | ||
| info, exist := tableInfo.GetColumnInfo(colID) | ||
| if !exist { | ||
| continue | ||
| } | ||
| if !info.IsVirtualGenerated() { | ||
| allVirtualGenerated = false | ||
| } | ||
| } | ||
| if allVirtualGenerated { | ||
| return false | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Silent continue on missing column info may mask bugs or produce incorrect results.
On Line 151, when GetColumnInfo(colID) returns !exist, the loop continues without updating allVirtualGenerated. This treats a missing column the same as a virtual generated column — if all handle-key columns are missing from column info, batch SQL would be incorrectly disabled. More importantly, a missing handle-key column likely indicates a bug elsewhere, and silently skipping it could mask the root cause.
Consider logging a warning or returning false directly when a handle-key column is not found:
Suggested improvement
for _, colID := range colIDs {
info, exist := tableInfo.GetColumnInfo(colID)
if !exist {
- continue
+ log.Warn("handle key column not found in table info",
+ zap.Int64("colID", colID))
+ continue
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if !tableInfo.HasPKOrNotNullUK { | |
| return false | |
| } else { | |
| // if the table has pk or uk, | |
| // but all the columns in the pk or uk are virtual generated columns, | |
| // we also don't generate batch sql, | |
| // because the virtual generated column can't be used to detect the row identity. | |
| allVirtualGenerated := true | |
| colIDs := tableInfo.GetOrderedHandleKeyColumnIDs() | |
| for _, colID := range colIDs { | |
| info, exist := tableInfo.GetColumnInfo(colID) | |
| if !exist { | |
| continue | |
| } | |
| if !info.IsVirtualGenerated() { | |
| allVirtualGenerated = false | |
| } | |
| } | |
| if allVirtualGenerated { | |
| return false | |
| } | |
| } | |
| if !tableInfo.HasPKOrNotNullUK { | |
| return false | |
| } else { | |
| // if the table has pk or uk, | |
| // but all the columns in the pk or uk are virtual generated columns, | |
| // we also don't generate batch sql, | |
| // because the virtual generated column can't be used to detect the row identity. | |
| allVirtualGenerated := true | |
| colIDs := tableInfo.GetOrderedHandleKeyColumnIDs() | |
| for _, colID := range colIDs { | |
| info, exist := tableInfo.GetColumnInfo(colID) | |
| if !exist { | |
| log.Warn("handle key column not found in table info", | |
| zap.Int64("colID", colID)) | |
| continue | |
| } | |
| if !info.IsVirtualGenerated() { | |
| allVirtualGenerated = false | |
| } | |
| } | |
| if allVirtualGenerated { | |
| return false | |
| } | |
| } |
🤖 Prompt for AI Agents
In `@pkg/sink/mysql/mysql_writer_dml.go` around lines 139 - 160, The loop over
tableInfo.GetOrderedHandleKeyColumnIDs() currently silently continues when
tableInfo.GetColumnInfo(colID) returns !exist, which can mask bugs; change the
behavior so that when a handle-key column is missing you immediately return
false (and optionally emit a warning including the missing colID and table
identifier) instead of continuing. Update the loop in the block that sets
allVirtualGenerated to check existence: if !exist -> return false (or log +
return false), otherwise proceed to check info.IsVirtualGenerated(); keep the
existing allVirtualGenerated logic unchanged for present columns.
pkg/sink/mysql/mysql_writer_dml.go
Outdated
| allVirtualGenerated := true | ||
| colIDs := tableInfo.GetOrderedHandleKeyColumnIDs() | ||
| for _, colID := range colIDs { | ||
| info, exist := tableInfo.GetColumnInfo(colID) | ||
| if !exist { | ||
| continue | ||
| } | ||
| if !info.IsVirtualGenerated() { | ||
| allVirtualGenerated = false | ||
| } | ||
| } | ||
| if allVirtualGenerated { | ||
| return false | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edge case: empty colIDs silently disables batch SQL.
If GetOrderedHandleKeyColumnIDs() returns an empty slice (even though HasPKOrNotNullUK is true), allVirtualGenerated stays true and batch SQL is incorrectly disabled. An explicit guard would make the intent clearer:
Suggested improvement
allVirtualGenerated := true
colIDs := tableInfo.GetOrderedHandleKeyColumnIDs()
+ if len(colIDs) == 0 {
+ return false
+ }
for _, colID := range colIDs {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| allVirtualGenerated := true | |
| colIDs := tableInfo.GetOrderedHandleKeyColumnIDs() | |
| for _, colID := range colIDs { | |
| info, exist := tableInfo.GetColumnInfo(colID) | |
| if !exist { | |
| continue | |
| } | |
| if !info.IsVirtualGenerated() { | |
| allVirtualGenerated = false | |
| } | |
| } | |
| if allVirtualGenerated { | |
| return false | |
| } | |
| allVirtualGenerated := true | |
| colIDs := tableInfo.GetOrderedHandleKeyColumnIDs() | |
| if len(colIDs) == 0 { | |
| return false | |
| } | |
| for _, colID := range colIDs { | |
| info, exist := tableInfo.GetColumnInfo(colID) | |
| if !exist { | |
| continue | |
| } | |
| if !info.IsVirtualGenerated() { | |
| allVirtualGenerated = false | |
| } | |
| } | |
| if allVirtualGenerated { | |
| return false | |
| } |
🤖 Prompt for AI Agents
In `@pkg/sink/mysql/mysql_writer_dml.go` around lines 146 - 159, The code sets
allVirtualGenerated to true and checks columns from
tableInfo.GetOrderedHandleKeyColumnIDs(), but if that call returns an empty
slice the variable remains true and wrongly disables batch SQL; modify the logic
in mysql_writer_dml.go to explicitly handle an empty colIDs result: after
calling GetOrderedHandleKeyColumnIDs(), if len(colIDs) == 0 then treat as "not
all virtual-generated" (i.e., do not flip allVirtualGenerated to true) or return
early based on the intended semantics; update the block around
allVirtualGenerated, colIDs, and the loop over tableInfo.GetColumnInfo(colID) so
empty slices don't cause batch SQL to be disabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/sink/mysql/mysql_writer_dml.go (1)
115-120:⚠️ Potential issue | 🔴 CriticalBuild failure:
genActiveActiveSQLstill callsshouldGenBatchSQLwith the old 3-argument signature.The pipeline confirms this:
too many arguments in call to w.shouldGenBatchSQL. Line 116 must be updated to match the new signature.🐛 Proposed fix
func (w *Writer) genActiveActiveSQL(tableInfo *common.TableInfo, eventsInGroup []*commonEvent.DMLEvent) ([]string, [][]interface{}) { - if !w.shouldGenBatchSQL(tableInfo.HasPKOrNotNullUK, tableInfo.HasVirtualColumns(), eventsInGroup) { + if !w.shouldGenBatchSQL(tableInfo, eventsInGroup) { return w.generateActiveActiveNormalSQLs(eventsInGroup) } return w.generateActiveActiveBatchSQL(eventsInGroup) }
🧹 Nitpick comments (1)
pkg/sink/mysql/mysql_writer_dml.go (1)
133-154: Unnecessaryelseafter early return — simplify control flow.The
elseblock on line 135 is redundant since theifbranch on line 133 returns early. Removing theelsereduces nesting and improves readability.♻️ Suggested simplification
if !tableInfo.HasPKOrNotNullUK { return false - } else { - // if the table has pk or uk, - // but all the columns in the pk or uk are virtual generated columns, - // we also don't generate batch sql, - // because the virtual generated column can't be used to detect the row identity. - allVirtualGenerated := true - colIDs := tableInfo.GetOrderedHandleKeyColumnIDs() - for _, colID := range colIDs { - info, exist := tableInfo.GetColumnInfo(colID) - if !exist { - continue - } - if !info.IsVirtualGenerated() { - allVirtualGenerated = false - } - } - if allVirtualGenerated { - return false + } + // if all handle-key columns are virtual generated, + // we can't use them for row identity in batch SQL. + allVirtualGenerated := true + colIDs := tableInfo.GetOrderedHandleKeyColumnIDs() + for _, colID := range colIDs { + info, exist := tableInfo.GetColumnInfo(colID) + if !exist { + continue + } + if !info.IsVirtualGenerated() { + allVirtualGenerated = false + break } } + if allVirtualGenerated { + return false + }Note: also added
breakonce a non-virtual column is found — no need to continue iterating.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/sink/mysql/mysql_writer_dml.go (1)
60-69:⚠️ Potential issue | 🟡 MinorStale comment: still references "have virtual column" as a batching disqualifier.
Line 66 says "if the table doesn't have a handle key or have virtual column, we just generate the sqls for each event row". After this PR, virtual columns no longer unconditionally prevent batch SQL. The comment should be updated to reflect the new behavior (only all-virtual-generated handle keys disable batch).
📝 Suggested comment update
-// if the table does't have a handle key or have virtual column, we just generate the sqls for each event row. +// if the table doesn't have a handle key, or all handle key columns are virtual generated, we just generate the sqls for each event row.
🤖 Fix all issues with AI agents
In `@pkg/sink/mysql/mysql_writer_dml_test.go`:
- Around line 72-163: Add unit tests for the virtual generated-column logic in
shouldGenBatchSQL: extend or add variants of buildTableInfo to create columns
with GeneratedExprString set and GeneratedStored=false (virtual generated
columns), then add two test cases in mysql_writer_dml_test.go that exercise
shouldGenBatchSQL — (1) table with a non-virtual PK plus a virtual generated
column expecting want=true, and (2) table whose handle-key columns are all
virtual generated expecting want=false; ensure tests use the existing
newDMLEvent and adjustWriter patterns so the tests hit the code path in
mysql_writer_dml.go that checks GeneratedExprString/GeneratedStored.
🧹 Nitpick comments (2)
pkg/sink/mysql/mysql_writer_dml.go (1)
133-154: Unnecessaryelseafter early return; also has stale comment on Line 155.The
elseblock afterreturn falseon Line 134 adds unnecessary nesting. The logic can be flattened. Also, Line 155 contains a leftover comment (// if tableInfo.HasVirtualColumns()) that appears to be a remnant of the old implementation — it should be removed.♻️ Flatten and clean up
if !tableInfo.HasPKOrNotNullUK { return false - } else { - // if the table has pk or uk, - // but all the columns in the pk or uk are virtual generated columns, - // we also don't generate batch sql, - // because the virtual generated column can't be used to detect the row identity. - allVirtualGenerated := true - colIDs := tableInfo.GetOrderedHandleKeyColumnIDs() - for _, colID := range colIDs { - info, exist := tableInfo.GetColumnInfo(colID) - if !exist { - continue - } - if !info.IsVirtualGenerated() { - allVirtualGenerated = false - } - } - if allVirtualGenerated { - return false + } + + // If all handle-key columns are virtual generated, + // they cannot identify a row, so skip batch SQL. + allVirtualGenerated := true + colIDs := tableInfo.GetOrderedHandleKeyColumnIDs() + for _, colID := range colIDs { + info, exist := tableInfo.GetColumnInfo(colID) + if !exist { + continue + } + if !info.IsVirtualGenerated() { + allVirtualGenerated = false + break } } - // if tableInfo.HasVirtualColumns() + if allVirtualGenerated { + return false + } + if len(events) == 1 && events[0].Len() == 1 {Note: also added an early
breakonce a non-virtual column is found, since there's no need to keep iterating.pkg/sink/mysql/mysql_writer_dml_test.go (1)
36-67:buildTableInfois a useful helper, but could be extended to cover virtual column scenarios.The helper currently only builds tables with a plain
id(int) andname(varchar). For testing the virtual-generated-column gating logic, consider adding a parameter (or a sibling helper) that can create a column withGeneratedExprStringandGeneratedStored = falseto represent a virtual generated column.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@pkg/sink/mysql/mysql_writer_dml_test.go`:
- Around line 226-235: Add a new unit test variant that mirrors the existing
"handle key contains virtual generated column should not use batch SQL" case but
uses a table schema whose primary key or not-null unique key is composed
entirely of virtual generated columns (so HasPKOrNotNullUK returns true and
allVirtualGenerated is exercised); create a new tableInfo (e.g.,
virtualOnlyHandleKeyTableInfo) with only virtual generated columns in the
PK/unique key, reuse newDMLEvent for events and the same adjustWriter to set
SafeMode=false and BatchDMLEnable=true, and assert want:false so the code path
in Writer that checks allVirtualGenerated (the branch that returns false for
batch SQL) is covered.
- Around line 72-133: The test currently misasserts shouldGenBatchSQL for
virtualHandleKeyTableInfo: update the existing expectation to want:true (the
current table has a NOT NULL normal column 'a' so batch SQL should be allowed),
and add a new test case that uses a builder (e.g., extend or add
buildTableInfoWithAllVirtualHandleKey) that makes all handle-index columns (both
'a' and 'b') virtual/generated (set GeneratedExprString and GeneratedStored
appropriately) and assert want:false to exercise the allVirtualGenerated code
path in shouldGenBatchSQL.
🧹 Nitpick comments (2)
pkg/sink/mysql/mysql_writer_dml.go (2)
136-155: Flatten the unnecessaryelseafter early return.Since
!tableInfo.HasPKOrNotNullUKalready returns, theelseblock at Line 136 can be un-indented. This improves readability and aligns with idiomatic Go style (avoidelseafterreturn).Suggested refactor
if !tableInfo.HasPKOrNotNullUK { return false - } else { - // if the table has pk or uk, - // but all the columns in the pk or uk are virtual generated columns, - // we also don't generate batch sql, - // because the virtual generated column can't be used to detect the row identity. - allVirtualGenerated := true - colIDs := tableInfo.GetOrderedHandleKeyColumnIDs() - for _, colID := range colIDs { - info, exist := tableInfo.GetColumnInfo(colID) - if !exist { - continue - } - if !info.IsVirtualGenerated() { - allVirtualGenerated = false - } - } - if allVirtualGenerated { - return false - } } + + // If all handle-key columns are virtual generated, + // batch SQL cannot be used because virtual columns can't identify rows. + allVirtualGenerated := true + colIDs := tableInfo.GetOrderedHandleKeyColumnIDs() + for _, colID := range colIDs { + info, exist := tableInfo.GetColumnInfo(colID) + if !exist { + continue + } + if !info.IsVirtualGenerated() { + allVirtualGenerated = false + break + } + } + if allVirtualGenerated { + return false + }Also added
breakon Line 149 once a non-virtual column is found — no need to keep iterating.
156-156: Remove stale comment.
// if tableInfo.HasVirtualColumns()appears to be a leftover from removed code. It doesn't serve as documentation and could confuse future readers. Remove it.
| buildTableInfoWithVirtualGeneratedHandleKey := func(tableName string) *common.TableInfo { | ||
| aFieldType := types.NewFieldType(tidbmysql.TypeLong) | ||
| aFieldType.AddFlag(tidbmysql.NotNullFlag) | ||
| bFieldType := types.NewFieldType(tidbmysql.TypeLong) | ||
| bFieldType.AddFlag(tidbmysql.NotNullFlag) | ||
| cFieldType := types.NewFieldType(tidbmysql.TypeLong) | ||
| cFieldType.AddFlag(tidbmysql.NotNullFlag) | ||
|
|
||
| ti := &tidbmodel.TableInfo{ | ||
| ID: 101, | ||
| Name: ast.NewCIStr(tableName), | ||
| Columns: []*tidbmodel.ColumnInfo{ | ||
| { | ||
| ID: 1, | ||
| Name: ast.NewCIStr("a"), | ||
| Offset: 0, | ||
| FieldType: *aFieldType, | ||
| State: tidbmodel.StatePublic, | ||
| }, | ||
| { | ||
| ID: 2, | ||
| Name: ast.NewCIStr("b"), | ||
| Offset: 1, | ||
| FieldType: *bFieldType, | ||
| State: tidbmodel.StatePublic, | ||
| GeneratedExprString: "a + c", | ||
| GeneratedStored: false, | ||
| }, | ||
| { | ||
| ID: 3, | ||
| Name: ast.NewCIStr("c"), | ||
| Offset: 2, | ||
| FieldType: *cFieldType, | ||
| State: tidbmodel.StatePublic, | ||
| }, | ||
| }, | ||
| Indices: []*tidbmodel.IndexInfo{ | ||
| { | ||
| ID: 1, | ||
| Name: ast.NewCIStr("idx1"), | ||
| Unique: true, | ||
| State: tidbmodel.StatePublic, | ||
| Columns: []*tidbmodel.IndexColumn{ | ||
| { | ||
| Name: ast.NewCIStr("a"), | ||
| Offset: 0, | ||
| Length: types.UnspecifiedLength, | ||
| }, | ||
| { | ||
| Name: ast.NewCIStr("b"), | ||
| Offset: 1, | ||
| Length: types.UnspecifiedLength, | ||
| }, | ||
| }, | ||
| }, | ||
| }, | ||
| } | ||
| info := common.NewTableInfo4Decoder("test", ti) | ||
| require.NotNil(t, info) | ||
| return info | ||
| } | ||
| virtualHandleKeyTableInfo := buildTableInfoWithVirtualGeneratedHandleKey("t_virtual_uk") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
echo "=== Check HasPKOrNotNullUK computation in common/table_info.go ==="
rg -n "HasPKOrNotNullUK\s*=" pkg/common/table_info.go -B 5 -A 5
echo ""
echo "=== Check GetOrderedHandleKeyColumnIDs implementation ==="
rg -n "func.*GetOrderedHandleKeyColumnIDs" pkg/common/ -A 25
echo ""
echo "=== Check NewTableInfo4Decoder for virtual column handling ==="
rg -n "func NewTableInfo4Decoder" pkg/common/ -A 30Repository: pingcap/ticdc
Length of output: 4332
🏁 Script executed:
#!/bin/bash
echo "=== Search for newColumnSchema4Decoder implementation ==="
rg -n "func newColumnSchema4Decoder" pkg/common/ -A 40
echo ""
echo "=== Search for HandleKeyIDList assignment ==="
rg -n "HandleKeyIDList\s*=" pkg/common/ -B 3 -A 3
echo ""
echo "=== Check mysql_writer_dml.go lines 130-160 for the actual logic ==="
sed -n '130,160p' pkg/sink/mysql/mysql_writer_dml.go
echo ""
echo "=== Check if virtual columns are excluded from HandleKeyIDList ==="
rg -n "IsVirtualGenerated" pkg/common/table_info_helper.go -B 5 -A 5Repository: pingcap/ticdc
Length of output: 6833
🏁 Script executed:
#!/bin/bash
echo "=== Search for HasPKOrNotNullUK definition and computation ==="
rg -n "HasPKOrNotNullUK" pkg/common/ -B 3 -A 3
echo ""
echo "=== Get full newColumnSchema function ==="
sed -n '489,750p' pkg/common/table_info_helper.go | head -150Repository: pingcap/ticdc
Length of output: 6952
🏁 Script executed:
#!/bin/bash
echo "=== Get full OriginalHasPKOrNotNullUK function ==="
sed -n '406,450p' pkg/common/table_info.go
echo ""
echo "=== Check if GetPrimaryKey is used and what happens with virtual columns ==="
rg -n "GetPrimaryKey" pkg/common/table_info.go -B 2 -A 5
echo ""
echo "=== Specifically check the implicit PK logic for virtual columns ==="
sed -n '559,620p' pkg/common/table_info_helper.goRepository: pingcap/ticdc
Length of output: 3957
🏁 Script executed:
#!/bin/bash
echo "=== Get the full shouldGenBatchSQL function ==="
sed -n '126,225p' pkg/sink/mysql/mysql_writer_dml.goRepository: pingcap/ticdc
Length of output: 2229
🏁 Script executed:
#!/bin/bash
echo "=== Check the test case expectations for virtual column ==="
sed -n '220,245p' pkg/sink/mysql/mysql_writer_dml_test.go
echo ""
echo "=== Check how newDMLEvent creates events and what Len() returns ==="
rg -n "func newDMLEvent" pkg/sink/mysql/ -A 15
echo ""
echo "=== Verify the test parameters for the virtual column case ==="
sed -n '225,240p' pkg/sink/mysql/mysql_writer_dml_test.goRepository: pingcap/ticdc
Length of output: 2523
🏁 Script executed:
#!/bin/bash
echo "=== Check IsVirtualGenerated implementation ==="
rg -n "func.*IsVirtualGenerated" -A 5
echo ""
echo "=== Verify GetColumnInfo works correctly ==="
rg -n "func.*GetColumnInfo" pkg/common/table_info.go -A 5Repository: pingcap/ticdc
Length of output: 878
🏁 Script executed:
#!/bin/bash
echo "=== Search for IsVirtualGenerated in all files ==="
rg -n "IsVirtualGenerated" . --type go 2>/dev/null | head -20
echo ""
echo "=== Check model.ColumnInfo structure (from TiDB import) ==="
grep -r "type ColumnInfo" . --include="*.go" | grep -v "TableColumnInfo\|RowColumnInfo" | head -5
echo ""
echo "=== Verify test table structure - check if column b is correctly marked virtual ==="
sed -n '72,105p' pkg/sink/mysql/mysql_writer_dml_test.goRepository: pingcap/ticdc
Length of output: 3301
Test expectation is incorrect; test case does not exercise the all-virtual-columns code path.
The test at line 234 expects want: false, but the production logic should return true for this table. The test table has a unique index on columns (a, b) where column a is normal (not virtual) and column b is virtual generated. When shouldGenBatchSQL evaluates this:
HasPKOrNotNullUKistrue(unique index on NOT NULL columns)- The loop at lines 141–150 sets
allVirtualGenerated = falsebecause columnais not virtual - The condition at line 152 is not triggered
- With 2 events and both having the same safe mode (
CommitTs > ReplicatingTs), the function returnstrue
The test name suggests it validates behavior when a handle key contains virtual columns, but the code only disables batch SQL when all handle-key columns are virtual. To properly test lines 141–154, add a case with a unique index where all columns are virtual generated.
🤖 Prompt for AI Agents
In `@pkg/sink/mysql/mysql_writer_dml_test.go` around lines 72 - 133, The test
currently misasserts shouldGenBatchSQL for virtualHandleKeyTableInfo: update the
existing expectation to want:true (the current table has a NOT NULL normal
column 'a' so batch SQL should be allowed), and add a new test case that uses a
builder (e.g., extend or add buildTableInfoWithAllVirtualHandleKey) that makes
all handle-index columns (both 'a' and 'b') virtual/generated (set
GeneratedExprString and GeneratedStored appropriately) and assert want:false to
exercise the allVirtualGenerated code path in shouldGenBatchSQL.
| { | ||
| name: "batch dml is disabled", | ||
| hasPK: true, | ||
| hasVirtualCols: false, | ||
| events: []*commonEvent.DMLEvent{newDMLEvent(t, 1, 1, 1)}, | ||
| config: &Config{SafeMode: false, BatchDMLEnable: false}, | ||
| want: false, | ||
| name: "handle key contains virtual generated column should not use batch SQL", | ||
| tableInfo: virtualHandleKeyTableInfo, | ||
| events: []*commonEvent.DMLEvent{newDMLEvent(t, 2, 1, 1), newDMLEvent(t, 3, 1, 1)}, | ||
| adjustWriter: func(w *Writer) { | ||
| w.cfg.SafeMode = false | ||
| w.cfg.BatchDMLEnable = true | ||
| }, | ||
| want: false, | ||
| }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test coverage gap: no case exercises the all-virtual-generated handle-key path.
The virtual column test case here has a mix of virtual and non-virtual columns in the unique index. If HasPKOrNotNullUK ends up false for this table (see comment above), then Lines 141–154 of the production code — the core new logic — are never tested.
Consider adding a test where the table has a primary key or not-null unique key composed entirely of virtual generated columns, so the allVirtualGenerated → return false path at Line 152 is exercised.
🤖 Prompt for AI Agents
In `@pkg/sink/mysql/mysql_writer_dml_test.go` around lines 226 - 235, Add a new
unit test variant that mirrors the existing "handle key contains virtual
generated column should not use batch SQL" case but uses a table schema whose
primary key or not-null unique key is composed entirely of virtual generated
columns (so HasPKOrNotNullUK returns true and allVirtualGenerated is exercised);
create a new tableInfo (e.g., virtualOnlyHandleKeyTableInfo) with only virtual
generated columns in the PK/unique key, reuse newDMLEvent for events and the
same adjustWriter to set SafeMode=false and BatchDMLEnable=true, and assert
want:false so the code path in Writer that checks allVirtualGenerated (the
branch that returns false for batch SQL) is covered.
|
/test mysql |
|
@wk989898: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
What problem does this PR solve?
Issue Number: close #3294 close #4184
What is changed and how it works?
Check List
Tests
Questions
Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?
Release note
Summary by CodeRabbit
Bug Fixes
New Features
Refactor
Tests