-
Notifications
You must be signed in to change notification settings - Fork 108
[MU][Option 2] Email content not displayed due to sanitize html #4188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: maintenance-v0.19.x
Are you sure you want to change the base?
Conversation
|
This PR has been deployed to https://linagora.github.io/tmail-flutter/4188. |
8880a57 to
6df4cd6
Compare
|
So we are considering switching to a whitekist strategy to a black list one? Sorry I do not buy it. I cannot trust us to know every possible unsafe html tag or attribute. What's wrong with option 1 Cc @Crash-- |
We will continue using the whitelist, which includes the originally allowed tags. However, for tags that are not on the whitelist, we will no longer remove their entire content or child nodes as in the previous logic. Instead, we will remove only the disallowed tag itself and continue sanitizing its inner elements.
That is obvious, but we always maintain a broader set of safe tags (based on HTML standards and Google’s recommendations). We also do not need to enforce overly strict constraints on this matter.
Option 1 is not incorrect, but its limitation is that whenever a new tag appears that is not included in the whitelist, it will be removed entirely (both the tag itself and all elements it contains). This affects how emails are rendered and makes maintenance more difficult and time-consuming. |
|
Thanks for the clarification |
|
See the document at linagora/dart-neats@c02d138#diff-608d87b363f587deb3bb659db286a66fc82b0ea1ead0d33e22a9ca9230b4ad44 to better understand this improvement. |
6df4cd6 to
935f55c
Compare
WalkthroughThis PR centralizes HTML sanitization by replacing public allowlists with a private SanitizeHtml instance, updates the sanitize_html dependency ref to a new engine, strengthens and reorganizes sanitization tests, adjusts HTML generation CSS for tables, narrows anchor processing to only anchors with href, and integrates the standardized sanitizer into the email transformation pipeline and calendar-event transformer configuration. Tests and controller code were updated to reflect attribute-order and output differences from the new sanitizer behavior. Changes
Estimated code review effort3 (Moderate) — ~25 minutes
PoemA rabbit hops through HTML streams, Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (4)
contact/pubspec.lockis excluded by!**/*.lockcore/pubspec.lockis excluded by!**/*.lockmodel/pubspec.lockis excluded by!**/*.lockpubspec.lockis excluded by!**/*.lock
📒 Files selected for processing (7)
core/lib/presentation/utils/html_transformer/text/standardize_html_sanitizing_transformers.dart(1 hunks)core/lib/utils/html/html_utils.dart(1 hunks)core/pubspec.yaml(1 hunks)core/test/utils/standardize_html_sanitizing_transformers_test.dart(3 hunks)lib/features/email/data/local/html_analyzer.dart(2 hunks)lib/features/email/presentation/controller/single_email_controller.dart(1 hunks)test/features/email/presentation/controller/single_email_controller_test.dart(0 hunks)
💤 Files with no reviewable changes (1)
- test/features/email/presentation/controller/single_email_controller_test.dart
🔇 Additional comments (9)
core/lib/utils/html/html_utils.dart (1)
154-156: LGTM! CSS enhancement for table word-breaking.The added CSS rule
word-break: normal !importantfor table elements complements the existingwhite-space: normal !importantrule and ensures consistent text handling in tables across different email clients.core/lib/presentation/utils/html_transformer/text/standardize_html_sanitizing_transformers.dart (2)
6-6: Verify thread-safety of the centralized sanitizer instance.The static final
_sanitizerinstance will be shared across all invocations ofStandardizeHtmlSanitizingTransformers.process(). This is generally safe for stateless sanitizers, but please confirm that theSanitizeHtmlclass is thread-safe and doesn't maintain mutable state that could lead to race conditions when processing multiple emails concurrently.
11-14: Verify the default sanitization configuration matches previous behavior.The refactored implementation now relies entirely on the default configuration of
SanitizeHtml()instead of explicit allowlists. This is cleaner but changes the sanitization behavior.Based on the PR discussion, the new engine preserves the whitelist but changes behavior for non-whitelisted tags (unwrapping instead of removing). Please verify:
- The default allowlist in the new engine matches the previously used
mailAllowedHtmlTagsandmailAllowedHtmlAttributes- The unwrapping behavior for unknown tags doesn't introduce security risks
- The change has been tested with real-world email content, especially content with non-standard tags
core/test/utils/standardize_html_sanitizing_transformers_test.dart (4)
82-83: LGTM! Good test helper abstraction.The
sanitize()helper function reduces duplication and makes tests more readable by encapsulating the common pattern of callingtransformer.process()and trimming the result.
86-130: LGTM! Comprehensive XSS protection tests.The test group thoroughly validates removal of event handlers across all DOM events and element types. The systematic approach testing every event attribute on every allowed tag ensures robust XSS protection.
226-246: LGTM! Thorough validation of nested dangerous content removal.This test effectively verifies that the sanitizer removes all dangerous vectors (onclick, script tags, onerror, javascript: URLs) while preserving safe structural elements and content. The assertions checking for both presence and absence of specific content are well-chosen.
249-329: LGTM! Excellent coverage of form-based attack vectors.These test groups comprehensively validate protection against:
- CSRF token stealing via hidden inputs
- Credential phishing via form submissions
- File upload exploitation
The tests correctly verify that forms are unwrapped (preserving safe nested content) while dangerous attributes and form inputs are removed.
lib/features/email/presentation/controller/single_email_controller.dart (1)
1743-1748: The transformer reordering is already validated by comprehensive tests.The tests in
single_email_controller_test.dart(lines 339-340 and 389-391) explicitly verify that calendar event parsing works correctly with the current transformer order:
- Test with auto-linking: Validates URLs convert to proper
<a>tags and newlines become<br>tags- Test with XSS removal: Validates that script tags and javascript: URLs are removed before autolink processing, then safe URLs are properly linkified
The expected output in both tests confirms the transformer order
StandardizeHtmlSanitizingTransformers()→SanitizeAutolinkUnescapeHtmlTransformer()is correct, as:
- Dangerous content is sanitized first (scripts, javascript: URLs stripped)
- Safe URLs are then converted to anchor tags
- HTML entities and newlines are properly transformed
No action needed; existing tests cover all three concerns raised.
lib/features/email/data/local/html_analyzer.dart (1)
7-7: The transformer ordering is correct as implemented.The pipeline for text/plain content—StandardizeHtmlSanitizingTransformers → SanitizeAutolinkHtmlTransformers → PersistPreformattedTextTransformer—produces the expected output:
StandardizeHtmlSanitizingTransformers (first): Defensively sanitizes input using the
sanitize_htmlpackage. For legitimate plain text emails, this is a no-op; for text containing stray/malicious HTML (edge case), it removes dangerous content before linkification.SanitizeAutolinkHtmlTransformers (second): Detects URLs and email addresses using the linkify library, converts them to HTML anchor tags, and escapes remaining plain text portions. The internal
escapeHtml=trueflag ensures plain text is properly escaped.PersistPreformattedTextTransformer (third): Preserves ASCII art and tables by wrapping them in styled divs; otherwise passes content through unchanged.
The code correctly handles:
- Email addresses and URLs (auto-linked to
<a>tags)- Preformatted content (code blocks, ASCII art preserved with CSS styling)
- Stray HTML tags (sanitized away in first transformer)
|
@chibenwa please also config rabbitai to review this PR linagora/dart-neats#1 |
|
I enabled code rabbit on dart-neats repo |
935f55c to
b8231d1
Compare
Fixed Screen.Recording.2025-12-10.at.16.04.19.mov |
b8231d1 to
7d037aa
Compare
|
@coderabbitai please review |
|
@chibenwa: I'll review the changes in this PR. ✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (4)
contact/pubspec.lockis excluded by!**/*.lockcore/pubspec.lockis excluded by!**/*.lockmodel/pubspec.lockis excluded by!**/*.lockpubspec.lockis excluded by!**/*.lock
📒 Files selected for processing (7)
core/lib/presentation/utils/html_transformer/dom/sanitize_hyper_link_tag_in_html_transformers.dart(1 hunks)core/lib/presentation/utils/html_transformer/text/standardize_html_sanitizing_transformers.dart(1 hunks)core/pubspec.yaml(1 hunks)core/test/utils/standardize_html_sanitizing_transformers_test.dart(3 hunks)lib/features/email/data/local/html_analyzer.dart(2 hunks)lib/features/email/presentation/controller/single_email_controller.dart(1 hunks)test/features/email/presentation/controller/single_email_controller_test.dart(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- core/pubspec.yaml
- core/lib/presentation/utils/html_transformer/text/standardize_html_sanitizing_transformers.dart
- lib/features/email/data/local/html_analyzer.dart
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-12-09T09:36:45.349Z
Learnt from: dab246
Repo: linagora/tmail-flutter PR: 4194
File: lib/features/manage_account/presentation/manage_account_dashboard_controller.dart:174-176
Timestamp: 2025-12-09T09:36:45.349Z
Learning: In Dart/Flutter projects using GetX, do not wrap getBinding<T>() calls in try/catch since they return null when not found. Only wrap Get.find<T>() calls in try/catch because they throw if a dependency is unavailable. When a binding is optional, consider checking Get.isRegistered<T>() or handling the null/exception path gracefully instead of blindly catching, and document the expectation for failure modes where a dependency may not be registered.
Applied to files:
core/lib/presentation/utils/html_transformer/dom/sanitize_hyper_link_tag_in_html_transformers.darttest/features/email/presentation/controller/single_email_controller_test.dartlib/features/email/presentation/controller/single_email_controller.dartcore/test/utils/standardize_html_sanitizing_transformers_test.dart
📚 Learning: 2025-12-09T12:47:45.861Z
Learnt from: zatteo
Repo: linagora/tmail-flutter PR: 4196
File: scribe/lib/scribe/ai/data/datasource_impl/ai_datasource_impl.dart:34-35
Timestamp: 2025-12-09T12:47:45.861Z
Learning: In the tmail-flutter repository, avoid suggesting replacing DioError with DioException unless there is a codebase-wide migration. This pattern applies to all Dart files; when reviewing, only propose a DioError-safe change (e.g., compatible error handling or conversion) unless a global migration is in place. Ensure consistency of DioError usage across modules and flag any deviations for a repo-wide decision.
Applied to files:
core/lib/presentation/utils/html_transformer/dom/sanitize_hyper_link_tag_in_html_transformers.darttest/features/email/presentation/controller/single_email_controller_test.dartlib/features/email/presentation/controller/single_email_controller.dartcore/test/utils/standardize_html_sanitizing_transformers_test.dart
📚 Learning: 2025-12-12T04:54:05.432Z
Learnt from: dab246
Repo: linagora/tmail-flutter PR: 4191
File: lib/features/email/presentation/extensions/handle_email_action_extension.dart:37-80
Timestamp: 2025-12-12T04:54:05.432Z
Learning: In lib/features/email/presentation/extensions/handle_email_action_extension.dart, the mailboxDashBoardController.selectedEmail should only be synchronized when isMobileThreadDisabled is true. This is intentional behavior and should not be changed to update selectedEmail in non-mobile or thread-enabled contexts.
Applied to files:
test/features/email/presentation/controller/single_email_controller_test.dartlib/features/email/presentation/controller/single_email_controller.dart
🔇 Additional comments (10)
core/test/utils/standardize_html_sanitizing_transformers_test.dart (7)
86-100: Excellent comprehensive event attribute sanitization coverage.The nested loops testing all 57 event attributes across all allowed tags provide robust coverage against event-handler-based XSS attacks. This systematic approach ensures no event handler slips through.
133-141: Verify javascript: href removal is complete.The test expects javascript: hrefs to be removed, but only checks that the href attribute is gone. Consider verifying that the link content remains intact after sanitization (the test checks for
<a id="id1">test</a>which looks correct).
164-178: Validate unknown tag unwrapping behavior.The tests verify that unknown tags like
<supress_time_adjustment>and<google-sheets-html-origin>are unwrapped while preserving their children. This aligns with Option 2's approach described in the PR objectives.
249-272: Strong form attribute sanitization coverage.The comprehensive list of 14 dangerous form attributes and systematic testing ensures protection against CSRF/phishing attacks via form submission vectors.
274-291: Form child element removal is thorough.Testing removal of input, button, textarea, select, and option elements prevents credential phishing and data exfiltration attacks.
368-440: Outstanding real-world attack vector coverage.The tests cover credential phishing, CSRF token stealing, javascript/data URI forms, file upload exploitation, and mixed XSS vectors. This provides strong evidence of the sanitizer's security posture.
10-21: TheallowedTagsconstant in the test is a test fixture used for iterating through tags in test cases, not a reference implementation list. TheStandardizeHtmlSanitizingTransformersdelegates to the externalsanitize_htmlpackage from pub.dev, which defines its own tag allowlist. No tag allowlist is maintained in the application's implementation, so there is nothing to verify against.Likely an incorrect or invalid review comment.
core/lib/presentation/utils/html_transformer/dom/sanitize_hyper_link_tag_in_html_transformers.dart (1)
21-21: Optimize selector to target only anchors with href.The change from
querySelectorAll('a')toquerySelectorAll('a[href]')is a sensible optimization. Anchors without href don't represent hyperlinks and don't require the URL sanitization, target, and rel modifications this transformer applies.lib/features/email/presentation/controller/single_email_controller.dart (1)
1743-1752: Standardize transformation pipeline for calendar events.The migration from
TransformConfiguration.fromTextTransformerstoTransformConfiguration.createenables both text and DOM transformations for calendar event descriptions. This ensures consistent sanitization across all email content processing paths.The configuration now:
- Preserves the existing text transformers (autolink, sanitize, newline)
- Adds DOM-level hyperlink sanitization with tooltip support for web
Based on learnings, this composite transformer approach provides defense-in-depth sanitization.
test/features/email/presentation/controller/single_email_controller_test.dart (1)
344-348: Test expectations updated to reflect new transformation order.The updated expectations correctly reflect the new attribute order produced by the enhanced HTML transformation pipeline:
rel="noreferrer"now precedes thestyleattributetarget="_blank"follows thestyleattributeThe sanitization behavior remains correct:
- Safe URLs are preserved and decorated with security attributes
- Dangerous javascript: hrefs are removed (line 404 shows the href attribute stripped while preserving link text)
Also applies to: 398-405
| test('SHOULD remove dangerous JS text nodes inside HTML', () { | ||
| const html = 'hello <p>document.cookie</p> world'; | ||
| expect(sanitize(html), equals('hello <p></p> world')); | ||
| }); | ||
|
|
||
| test('SHOULD remove HTML comments entirely', () { | ||
| const html = '<div><!--test--><p>A</p><!--x--></div>'; | ||
| expect(sanitize(html), equals('<div><p>A</p></div>')); | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarify "dangerous JS text nodes" test expectation.
Line 190 tests that text content document.cookie inside a <p> tag is removed, but this seems overly aggressive. Plain text that happens to contain JavaScript keywords shouldn't be stripped if it's not in a dangerous context.
Could you clarify the intended behavior? If the sanitizer removes any text matching JavaScript patterns, this might affect legitimate email content that references these terms.


Descriptions
This is the second proposed solution to address the issue where email content fails to display after applying the HTML sanitization process. This solution focuses on strengthening the
sanitize_htmllibrary, ensuring that the sanitizer removes or ignores only unsafe tags and attributes without stripping valid email content.Dependency
Related
#4184
Summary by CodeRabbit
Bug Fixes
Tests
✏️ Tip: You can customize this high-level summary in your review settings.