Optimize PDF signature encoding: use HEX only when needed for Unicode characters #292

Copilot · 2025-08-15T14:34:42Z

This PR fixes an issue where Japanese and other multi-byte characters were appearing garbled when viewing PDF signature information in PDF software or browsers, while optimizing the encoding to only use HEX when necessary.

Problem

When using Japanese characters in signature information fields (reason, contactInfo, name, location), the characters would appear corrupted or garbled when viewing the signature details in PDF viewers. This happened because the code was using PDFString.of() which doesn't properly handle multi-byte Unicode characters.

Solution

Implemented conditional encoding that intelligently chooses the appropriate method based on the content:

ASCII-only text: Uses PDFString.of() for efficiency and cleaner output
Unicode/multi-byte text: Uses PDFHexString.fromText() to preserve character integrity

The implementation includes:

needsHexEncoding() helper function that detects non-ASCII characters using /[^\x00-\x7F]/
createPDFString() helper that selects the appropriate encoding method
Conditional encoding for all four signature information fields: Reason, ContactInfo, Name, and Location

Testing

Added comprehensive test cases for both ASCII and Unicode text scenarios
Verified ASCII text uses PDFString (more efficient)
Verified Unicode text uses PDFHexString (preserves characters)
Updated existing tests to expect correct encoding based on content
Japanese test case validates proper encoding and decoding: 日本語の理由, 連絡先@example.com, 田中太郎, 東京、日本

Example

ASCII text (efficient encoding):

Reason: (Simple ASCII reason)  // Uses PDFString.of()

Unicode text (preserves characters):

Reason: <FEFF65E5672C8A9E306E7406304630>  // Uses PDFHexString.fromText()

This approach provides optimal efficiency for common ASCII use cases while ensuring Unicode characters display correctly in PDF viewers. The change is fully backward compatible and compliant with PDF standards.

Fixes #267.

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

- Changed PDFString.of() to PDFHexString.fromText() for reason, contactInfo, name, and location fields - Added comprehensive test case for Japanese character encoding - Updated existing tests to expect PDFHexString instead of PDFString - All tests passing and linting clean Co-authored-by: vbuch <5102057+vbuch@users.noreply.github.com>

vbuch · 2025-08-15T14:54:51Z

@copilot could we only HEX when that is needed (utf chars in the value)? Is that fine with the PDF standard?

Copilot · 2025-08-15T15:12:12Z

@copilot could we only HEX when that is needed (utf chars in the value)? Is that fine with the PDF standard?

Yes, absolutely! I've implemented conditional encoding that only uses HEX when needed. The code now:

Uses PDFString.of() for ASCII-only text (more efficient, cleaner output)
Uses PDFHexString.fromText() only for Unicode/multi-byte characters

A helper function needsHexEncoding() detects if the string contains non-ASCII characters using /[^\x00-\x7F]/. This is fully compliant with the PDF standard and provides the best of both worlds.

Added comprehensive tests to verify both encoding paths work correctly. Commit: ed57f99

coveralls · 2025-08-15T15:20:59Z

coverage: 100.0%. remained the same
when pulling 269e26d on copilot/fix-267
into f54ca67 on develop.

Initial plan

855a3d2

Copilot AI assigned Copilot and vbuch Aug 15, 2025

Copilot started work on behalf of vbuch August 15, 2025 14:34 View session

Copilot AI changed the title ~~[WIP] Garbled characters when Japanese is used for signature information~~ Fix garbled Japanese characters in PDF signature information fields Aug 15, 2025

Copilot AI requested a review from vbuch August 15, 2025 14:47

Copilot finished work on behalf of vbuch August 15, 2025 14:47

Copilot started work on behalf of vbuch August 15, 2025 14:54 View session

Copilot AI changed the title ~~Fix garbled Japanese characters in PDF signature information fields~~ Optimize PDF signature encoding: use HEX only when needed for Unicode characters Aug 15, 2025

Copilot finished work on behalf of vbuch August 15, 2025 15:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Optimize PDF signature encoding: use HEX only when needed for Unicode characters #292

Optimize PDF signature encoding: use HEX only when needed for Unicode characters #292

Uh oh!

Copilot AI commented Aug 15, 2025 •

edited

Loading

Uh oh!

vbuch commented Aug 15, 2025

Uh oh!

Copilot AI commented Aug 15, 2025

Uh oh!

coveralls commented Aug 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Optimize PDF signature encoding: use HEX only when needed for Unicode characters #292

Are you sure you want to change the base?

Optimize PDF signature encoding: use HEX only when needed for Unicode characters #292

Uh oh!

Conversation

Copilot AI commented Aug 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Testing

Example

Uh oh!

vbuch commented Aug 15, 2025

Uh oh!

Copilot AI commented Aug 15, 2025

Uh oh!

coveralls commented Aug 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Aug 15, 2025 •

edited

Loading