Commit e4f8b75
Add full DOCX and enhanced PDF support to RAG system
- Installed mammoth.js for DOCX file processing
- Enhanced PDF extraction with metadata (title, author, page count)
- Added structure-preserving chunking for both PDF and DOCX
- DOCX files preserve headings, lists, and paragraph structure
- Smart chunking maintains document context with overlap
- Added visual processing indicators showing file type icons
- Comprehensive test coverage for new document processing features
- Updated documentation to reflect new capabilities
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>1 parent 1e47ce7 commit e4f8b75
File tree
7 files changed
+718
-52
lines changed- src/lib
- components
- types
- utils
- tests/unit
7 files changed
+718
-52
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
105 | 105 | | |
106 | 106 | | |
107 | 107 | | |
108 | | - | |
109 | | - | |
110 | | - | |
111 | | - | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
112 | 112 | | |
113 | 113 | | |
114 | 114 | | |
| |||
156 | 156 | | |
157 | 157 | | |
158 | 158 | | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
159 | 167 | | |
160 | 168 | | |
161 | 169 | | |
| |||
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
58 | 58 | | |
59 | 59 | | |
60 | 60 | | |
| 61 | + | |
61 | 62 | | |
62 | 63 | | |
63 | 64 | | |
| |||
0 commit comments