feat: add image input support for vision models #596

gy-mate · 2025-08-25T10:50:54Z

Describe your changes

Add --image/-i flag to include image files in prompts
Support common formats: PNG, JPEG, GIF, WebP (max 5MB, max 10 images)
OpenAI: use content arrays with text and image_url parts
Ollama: use native Images field for vision models like LLaVA
Error gracefully for non-vision APIs (Anthropic, Google, Cohere)
Validate file existence, format, and size limits
Works with any OpenAI-compatible endpoint in config

Authored-By: @claude, @anuramat, @gy-mate

Related issue

Resolves #364.

Checklist before requesting a review

I have read CONTRIBUTING.md
I have performed a self-review of my code. It works!

If this is a feature

I have created a discussion
A project maintainer has approved this feature request. Link to comment:

@anuramat

- Add --image/-i flag to include image files in prompts - Support common formats: PNG, JPEG, GIF, WebP (max 5MB, max 10 images) - OpenAI: use content arrays with text and image_url parts - Ollama: use native Images field for vision models like LLaVA - Error gracefully for non-vision APIs (Anthropic, Google, Cohere) - Validate file existence, format, and size limits - Works with any OpenAI-compatible endpoint in config Authored-By: claude, @anuramat, @gy-mate

gy-mate · 2025-08-25T20:07:50Z

The linter error refers to a line that is left unchanged by this PR.

gy-mate · 2025-11-07T16:09:14Z

@caarlos0 Could you please review this PR?

Many thanks in advance! :)

caarlos0 · 2026-01-05T14:26:31Z

i wonder how much of this is really needed, and how much of it is needed here.

afaik fantasy and crush already support passing image attachments, and we already have code handling mime types and stuff like that.

shouldn't we maybe have a new API for image models in fantasy? and probably eventually another one for audio etc? maybe @kujtimiihoxha and @andreynering have more thoughts on this

gy-mate · 2026-01-05T20:56:37Z

I'd like to use a CLI for this purpose. As far as I understand, crush only has a basic CLI with no piping or follow-up options and fantasy doesn't have one at all. That's why I'd love to see this feature in mods.

andreynering · 2026-01-06T13:23:09Z

Hey @gy-mate,

Can you let us know what you miss from Crush that doesn't fit your use case? What do you mean by "follow-up options"?

We basically plan to retire Mods in favor of crush run, but we know we still have work to do once it has all the meaningful features.

gy-mate · 2026-01-08T13:52:01Z

Hi @andreynering! :)

Can you let us know what you miss from Crush that doesn't fit your use case? What do you mean by "follow-up options"?

I meant mods -c and mods -C that continues a / the previous conversation. (Although I would switch their syntax if it gets reimplemented in crush because -C is probably more commonly used than -c.)

gy-mate · 2026-01-21T11:53:30Z

@caarlos0 @andreynering Could you please review my PR in light of the above? Many thanks in advance! :)

andreynering · 2026-01-22T13:21:18Z

Hi @gy-mate,

We plan to sunset Mods really soon and archive this repository.

From now on, Crush is our focus, and we do aknowledge how important non-interactive mode is! In fact, yesterday we pushed a release with --model flag support and we want to continue to make progress on that area.

If you want to contribute to this feature on Crush, that would be wonderful. Otherwise, we'll eventually do that ourselves.

In meantime, if your implementation on Mods works well, you can use your fork.

gy-mate · 2026-01-23T17:07:51Z

We plan to sunset Mods really soon and archive this repository.

Oh, I see. Thanks for the info! :)

From now on, Crush is our focus, and we do aknowledge how important non-interactive mode is! In fact, yesterday we pushed a release with --model flag support and we want to continue to make progress on that area.

Awesome, thank you! :)

If you want to contribute to this feature on Crush, that would be wonderful. Otherwise, we'll eventually do that ourselves.

Shall I open relevant issues in the Crush repo? Or do they already exist?

andreynering · 2026-01-23T17:11:23Z

Shall I open relevant issues in the Crush repo? Or do they already exist?

I'm not sure. Worth searching if they exist already. Otherwise, feel free to open new issues.

gy-mate · 2026-01-26T07:28:01Z

Great, thanks! I've opened charmbracelet/crush#1982 and charmbracelet/crush#1983.

andreynering · 2026-01-26T14:23:22Z

Awesome, thank you!

gy-mate requested a review from caarlos0 as a code owner August 25, 2025 10:50

gy-mate marked this pull request as draft August 25, 2025 10:53

fix: reformat comments

e7dd872

gy-mate force-pushed the image-input branch 2 times, most recently from 71bb937 to 5a9d0d1 Compare August 25, 2025 19:39

gy-mate added 5 commits August 25, 2025 21:47

fix: remove unused functions

5e71687

fix: remove unused import

6a541f7

fix: catch file closure exception

208d16b

fix: pre-allocate slices

3407f0b

fix: change error printing function

5157f51

gy-mate force-pushed the image-input branch from 5a9d0d1 to fdc232a Compare August 25, 2025 19:48

gy-mate added 3 commits August 25, 2025 21:56

fix: change variable declaration

b66ac64

fix: replace space indentation with tabs

b5c1337

fix: align variable declarations

a1c16d3

gy-mate force-pushed the image-input branch from fdc232a to a1c16d3 Compare August 25, 2025 19:57

gy-mate marked this pull request as ready for review August 25, 2025 20:01

gy-mate mentioned this pull request Jan 26, 2026

crush run sessions charmbracelet/crush#1982

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add image input support for vision models #596

feat: add image input support for vision models #596

Uh oh!

gy-mate commented Aug 25, 2025

Uh oh!

gy-mate commented Aug 25, 2025

Uh oh!

gy-mate commented Nov 7, 2025

Uh oh!

caarlos0 commented Jan 5, 2026

Uh oh!

gy-mate commented Jan 5, 2026

Uh oh!

andreynering commented Jan 6, 2026

Uh oh!

gy-mate commented Jan 8, 2026

Uh oh!

gy-mate commented Jan 21, 2026

Uh oh!

andreynering commented Jan 22, 2026

Uh oh!

gy-mate commented Jan 23, 2026

Uh oh!

andreynering commented Jan 23, 2026 •

edited

Loading

Uh oh!

gy-mate commented Jan 26, 2026

Uh oh!

andreynering commented Jan 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: add image input support for vision models #596

Are you sure you want to change the base?

feat: add image input support for vision models #596

Uh oh!

Conversation

gy-mate commented Aug 25, 2025

Describe your changes

Related issue

Checklist before requesting a review

If this is a feature

Uh oh!

gy-mate commented Aug 25, 2025

Uh oh!

gy-mate commented Nov 7, 2025

Uh oh!

caarlos0 commented Jan 5, 2026

Uh oh!

gy-mate commented Jan 5, 2026

Uh oh!

andreynering commented Jan 6, 2026

Uh oh!

gy-mate commented Jan 8, 2026

Uh oh!

gy-mate commented Jan 21, 2026

Uh oh!

andreynering commented Jan 22, 2026

Uh oh!

gy-mate commented Jan 23, 2026

Uh oh!

andreynering commented Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gy-mate commented Jan 26, 2026

Uh oh!

andreynering commented Jan 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

andreynering commented Jan 23, 2026 •

edited

Loading