Skip to content

Conversation

@gvzdv
Copy link
Contributor

@gvzdv gvzdv commented Nov 17, 2025

Resolves #1174.

Tried to prevent Qwen from getting stuck while working with complex graphics.

Changes made:

  • get simpler data from Qwen (IDs are assigned later, confidence score is hard-coded). Suboptimal, but ensures that the model focuses only on the most important task (labelling and determining bounding boxes);
  • asked the model to work step-by-step;
  • changed/added hyperparameters to ensure that the model stops producing tokens at some point (= doesn't get stuck in an endless loop).

Limited testing on Unicorn. Needs more testing, but the first results are encouraging.


Required Information

  • I referenced the issue addressed in this PR.
  • I described the changes made and how these address the issue.
  • I described how I tested these changes.

Coding/Commit Requirements

  • I followed applicable coding standards where appropriate (e.g., PEP8)
  • I have not committed any models or other large files.

New Component Checklist (mandatory for new microservices)

  • I added an entry to docker-compose.yml and build.yml.
  • I created A CI workflow under .github/workflows.
  • I have created a README.md file that describes what the component does and what it depends on (other microservices, ML models, etc.).

OR

  • I have not added a new component in this PR.

Copy link
Member

@jeffbl jeffbl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No red flags. Couple minor nits/questions. Go ahead and merge when ready.

Args:
objects (list): List of detected objects with confidence scores
qwen_output (list): Qwen detection output with bbox_2d and label
width (int): Image width for normalization
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in pixels?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In pixels, yes. Thank you!

json_schema=BBOX_RESPONSE_SCHEMA,
temperature=0.0,
parse_json=True
temperature=0.5,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious why the move back to non-zero temperature?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is one of the theories on why this is happening: same problem reported on GitHub.
When the temperatue is default, it happens only sometimes, but when the temperature is set to 0, it happens every time.

I am not 100% sure it's affecting the performance in our specific case (too many variables: model, quantization, engine...), but I'm willing to try since it doesn't seem to affect the accuracy of outputs.

if object_json is None or len(object_json.get("objects", [])) == 0:
logging.debug(f"Qwen output received: {qwen_output}")

if qwen_output is None or len(qwen_output) == 0:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No content (204) isn't really an error, if there is legit nothing to extract. Should ideally distinguish between an actual error (something went wrong and nothing to report) vs. everything worked, but there were no objects to extract.

@jeffbl jeffbl merged commit a9c3562 into main Nov 25, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

object-detection-llm times out on specific photo from specific page

3 participants