Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for infinite output model fallback #2631

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

IsaacBreen
Copy link

@IsaacBreen IsaacBreen commented Dec 14, 2024

When a response exceeds its length limit and the model doesn't support assistant prefill, we currently throw an error. This PR adds support for falling back to a dedicated "infinite output" model in such cases.

Changes

  • Added --infinite-output-model CLI argument
  • Added infinite_output_model support to Model class
  • Modified response handling to check for and use infinite output model before giving up
  • Updated status display to show infinite output model when configured

Impact

This is particularly valuable for users of models with lower output token limits that don't support prefill:

  • Gemini users benefit most, since Gemini has an 8k token limit and no prefill support, but great free usage tiers
  • OpenAI users might benefit for extremely long edits (though 16k limit is usually sufficient)
  • Claude users unaffected (already supports prefill)

Implementation Notes

The flow is now:

  1. If main model hits length limit, check if it supports prefill
  2. If not, check for infinite output model
  3. If found and it supports prefill, switch to it
  4. Otherwise throw error as before

I haven't added any default infinite output model configurations. The current convention is that default models (main/weak/editor) come from the same provider. Since the whole point of infinite output models is to fall back to a different provider when the main one doesn't support it, this would break that convention.

We could add defaults (e.g. falling back to Claude for Gemini users), but I kept this PR focused on just the core mechanism.

@CLAassistant
Copy link

CLAassistant commented Dec 14, 2024

CLA assistant check
All committers have signed the CLA.

@paul-gauthier
Copy link
Collaborator

Thanks for your interest in aider, and for filing this PR.

I'm finding it a bit hard to understand the need here. Why not just work with a main model that supports prefill?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants