-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Support OpenAI reasoning models #1841
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Shouldn't this generate a new settings.yaml with "max_completion_tokens"? I cloned the repo and checkout this branch. I dont see any differences in the settings.yaml file. It has had the same 2 errors from when I try to run o3 mini on the main branch: I fixed this by uncommenting the encoding line in the config I thought that's what this branch was trying to fix |
It's a bit more complicated than just swapping out the params. There is an accompanying docs PR that explains the new model differences: https://github.com./microsoft/graphrag/pull/1842/files You generally don't wan't to use those params at all with the new prompts, so |
* Update tiktoken * Add max_completion_tokens to model config * Update/remove outdated comments * Remove max_tokens from report generation * Remove max_tokens from entity summarization * Remove logit_bias from graph extraction * Remove logit_bias from claim extraction * Swap params if reasoning model * Add reasoning model support to basic search * Add reasoning model support for local and global search * Support reasoning models with dynamic community selection * Support reasoning models in DRIFT search * Remove unused num_threads entry * Semver * Update openai * Add reasoning_effort param
This updates GraphRAG config and prompting to support the new o* models and the alternate parameters that they require.
In addition to some basic logical switching to make sure the wrong params aren't sent, the primary work here was to rethink some of the prompting (especially for gleanings) to request response lengths without resorting to hard parameter control via
max_tokens
, which is not possible when usingmax_completion_tokens
in a reasoning model. The prompting is updated to request specific lengths from the model reasoning process directly.On the search side, this also resolves an issue where we had individual model params that should be taken from the root model configs, so it removes things like temperature and top_p from the config objects.