-
Notifications
You must be signed in to change notification settings - Fork 115
Add support for chat_completion task in Azure OpenAI integration #5796
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
65b591d
cbd8338
f09a547
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -19,7 +19,7 @@ | |
| "task_type": { | ||
| "type": "enum", | ||
| "description": "The task type", | ||
| "options": ["completion", "text_embedding"] | ||
| "options": ["completion", "chat_completion", "text_embedding"] | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nitpick, but could these be in alphabetical order? |
||
| }, | ||
| "azureopenai_inference_id": { | ||
| "type": "string", | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -802,7 +802,7 @@ export class AzureOpenAIServiceSettings { | |
| * This setting helps to minimize the number of rate limit errors returned from Azure. | ||
| * The `azureopenai` service sets a default number of requests allowed per minute depending on the task type. | ||
| * For `text_embedding`, it is set to `1440`. | ||
| * For `completion`, it is set to `120`. | ||
| * For `completion` and `chat_completion`, it is set to `120`. | ||
| * @ext_doc_id azureopenai-quota-limits | ||
| */ | ||
| rate_limit?: RateLimitSetting | ||
|
|
@@ -824,6 +824,7 @@ export class AzureOpenAITaskSettings { | |
|
|
||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On line 819 above this, "For a `completion` or `text_embedding` task" should be "For a `completion`, `chat_completion` or `text_embedding` task" |
||
| export enum AzureOpenAITaskType { | ||
| completion, | ||
| chat_completion, | ||
| text_embedding | ||
| } | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -37,7 +37,7 @@ import { TaskType } from '@inference/_types/TaskType' | |
| * * Amazon SageMaker (`chat_completion`, `completion`, `rerank`, `sparse_embedding`, `text_embedding`) | ||
| * * Anthropic (`completion`) | ||
| * * Azure AI Studio (`completion`, `rerank`, `text_embedding`) | ||
| * * Azure OpenAI (`completion`, `text_embedding`) | ||
| * * Azure OpenAI (`completion`, `chat_completion`, `text_embedding`) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nitpick, but could these be in alphabetical order like for the other providers? |
||
| * * Cohere (`completion`, `rerank`, `text_embedding`) | ||
| * * DeepSeek (`chat_completion`, `completion`) | ||
| * * Elasticsearch (`rerank`, `sparse_embedding`, `text_embedding` - this service is for built-in models and models uploaded through Eland) | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| summary: A chat completion task | ||
| description: Run `PUT _inference/chat_completion/azure_openai_chat_completion` to create an inference endpoint that performs a `chat_completion` task. | ||
| method_request: 'PUT _inference/chat_completion/azure_openai_chat_completion' | ||
| # type: "request" | ||
| value: |- | ||
| { | ||
| "service": "azureopenai", | ||
| "service_settings": { | ||
| "api_key": "Api-Key", | ||
| "resource_name": "Resource-name", | ||
| "deployment_id": "Deployment-id", | ||
| "api_version": "2024-02-01" | ||
| } | ||
| } |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,25 @@ | ||
| summary: A text embedding task | ||
| description: A successful response when creating an Azure OpenAI `text_embedding` inference endpoint. | ||
| # type: response | ||
| # response_code: | ||
| value: |- | ||
| { | ||
| "inference_id": "azure_openai_embeddings", | ||
| "task_type": "text_embedding", | ||
| "service": "azureopenai", | ||
| "service_settings": { | ||
| "resource_name": "Resource-name", | ||
| "deployment_id": "Deployment-id", | ||
| "api_version": "2024-02-01", | ||
| "rate_limit": { | ||
| "requests_per_minute": 1140 | ||
| }, | ||
| "dimensions": 1536, | ||
| "similarity": "dot_product" | ||
| }, | ||
| "chunking_settings": { | ||
| "strategy": "sentence", | ||
| "max_chunk_size": 250, | ||
| "sentence_overlap": 1 | ||
| } | ||
| } |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,18 @@ | ||
| summary: A completion task | ||
| description: A successful response when creating an Azure OpenAI `completion` inference endpoint. | ||
| # type: response | ||
| # response_code: | ||
| value: |- | ||
| { | ||
| "inference_id": "azure_openai_completion", | ||
| "task_type": "completion", | ||
| "service": "azureopenai", | ||
| "service_settings": { | ||
| "resource_name": "Resource-name", | ||
| "deployment_id": "Deployment-id", | ||
| "api_version": "2024-02-01", | ||
| "rate_limit": { | ||
| "requests_per_minute": 120 | ||
| } | ||
| } | ||
| } |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,18 @@ | ||
| summary: A chat completion task | ||
| description: A successful response when creating an Azure OpenAI `chat_completion` inference endpoint. | ||
| # type: response | ||
| # response_code: | ||
| value: |- | ||
| { | ||
| "inference_id": "azure_openai_chat_completion", | ||
| "task_type": "chat_completion", | ||
| "service": "azureopenai", | ||
| "service_settings": { | ||
| "resource_name": "Resource-name", | ||
| "deployment_id": "Deployment-id", | ||
| "api_version": "2024-02-01", | ||
| "rate_limit": { | ||
| "requests_per_minute": 120 | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this change wasn't intentional, but we probably shouldn't be updating dependencies as part of adding docs for the inference API.