Skip to content

[ML] Adding docs for the unified inference API #118696

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Jan 13, 2025

Conversation

jonathan-buttner
Copy link
Contributor

@jonathan-buttner jonathan-buttner commented Dec 13, 2024

This PR adds docs for the new unified inference API. I tried to cover the more complicated objects with examples.

This PR is waiting on the addition of the chat_completion task type in this PR: #119982

https://elasticsearch_bk_118696.docs-preview.app.elstc.co/guide/en/elasticsearch/reference/master/chat-completion-inference-api.html

@jonathan-buttner jonathan-buttner added >docs General docs changes :ml Machine learning Team:ML Meta label for the ML team v9.0.0 v8.18.0 labels Dec 13, 2024
Copy link
Contributor

Documentation preview:

@@ -63,4 +63,44 @@ Specifies the chunking strategy.
It could be either `sentence` or `word`.
end::chunking-settings-strategy[]

tag::unified-schema-content-with-examples[]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I import this a couple times for each of the messages since they all have the same format (string or an array of objects).

The text content.
+
Object representation:::
`text`::::
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We get lucky that each of the messages has content:: so the colons work out here to nest it correctly.

(Optional, array of objects)
A list of tools that the model can call.
+
.Structure
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open to other names here? Maybe Format?

(Required unless `tool_calls` is specified, string or array of objects)
The contents of the message.
+
include::inference-shared.asciidoc[tag=unified-schema-content-with-examples]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally find it helpful if examples are sprinkled throughout the docs as we're reading instead of just include complete ones at the bottom. I'm open to other suggestions here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably a larger issue than this PR, but theres a bug with the example expansion where the arrow always points down Screenshot 2024-12-16 at 2 06 54 PM, but only for the inner expansion box, the outer one is correc.t

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing that out. I made the docs team aware.

------------------------------------------------------------
// TEST[skip:TBD]

<1> Each tool call needs a corresponding Tool message.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is too OpenAI specific to include?

@jonathan-buttner
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/docs

@jonathan-buttner jonathan-buttner marked this pull request as ready for review December 13, 2024 21:22
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@elasticsearchmachine elasticsearchmachine added the Team:Docs Meta label for docs team label Dec 13, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-docs (Team:Docs)

@jonathan-buttner
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/docs

@jonathan-buttner jonathan-buttner added the auto-backport Automatically create backport pull requests when merged label Dec 13, 2024
Copy link
Contributor

@maxhniebergall maxhniebergall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for writing this Jonathan! I think these docs look great. I just left a couple small comments which could be improvements.

==== {api-request-body-title}

`messages`::
(Required, array of objects) A list of objects representing the conversation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about adding some information about how these messages lists should be generated? Something like

"Requests should generally only add new messages from the user. The other messages ("assistent", "system", or "tool") should generally only be copy-pasted from the response to a previous completion request, such that the messages array is built up over the course of a conversation."


`model`::
(Optional, string)
The ID of the model to use.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The ID of the model to use.
The ID of the model to use. By default, the model ID set in the inference endpoint is used.

Copy link
Contributor

@szabosteve szabosteve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a few small comments, please take or leave them. LGTM!

==== {api-description-title}

The chat completion {infer} API enables real-time responses for chat completion tasks by delivering answers incrementally, reducing response times during computation.
It only works with the `chat_completion` task type for OpenAI and Elastic Inference Service.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
It only works with the `chat_completion` task type for OpenAI and Elastic Inference Service.
It only works with the `chat_completion` task type for the `openai` and `elasticsearch` {infer} services.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, one heads up the Elastic Inference Service (EIS) is actually different from the Elasticsearch service. EIS is called elastic and Elasticsearch is called elasticsearch. Also we don't have docs for EIS yet but we need them.

@@ -67,10 +67,10 @@ Click the links to review the configuration details of the services:
* <<infer-service-elasticsearch,Elasticsearch>> (`rerank`, `sparse_embedding`, `text_embedding` - this service is for built-in models and models uploaded through Eland)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* <<infer-service-elasticsearch,Elasticsearch>> (`rerank`, `sparse_embedding`, `text_embedding` - this service is for built-in models and models uploaded through Eland)
* <<infer-service-elasticsearch,Elasticsearch>> (`chat_completion`, `rerank`, `sparse_embedding`, `text_embedding` - this service is for built-in models and models uploaded through Eland)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we chatted about offline, the elasticsearch service is different from the elastic service. So this code will actually stay the same. We'll need to add a new entry for the elastic service.

@jonathan-buttner jonathan-buttner merged commit 838a41a into elastic:main Jan 13, 2025
5 checks passed
@jonathan-buttner jonathan-buttner deleted the ml-unified-api-docs branch January 13, 2025 14:48
jonathan-buttner added a commit to jonathan-buttner/elasticsearch that referenced this pull request Jan 13, 2025
* Including examples

* Using js instead of json

* Adding unified docs to main page

* Adding missing description text

* Refactoring to remove unified route

* Addign back references to the _unified route

* Update docs/reference/inference/chat-completion-inference.asciidoc

Co-authored-by: István Zoltán Szabó <[email protected]>

* Address feedback

---------

Co-authored-by: István Zoltán Szabó <[email protected]>
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.x

elasticsearchmachine pushed a commit that referenced this pull request Jan 13, 2025
* Including examples

* Using js instead of json

* Adding unified docs to main page

* Adding missing description text

* Refactoring to remove unified route

* Addign back references to the _unified route

* Update docs/reference/inference/chat-completion-inference.asciidoc



* Address feedback

---------

Co-authored-by: István Zoltán Szabó <[email protected]>
martijnvg pushed a commit to martijnvg/elasticsearch that referenced this pull request Jan 14, 2025
* Including examples

* Using js instead of json

* Adding unified docs to main page

* Adding missing description text

* Refactoring to remove unified route

* Addign back references to the _unified route

* Update docs/reference/inference/chat-completion-inference.asciidoc

Co-authored-by: István Zoltán Szabó <[email protected]>

* Address feedback

---------

Co-authored-by: István Zoltán Szabó <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged >docs General docs changes :ml Machine learning Team:Docs Meta label for docs team Team:ML Meta label for the ML team v8.18.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants