...
Inference requests can alsp be submitted to the LLM service via notebook by running the python-based API client in the notebook environment and then providing the request and model details. An available LLM server will process the request and return the result.
API
The WebUI and notebook application both use a job submission API to interact with the LLM service. Other such interfaces can also be developed to integrate the LLM service with other applications and platforms.
Submit a job
Submitting inference jobs to the LLM service is accomplished through the jobs
API endpoint:
...
Code Block | ||
---|---|---|
| ||
app // Name of the application; in this case, "gridrepublic:text-inference"
commandLine // JSON string of the prompt in an array named "input"
hours // Runtime limit for the job; by default, use: 1
tag // Name of the model to use for inference |
...
Code Block | ||
---|---|---|
| ||
{ "app": "gridrepublic:text-inference", "commandLine": "{'input': ['When does 1+1=10?'] }", "tag": "gemma:2b" } |
The WebUI and notebook application both use this API.
TODO:
...
When the job is submitted, the API returns a success indicator and either an array of job "ids", when success is true, or a string indicating an "error", when success is false. For example:
Code Block | ||
---|---|---|
| ||
{
"success":true,
"ids":["a9ab011455bb6aeb0161b5fc08766b42"]
} |
Get job status
To retrieve the current status of a job that has been submitted to the LLM service, the jobs
API endpoint accepts GET requests with a comma-separated list of one or more job IDs as a path parameter:
|
In the following example, \{ids\} was replaced with 9f22472031ef57c3fd517061d116ad68; the output of the inference process is contained in the "log" property and is updated as the process runs:
Code Block | ||
---|---|---|
| ||
{ "success":true, "jobs":{ "9f22472031ef57c3fd517061d116ad68":{ "vmStatus":"running", "states":{ "default":{ "status":"running", "outputFiles":[], "log":"1+1=2. When you add numbers, the result", "commandLine":"{\"inputs\":[\"How can 1+1=10?\"]}", "app":"gridrepublic:text-inference" } }, "created":"2024-05-06T21:26:00+00:00", "copy":0, "tag":"llama2-uncensored", "runtime":23 } } } |