You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

The GridRepublic LLM Service is accessible via API, WebUI, and a notebook application.

The WebUI and notebook application both allow interaction with connected LLM servers, which power the inference service. These interfaces accept text as input for inference and support a range of models.

WebUI

When using the WebUI, up to three models can be selected to run inference on a given prompt. Check the box for any model, enter the prompt to use for input, and click the Process button to submit a job for LLM servers.

Notebook application

Inference requests can be submitted to the LLM service via notebook by running the python-based API client in the notebook environment and then providing the request and model details. An available LLM server will process the request and return the result.

 API

Submitting inference jobs to the LLM service is accomplished through the jobs API endpoint:

https://api.charityengine.services/remotejobs/v2/jobs

This endpoint accepts a JSON payload via POST with the following parameters:

app // Name of the application; in this case, "charityengine:text-inference"
commandLine // JSON string of the prompt in an array named "input"
tag // Name of the model to use for inference

The format of the JSON data is as follows, with the "input" and "tag" strings populated with a prompt and specific model name:

{
  "app": "charityengine:text-inference",
  "commandLine": "{'input': ['When does 1+1=10?'] }",
  "tag": "gemma:2b"
}


TODO: Specify the response format from GET jobs/{id}

WebUI and notebook application both use this API .

  • No labels