The Distributed LLM Service is accessible via WebUI, a notebook application, and an API. All interfaces support a range of models.

Use of each of these endpoints is documented below.

WebUI

The WebUI allows the user to select up to three supported models, enter a prompt, and run inference.

Choose your models: select preferred models (up to 3 at a time), via the checkboxes
Enter your prompt (plain text only)
Click the "Process" button to submit a job to the network

GR-Public > LLM WebUI and Notebook > distributed-inference-webui.png

Notebook application

Inference requests can alsp be submitted to the LLM service via notebook by running the python-based API client in the notebook environment and then providing the request and model details. An available LLM server will process the request and return the result.

GR-Public > LLM WebUI and Notebook > distributed-inference-notebook.png

API

Submitting inference jobs to the LLM service is accomplished through the jobs API endpoint:

https://api.charityengine.services/remotejobs/v2/jobs

This endpoint accepts a JSON payload via POST with the following parameters:

app // Name of the application; in this case, "charityengine:text-inference"
commandLine // JSON string of the prompt in an array named "input"
tag // Name of the model to use for inference

The format of the JSON data is as follows, with the "input" and "tag" strings populated with a prompt and specific model name:

{
  "app": "charityengine:text-inference",
  "commandLine": "{'input': ['When does 1+1=10?'] }",
  "tag": "gemma:2b"
}

The WebUI and notebook application both use this API.

TODO:

Revise the API URL to gridrepublic.org
Specify the response format from GET jobs/{id}