curl --request POST \
  --url https://paradigm.lighton.ai/api/v2/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "alfred-40b-1123",
  "prompt": "Hello, "
}
'

{
  "model": "alfred-40b-1123",
  "prompt": "Hello, "
}

Models

Generate a text completion

This endpoint can be used to generate completions from a Large Language Model. It is a simple proxy forwarding your requests to the desired model. Any LightOn model is deployed on a vLLM-based image. Response Types:

When stream=false (default): Returns a complete JSON response with all completion choices
When stream=true: Returns Server-Sent Events (SSE) with incremental completion chunks Streaming Format: Each SSE event contains a JSON object with incremental text. The stream ends with data: [DONE].

POST

api

completions

curl --request POST \
  --url https://paradigm.lighton.ai/api/v2/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "alfred-40b-1123",
  "prompt": "Hello, "
}
'

{
  "model": "alfred-40b-1123",
  "prompt": "Hello, "
}

Authorizations

Authorization

string

header

required

Bearer token authentication

Body

Request serializer for completions endpoint.

model

string

required

Model to use for generating completions, must exist and be configured from the admin

prompt

string

The prompt to generate completions for

max_tokens

integer

Maximum number of tokens to generate

temperature

number<double>

Sampling temperature between 0 and 2

top_p

number<double>

Nucleus sampling parameter

integer

Number of completions to generate

stream

boolean

Whether to stream back partial progress

logprobs

integer | null

Include the log probabilities on the logprobs most likely tokens

echo

boolean

Echo back the prompt in addition to the completion

stop

string

Up to 4 sequences where the API will stop generating further tokens

presence_penalty

number<double>

Penalty for new tokens based on whether they appear in the text so far

frequency_penalty

number<double>

Penalty for new tokens based on their existing frequency in the text

best_of

integer

Generates multiple completions server-side and returns the best

logit_bias

object

Modify the likelihood of specified tokens appearing in the completion

Show child attributes

logit_bias.{key}

any

user

string

A unique identifier representing your end-user

suffix

string

The suffix that comes after a completion of inserted text

Response

Successful response

Response serializer for completions endpoint results.

string

required

Unique identifier for the completion

object

string

required

Object type, always 'text_completion'

created

integer

required

Unix timestamp of when the completion was created

model

string

required

The model used for generating the completion

choices

object[]

required

List of completion choices generated by the model

Show child attributes

choices.text

string

required

The generated text completion

choices.index

integer

required

The index of this choice in the list of choices

choices.logprobs

any | null

Log probability information for the choice

choices.finish_reason

string | null

The reason the model stopped generating tokens

usage

any

Usage statistics for the completion request

Generate a chat completion Create embeddings

⌘I

Models

Tools

Companies

Users

Workspaces

Feedbacks

Documents

Reportings

Chatflow

Upload Sessions

SCIM

Generate a text completion

Authorizations

Body

Response