API reference
- class gseai.GSEAIServer(api_token, host='gseai.gse.buffalo.edu', port=11434, timeout=None)
Bases:
objectClient for the GSE AI LocalAI server.
- Parameters:
api_token (
str) – Bearer token for authentication.host (
str) – Hostname of the server.port (
int) – Port the server listens on.timeout (
float|None) – Request timeout in seconds.None(default) means no timeout, which is recommended for slow models.
- close()
- Return type:
None
- list_models()
GET /v1/models — list available models.
- Return type:
dict
- chat(model, prompt, *, system_prompt=None, temperature=None, max_tokens=None, stream=False)
Convenience wrapper for single-turn chat.
- Parameters:
model (
str) – Model identifier.prompt (
str) – User message as a plain string.system_prompt (
str|None) – Optional system message.temperature (
float|None) – Sampling temperature (0–2).max_tokens (
int|None) – Maximum tokens to generate.stream (
bool) – If True, return a generator of SSE event dicts.
- Return type:
dict|Generator[dict,None,None]- Returns:
Response dict, or a generator of SSE event dicts when
stream=True.
- chat_completions(model, messages, *, temperature=None, max_tokens=None, stream=False, top_p=None, top_k=None, stop=None, presence_penalty=None, frequency_penalty=None, repeat_penalty=None, logit_bias=None, seed=None, response_format=None, tools=None, tool_choice=None)
POST /v1/chat/completions — OpenAI-compatible chat completions.
- Parameters:
model (
str) – Model identifier.messages (
list[dict]) – List of message dicts withroleandcontent.temperature (
float|None) – Sampling temperature (0–2).max_tokens (
int|None) – Maximum tokens to generate.stream (
bool) – If True, return a generator of SSE event dicts.top_p (
float|None) – Nucleus sampling (0–1).top_k (
int|None) – Top-k sampling limit.stop (
str|list[str] |None) – Stop sequence(s).presence_penalty (
float|None) – Presence penalty (-2 to 2).frequency_penalty (
float|None) – Frequency penalty (-2 to 2).repeat_penalty (
float|None) – Repetition penalty.logit_bias (
dict|None) – Token probability bias adjustments.seed (
int|None) – Random seed for reproducibility.response_format (
dict|None) – JSON schema for structured output.tools (
list[dict] |None) – Function definitions for tool/function calling.tool_choice (
str|None) – Tool selection mode — “auto”, “none”, or “required”.
- Return type:
dict|Generator[dict,None,None]
- completions(model, prompt, *, max_tokens=None, temperature=None, top_p=None, top_k=None, stop=None, frequency_penalty=None, presence_penalty=None, stream=False, seed=None)
POST /v1/completions — legacy text completions.
- Parameters:
model (
str) – Model identifier.prompt (
str|list) – Input text or list of texts.max_tokens (
int|None) – Maximum tokens to generate.temperature (
float|None) – Sampling temperature (0–2).top_p (
float|None) – Nucleus sampling (0–1).top_k (
int|None) – Top-k sampling limit.stop (
str|list[str] |None) – Stop sequence(s).frequency_penalty (
float|None) – Frequency penalty (-2 to 2).presence_penalty (
float|None) – Presence penalty (-2 to 2).stream (
bool) – If True, return a generator of SSE event dicts.seed (
int|None) – Random seed.
- Return type:
dict|Generator[dict,None,None]
- embeddings(model, input, *, encoding_format=None, dimensions=None)
POST /v1/embeddings — generate text embeddings.
- Parameters:
model (
str) – Model identifier.input (
str|list[str]) – Text or list of texts to embed.encoding_format (
str|None) – Output format — “float” or “base64”.dimensions (
int|None) – Target embedding dimensionality.
- Return type:
dict
- responses(model, messages, **kwargs)
POST /v1/responses — stateful chat responses (OpenAI-compatible).
- Parameters:
model (
str) – Model identifier.messages (
list[dict]) – List of message dicts withroleandcontent.**kwargs (
Any) – Additional parameters forwarded to the endpoint.
- Return type:
dict
- messages(model, messages, max_tokens, *, system=None, temperature=None, top_p=None, top_k=None)
POST /v1/messages — Anthropic-compatible messages API.
- Parameters:
model (
str) – Model identifier.messages (
list[dict]) – List of message dicts withroleandcontent.max_tokens (
int) – Maximum tokens to generate (required by the API).system (
str|None) – System prompt.temperature (
float|None) – Sampling temperature.top_p (
float|None) – Nucleus sampling (0–1).top_k (
int|None) – Top-k sampling limit.
- Return type:
dict
- transcribe(model, file_path, *, language=None, prompt=None, response_format='json')
POST /v1/audio/transcriptions — transcribe audio to text.
- Parameters:
model (
str) – Whisper model identifier.file_path (
str) – Path to the audio file.language (
str|None) – Source language code (e.g."en"); auto-detected if omitted.prompt (
str|None) – Optional context hint passed to the model.response_format (
str) – One of"json","verbose_json","text","srt", or"vtt"(default"json").
- Return type:
dict|str- Returns:
Parsed dict for JSON formats, plain text string otherwise.
- translate(model, file_path, *, prompt=None, response_format='json')
POST /v1/audio/translations — transcribe audio and translate to English.
- Parameters:
model (
str) – Whisper model identifier.file_path (
str) – Path to the audio file.prompt (
str|None) – Optional context hint passed to the model.response_format (
str) – One of"json","verbose_json","text","srt", or"vtt"(default"json").
- Return type:
dict|str- Returns:
Parsed dict for JSON formats, plain text string otherwise.
- speech(model, input, *, voice=None, speed=None)
POST /v1/audio/speech — synthesize speech from text.
- Parameters:
model (
str) – TTS model identifier.input (
str) – Text to synthesize.voice (
str|None) – Voice identifier.speed (
float|None) – Playback speed multiplier (default1.0).
- Return type:
bytes- Returns:
Raw audio bytes.
- generate_image(model, prompt, *, n=None, size=None, steps=None, seed=None)
POST /v1/images/generations — generate images from a text prompt.
- Parameters:
model (
str) – Image generation model identifier.prompt (
str) – Text description of the desired image.n (
int|None) – Number of images to generate.size (
str|None) – Output dimensions, e.g."512x512".steps (
int|None) – Diffusion steps.seed (
int|None) – Random seed for reproducibility.
- Return type:
dict
- edit_image(model, image_path, prompt, *, mask_path=None, n=None, size=None)
POST /v1/images/edits — edit an image guided by a text prompt.
- Parameters:
model (
str) – Image model identifier.image_path (
str) – Path to the source image.prompt (
str) – Edit instruction.mask_path (
str|None) – Optional greyscale mask (white = region to edit).n (
int|None) – Number of variants to generate.size (
str|None) – Output dimensions, e.g."512x512".
- Return type:
dict
- image_variation(model, image_path, *, n=None, size=None)
POST /v1/images/variations — generate variations of an existing image.
- Parameters:
model (
str) – Image model identifier.image_path (
str) – Path to the source image.n (
int|None) – Number of variations to generate.size (
str|None) – Output dimensions, e.g."512x512".
- Return type:
dict