CLI reference

The gseai command-line tool lets you interact with the server without writing any Python.

Authentication

Every command requires a bearer token. Set it once as an environment variable so you don’t have to type it on every call:

export GSEAI_API_TOKEN=your-api-token

You can also pass it inline with -t / --token.

Global options

These options apply to every command and must come before the subcommand.

gseai [OPTIONS] COMMAND [ARGS]...

Flag	Env var	Description
`-t`, `--token`	`GSEAI_API_TOKEN`	Bearer auth token (required)
`-H`, `--host`	`GSEAI_HOST`	Server hostname (default: `gseai.gse.buffalo.edu`)
`-p`, `--port`	`GSEAI_PORT`	Server port (default: `11434`)
`-T`, `--timeout`	`GSEAI_TIMEOUT`	Request timeout in seconds (default: no timeout)

Models

gseai models          # one model ID per line
gseai models -j       # raw JSON

Chat

Single-turn chat with a plain-text prompt:

gseai chat gemma-4-e2b-it "What is machine learning?"

# Read the prompt from a file
gseai chat gemma-4-e2b-it -f prompt.txt

# Interactive session (multi-turn, retains conversation history)
gseai chat gemma-4-e2b-it -i

# Interactive with a system prompt and streaming
gseai chat gemma-4-e2b-it -i -s "You are a concise tutor." -S

Type exit or press Ctrl+C to end an interactive session.

Flag	Description
`-f`, `--file`	Read the prompt from a file
`-i`, `--interactive`	Start a multi-turn chat session, retaining conversation history
`-s`, `--system`	System prompt
`-t`, `--temperature`	Sampling temperature (0–2)
`-m`, `--max-tokens`	Maximum tokens to generate per turn
`-S`, `--stream`	Stream tokens as they arrive
`-j`, `--json`	Print full JSON response (single-turn only)

Completions

Legacy text completion (no chat structure):

gseai completions gemma-4-e2b-it "Once upon a time"

Accepts the same -t, -m, -S, -j flags as chat.

Embeddings

gseai embeddings nomic-embed-text "Hello world"
# model:      nomic-embed-text
# dimensions: 768
# values:     [0.0142, -0.0317, ...]

gseai embeddings nomic-embed-text "Hello world" -j   # full JSON

Audio

transcribe

Transcribe an audio file to text using Whisper:

gseai audio transcribe whisper recording.mp3
gseai audio transcribe whisper lecture.mp3 -l fr   # French source
gseai audio transcribe whisper lecture.mp3 -f srt  # SRT subtitles

Flag	Description
`-l`, `--language`	Source language code (e.g. `en`, `fr`); auto-detected if omitted
`-p`, `--prompt`	Context hint passed to the model
`-f`, `--format`	Response format: `text` (default), `json`, `verbose_json`, `srt`, `vtt`
`-j`, `--json`	Print full JSON (equivalent to `--format verbose_json`)

translate

Transcribe and translate to English in one step:

gseai audio translate whisper french_lecture.mp3

Accepts the same -p, -f, -j flags as transcribe.

speech

Synthesize speech from text (requires a TTS model):

gseai audio speech tts-model "Hello, world" -o hello.mp3

Flag	Description
`-o`, `--output`	Output file path (default: `speech.mp3`)
`-v`, `--voice`	Voice identifier
`-s`, `--speed`	Playback speed (default: `1.0`)

Images

generate

Generate an image from a text prompt:

gseai images generate stable-diffusion "a red barn in a snowy field"
gseai images generate stable-diffusion "a red barn" -o barn.png -n 3

When --n is greater than 1 the output filenames are numbered: barn_1.png, barn_2.png, etc.

Flag	Description
`-o`, `--output`	Output file path (default: `image.png`)
`-n`, `--n`	Number of images to generate (default: `1`)
`-s`, `--size`	Image dimensions, e.g. `512x512`
`-S`, `--steps`	Diffusion steps
`-r`, `--seed`	Random seed for reproducibility
`-j`, `--json`	Print raw JSON response

edit

Edit an existing image guided by a text prompt:

gseai images edit stable-diffusion photo.png "replace the sky with a sunset"
gseai images edit stable-diffusion photo.png "repaint the sky" -m mask.png

Flag	Description
`-m`, `--mask`	Greyscale mask image (white = area to edit)
`-o`, `--output`	Output file path (default: `edited.png`)
`-n`, `--n`	Number of variants
`-s`, `--size`	Output dimensions
`-j`, `--json`	Print raw JSON response

variation

Generate variations of an existing image:

gseai images variation stable-diffusion photo.png -n 3

Flag	Description
`-o`, `--output`	Output file path (default: `variation.png`)
`-n`, `--n`	Number of variations (default: `1`)
`-s`, `--size`	Output dimensions
`-j`, `--json`	Print raw JSON response

Queue

The queue lets you submit long-running inference jobs and collect results later, without keeping a connection open. Each user sees only their own jobs.

There are two submission commands depending on whether the job takes a text prompt or a file as input:

Text-in (chat, embeddings, speech, image_generate): use queue submit
File-in (transcribe, translate, image_edit, image_variation): use queue upload

submit

Submit a text-in job and print its ID:

gseai queue submit qwen2.5-coder "Summarise this paper: ..." -n my-job
# a3f1c7d2-4b5e-...

# Generate speech
gseai queue submit kokoros "Hello, world." --job-type speech -n tts

# Generate an image
gseai queue submit stable-diffusion "A red barn." --job-type image_generate -n barn

# Read the prompt from a file
gseai queue submit qwen2.5-coder -f prompt.txt -n my-job

Flag	Description
`-f`, `--file`	Read the prompt from a file
`-n`, `--name`	Human-readable job name (default: `job`)
`--job-type`	`chat` (default), `embeddings`, `speech`, or `image_generate`
`-s`, `--system`	System prompt (`chat` only)
`-t`, `--temperature`	Sampling temperature (`chat` only, 0–2, default: `0.0`)
`-m`, `--max-tokens`	Maximum tokens to generate (`chat` only, default: `8192`)

upload

Submit a file-in job and print its ID:

# Transcribe an audio file
gseai queue upload whisper lecture.mp3 --job-type transcribe -n lecture

# Translate audio to English
gseai queue upload whisper talk.mp3 --job-type translate -n talk-en

# Generate a variation of an image
gseai queue upload stable-diffusion photo.png --job-type image_variation -n var

# Edit an image with a text instruction
gseai queue upload stable-diffusion photo.png --job-type image_edit \
    --prompt "replace the sky with a sunset" -n edit

Flag	Description
`--job-type`	Required. One of `transcribe`, `translate`, `image_edit`, `image_variation`
`-n`, `--name`	Human-readable job name (default: `job`)
`-p`, `--prompt`	Edit instruction (`image_edit` only)

status

Show the current status of a job:

gseai queue status a3f1c7d2-...
gseai queue status a3f1c7d2-... -j   # raw JSON

Human-readable output shows job ID, name, type, model, status, token count, timestamps, and the result or error once the job has finished.

Flag	Description
`-j`, `--json`	Print full JSON record

list

List your jobs with optional filtering:

gseai queue list
gseai queue list --status pending
gseai queue list --job-type speech
gseai queue list --model qwen2.5-coder --limit 50

Output is a table of job ID (first 8 chars), name, type, model, status, and creation time.

Flag	Description
`--status`	Filter by status: `pending`, `running`, `done`, `error`, `cancelled`
`--model`	Filter by model identifier
`--job-type`	Filter by job type (`chat`, `speech`, `transcribe`, etc.)
`--limit`	Maximum results (default: `200`)
`-j`, `--json`	Print raw JSON list

cancel

Cancel a pending job, or all pending jobs:

gseai queue cancel a3f1c7d2-...

# Cancel every pending job at once
gseai queue cancel --all

Only pending jobs can be cancelled. Exits non-zero with an error message if a specific job is running, already finished, or not found.

Flag	Description
`--all`	Cancel all pending jobs; prints a line per job and a total count

wait

Block until a job finishes and print the result:

gseai queue wait a3f1c7d2-...
gseai queue wait a3f1c7d2-... --interval 30 --timeout 3600

# Save a binary result (speech or image) to a file
gseai queue wait a3f1c7d2-... -o output.mp3

Progress lines are written to stderr; text results go to stdout. Binary results (speech, images) are saved to the path given by -o, or to {job_id[:8]}.mp3 / .png by default, and the saved path is printed to stdout.

[14:00:32] pending (0 tokens) ...
[14:01:32] running (142 tokens) ...
[14:18:44] done
Transformers are a type of neural network architecture ...

Exits non-zero if the job finishes with status error or cancelled.

Flag	Description
`-i`, `--interval`	Seconds between polls (default: `60`)
`-T`, `--timeout`	Maximum seconds to wait before giving up (default: no limit)
`-o`, `--output`	Output file path for binary results; defaults to `{id[:8]}.mp3` or `.png`
`-j`, `--json`	Print full JSON record instead of result text

fetch

Download the binary result of an already-done job without polling:

gseai queue fetch a3f1c7d2-... -o speech.mp3

Exits non-zero if the job is not done, has no binary result (text-output job types), or belongs to a different user.

Flag	Description
`-o`, `--output`	Output file path (default: `{job_id[:8]}.mp3` or `.png`)

run

Submit a text-in job and wait for the result in one command (combines submit + wait):

gseai queue run qwen2.5-coder "Explain quantum entanglement." -n my-job

# Generate speech and save immediately
gseai queue run kokoros "Hello, world." --job-type speech -o hello.mp3

The submitted job ID is printed to stderr so it can be recorded even if you interrupt the wait. For file-in jobs, use queue upload then queue wait instead.

Flag	Description
`-f`, `--file`	Read the prompt from a file
`-n`, `--name`	Human-readable job name (default: `job`)
`--job-type`	`chat` (default), `embeddings`, `speech`, or `image_generate`
`-s`, `--system`	System prompt (`chat` only)
`-t`, `--temperature`	Sampling temperature (`chat` only, 0–2, default: `0.0`)
`-m`, `--max-tokens`	Maximum tokens to generate (`chat` only, default: `8192`)
`-i`, `--interval`	Seconds between polls (default: `60`)
`-T`, `--timeout`	Maximum seconds to wait (default: no limit)
`-o`, `--output`	Output file path for binary results; defaults to `{id[:8]}.mp3` or `.png`
`-j`, `--json`	Print full JSON record instead of result text