diff --git a/AGENTS.md b/AGENTS.md index eb07e04..ac7604a 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -19,20 +19,29 @@ generation endpoint and serves a small vanilla HTML/JS playground. `noEmit: true`, so plain `bunx tsc` works too). - Tests / lint / formatter: none configured. If adding tests, use `bun test`. -The server binds `0.0.0.0` (see `index.ts:6`), so it is reachable from other +The server binds `0.0.0.0` (see `index.ts:61`), so it is reachable from other hosts on the network when running locally — be mindful when entering API keys. ## Architecture - `index.ts` — the entire backend. One `Bun.serve` instance with: - `/` serves `index.html` via Bun's HTML import (`import index from "./index.html"`). - - `POST /api/generate` accepts `{ baseURL, apiKey, model, prompt, size }`, - builds an OpenAI-compatible provider with `@ai-sdk/openai-compatible`, and - calls `generateImage` from `ai`. Images come back as base64 and are - returned as `data:` URLs in `{ images: string[] }`. + - `POST /api/generate` accepts + `{ baseURL, apiKey, model, prompt, size, referenceImages? }`. It returns + `{ images: string[] }` where each entry is a `data:` URL (base64). + - Two code paths inside the handler: + 1. No `referenceImages` → uses `@ai-sdk/openai-compatible` + `generateImage` + from `ai`. + 2. `referenceImages` present → hand-rolled `multipart/form-data` POST to + `${baseURL}/images/edits` (see `generateWithReference`). The AI SDK + does not currently expose image edits for OpenAI-compatible providers, + so this path bypasses it on purpose. The edits endpoint is gpt-image + series only (see UI hint in `index.html`). - `index.html` — self-contained UI: inline CSS, plain DOM JS, no build step. - Settings (baseURL, apiKey, model, size, prompt) persist in `localStorage` - under the `aip:` prefix. There is no React code despite + Text fields (`baseURL`, `apiKey`, `model`, `size`, `prompt`) persist in + `localStorage` under the `aip:` prefix. Reference images are kept + in an in-memory `refImages` array as base64 data URLs and are **not** + persisted — refreshing the page drops them. There is no React code despite `react` / `react-dom` / `@types/react*` being in `package.json` — treat those deps as latent. Do not invent a React frontend unless asked. - No router, no DB, no auth. API key is supplied per-request by the browser @@ -59,3 +68,7 @@ hosts on the network when running locally — be mindful when entering API keys. - The AI SDK image type is loose; the current handler casts to `{ mediaType?: string; base64?: string }`. Mirror that pattern rather than trusting field presence. +- For anything the AI SDK does not cover (e.g. image edits, masks, variations), + follow `generateWithReference`: build `FormData` with `Blob`s decoded from + the incoming data URLs and `fetch` the upstream endpoint directly with the + caller's `Authorization: Bearer `.