Files
imagen/AGENTS.md
T
imbytecat 5af05b2141 feat: stream gpt-image generation via SSE with keepalive
- /api/generate now responds with text/event-stream end-to-end
- forwards upstream image_generation.* / image_edit.* partial+completed events
- 20s keepalive comments survive Cloudflare's 120s proxy-read timeout
- falls back to non-streaming when upstream rejects stream/partial_images
- drops @ai-sdk/openai-compatible, @ai-sdk/react, ai (unused)
- frontend consumes SSE via fetch+ReadableStream, shows progressive preview
2026-05-18 22:44:31 +08:00

90 lines
4.6 KiB
Markdown

# AGENTS.md
Bun + TypeScript single-file server that proxies an OpenAI-compatible image
endpoint and serves a small vanilla HTML/JS playground. The whole pipeline is
SSE end-to-end so it survives Cloudflare's 120s proxy-read timeout.
## Runtime
- Bun, not Node. See `CLAUDE.md` for the full Bun-vs-Node cheatsheet
(prefer `Bun.serve`, `Bun.file`, `bun:test`, `Bun.sql`, etc.). Do not add
`dotenv` — Bun loads `.env` automatically.
- Bun version baseline: `1.3.13` (per `README.md`).
## Commands
- Install: `bun install`
- Dev (HMR): `bun run dev``bun --hot ./index.ts`
- Start: `bun run start``bun ./index.ts`
- Typecheck: no script defined. Use `bunx tsc --noEmit` (tsconfig already sets
`noEmit: true`, so plain `bunx tsc` works too).
- Tests / lint / formatter: none configured. If adding tests, use `bun test`.
The server binds `0.0.0.0` (see `index.ts:175`), so it is reachable from other
hosts on the network when running locally — be mindful when entering API keys.
## Architecture
- `index.ts` — the entire backend. One `Bun.serve` instance with:
- `/` serves `index.html` via Bun's HTML import (`import index from "./index.html"`).
- `POST /api/generate` accepts
`{ baseURL, apiKey, model, prompt, size, referenceImages? }` and **always
responds with `text/event-stream`**. Emitted events:
- `event: partial``{ image: dataUrl, index }` for each `partial_image`
- `event: final``{ image: dataUrl }` for the completed image
- `event: done` — empty payload, sent right before close
- `event: error``{ message }` for any failure
- SSE comments `: keepalive` every 20s while waiting for upstream, so
Cloudflare's 120s proxy-read timeout never fires.
- Upstream dispatch:
- `referenceImages` present → `POST {baseURL}/images/edits` as
`multipart/form-data` (image blobs decoded from data URLs).
- Otherwise → `POST {baseURL}/images/generations` as JSON.
- Both calls send `stream: true, partial_images: 2` first. If upstream
returns a 400 mentioning `stream` or `partial_images`,
`isStreamingUnsupportedError` triggers a single retry with
`stream: false` and the response is replayed as one `final` event via
`forwardUpstreamJSON`. Any other 4xx/5xx propagates as `error`.
- Targets the **gpt-image series only** (gpt-image-2 is the default). Do
not reintroduce DALL·E-only fields like `response_format` — gpt-image
always returns `b64_json`.
- `index.html` — self-contained UI: inline CSS, plain DOM JS, no build step.
Reads the SSE response via `fetch` + `ReadableStream` (not `EventSource`,
because the API is `POST`). Partials overwrite a single `<img>` so the
preview animates in place. Text fields (`baseURL`, `apiKey`, `model`,
`size`, `prompt`) persist in `localStorage` under the `aip:<field>` prefix.
Reference images are kept in an in-memory `refImages` array as base64 data
URLs and are **not** persisted. There is no React code despite
`react` / `react-dom` / `@types/react*` being in `package.json` — treat
those deps as latent. Do not invent a React frontend unless asked.
- No router, no DB, no auth, no AI SDK. API key is supplied per-request by
the browser and never stored server-side.
## TypeScript conventions
`tsconfig.json` is strict with bundler-mode resolution:
- `strict`, `noUncheckedIndexedAccess`, `noImplicitOverride`,
`noFallthroughCasesInSwitch` are on — array/object index access is
`T | undefined` and must be narrowed.
- `verbatimModuleSyntax` + `moduleDetection: "force"` — use `import type` for
type-only imports; every file is a module.
- `allowImportingTsExtensions` is on; `.ts` extensions in imports are fine.
- `jsx: "react-jsx"` is set but unused (see frontend note above).
## When extending the API
- Add routes inside the `routes` object in `index.ts`; keep the
`{ POST: async (req) => … }` shape used by `/api/generate`.
- For any long-running upstream call, mirror the SSE-with-keepalive pattern:
build a `ReadableStream<Uint8Array>`, start a 20s `: keepalive` comment
timer in `start()`, do work inside `try`, always `clearInterval` and
`controller.close()` in `finally`. Helpers `sseEvent` / `sseComment`
already exist.
- Stay defensive about upstream capabilities: many OpenAI-compatible
providers reject unknown params. Send the optimistic request first, then
detect the specific 400 (see `isStreamingUnsupportedError`) and retry with
a degraded body rather than feature-detecting up front.
- Decode incoming data URLs with `decodeDataUrl` (returns `Buffer` + mime)
and pass them as `Blob` parts to `FormData` — same pattern as the edits
path.