Files
imagen/AGENTS.md
T
imbytecat 5af05b2141 feat: stream gpt-image generation via SSE with keepalive
- /api/generate now responds with text/event-stream end-to-end
- forwards upstream image_generation.* / image_edit.* partial+completed events
- 20s keepalive comments survive Cloudflare's 120s proxy-read timeout
- falls back to non-streaming when upstream rejects stream/partial_images
- drops @ai-sdk/openai-compatible, @ai-sdk/react, ai (unused)
- frontend consumes SSE via fetch+ReadableStream, shows progressive preview
2026-05-18 22:44:31 +08:00

4.6 KiB

AGENTS.md

Bun + TypeScript single-file server that proxies an OpenAI-compatible image endpoint and serves a small vanilla HTML/JS playground. The whole pipeline is SSE end-to-end so it survives Cloudflare's 120s proxy-read timeout.

Runtime

  • Bun, not Node. See CLAUDE.md for the full Bun-vs-Node cheatsheet (prefer Bun.serve, Bun.file, bun:test, Bun.sql, etc.). Do not add dotenv — Bun loads .env automatically.
  • Bun version baseline: 1.3.13 (per README.md).

Commands

  • Install: bun install
  • Dev (HMR): bun run devbun --hot ./index.ts
  • Start: bun run startbun ./index.ts
  • Typecheck: no script defined. Use bunx tsc --noEmit (tsconfig already sets noEmit: true, so plain bunx tsc works too).
  • Tests / lint / formatter: none configured. If adding tests, use bun test.

The server binds 0.0.0.0 (see index.ts:175), so it is reachable from other hosts on the network when running locally — be mindful when entering API keys.

Architecture

  • index.ts — the entire backend. One Bun.serve instance with:
    • / serves index.html via Bun's HTML import (import index from "./index.html").
    • POST /api/generate accepts { baseURL, apiKey, model, prompt, size, referenceImages? } and always responds with text/event-stream. Emitted events:
      • event: partial{ image: dataUrl, index } for each partial_image
      • event: final{ image: dataUrl } for the completed image
      • event: done — empty payload, sent right before close
      • event: error{ message } for any failure
      • SSE comments : keepalive every 20s while waiting for upstream, so Cloudflare's 120s proxy-read timeout never fires.
    • Upstream dispatch:
      • referenceImages present → POST {baseURL}/images/edits as multipart/form-data (image blobs decoded from data URLs).
      • Otherwise → POST {baseURL}/images/generations as JSON.
      • Both calls send stream: true, partial_images: 2 first. If upstream returns a 400 mentioning stream or partial_images, isStreamingUnsupportedError triggers a single retry with stream: false and the response is replayed as one final event via forwardUpstreamJSON. Any other 4xx/5xx propagates as error.
    • Targets the gpt-image series only (gpt-image-2 is the default). Do not reintroduce DALL·E-only fields like response_format — gpt-image always returns b64_json.
  • index.html — self-contained UI: inline CSS, plain DOM JS, no build step. Reads the SSE response via fetch + ReadableStream (not EventSource, because the API is POST). Partials overwrite a single <img> so the preview animates in place. Text fields (baseURL, apiKey, model, size, prompt) persist in localStorage under the aip:<field> prefix. Reference images are kept in an in-memory refImages array as base64 data URLs and are not persisted. There is no React code despite react / react-dom / @types/react* being in package.json — treat those deps as latent. Do not invent a React frontend unless asked.
  • No router, no DB, no auth, no AI SDK. API key is supplied per-request by the browser and never stored server-side.

TypeScript conventions

tsconfig.json is strict with bundler-mode resolution:

  • strict, noUncheckedIndexedAccess, noImplicitOverride, noFallthroughCasesInSwitch are on — array/object index access is T | undefined and must be narrowed.
  • verbatimModuleSyntax + moduleDetection: "force" — use import type for type-only imports; every file is a module.
  • allowImportingTsExtensions is on; .ts extensions in imports are fine.
  • jsx: "react-jsx" is set but unused (see frontend note above).

When extending the API

  • Add routes inside the routes object in index.ts; keep the { POST: async (req) => … } shape used by /api/generate.
  • For any long-running upstream call, mirror the SSE-with-keepalive pattern: build a ReadableStream<Uint8Array>, start a 20s : keepalive comment timer in start(), do work inside try, always clearInterval and controller.close() in finally. Helpers sseEvent / sseComment already exist.
  • Stay defensive about upstream capabilities: many OpenAI-compatible providers reject unknown params. Send the optimistic request first, then detect the specific 400 (see isStreamingUnsupportedError) and retry with a degraded body rather than feature-detecting up front.
  • Decode incoming data URLs with decodeDataUrl (returns Buffer + mime) and pass them as Blob parts to FormData — same pattern as the edits path.