Edge Compute
Run your application next to 8080 inference—ASIC-backed LLMs and general compute on the same fabric for minimal end-to-end latency.
Edge Compute is 8080’s way of running your code on the same network and infrastructure that serves inference. Models run on ASIC accelerators tuned for LLM workloads; your handlers, gateways, and tools run on general-purpose compute provisioned alongside that stack—not in a distant region reached only over the public internet.
8080 datacenters are placed in closest physical proximity to major AWS and GCP cloud regions. That keeps the path between applications you run in those clouds and 8080 Edge and inference as short as possible—fewer miles and fewer intermediate hops than if inference lived far from your existing regional footprint—so hybrid setups still get minimal distance for traffic to and from your hyperscaler workloads.
Why it matters
Section titled “Why it matters”When a user request hits your app and your app calls the inference API, every hop across the open internet adds milliseconds (often tens or hundreds) of delay you cannot fully control. On Edge Compute, orchestration keeps your application logic and the accelerator-backed inference path on a co-located fabric, so the round trip from your code to the model and back is as short as the platform allows. That makes end-to-end latencies—browser or client through your logic, into the model, and out again—practically impossible to reproduce if you host the app elsewhere and only call api.8080.io over the public internet.
Use Edge when you care about agents, multi-step flows, RAG, custom routing, or any pattern where many small model calls or tight coupling between business logic and inference would otherwise dominate latency.
- Python — The primary SDK today packages the FastAPI-style app builder, tools, sandboxes, and helpers to call completions from the same environment you deploy. See Custom endpoints to get started.
- TypeScript — A TypeScript/JavaScript SDK is coming soon; it will follow the same idea—define endpoints and deploy them next to inference—once the runtime support lands (see Deploying code for the current manifest and Python-oriented workflow).
Next steps
Section titled “Next steps”- Deploying code —
8080.yaml, entrypoints, resources, and how deployments are configured. - Sandboxes — Isolated environments for code execution and tooling next to your Edge app.
- Custom endpoints — Build and run a minimal Python service locally, then deploy with the CLI.