Status: closed beta

Use specialized open models like any other API.

Skip the overhead of self-hosting. Access community-hosted models through an OpenAI-compatible API, with flexible model choice and better economics.

> Join the beta How it works

bash — curl-inference.sh

curl https://infer.ram4.dev/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "community/llama3.1-70b-legal-es",
    "messages": [
      {"role": "user", "content": "Hola, necesito ayuda con un contrato"}
    ]
  }'

Closed APIs are expensive. Self-hosting takes time. Raw GPU rental still leaves too much on you.

infer gives you a simpler path to open inference: configured models, one API, less lock-in.

Closed beta

30 / 50

Spots left

How it works

Use models

Pick a model

Browse community-hosted models by specialty, latency, and price.

Get your API key

One key, access to the entire network.

Drop-in replacement

Change one line. Works with any OpenAI SDK.

Host models

Configure locally

Run any model with Ollama, vLLM, or LocalAI.

Connect to infer

Point your endpoint to our gateway. We handle routing.

Earn per token

Get paid for every request your model serves.

Why infer

Works with your current stack

Keep your SDKs. Change your base URL and start.

Models for real use cases

Use specialized, fine-tuned, or augmented models beyond generic cloud defaults.

More control, less lock-in

Choose models and providers without rewriting your app around one vendor.

Better economics for open workloads

Use open inference at lower cost for many production and prototyping scenarios.

Monetize the models you already run.

Connect Ollama, vLLM, or LocalAI to infer. Keep control of your setup and get paid when your model serves traffic.

terminal

$ infer connect --endpoint http://localhost:11434 --model llama3.1-70b
✓ Connected. Your model is now live on infer.ram4.dev

Join the beta

We're onboarding the first API users and model hosts now. Tell us your role and use case to get invited.