Serverless open-model inference focused on low latency. Function calling, JSON mode and LoRA hosting.