InferX Beta Serverless GPU Inference Platform, Built for Agent-Native Workloads

Browse published catalog models for agent-native workloads and serverless GPU inference, log in when you want to customize one or deploy it into your own tenant.
Model Intro Tags Action
gemma-4-E2B-it
An efficient Gemma 4 model optimized for strong performance with lower resource usage View on Hugging Face
Qwen2.5-Coder-0.5B
Lightweight coder View on Hugging Face
LLM Code OpenClaw
Qwen2.5-Coder-1.5B-Instruct
Lightweight coder View on Hugging Face
coding low-latency