SuperMarioYL

EN  ⇄  中文
Leo — AI systems, made to run in production.

I build the infra that makes LLM agents reliable in production — inference serving, MCP tool layers, multi-agent orchestration, and eval/observability.

How my agents run

User → Orchestrator → Tools/MCP + Memory/RAG → Inference, on a Cloud Native AI substrate, instrumented by Eval & Observability

Every tool call and LLM span is traced; guardrails gate actions; eval feedback closes the loop — agents as observable, cost-bounded systems on a cloud-native substrate.

Capabilities

AI Agent: plan→act→reflect loop · Cloud Native: scheduler + pods · Inference: lower latency, higher throughput

Tech stack

Tech stack grouped by pillar: AI Agent, Cloud Native AI, Inference

Journey

From infrastructure to agents: Cloud Native AI → Inference Acceleration → AI Agent

Selected work


Let's build reliable AI systems together · blog.lei6393.com

Blog    Email    GitHub     Profile views