Series B fintech · Q4 2025
Problem
Inference costs tripled month-over-month. Engineering couldn't trace which model calls were driving the spike. Board was asking CFO questions the team couldn't answer.
What shipped
Cost routing layer in 6 weeks — structured model-selection logic, full audit trail mapping every dollar of inference cost to a feature call.
Outcome
Inference cost reduced 58% while throughput increased. First time the team could answer a board question about cost-per-transaction.