Metering AI usage without token chaos

When you charge for AI, you need a unit that makes sense to users and to your product.

Raw token counts are useful infrastructure details, but they are a poor product-facing billing model. Users care about the action they took, the value they got, and what it cost — not input tokens, output tokens, or model-specific rates. That is why AI products should meter events, not tokens.

Tokens are implementation details

Input and output tokens matter to providers and backend cost calculations. They do not make a great pricing language for end users. Exposing them directly creates friction:

pricing becomes harder to explain
usage feels unpredictable
changing models affects customer-facing billing
your product experience mirrors provider complexity

Users do not ask "How many output tokens did that summary consume?" They ask "How much did that summary cost?" That difference matters.

The unit should match the product

Product teams typically want to charge for actions users understand:

a chat reply
an image generation
a document summary
an agent run
a workflow step

Those are the moments where value is created. So those are the moments your billing system should be built around. Define product-native units instead of exposing provider-native cost units:

chat.reply = 8 credits
image.generate = 40 credits
doc.summarize = 12 credits
agent.run = 25 credits

Meter events, not token math

The event-based model

users buy credits
your app triggers AI actions
you meter those actions as events
credits are deducted from the wallet
balances update in real time

You send an event name and a credit amount. Chargly records the event, deducts the balance, and keeps the wallet state in sync. Stripe handles top-ups when users need more. Your code operates at the level of product events instead of raw provider usage.

Choosing event granularity: a decision framework

There is no single correct answer. The right level depends on your product. Here is a practical framework:

Approach	When to use	Example
Per-action	Clear feature boundaries; users interact with discrete features	`chat.reply` = 8 credits, `image.generate` = 40 credits
Per-model	Users explicitly choose between models in the product	GPT-4 actions cost more than GPT-3.5; premium image models cost more
Flat-rate	Simplicity matters more than fine-grained accuracy	Every AI call = 1 credit; every workflow step = 3 credits

Practical rule: Start with per-action pricing unless users explicitly choose between models in the product.

Per-action sits at the best level of abstraction: above raw provider cost math, below vague bundle pricing, and close to the features users actually interact with. It also gives you room to evolve — you can change providers, adjust internal cost structures, or optimize model choices without forcing users to relearn the pricing model.

A simple mental model

Providers bill in tokens
Products bill in actions
Users buy credits

Those layers should not all be the same thing. When they are collapsed into one, pricing gets harder to explain and manage. When they are separated cleanly, the system becomes easier to operate.

How Chargly fits in

Chargly is built around the event-based model: wallets hold balances, events represent billable actions, deductions happen in real time, and developers define credit costs at the level their users actually understand. The goal is not to pretend model costs do not exist — it is to keep them in your backend logic and pricing decisions, not in your customer-facing UX.

If you are defining billable actions, start by deciding which user-visible events in your product deserve a credit cost. Then make the model consistent, understandable, and easy to buy into.

billingcreditsmeteringtokens