When you charge for AI, you need a unit that makes sense to users and to your product.
Raw token counts are useful infrastructure details, but they are a poor product-facing billing model. Users care about the action they took, the value they got, and what it cost — not input tokens, output tokens, or model-specific rates. That is why AI products should meter events, not tokens.
Tokens are implementation details
Input and output tokens matter to providers and backend cost calculations. They do not make a great pricing language for end users. Exposing them directly creates friction:
- pricing becomes harder to explain
- usage feels unpredictable
- changing models affects customer-facing billing
- your product experience mirrors provider complexity
Users do not ask "How many output tokens did that summary consume?" They ask "How much did that summary cost?" That difference matters.
The unit should match the product
Product teams typically want to charge for actions users understand:
- a chat reply
- an image generation
- a document summary
- an agent run
- a workflow step
Those are the moments where value is created. So those are the moments your billing system should be built around. Define product-native units instead of exposing provider-native cost units:
chat.reply= 8 creditsimage.generate= 40 creditsdoc.summarize= 12 creditsagent.run= 25 credits
Meter events, not token math
The event-based model
- users buy credits
- your app triggers AI actions
- you meter those actions as events
- credits are deducted from the wallet
- balances update in real time
You send an event name and a credit amount. Chargly records the event, deducts the balance, and keeps the wallet state in sync. Stripe handles top-ups when users need more. Your code operates at the level of product events instead of raw provider usage.
Choosing event granularity: a decision framework
There is no single correct answer. The right level depends on your product. Here is a practical framework:
| Approach | When to use | Example |
|---|---|---|
| Per-action | Clear feature boundaries; users interact with discrete features | chat.reply = 8 credits, image.generate = 40 credits |
| Per-model | Users explicitly choose between models in the product | GPT-4 actions cost more than GPT-3.5; premium image models cost more |
| Flat-rate | Simplicity matters more than fine-grained accuracy | Every AI call = 1 credit; every workflow step = 3 credits |
Practical rule: Start with per-action pricing unless users explicitly choose between models in the product.
Per-action sits at the best level of abstraction: above raw provider cost math, below vague bundle pricing, and close to the features users actually interact with. It also gives you room to evolve — you can change providers, adjust internal cost structures, or optimize model choices without forcing users to relearn the pricing model.
A simple mental model
- Providers bill in tokens
- Products bill in actions
- Users buy credits
Those layers should not all be the same thing. When they are collapsed into one, pricing gets harder to explain and manage. When they are separated cleanly, the system becomes easier to operate.
How Chargly fits in
Chargly is built around the event-based model: wallets hold balances, events represent billable actions, deductions happen in real time, and developers define credit costs at the level their users actually understand. The goal is not to pretend model costs do not exist — it is to keep them in your backend logic and pricing decisions, not in your customer-facing UX.
If you are defining billable actions, start by deciding which user-visible events in your product deserve a credit cost. Then make the model consistent, understandable, and easy to buy into.
Next steps:
- Events & Metering docs — how Chargly handles event-based billing
- Live demo — see the flow in practice
- Introducing Chargly — why we built a credit-first billing layer
- Why AI monetization needs pricing intelligence — pricing intelligence and Pricing Advisor