PMAI PM Playbook

AI cost model

Core template

Use this to make unit economics visible before launch. Update it monthly after launch. If you can't fill this out, you don't understand your product's economics.

Upstream: model requirements and cost constraints from the AI PRD. Downstream: cost per task feeds into the launch gate checklist and observability plan.

Inputs

ParameterValueSource/assumption
Model / providere.g., Claude Sonnet via Anthropic API
Input coste.g., $3.00 / 1M input tokens
Output coste.g., $15.00 / 1M output tokens
Avg input tokens per taske.g., 2,000 tokensmeasured or estimated
Avg output tokens per taske.g., 500 tokensmeasured or estimated
Retrieval cost per taske.g., $0.002 for embedding lookup
Cache hit ratee.g., 30% of requests use cached prompt
Cache savings ratee.g., 90% saved on cached input tokens, if cache reads cost 10% of base input price
Tasks per user per daye.g., 8
Active users per customere.g., 5
Working days per monthe.g., 22
Gross margin targete.g., 70%
Pricing assumptione.g., $50/user/month

Calculated costs

Cost per task

Input cost:   [avg input tokens] x [input price per token]   = $___
Output cost:  [avg output tokens] x [output price per token] = $___
Retrieval:    [retrieval cost per task]                       = $___
Cache saving: [input cost] x [cache hit rate] x [cache savings rate] = -$___

Cost per task = $___

Cost per user per month

[cost per task] x [tasks per user per day] x [working days] = $___

Cost per customer per month

[cost per user per month] x [active users per customer] = $___

Break-even usage

[pricing assumption] x (1 - [gross margin target]) = max total COGS per user per month
max total COGS / [cost per task] = max tasks per user per month (if AI is the only COGS)
max tasks / [working days] = max tasks per user per day

Sensitivity analysis

ScenarioChanged assumptionCost per taskCost per customer/monthStill viable?
BaselineAs above$___$___Yes
Heavy user (3x tasks)Tasks per user = 3x baseline
Longer outputs (2x)Avg output tokens = 2x baseline
No cachingCache hit rate = 0%
Model price increaseInput/output cost +50%

Agentic cost multiplier

FactorMultiplier
Tool calls per taske.g., 3-5 calls average
Retries on failuree.g., 1.2x
Context accumulatione.g., 2x by step 5
Total agentic multipliere.g., 8x single-turn baseline

Multi-model routing

Task typeModelCost per task% of traffic
e.g., classification, simple extractione.g., Haiku
e.g., generation, complex reasoninge.g., Sonnet
Blended cost per task

Worked example (illustrative)

Numbers below are illustrative and will vary by provider and model. Run 20 representative tasks through your model to get real token counts before using this for decisions.

Scenario: support copilot that drafts responses from knowledge base articles.

ParameterValue
ModelClaude Sonnet at $3/1M input, $15/1M output
Avg input tokens per task2,000 (system prompt + KB context + ticket)
Avg output tokens per task400 (draft response)
Retrieval cost per task$0.002 (embedding lookup)
Cache hit rate30%
Cache savings rate90% saved on cached input tokens
Tasks per user per day25
Active users per customer10
Working days per month22
Pricing$50/user/month
Gross margin target70%

Cost per task:

Input cost:    2,000 / 1M x $3.00                              = $0.0060
Cache saving:  $0.0060 x 0.30 x 0.90                           = -$0.0016
Output cost:   400 / 1M x $15.00                                = $0.0060
Retrieval:                                                        $0.0020

Cost per task = $0.0124

Cost per user per month: $0.0124 x 25 x 22 = $6.82

Cost per customer per month: $6.82 x 10 = $68.20

Break-even: at a 70% gross margin target, max total COGS per user = $50 x 0.30 = $15/month. AI cost is $6.82, leaving $8.18 for other COGS (infrastructure, support, etc.). AI cost alone is 14% of revenue.

Sensitivity: if heavy users run 75 tasks/day (3x), AI cost per user rises to $20.46 — exceeds the $15 COGS ceiling on its own, requiring a pricing tier for high-volume users or model routing to a cheaper tier for simple tasks.

Agentic multiplier: if this were an agent making 4 tool calls per task with 1.3x context growth, the per-task cost would be roughly $0.0124 x 4 x 1.3 = $0.065. The total agentic multiplier in practice varies by workflow — the table above uses 8x as a placeholder because retries and growing context across multi-step chains compound beyond the per-call estimate.

Notes