Readiness assessment (YAML)
Sales Call CRM Assistant
readiness-assessment.yamlYAML
product:
name: "Sales Call CRM Assistant"
stage: "prototype"
owner: "Sarah Chen, VP Product"
target_users:
- "Enterprise Account Executives"
- "SDRs using Gong/Chorus with Salesforce"
use_case:
problem: "Sales reps spend 45 min/day on post-call CRM updates. 30% of fields are empty or outdated. Pipeline forecasting suffers."
ai_job: "Extract structured CRM fields from sales call transcripts and present them for rep approval before writing to Salesforce."
non_ai_alternative: "Reps manually update Salesforce from memory and notes after each call."
expected_outcome: "Reduce time spent updating the five target fields by at least 50% during pilot. Increase field completion from 70% to 95% on those fields."
dimensions:
problem_fit:
score: 4
evidence:
- "[T1] Time study across 23 reps confirmed 45 min/day average on CRM entry"
- "[T1] CRM audit of 1,200 opportunities showed 30% empty rate on key fields"
- "[T1] Forecast accuracy at 62%, below 80% target, linked to data quality"
risks:
- "Reps may not adopt if review UX adds too much friction."
owner: "Product"
next_action: "Monitor pilot adoption. If review takes >3 min per call, redesign UX."
workflow_fit:
score: 4
evidence:
- "[T3] Integrates with existing Gong and Salesforce stack. No new tools required."
- "[T3] Rep review step designed with accept/edit/reject per field."
- "[T2] Fits existing post-call workflow: reps already open Salesforce after each call."
risks:
- "If reviewing is slower than manual entry, reps will skip the tool."
owner: "Product + Design"
next_action: "Usability test review flow with 5 reps before pilot."
ai_job_definition:
score: 4
evidence:
- "[T3] AI job is specific: extract 5 fields, present with evidence, require approval."
- "[T3] Input and output contracts defined with schemas."
- "[T3] Autonomy level is suggest-only for all fields."
risks:
- "Scope pressure to add auto-write or more fields will come fast."
owner: "Product"
next_action: "Hold scope through pilot. Expand only after eval coverage catches up."
data_readiness:
score: 2
evidence:
- "[T1] Gong API provides transcripts. Salesforce API access confirmed."
- "[T1] Consent flag exists in Gong metadata."
risks:
- "18% of calls lack consent flag. Cannot process those calls."
- "Transcript quality varies with audio quality and cross-talk. No quality filtering in place."
- "No automated consent verification. Manual verification only."
owner: "Engineering + Legal"
next_action: "Audit consent flag coverage for pilot cohort. Build automated consent check. Implement transcript quality filtering."
eval_readiness:
score: 3
evidence:
- "[T3] 200-example golden set designed. 50 examples labeled by two senior AEs."
- "[T3] Quality rubric defined with pass/fail criteria."
- "[T1] Automated evidence verification built."
- "[T1] Prototype eval on 50 examples shows 87% field accuracy."
risks:
- "Only 25% of eval set complete. Not enough coverage of poor transcripts."
owner: "ML Lead"
next_action: "Complete remaining 150 examples. Ensure stratification across transcript quality levels."
system_behavior:
score: 2
evidence:
- "[T1] Prompt pipeline built and tested on 50 examples."
- "[T1] Confidence scoring implemented but not calibrated."
- "[T1] Transcript quality detection prototyped on 12 transcripts."
risks:
- "Confidence scores not validated against actual accuracy. May mislead reps."
- "Quality scoring tested on only 12 transcripts. Needs 50+ to calibrate."
- "Latency not validated at scale. No load testing done."
owner: "Engineering"
next_action: "Calibrate confidence scoring against ground truth. Calibrate quality scoring on 50 manually-rated transcripts. Load test pipeline at 2x expected usage."
risk_and_safety:
score: 3
evidence:
- "[T3] Consent flow designed. Fabrication detection automated."
- "[T3] Privacy controls and data retention policy specified."
- "[T3] Legal review of consent requirements completed."
risks:
- "Consent flow not built yet. Manual verification only."
- "Over-trust risk if reps auto-approve without reading."
owner: "Engineering + Legal"
next_action: "Build consent flow. Add review time tracking to detect auto-approval."
regulatory_readiness:
score: 3
evidence:
- "[T3] Legal review of recording consent requirements completed."
- "[T3] Privacy controls and data retention policy specified."
- "[T1] Consent flag exists in call metadata and can gate processing."
risks:
- "Automated consent verification flow not built."
- "Retention and export controls for pilot transcripts not verified in staging."
- "Consent flag coverage is incomplete for 18% of calls."
owner: "Engineering + Legal"
next_action: "Build automated consent verification and verify transcript retention/export controls before production."
cost_and_business_case:
score: 4
evidence:
- "[T1] $0.03 per call in testing, under $0.05 target."
- "[T3] 12-rep pilot at ~$32/month. Full 38-rep team at ~$100/month. 200-rep production at ~$528/month."
- "[T3] API cost is low enough to test whether extracting five target fields measurably reduces CRM update effort."
risks:
- "Total time savings are unproven because v1 excludes stage, amount, and close-date updates."
owner: "Product + Finance"
next_action: "Monitor cost per call during pilot. Set alert at $0.08."
observability:
score: 2
evidence:
- "[T3] Logging spec complete covering extraction, rep actions, and quality metrics."
- "[T3] Dashboard mockups reviewed by sales ops."
- "[T3] Alerting rules defined for acceptance rate, fabrication, and latency."
risks:
- "Dashboards not built. Cannot observe pilot without them."
- "Logging in dev only, not staging or production."
- "No alerting configured. Fabrication events would go undetected."
owner: "Engineering"
next_action: "Build dashboards and deploy logging to staging before pilot start. Configure alerting for fabrication and acceptance rate."
launch_and_operations:
score: 3
evidence:
- "[T4] Pilot cohort identified: 12 reps from Enterprise team."
- "[T3] Rollback plan documented (disable per rep or globally in <5 min)."
- "[T3] Pilot success criteria defined."
risks:
- "Rep training not done. Training materials drafted but not reviewed."
owner: "Product + Sales Enablement"
next_action: "Complete training materials. Schedule 30-min training session before pilot."
recommendation:
level: "pilot_candidate"
weighted_score: 3.07
rationale: "The weighted score of 3.07 falls in the pilot candidate range. The problem fit, AI job definition, and suggest-only workflow justify preparing a controlled 12-rep pilot, but launch-critical gaps in data readiness (2/5), system behavior (2/5), and observability (2/5) must close before the pilot starts."
blockers_before_pilot:
- "Define and execute consent verification for the pilot cohort."
- "Deploy observability dashboards and alerting before pilot start."
- "Calibrate confidence scoring and transcript quality scoring on 50+ examples each."
- "Train pilot cohort on review workflow."
- "Verify transcript retention and export controls for pilot data."
blockers_before_production:
- "Automated consent verification flow not built."
- "Eval set only 25% complete."
- "Confidence scoring not calibrated against ground truth."
- "Rep training not conducted."
- "Observability dashboards and alerting not deployed."
- "Transcript retention and export controls not verified in staging."
- "Load testing not completed."
alternative_recommendation: "If consent flag coverage cannot reach 95%+, consider restricting to calls from the Gong-native dialer where consent is automatically flagged, rather than processing all call sources."