/scorecard
The AI-Native
Team Scorecard.
AI adoption is not tool adoption. It is workflow redesign.
A 5-layer diagnostic for CTOs and engineering leaders evaluating whether AI is creating leverage or hidden review debt. Score each layer 1, 3, or 5. Total tells you whether to scale AI, fix a layer first, or compound advantage.
/five layersscore 1 · 3 · 5
layer 01
whether work is clear enough for ai-assisted execution
Task Clarity
diagnosticCan an engineer or agent tell what 'done' means before writing code?
1· weak
Tasks are vague, context is scattered, and AI output needs heavy interpretation.
3· developing
Some templates exist, but acceptance criteria and ownership vary by team.
5· strong
Work has crisp context, constraints, acceptance criteria, and an owner before AI touches it.
red flagAI produces plausible work that solves the wrong problem.next moveCreate task and spec templates for AI-assisted workflows.
layer 02
whether the team can absorb increased ai-generated output
Review Capacity
diagnosticDoes AI increase throughput faster than your review system can absorb it?
1· weak
Senior reviewers are overloaded and AI increases review debt.
3· developing
Review norms exist, but risk-based routing is inconsistent.
5· strong
Review is tiered by risk, ownership, architecture impact, and test confidence.
red flagMore code merges while senior judgment becomes the bottleneck.next moveDesign review lanes: low-risk, medium-risk, architecture-risk.
layer 03
whether tests catch weak assumptions, not only syntax
Test Quality
diagnosticWould your test suite catch a confident but wrong AI-generated change?
1· weak
Tests are thin, flaky, or mostly happy-path.
3· developing
Core tests exist, but AI-generated edge cases are not systematically checked.
5· strong
Tests, evals, and fixtures are designed around likely AI failure modes.
red flagAI output passes checks but fails real usage or edge cases.next moveAdd failure-mode tests and evals for AI-assisted changes.
layer 04
whether production feedback improves ai-assisted delivery
Incident Learning
diagnosticWhen AI-assisted work fails, does the system get smarter?
1· weak
Incidents are handled case-by-case with little process learning.
3· developing
Retros happen, but learning rarely updates prompts, specs, tests, or review rules.
5· strong
Incidents update specs, tests, playbooks, ownership, and AI usage guidance.
red flagSame failure mode repeats across AI-assisted work.next moveTurn incidents into reusable workflow rules.
layer 05
whether leaders can see real adoption quality, not just tool activity
Leadership Visibility
diagnosticCan a CTO see whether AI is improving the engineering system or hiding work?
1· weak
Leadership tracks seats, usage, or anecdotes.
3· developing
Some output metrics exist, but quality and buyer impact are unclear.
5· strong
Leaders see adoption, review debt, trust, cycle time, and business impact together.
red flagAI adoption looks good in dashboards but bad in team load.next moveCreate an AI-native adoption scorecard for leadership review.
/interpreting your totalout of 25
- 5–10Tool ahead of workflow
AI tool adoption is ahead of workflow readiness. The team is creating leverage on paper and review debt in practice.
next movePause scale-up. Fix the weakest layer first.
- 11–17Mixed signal
Pockets of leverage and pockets of hidden rework. Some teams are compounding; others are silently slowing down.
next moveRun targeted workflow redesign around the lowest two layers.
- 18–22Credible foundation
The team has a credible AI-native foundation. The system absorbs AI output without losing trust.
next moveScale with stronger measurement and role clarity.
- 23–25Compounding
The team is ready to compound AI-native advantage. AI is improving the engineering system, not papering over it.
next moveTurn practices into playbooks, assets, and a leadership operating rhythm.
/what to do with thisaudit · sprint · message
→ If your team scores 5–17
I run an AI Engineering Productivity Audit that maps where AI is creating leverage versus review debt across exactly these 5 layers, and recommends the operating changes that unblock the next layer.
Email about the Audit →→ If your team scores 18–22
The 30-Day Agentic Coding Rollout Sprint operationalizes the next layer for you - review lanes, evals, leadership visibility - without slowing delivery.
Email about the Sprint →→ If you want the carousel + PDF version
Comment 'scorecard' on the launch post on LinkedIn or DM me directly. I'll send the carousel, the 1-page PDF, and a shorter version you can run in a leadership meeting.
Message on LinkedIn →
Which layer is AI amplifying in your team right now - task clarity, review capacity, tests, incident learning, or leadership visibility?
← homeset in fraunces · geist · geist mono