Skip to content

Delivery Flow Metrics

The Delivery Flow tracks work items (features, bugs, chores, incidents) from creation through to production deployment. Use these metrics to understand team velocity, code review efficiency, and delivery pipeline performance.

Grain: One row per feature/bug/chore/incident issue Refresh: Every 60 minutes Primary Use Case: Team velocity, cycle time analysis, and delivery efficiency

The Delivery Flow provides comprehensive metrics for:

  • Issue lifecycle from creation to closure
  • PR cycle time and code review efficiency
  • Time from merge to production deployment
  • Work type breakdown (planned vs unplanned)

All duration measures track specific phases of the delivery lifecycle.

What it measures: Total time an issue was open.

AspectValue
Start PointIssue created
End PointIssue closed
UnitDays
NULL whenIssue not yet closed

Available aggregations:

  • avgIssueDurationDays - Average time issues are open
  • medianIssueDurationDays - Median time (recommended)

Interpretation:

  • Measures overall cycle time from work identification to completion
  • Includes all time: waiting, development, review, and deployment
  • High values indicate blocked work or scope creep

Typical ranges:

  • Quick fixes: 1-3 days
  • Standard features: 1-2 weeks
  • Complex features: 2-4 weeks

What it measures: How quickly issues receive initial attention.

AspectValue
Start PointIssue created
End PointFirst response (comment, assignment, etc.)
UnitDays
NULL whenNo response recorded

Available aggregations:

  • avgTimeToFirstResponseDays - Average response time

Interpretation:

  • Team responsiveness indicator
  • Low values indicate good triage processes
  • High values may indicate understaffing or poor notifications

What it measures: Time to start coding after issue creation.

AspectValue
Start PointIssue created
End PointFirst PR created
UnitDays
NULL whenNo PR created yet

Available aggregations:

  • avgIssueToPrDays - Average time to start coding
  • medianIssueToPrDays - Median time (recommended)

Interpretation:

  • Measures pickup time - how quickly work gets started
  • Includes any waiting time before development begins
  • Low values indicate good work prioritization

Typical ranges:

  • Fast teams: 0-1 days
  • Standard: 2-5 days
  • Backlog heavy: 1+ weeks

What it measures: Time from PR creation to merge.

AspectValue
Start PointFirst PR created
End PointFirst PR merged
UnitDays
NULL whenNo PR merged yet

Available aggregations:

  • avgPrCycleTimeDays - Average PR duration
  • medianPrCycleTimeDays - Median (recommended)

Interpretation:

  • Code review efficiency metric
  • Includes review time, feedback cycles, and CI/CD runs
  • High values indicate review bottlenecks

Typical ranges:

  • High-performing: < 1 day
  • Standard: 1-3 days
  • Needs improvement: > 1 week

What it measures: Time waiting for first review.

AspectValue
Start PointPR created
End PointFirst review submitted
UnitDays
NULL whenNo review yet

Available aggregations:

  • avgPrToReviewDays - Average wait time for review

Interpretation:

  • Review queue indicator
  • High values indicate reviewer bottleneck
  • Target: Same day or next business day

What it measures: Time from PR creation to first approval.

AspectValue
Start PointPR created
End PointFirst approval received
UnitDays
NULL whenNo approval yet

Available aggregations:

  • avgPrToApprovalDays - Average time to approval

Interpretation:

  • Total review cycle including feedback iterations
  • Gap between “to review” and “to approval” = feedback cycle time

What it measures: Time from code merge to production deployment.

AspectValue
Start PointFirst PR merged
End PointFirst successful production deployment
UnitDays
NULL whenNo successful production deployment

Available aggregations:

  • avgMergeToDeployDays - Average deployment pipeline time
  • medianMergeToDeployDays - Median (recommended)

Interpretation:

  • Deployment pipeline efficiency
  • Low values indicate good CI/CD practices
  • High values may indicate manual deployment gates

Typical ranges:

  • Continuous deployment: < 1 hour (shown as < 0.04 days)
  • Daily deployments: < 1 day
  • Weekly releases: 3-7 days

What it measures: Complete time from issue creation to successful production deployment.

AspectValue
Start PointIssue created
End PointFirst successful production deployment
UnitDays
NULL whenNo successful production deployment

Available aggregations:

  • avgTotalLeadTimeDays - Average end-to-end lead time
  • medianTotalLeadTimeDays - Median (recommended)
  • p90TotalLeadTimeDays - 90th percentile (for planning)

Interpretation:

  • Most comprehensive delivery metric
  • Combines all phases: waiting + development + review + deployment
  • P90 useful for setting customer expectations

Typical ranges:

  • Elite teams: 1-3 days
  • High-performing: 1-2 weeks
  • Standard: 2-4 weeks
  • Needs improvement: 1+ months
MeasureDescription
countTotal issues
featureCountFeature issues
bugCountBug issues
choreCountChore/maintenance issues
incidentCountIncident issues
MeasureDescription
plannedWorkCountFeatures + Chores (planned work)
unplannedWorkCountBugs + Incidents (reactive work)
plannedWorkPercentage% of work that was planned

Interpretation:

  • Healthy teams: 70-80% planned work
  • High unplanned work: May indicate quality issues or understaffing
  • 100% planned: May indicate ignoring bugs/incidents

Track code change size to identify review complexity:

MeasureDescription
avgPrLinesChangedAverage lines (additions + deletions) per PR
medianPrLinesChangedMedian lines changed
avgPrChangedFilesAverage files modified per PR

PR Size Categories (dimension prSize):

  • xs - Extra small (< 50 lines)
  • s - Small (50-200 lines)
  • m - Medium (200-500 lines)
  • l - Large (500-1000 lines)
  • xl - Extra large (> 1000 lines)

Best practice: Aim for smaller PRs (xs, s, m). Large PRs have longer review cycles and higher defect rates.

Track work that originated from discovery research:

MeasureDescription
fromDiscoveryCountIssues created from validated discoveries
fromDiscoveryPercentage% of work from discovery process

Interpretation:

  • Measures how much work flows through the discovery process
  • Higher percentages indicate more research-driven development

Track delivery pipeline completion rates:

MeasureDescription
withPrCountIssues with at least one PR
withMergedPrCountIssues with merged PR
withProductionDeploymentCountIssues deployed to production
withSuccessfulDeploymentCountIssues with successful deployment

Interpretation:

  • Drop-off between stages indicates bottlenecks
  • Example: High withPrCount but low withMergedPrCount = review bottleneck

Filter delivery metrics by:

DimensionDescription
providerIssue tracking provider
typefeature, bug, chore, incident
stateCurrent issue state
prSizexs, s, m, l, xl
isPlannedWorkFeatures and chores
isUnplannedWorkBugs and incidents
originatedFromDiscoveryCame from discovery process

Status flags:

  • isIssueComplete - Issue is closed
  • hasPr - Has at least one PR
  • hasMergedPr - Has merged PR
  • hasProductionDeployment - Deployed to production
  • hasSuccessfulProductionDeployment - Successfully deployed

Via Joins:

  • Projects.id / Projects.name - Filter by project
  • Teams.id / Teams.name - Filter by team
  • Users.id / Users.name - Filter by issue author
Measures:
- count (throughput)
- medianTotalLeadTimeDays (cycle time)
- featureCount vs bugCount (work mix)
Measures:
- medianPrCycleTimeDays (overall review time)
- avgPrToReviewDays (time to first review)
- avgPrToApprovalDays (time to approval)
- medianPrLinesChanged (PR size)
Measures:
- medianMergeToDeployDays (pipeline speed)
- withMergedPrCount vs withSuccessfulDeploymentCount (completion rate)
Measures:
- plannedWorkPercentage (target: 70-80%)
- bugCount trend over time
- incidentCount (should be low)

Q: What’s the difference between “Issue Duration” and “Total Lead Time”?

Issue Duration measures when the issue is open (created → closed). Total Lead Time measures time to production (created → deployed). An issue can close before deployment or deploy before closing.

Q: Why measure PR Cycle Time separately from Total Lead Time?

PR Cycle Time isolates the code review phase. If Total Lead Time is high but PR Cycle Time is low, the bottleneck is elsewhere (pickup time or deployment pipeline).

Q: What’s a good target for PR Cycle Time?

Industry benchmarks suggest < 24 hours for high-performing teams. However, context matters - security-critical code may require longer reviews.

Q: Why split planned vs unplanned work?

This split reveals team health. High unplanned work (bugs, incidents) indicates quality issues. Teams should track this ratio over time.

Q: Some issues show 0 Total Lead Time - is that accurate?

If a PR deploys on the same day the issue was created, the lead time rounds to 0 days. This indicates very fast delivery.