DevOps Insights with Team Comparative
The article is a summarizes our analysis of various parameters that govern a team's DevOps efficiencies. It is based on an experiment with real public data available from a popular DevOps service. The comparative is not intended for individual to individual comparison but highlights team patterns to foster a healthy culture.
Scenario
The focus on digitization has made Code the center of universe. As a consequence every team is aspiring to do more with what's available. However to make any improvements, first we need to baseline our current performance. One way to baseline and improve is by looking at your immediate environment (group or teams) and doing a relative comparison. Ability to compare teams based on the parameters captured implicitly through their activities, can provide rich insights to address the following queries
What is the smallest change we can make to gain maximum efficiency?
Data can help to identify teams that are very similar to ours's operationally but far more successful than us in terms of outcomes. It can also potentially aide in identifying the changes we should make to attain the same efficiencies.
What changes do we need to make to get closer to the team we aspire to be?
Every team aspires to perform better than they currently do and almost all teams have role models either within their organization or externally. Data can help teams identify how far they are from their role models and what changes should they make to operate at the desired state.
Data
- GitHub data public-ally available
- Selected 371 repos that were active and good candidates for observation
Analysis
- We selected 12 parameters across each team (e.g. PR response time, issue response time, PR inflow, PR outflow, new contributors added, new activity in repo etc.)
- We then mapped these 12 parameters into a 2 dimensional space to see if we can get clusters of teams with score ranging as High, Medium & Low . The team categorization was done based on the popularity parameters, few explicit (e.g. fork, stars etc.), few derived
- We were able to find 3 distinct clusters in the data based on the above categories (High | Medium | Low) as can be seen below. The Red (High) includes teams that are very popular in the GitHub community.
- A deeper drill down of the cluster based on Popularity , Collaboration and Code showed that while some teams performed better in one area they lagged in another and as a result in the overall score remained in the Medium or Low category
Score Based on Code Related Metrics
Score Based on Collaboration Related Metrics
Score Based on Popularity/Social Metrics
- A further drill down into PR based metrics for selected 2 teams (Left & Right) showed that one team regulate the debt periodically while for the other the debt is increasing continuously
Conclusion
Our goal was to do a proof of concept to see if we can baseline and differentiate teams based on certain parameter and provide actionable insights. The above analysis does prove our hypothesis. So far we have only relied on the derived parameters, however there is huge scope for bringing in implicit parameters (parameters that are not evident directly but evident from behavior). We plan to build on this in future.
Contributors
Divya Vaishnavi | Harish Kumar Agarwal | Vinod Joshi