I think the agile community needs to change how it measures success for agile teams. The ways that we gather metrics and the information we seek out of those metrics is actually getting in the way of what’s most important, making working software.

By Forcing individual metrics we sometimes discourage team collaboration by focusing too intently on others we can actually skew the very thing we’re trying to measure thus defeating the purpose.

The way I see it, there’s two major problems:

The observer effect: The observer effect states that observing a process can impact its output. For instance, telling a team that you’ll be keeping a close eye on their velocity might cause that team to overestimate their items in order to increase their velocity. This is especially dangerous when working with story points since there’s no way to compare the validity of an estimate.

This image was taken from here.

While the above comic has probably happened at some point, it’s not my favorite example of the observer effect at work. Let’s talk about a support person I knew a long time ago, we’ll call him “Jason” since that was his name. Now Jason was a great tech he helped others on particularly difficult calls, he solved problems correctly, generally on the first call and got great feedback from customers. The problem is that Jason’s call times were too long and this particular metric was very important to management. A few meetings later and a review later it was made clear that Jason HAD to get his times down or look for another job. Fast forward a few weeks and Jason now was in the top 5 for the entire support group for call times? How did he do it? He wouldn’t tell anyone for the longest time until one day I came in early and there was Jason, an hour before his shift, and picking up calls and immediately hanging up.

Here’s the interesting thing, Jason wouldn’t have done something like that if his call times hadn’t been more important than his actual performance. Measuring his call times negatively affected his output. Moreover this was a bad metric to begin, even without extreme examples like Jason, we’ve all been on a call with a tech support agent who just wants to get you off his line. The question is, what calls are your teams hanging up on to make their numbers?

The streetlight effect: The streetlight effect is our human tendency to look for answers where it’s easy to look rather than where the actual information is. For instance, counting the lines of code produced is easy but doesn’t tell us anything about the quality of the application, the functionality it provides or even the effectiveness.

This image was taken from here.

I recall some time ago I was working on a team that made multiple products each with different quality standards. The thing was that “Product A” had much more difficult quality standards the “Product B” or “Product C” or Product D”, which wouldn’t be too big a problem except that management had decided that quality would be a big deal when the next review came around.

The thing is, something like “Quality” is a bit of a nebulous concept and it’s not really easy to measure. Error rate however is much easier to measure, thus anybody who found themselves working on “Product A” with it’s higher quality standards would be at a bigger disadvantage come review. So who ended doing that work? Interns mostly, temps and contractors when they were around and anybody else.

As it turns out, even though error rate was easy to measure, it didn’t tell us anything valuable since the number of errors produced was more dependent on product than employee. Instead we drove several good new hires, lost a customer, and lowered morale for the whole team since their job became less about building and more about avoiding errors.

Now both of these examples take place outside of software development so let’s apply these concepts to some common “Agile” Metrics you might be familiar with. What’s easy to measure?

Unit Tests written: Most agile developers write a lot of unit tests; test-driven development creates even more tests (both of which create better quality code). So measuring a developer’s productivity by the number of tests they create must be good! Actually, the observer effect kills this one dead. Telling a developer that they’ll be measured on the number of tests they write ensures they’ll create many tests with no respect to the quality of those tests. Our goal is not to ship tests; our goal is to ship working code. I’ll take fewer better tests than more crappy tests any day.

Individual Velocity: Once again the observer effect makes this a bad metric. If a developer knows he’s being individually graded on his performance and also knows that he only gets credit for the things he specifically works on then he’s actively discouraged from contributing to the group. He’s placed in the very un-agile situation of competing with his team rather than contributing to it.

In a perfect world an agile team is collaborating, interacting, discussing and reviewing almost everything they do. This is a good thing for building quality software and solving problems fast but this level of interaction makes it nigh impossible to separate a person’s individual productivity from the group, so don’t try, you’ll simply hurt your team’s ability to make good software.

Team Velocity: This is one of the most misunderstood metrics in all of Scrum. A team’s velocity is unique to them. It simply can’t be compared to another team. Let’s say that team A estimates a certain amount of work at 50 pts. for a sprint and team B estimates that same work at 150 pts. for the same sprint. Now if both teams finish their sprint successfully then team A has a velocity of 50 pts. and team B has a velocity of 150 pts. Which team is more productive? Neither. They both did the same amount of work.This metric is particularly evil because it encourages teams to fudge the numbers on their estimates, which can affect the team’s ability to plan their next sprint. If the team can’t properly plan a sprint then that puts your entire release in danger of shipping late.For more about your Scrum team’s velocity, you can check out an earlier blog post I wrote.

Okay smart guy, what metrics should we use?
Glad you asked, we measure productivity by the working software we deliver. We measure actual output rather than contributing factors. This approach is more Agile because it frees the team to build software in whatever way can better contribute to their success rather than whatever way creates better metric scores. It’s also much more logical since working software is something that we can literally take to the bank (after it’s been sold of course).

So what are the actual new metrics?

Value Delivered: You’ll need your product owner for this. Ask him to give each user story a value that represents its impact to his stakeholders. You can enumerate this with an actual dollar amount or some arbitrary number of some kind. At the end of each sprint you’ll have a number that can tell you how much value you’ve delivered to your customers through the eyes of the product owner.

This metric does not measure performance, instead it measures impact. Ideally your product owner will prioritize higher value items towards the top of the backlog and thus each sprint will deliver the maximum value possible. If you’re working on a finite project with a definite end in sight, your sprints will start out very high value and gradually trend towards delivering less and less value as you get deeper into the backlog. At some point, the cost of development will eclipse the potential value of running another sprint, that’s typically a good time for the team to switch to a new product.

On Time Delivery: People sometimes tell me that agile adoption failed at their company because they couldn’t give definite delivery dates to their clients. I don’t buy this. One thing that an agile team should definitely be able to do is deliver software by a certain date. It’s possible that a few stories may not be implemented but those are typically the lowest value stories that would have the least amount of impact on the client. That being said, a team’s velocity should be reasonably steady, if it goes up or down it should do so gradually. Wild swings in velocity from sprint to sprint make long term planning harder to do.

Here’s the metric: if a team forecasts 5 stories for an upcoming sprint and they deliver 5 stories then they earn 2 points toward this metric. If they deliver 4 stories or they deliver less than 2 days early (pick your own number here) then they earn one point. If they deliver more than 2 days early or they only deliver 3 (out of 5) stories they earn no points. At the end of a quarter or the end of a release or the end of the year the team will be judged by how accurately they can forecast their sprints.

So what we’re measuring is value delivered to the customer and on time delivery of that software. Which are the only two real metrics you can literally cash checks with.

Source : https://www.infoq.com/articles/not-destroy-team-metrics