Lessons Learned in Defect Triage

Articles
http://www.informit.com/articles/printerfriendly.aspx?p=1324984
Lessons Learned in Defect Triage

By Michael Kelly Date: Feb 23, 2009 Return to the article Michael Kelly shares experiences from a project team whose development process implemented some sweeping changes, with significant improvements (and a few missteps) along the way. On a recent project, I joined a team that was struggling to deliver reliable software on time. The team consisted of around 20 developers and testers, mostly collocated, and was going through a lot of changes at the time I came on board. The biggest change was a switch to Scrum as the team's development process. In the new process, the programmers and testers would work from two sources: New features would be developed from the product backlog, and defects and issues would be worked from the defect triage process. Because we changed that process, we were required to change many processes. We knew immediately that we needed to tackle the way in which we managed defects. It was a mix of different project databases, different ticket workflows, different release schedulesand no formal way to track it all. In short, we had a mess. Major, sweeping process changes, while ideal in a situation like this, can be painful and timeconsuming. We didn't have the time to stop, restructure everything, and train everyone on a new process. We had a large number of commitments, some of which were already behind schedule, and had to keep everyone focused on delivery. Therefore, we decided to start planning small incremental changes that we could roll out over time. While that was going on, we would implement defect triage meetings on a projectbyproject basis. This article describes how we made those defect triage meetings effective.
Ge ng the Right People in the Room

Two primary factors drove our team's triage process: Ticket priority (How important is it?) Ticket severity (How much pain does it cause?) From time to time, estimates for how long a fix might take could play a role. Other factors also came into play on occasion: a deadline commitment, dependency on another issue, availability of a resource with specialized knowledge, work already taking place on similar or related issues, etc. To define priorities and severities efficiently, we needed to get the right people in the room. For this team, those people consisted of the following: The various project managers for each of our clients The technical leadership for the development team Select representatives from the quality assurance team One area of struggle was in finding representation for some of our "faceless" stakeholders: internal endusers and the technical operations department. Both of those teams had a hard time getting regular attendees at the meetings. From time to time, we would ask quality assurance team members to act as representatives for those missing stakeholders. The goal in selecting the audience is to keep it small (no more than 510 people), but with enough people to make the right decisions. We also wanted to build a team of people who understood the deadlines and commitments, knew how to assess severity and impact, and could estimate at a high level the work that was needed to handle an issue. The team we assembled would determine which issues were fixed first, and had to have enough credibility and authority to ensure that decisions made at the meetings were carried out. Once we knew who the right people were, we set a meeting schedule that worked for everyone. If you hold meetings too often, people won't attend. If meetings are too infrequent, the team won't be effective. We decided to meet four days a week, alternating our meetings between clientfacing issues and platformfacing issues. Because each of these sets of issues pulled a slightly different audience, some people needed to meet four times a week, and others needed to attend only two meetings a week. We also found that scheduling the meeting at the same time each day reduced confusion.
Managing the Process

Once we had the right people in the room, the next step was to figure out how we were all going to work together, finding a way to
1 of 3
11/8/2009 9:54 PM
Articles
balance the clientfacing issues and the core platform issues. The first thing we did was work to get consistency across project databases. We needed all the projects to use the same workflows, track releases in the same way, and use automated "top ten" lists that populated from different projects. Those topten lists became the primary view of relative priority across projects. For example, we created a client topten and a platform topten. The client topten looked across all the clientfacing projects, and the platform topten looked across all the technicalfacing projects that might affect internal operations or multiple clients. With the lists set up, the next step was to clear out the work in the developers' personal queues. Because a developer might have a number of tickets currently assigned without regard to relative priority, we asked everyone to pool any tickets not actively being worked. Then, when each developer was ready for the next ticket, he or she would simply pull from a topten list, regardless of which individual project would normally get his or her focus. Once a ticket is in a topten list, you have to think about where it goes next. The same team that populated the topten lists also planned for how those tickets got out to production. One technique we found helpful was to tie the priority of a ticket to the release schedule. We started with some simple heuristics: A blocker issue equated to a hotfix, a critical issue would go in the next scheduled release, and everything else (major, minor, trivial, etc.) would be worked and slated for release as resources allowed. The team that reviewed the tickets on a regular basis was also the team that managed any issues resulting from investigation. They coordinated getting work done across teams. They also determined which tickets qualified for rejection or were returned for more information. By escalating issues as needed, this group helped resolve blocking issues for the people who were working the tickets.
Ge ng the Message Out

Regardless of how you decide to manage your issues (topten lists, assigning tickets out to individual developers, bruteforce spreadsheets, and so on), everyone needs to know how you're going to communicate what the issues are and what progress is being made. You need to deliver information that's simple to understand and difficult to ignore. I got in the habit of sending daily email messages to the product development team, summarizing the topten lists and ticket progress. Another important aspect to communicate is the upcoming releases. We had a number of small production releases each week, and a large release every couple of months. The development staff found it helpful to know the upcoming key dates. On a whiteboard central to all the product development team locations, we kept a calendar showing the next two months. On the calendar, we posted key dates such as initial release candidates, code freeze, customer useracceptance testing (UAT) windows, and release dates. At a glance, a developer would know not only what the issues were, but when they needed to be completed.
Using Metrics to Improve the Process

Once we began the triage process, we were able to collect data on how the process was working. We started by tracking where development time was spent. Each developer would log his or her time by ticket, which made it easy for us to see time spent by project, by priority, by ticket type, and by release. This feedback mechanism let us know which projects were getting the most attention and which were being slighted. We looked at the number of tickets created or resolved in a project, how many commits were done for a project, and the number of developers working tickets in a project. By examining where the activity was taking place, we were able to better understand where we might need to focus refactor efforts, spend more attention on testing, or be more careful when pulling together final release information. We also examined what we were not doing. We looked at the data to understand where the least amount of churn was. We looked at ticket aging reports. We looked at projects that didn't get many hours logged. In the short term, when focusing on triage, such decisions may make sense. But we didn't want to get trapped into shortterm thinking. We also wanted to be sure that ignoring certain areas of the code or certain projects was a conscious decision, not an accident. Since our development team was using Scrum, at the end of each sprint (well, most sprints), we also tried to provide details about which issues were resolved in that sprint. At the same time we showed off features, we tried to show how much work was done to resolve issues from production, or issues that got pulled into the sprint via the triage process. We provided process metrics to make sure that we were focused on the right problems.
Where to Go from Here

Our team and our process still aren't perfect, but we've learned a lot, and we'll continue to improve as we learn more. Sometimes we don't get the meeting attendance we want, but we've found that problem to be cyclical. People attend for a while and then stop coming. When their tickets are no longer getting attentionsurprise, surprisethey attend again. As we work this plan, we continue to make small changes in the overall processes of software development and defect management. We try to use the information we gather to inform our decisions. As you think about your own defectmanagement process, I hope that some
2 of 3
11/8/2009 9:54 PM
Articles
of these ideas help. Daily Defect Meetings Our daily defect meetings had the following format: 1. Review the current items in the topten list(s): a. Are any issues not making progress? Are the appropriate people involved in working the issue? b. Are there any blockers around working the issues that need to be escalated? c. Have circumstances changed in any way that would cause us to drop any of these issues from this list? 2. Review new items created in any of the projects that feed the list(s): a. a. Review the tickets created since the last meeting from each project database feeding the lists you're reviewing. b. b. If appropriate, update priority and severity as a team. Be sure to capture key comments about priority and severity in the ticket. c. c. Pull any new tickets into the list as appropriate. 3. Review the most critical outstanding existing items from the projects that feed the list(s): a. When there are no new issues to pull in, and the list has room for new items, look at the highestpriority tickets from each feeding project database. b. Ask attendees if any tickets within their projects need focus, but the ticket's priority may not appropriately represent the urgency of the issue. c. Pull tickets into the list as appropriate, based on feedback and discussion by the team. 4. Discuss any other issues related to the projects that the team may need to know. 5. If a topten list has more than 10 tickets, ask the team whether that's okay. Sometimes it is (related tickets, requirements for an upcoming release, and so on), but make sure that the team knows why the buffer was extended and is clear on what that means. Try not to let it happen too often.
2009 Pearson Education, Inc. All rights reserved. 800 East 96th Street Indianapolis, Indiana 46240
3 of 3
11/8/2009 9:54 PM

Lessons Learned in Defect Triage

Enviado por

Dados do documento

Descrição original:

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Lessons Learned in Defect Triage

Enviado por

Direitos autorais:

Formatos disponíveis

Articles

Lessons Learned in Defect Triage

Ge ng the Right People in the Room

Managing the Process

Ge ng the Message Out

Using Metrics to Improve the Process

Where to Go from Here

Você também pode gostar