Fog Creek

Refactoring to a Happier Development Team – Interview with Coraline Ada Ehmke

Looking for audio only? Listen on and subscribe


Up Next
Becoming the Leader Your Engineers Need

In this interview with Coraline Ada Ehmke, Lead Software Engineer at Instructure, we discuss data-driven refactoring and developer happiness teams. Coraline gives some great advice on the kinds of tests we should write for refactoring, tools to use and metrics to monitor, to make sure our refactoring is effective. We also learn about the role of refactoring in the Developer Happiness team at Instructure. You can read more from Coraline on her site.

Content and Timings

  • Introduction (0:00)
  • Useful Tests for Refactoring (0:57)
  • Data-driven Refactoring Metrics (4:13)
  • Winning Management Over to Refactoring (6:34)
  • The Developer Happiness Team (8:36)
  • Recommended Resources (13:58)

Transcript

Introduction

Derrick:
Coraline Ada Ehmke is Lead Engineer at Instructure. She is the creator of the Contributor Covenant, a code of conduct for open source projects, and Founder of OS4W, an organization encouraging greater diversity in open source projects. She speaks regularly at conferences about software development, including the talk ‘Data-Driven Refactoring’. Coraline, thank you so much for joining us today. Do you have any more to share about yourself?

Coraline:
I want to say I’ve been doing web development for about 20 years now, which is forever. I actually wrote my first website before there was a graphical browser to view it in. I started out in Perl, went into ASP for a little while, did some Java, and then discovered Ruby in 2007, and I haven’t looked back. All that experience has led me to be very opinionated about what makes good software. I have some very strong opinions that I’m always flexible about changing, but I have a sense of what good software looks like, and that sort of drives me in my daily work.


“A lot of problems come with code that’s just good enough”


Useful Tests for Refactoring

Derrick:
I want to dig into your experience with refactoring a bit. You say that without tests, you shouldn’t refactor. Why are tests so important to refactoring?

Coraline:
I think it was Michael Feathers who once said that ‘if you’re refactoring without testing, you’re just changing shit’. Basically, the importance of testing is you want to challenge and document what your assumptions are about the way that a piece of code works currently, which can be really difficult, because you might look at a method or a class name and assume its role based on its name, but that name could have drifted over time. You need to ensure that you’re documenting what you think it does and then testing to see that it does what you think it does. Testing can also help you stay within the guardrails through refactoring efforts. It’s hard to know where to stop sometimes, so tests can help you identify when you’re doing too much. You want to make sure you’re doing the minimum rework with the maximum results.

Derrick:
Yeah, that sounds great. It’s so easy with refactoring to just continue down that rabbit hole. What kind of tests are useful when refactoring?

Coraline:
It’s good to have good test coverage to start with, if it’s possible. If you don’t have good test coverage at the start, it’s good to build up some unit tests and integration tests just to make sure you’re staying on the path. There are a couple kinds of tests that I’m particularly interested in when it comes to refactoring. The first is boundary testing, which is basically using test cases to test the extremes of input. The maximum values, the minimum values, things that are just inside or outside of boundaries, typical values, and error values as well.

I like to do generative testing for this. If I have a method that takes an argument, I will create an enumerable with lots of different kinds of arguments in it, and see what passes and what fails. Those failures are really important because when you’re refactoring, you can’t know for sure that somewhere in the code base something is waiting for that method to fail or raise an exception. Your refactoring should not change the signature of that method such that a given piece of input stops causing a failure. You want to be really careful with that. Generative testing can help a lot with that, so there’s that with boundary testing.

Also, I like attribute testing, which is an outside-in way of testing. You check that after your code runs the object that’s being tested possesses or does not possess certain qualities. This can be really useful for understanding when you changed how an object is serialized or the attributes of an object without getting into the implementation details.

The most important tests, though, for refactoring are tests that you throw away. We’re hesitant. We love deleting code, but we hate deleting tests, because it gives us this feeling of insecurity. The sort of test that you do during refactoring doesn’t lend itself well to maintaining code over time. You’re going to be a lot more granular in your tests. You’re going to be testing things that are outside of the purview of normal integration or unit tests, and it’s important that you feel comfortable throwing those away when you’re done with a refactoring effort.

Derrick:
Yeah. The importance of being able to throw away stuff really gives you that perspective on what you’re trying to do.

Coraline:
And it lets you work incrementally, too. You can start by challenging your assumption about how a piece of code works. Once you feel confident, you can set up some boundary tests to see how it responds to strange conditions. Once you feel confident with that, then you’re probably confident enough to do some refactoring and make sure that you’re not introducing regressions.

Data-driven Refactoring Metrics

Derrick:
An important aspect of your approach to refactoring is data and measuring the impact of refactoring efforts over time. What are some useful metrics to track?

Coraline:
I want to emphasize that over time is really, really important, because if you’re not collecting data on your starting point and checkpoints along the way, you don’t know if you’re actually making your code base better or worse. It is possible through refactoring to make your code base worse. Some of the things I like to look at at the start are test execution times. If your test suite takes 45 minutes to run in CI, you’re not doing TDD. It’s impossible, because the feedback loop for your test has to be as short as possible. Refactoring tests is actually a great way to get started in trying to adjust that test execution time.

Another one that I’ll probably talk about later is the feature-to-bug ratio. If you look at your sprint planning, figure out how much time you’re spending on new features versus how much time you’re spending on bugs. You can actually use some tools to track down what areas of your code base are generating the most bugs, and combine that with churn to see what’s changing the most. Those are interesting.

I like to look at code quality metrics. A lot of people use Code Climate for tracking code quality, which I don’t really like that much, because it assigns letter grades to code. If a class has an ‘F’, you might say, “Oh, it needs to be rewritten,” or if a class has an ‘A’, you might think, “Oh, it’s perfect. No changes are necessary.” Code Climate doesn’t really give you the change over time. It’ll give you notification if a given class changes it’s grade. That’s great, but you can’t track that over time. I like some other code quality tools. Looking at complexity, there are tools that do assignment branch conditional algorithms to see how complex a piece of code is. I also like to look at coupling, because coupling is often a symptom of a need for refactoring, especially in a legacy application.

Derrick:
What mistakes do you see people making with refactoring?

Coraline:
Basically trying to do too much at once. It’s really easy to look at a piece of code and immediately make a value judgment, saying, “This is bad code,” which is really unfair to the programmers who came before, and saying, “I’m just going to rewrite the whole thing.” We have this instinct to burn it all down. Trying to change too much at one time is really a recipe for disaster. You want to keep the surface area of your changes as small as possible and work as incrementally as possible. Those things take discipline, and I think it’s hard for a lot of people to achieve that without actively policing themselves.


“The most important tests, though, for refactoring are tests that you throw away”


Winning Management Over to Refactoring

Derrick:
Often just getting the time to actually spend on refactoring can be a challenge. What are some benefits of refactoring that can win management over?

Coraline:
That’s a big, important question. I think in a healthy development organization, the development team, the engineers themselves, are stakeholders. They have some input into your backlog. If there’s technical debt that needs to be addressed, that can be prioritized. Not all of us live in a world where we have that kind of influence over the backlog. Some of the ways you can convince management that refactoring is a good idea… I talked before about the bug-to-feature ratio. Stakeholders want new features. New features keep products healthy and appealing to end users and to investors. Anything that stands in the way of a new feature getting out is necessarily a business problem.

If you look at your feature-to-bug-fix ratio and see that it’s poor, one way you can sell it is to say, “through this refactoring, we’re going to improve this ratio and be able to do more feature work.” You can also look at how long it takes to implement a new feature and talk about how refactoring a piece of code, especially if it’s code that changes very often, can make it easier to add new features to a particular area of the code base faster and easier.

In the end, if none of that works, just cheat. Just lie and pad your estimates, so you automatically build in some time for reducing technical debt with all of your estimates, which unfortunately I’ve had to use more often than I would have liked to. It is a valuable tactic to keep in your toolbox.

Derrick:
Yeah, definitely. I still really like… You mentioned this earlier, but the feature-to-bug-fix ratio is really important.

Coraline:
It really is a great health check for how the team overall is doing, what the code looks like and the health of the team, as well. If people are doing a lot of bug fixes, you’re probably going to have some unhappy engineers, because they want to work on new and shiny things. That can have an impact on how happy your developers are, which is really critical to how much work they’re getting done and how good your product is.

The Developer Happiness Team

Derrick:
Definitely. Let’s talk about happiness. You previously led what was known as the Developer Happiness or Refactoring team at Instructure. Tell us a little bit about the role and mission of that team.

Coraline:
It was one of my favorite jobs, honestly. We were charged with increasing the happiness and productivity of the engineering team. Within our charter was identifying processes or parts of the code base or even people who were standing in the way of developers being as productive as they could possibly be. Being very data driven in my approach, the first thing I wanted to do was get a sense of just how gnarly the code base was.

We ended up writing a bunch of code analysis tools. Again, we were using Code Climate, but I wasn’t really satisfied with its tracking ability over time. So we wrote a few tools. I wrote one called Fukuzatsu, which is an ABC complexity measurement tool. There was already a tool out there that was very similar called Flog that was written by Ryan Davis, but it’s an opinionated tool. It punishes you for things like metaprogramming, and it also favors frameworks like active record and punishes you for using alternative ORMs, so just in the nature of the bias that was built in. I wanted something that… I don’t want opinions with my data. I’m an engineer. I’m a professional. I’m capable of forming my own opinions. I just wanted raw data to work with, and that’s what Fukuzatsu gave me.

Another tool that we ended up building as part of this process at the very beginning was something called Society, which basically makes a social graph of the relationships between your classes. You get this nice circular diagram with afferent and efferent coupling displayed in different colors, and you see the links between different classes. That can help you identify service boundaries, for example, because if you have one class that’s a trunk of a lot of inbound or outbound connections, you might say, “That’s a good place for a service.”

A lot of the work we did early on was building tools. We built a mega-tool that collected data from all these other tools and presented them in a dashboard format with change shown over time. You could drill into complexity. You could drill into coupling. You could drill into code quality or code smells, or various other metrics, and see how they were changing at the commit level. That being done, we identified which areas were causing a lot of bugs, which areas of the code are really complex or really changing a lot.

The next step was creating a team of refactoring ambassadors. Rather than taking on the refactoring work ourselves and handing it over to the team that owned the code, our goal was to send in someone who’s really good at refactoring to work with that team to refactor the problematic code, which I think was pretty valuable in terms of ownership and continued success, and also training and teaching and helping those teams level up.

I think a lot of problems come with code that’s just good enough. I think we’ve all seen pull request bombs where someone had a bug to fix or had a feature to implement, and they went down a path that maybe was not optimal. By the time you actually saw it to do a code review, it was too late to suggest an alternative approach, because so much work had been put into it. You knew that you would just crush the spirit of whoever sent the PR. We end up with this lowest-common-denominator code. People also tend to copy and paste the approach that they’ve seen taken elsewhere in a class or elsewhere in a code base. Demonstrating to them that there’s a better way and there’s a better pattern you can follow is really important to maintaining code quality. That was the overall mission of the team.

Derrick:
Is that kind of focussed team something you’d recommend others try? What kind of impact did it have on factors like code quality and the developer happiness?

Coraline:
I think people appreciated that we were paying attention to engineering’s needs. We’d send out surveys to gauge how satisfied people were with the code that they were writing, with CI, with the test suite, with the overall process. We tracked it over time. I think engineers want to be listened to outside of the scope of just, “What are you building right now?”, or “What are you fixing right now?” They feel empowered when their needs are being addressed and when people are asking them questions about how they’re feeling and the work that they’re doing.

In terms of it being a full-time team, I think that really depends on the size of your engineering organization. Instructure has about 100 engineers, so it made sense to have a dedicated team doing that for a while. I think you could have an individual who’s charged with doing that as their primary job, and that would probably help, or have a team that’s a virtual team where you’re rotating people through on a regular basis.

There are different ways to approach it. I think the important thing is just gauging the health of your development organization and gauging the happiness of your engineers. That is going to have a significant impact on the quality of your code in the end.

Derrick:
You talked about all those tools for code quality, and they sound really great. How did you measure developer happiness? Was it talking to people, or…

Coraline:
Talking to people. We had a survey that we created that asked a bunch of questions about, “What do you think are the impediments to getting your job done? How good do you feel about the feature work that you’ve been doing? Are we spending too much time fixing bugs?” We published the anonymized results of those surveys to our internal wiki so people could go back and reference them, and we could use that to track how well we were doing as a team, as well. Again, just listening to people makes them feel better about things and gives them some hope that maybe change is coming.


“I think engineers want to be listened to outside of the scope of just, ‘What are you building right now?'”


Recommended Resources

Derrick:
Can you recommend any other resources for those wanting to learn more about effective refactoring?

Coraline:
I would suggest really getting to know what your testing tools are as a first step, and looking at some alternative ways of testing. Generative testing, for example, gets a bad rap, but I think it’s really a helpful technique to use when you’re doing refactoring. Looking at what people in other languages are doing in terms of their philosophical approach to testing is really important, and making sure that you’re really up to speed on what those different tools are so that you can be more effective in your work.

In terms of tooling, I would recommend a tool for Ruby called Reek, which is a code smell identifier that can help you isolate what areas you want to focus on. You can create an encyclopedia of code smells for your code base and refer back to that, and organize your work such that, “Oh, we’re going to address all the code smells that are of this kind,” and do that across the board. Really look at the tools that are available for your language in terms of code quality.

There are a few books that I recommend. The first is ‘Working Effectively with Legacy Code’ by Michael Feathers. The examples are written in Java, but the lessons that it teaches are applicable to any language. That’s a really good place to start. I work in Ruby, so I would also recommend ‘Refactoring: Ruby Edition’ by Jay Fields and Shane Harvie, and another book called ‘Rails Anti-Patterns’ by Chad Pytel and Tammer Saleh.

Derrick:
Those are fantastic resources. Coraline, thank you so much for joining us today.

Coraline:
Great, it was a lot of fun. Thank you for inviting me.