Building trello.com for multiple devices

January 24th, 2012 by Bobby Grace

Trello at various sizes on multiple devicesWe built Trello from the ground up to work on just about any device. It’s not a simplified version with limited features, either. Trello responds to your device’s screen size and capabilities. It’s the same exact site and the same exact code; a consistent experience that looks, feels, and works the same everywhere.

But we also have an iPhone app. It’s pretty great if I may say so. If you are a Trello user with an iPhone, you should download it now. It’s free. Now that there is a great app, why would we still focus on a mobile web app? One reason is that focusing on mobile makes Trello better as a whole.

Everything we do for mobile translates back to a better desktop experience. This wasn’t the first reason we wanted to implement a responsive design, but it turned out to be the most important. Keeping mobile in mind focuses us on creating a fast and easy-to-use interface. Trello gets data super fast, but the page must also render quickly. That stops us from using complex interactions and some of the more whiz-bang CSS features. This translates back to rendering speed on your desktop, which isn’t likely to be an overclocked gaming rig with 32GB of RAM.

All the interface elements are mobile-friendly, which means they are also more desktop-friendly. Buttons and menus have big, friendly hit targets. Interactions are straight-forward. We don’t rely on hidden hover effects and if we’ve got a complex interaction, it will have a fallback for touch devices.

There are no redirects. Let’s say you’re on your phone and somebody links to trello.com from Twitter. You’re able to watch the video, read the pitch, sign up, and try it out. You are left with a good impression, no nonsense. No screaming “DOWNLOAD THE APP!” page, no switching out of whatever app your using, and you won’t be redirected five times and land on the m.trello.com homepage. You just get the information and it works.

Scaling, zooming, and resizing work seamlessly. Sometimes you want to use a small browser window on a desktop. Maybe you’ve got a side monitor with a Trello board and your mail app open while your text editor or photo editor are up on your main screen. The horizontal board view won’t work at smaller window sizes, but Trello will adapt to a view that does. If you need bigger or smaller text, you can zoom in (or out) as much as you need and Trello will use a view that works. Got a huge monitor and want to project your Trello board for all to see? Trello will work for that as well. Is it way far back in the office? Just zoom in and text will be readable.

We can deliver updates to all devices seamlessly. We have a single codebase and one place to deploy, which means updates are easy. We don’t have to retro-fit new features to a separate codebase, worry about new workflows or interactions, or think about how some URL is going to work. This lets us develop and ship features faster.

So how does it all work?

Here are some of the tools and tricks that made developing a responsive interface much easier.

Use a limited library of mobile-optimized, reusable components. Each component can be collapsed into a single column or otherwise adjusts for smaller screen sizes. The same layout elements used for the back of cards are used in the organization profile and a bunch of other pages. Our context menus (those small pop-ups used to do things like assign members and select labels) are narrow enough that they will fit on any screen. We have some one-offs like the landing page and the board, but they are few and far between. For the most part, we don’t have to ask how a new feature is going to work on mobile because any component we use will already be mobile-optimized.

Card menu on card detailThink twice about navigation. The card sidebar typically has navigation and buttons for voting, assigning, and the like. It collapses below the main column with a smaller window, which means you would have to scroll and scroll to get to that vote button you were looking for. We added a card menu to the header on the back of cards that lets you easily vote, add due dates, assign members, and everything else without wearing out your thumb.

Use a vector-based image editor like Illustrator to produce icons. We wanted to make sure icons looked sharp on devices with a higher pixel ratio like the the iPhone 4. We use a CSS sprite sheet that has every icon used on the site. We can easily export a higher resolution version using Illustrator, and serve it to capable devices using some simple CSS media queries.

Trello icon sprite sheet

That being said…

We’re developing for the browsers and devices of today and tomorrow. We don’t have spare development time, so we can’t spend any on browsers and devices that won’t be around in two to three years. Trello won’t work on the RAZR, for instance. It also doesn’t work on every ‘smart’ device. Internet Explorer 8 doesn’t have the technical capabilities to run Trello. The Windows Phone 7.0 browser is based off of IE8, so it won’t work either. But it will work on Android 2.3+, iOS 4.0+, Windows Phone 7.5 (Mango), and others. Are we neglecting would-be happy users? Perhaps. But we can provide more value to more people by shipping features faster and that’s better in the long run.

And we still love native apps! Apps provide things we can’t get out of the web: better speed, offline support, smooth animations, push notifications, and a native look and feel. Native apps will provide a better experience to a broader reach. We just don’t think the mobile web should be ignored because of them.

Designing for multiple devices has deeply influenced our design decisions and made for a better, faster, and easier-to-use product. There are plenty of improvements to be made, but we’re happy with the foundation we’ve set. Sign up now and check it out for yourself.



The Trello Tech Stack

January 19th, 2012 by Brett Kiefer

Trello started as an HTML mockup that Justin and Bobby, the Trello design team, put together in a week. I was floored by how cool it looked and felt. Since Daniel and I joined the project to prototype and build Trello, the challenge for the team has been to keep the snappy feeling of the initial mockups while creating a solid server and a maintainable client.

The Initial Trello Mockup

The Initial Trello Mockup

That led us toward a single-page app that would generate its UI on the client and accept data updates from a push channel. This is pretty far from any of the work we’ve done before at Fog Creek, so from a technical perspective Trello has been an adventure.

Initially, we were wondering how interesting and far-out the stack could be before management got nervous, but our concerns were addressed in an early meeting with Joel, when he said “Use things that are going to work great in two years.”

So we did. We have consistently opted for promising (and often troublesome) new technologies that would deliver an awesome experience over more mature alternatives. We’re about a year in, and it’s been a lot of fun.

CoffeeScript

Trello started out as a pure JavaScript project on both client and server, and stayed that way until May, when we experimentally ported a couple of files to CoffeeScript to see how we liked it. We loved it, and soon converted the rest of the code over and started coding CoffeeScript exclusively.

CoffeeScript is a language that compiles to readable JavaScript. It existed when we started Trello, but I was worried about the added complexity of having to debug compiled code rather than directly debug the source. When we tried, it, though, the conversion was so clean that mapping the target code to the source when debugging in Chrome required little mental effort, and the gains in code brevity and readability from CoffeeScript were obvious and compelling.

JavaScript is a really cool language. Well-written CoffeeScript smooths out and shortens JavaScript, while maintaining the same semantics, and does not introduce a substantial debugging indirection problem.

The Client

The Trello servers serve virtually no HTML. In fact, they don’t serve much client-side code at all. A Trello page is a thin (2k) shell that pulls down the Trello client-side app in the form of a single minified and compressed JS file (including our third-party libraries and our compiled CoffeeScript and Mustache templates) and a CSS file (compiled from our LESS source and including inlined images). All of that comes in under 250k, and we serve it from Amazon’s CloudFront CDN, so we get very low-latency loads in most locations. In reasonably high-bandwidth cases, we have the app up and running in the browser window in about half a second. After that, we have the benefit of caching, so subsequent visits to Trello can skip that part.

In parallel, we kick off an AJAX data load for the first page’s data content and try to establish a WebSocket connection to the server.

BACKBONE.JS

When the data request returns, Backbone.js gets busy. The idea with Backbone is that we render each Model that comes down from the server with a View, and then Backbone provides an easy way to:

  1. Watch for DOM events within the HTML generated by the View and tie those to methods on the corresponding Model, which re-syncs with the server
  2. Watch the model for changes, and re-render the model’s HTML block to reflect them

Neat! Using that general approach, we get a fairly regular, comprehensible, and maintainable client. We custom-built a client-side Model cache to handle updates and simplify client-side Model reuse.

PUSHSTATE

Now that we have the entire client app loaded in the browser window, we don’t want to waste any time with page transitions. We use HTML5 pushState for moving between pages; that way we can give proper and consistent links in the location bar, and just load data and hand off to the appropriate Backbone-based controller on transition.

MUSTACHE

We use Mustache, a logic-less templating language, to represent our models as HTML. While ‘harnessing the full power of [INSERT YOUR FAVORITE LANGUAGE HERE] in your templates’ sounds like a good idea, it seems that in practice it requires a lot of developer discipline to maintain comprehensible code. We’ve been very happy with the ‘less is more’ approach of Mustache, which allows us to re-use template code without encouraging us to mingle it with our client logic and make a mess of things.

Pushing and Polling

Realtime updates are not a new thing, but they’re an important part of making a collaborative tool, so we have spent some time on that layer of Trello.

SOCKET.IO AND WEBSOCKETS

Where we have browser support (recent Chrome, Firefox, and Safari), we make a WebSocket connection so that the server can push changes made by other people down to browsers listening on the appropriate channels. We use a modified version* of the Socket.io client and server libraries that allows us to keep many thousands of open WebSockets on each of our servers at very little cost in terms of CPU or memory usage. So when anything happens to a board you’re watching, that action is published to our server processes and propagated to your watching browser with very minimal latency, usually well under a second.

AJAX POLLING

It ain’t fancy, but it works.

Early Architecture Drawing

Early Architecture Drawing

When the client browser doesn’t support WebSockets (I’m lookin’ at you, Internet Explorer), we just make tiny AJAX requests for updates every couple of seconds while a user is active, and back off to polling every ten seconds when the user goes idle. Because our server setup allows us to serve HTTPS requests with very little overhead and keep TCP connections open, we can afford to provide a decent experience over plain polling when necessary.

We tried Comet, via the downlevel transports for Socket.io, and all of them were (at the time) shaky in one way or another. Also, Comet and WebSockets seemed to be a risky basis for a major feature of the app, and we wanted to be able to fall back on the most simple and well-established technologies if we hit a problem.

We hit a problem right after launch. Our WebSocket server implementation started behaving very strangely under the sudden and heavy real-world usage of launching at TechCrunch disrupt, and we were glad to be able to revert to plain polling and tune server performance by adjusting the active and idle polling intervals. It allowed us to degrade gracefully as we increased from 300 to 50,000 users in under a week. We’re back on WebSockets now, but having a working short-polling system still seems like a very prudent fallback.

The Server

  • node.js
  • HAProxy
  • Redis
  • MongoDB

NODE.JS

The server side of Trello is built in Node.js. We knew we wanted instant propagation of updates, which meant that we needed to be able to hold a lot of open connections, so an event-driven, non-blocking server seemed like a good choice. Node also turned out to be an amazing prototyping tool for a single-page app. The prototype version of the Trello server was really just a library of functions that operated on arrays of Models in the memory of a single Node.js process, and the client simply invoked those functions through a very thin wrapper over a WebSocket. This was a very fast way for us to get started trying things out with Trello and making sure that the design was headed in the right direction. We used the prototype version to manage the development of Trello and other internal projects at Fog Creek.

By the time we had finished the prototype, we were good and comfortable in Node and excited about its capabilities and performance, so we stuck with it and made our Pinocchio proto-Trello a real boy; we gave it:

Node is great, and getting better all of the time as its active developer community churns out new and useful libraries. The huge amount of continuation passing that you have to do is an issue at first, and it takes a couple of weeks to get used to it. We use a really excellent async library (and the increased code brevity of CoffeeScript) to keep our code under control. There are more sophisticated approaches that add features to JavaScript to automate continuations, but we’re more comfortable with just using an async library whose behavior we understand thoroughly.

 

HAPROXY

We use HAProxy to load balance between our webservers. It balances TCP between the machines round robin and leaves everything else to Node.js, leaving the connections open with a reasonably long time to live to support WebSockets and re-use of a TCP connection for AJAX polling.

REDIS

Trello uses Redis for ephemeral data that needs to be shared between server processes but not persisted to disk. Things like the activity level of a session or a temporary OpenID key are stored in Redis, and the application is built to recover gracefully if any of these (or all of them) are lost. We run with allkeys-lru enabled and about five times as much space as its actual working set needs, so Redis automatically discards data that hasn’t been accessed lately, and reconstructs it when necessary.

Our most interesting use of Redis is in our short-polling fallback for sending changes to Models down to browser clients. When an object is changed on the server, we send a JSON message down all of the appropriate WebSockets to notify those clients, and store the same message in a fixed-length list for the affected model, noting how many messages have been added to that list over all time. Then, when a client that is on AJAX polling pings the server to see if any changes have been made to an object since its last poll, we can get the entire server-side response down to a permissions check and a check of a single Redis value in most situations. Redis is so crazy-fast that it can handle thousands of these checks per second without making a substantial dent into a single CPU.

Redis is also our pub/sub server, and we use it to propagate object change messages from the server process making the initiating request to all of the other server processes. Once you have a Redis server in place, you start using it for all sorts of things.

MONGODB

MongoDB fills our more traditional database needs. We knew we wanted Trello to be blisteringly fast. One of the coolest and most performance-obsessed teams we know is our next-door neighbor and sister company StackExchange. Talking to their dev lead David at lunch one day, I learned that even though they use SQL Server for data storage, they actually primarily store a lot of their data in a denormalized format for performance, and normalize only when they need to.

Trello Today

In MongoDB, we give up relational DB features (e.g. arbitrary joins) for very fast writes, generally faster reads, and better denormalization support — we can store a card’s data in a single document in the database and still have the ability to query into (and index) subfields of the document. As we’ve grown quickly, having a database that can take a fair amount of abuse in terms of read and write capacity has been a very good thing. Also, MongoDB is really easy to replicate, back up, and restore (the Foursquare debacle notwithstanding).

Another neat side benefit of using a loose document store is how easy it is to run different versions of the Trello code against the same database without fooling around with DB schema migrations. This has a lot of benefits when we push a new version of Trello; there is seldom (if ever) a need to stop access to the app while we do a DB update or backfill.
This is also really cool for development: when you’re using hg (or git-) bisect and a relational test DB to search for the source of a bug, the additional step of up- or downgrading a test db (or creating a new one with the properties you need) can really slow things down.

So we like it?

We like our tech stack. As Joel observes, we’ve bled all over it, but I’ve never seen a team make an interesting app without tool- and component-related bloodshed, and not everyone can say that they really like what they’ve ended up with. As is true of most applications, no component or implementation detail is necessary to its nature; however, we think that this excellent set of open-source projects has sped up our development, left us with a solid and maintainable code base that we’re eager to move forward with, and made Trello a more responsive and beautiful app. Thanks to everyone who has contributed to them; it’s a great time to be a programmer.

Sound neat? Try Trello! It’s free.

Just can’t get enough tech stack talk? Here’s a Prezi I made for a recent talk on Trello.

* The Socket.io server currently has some problems with scaling up to more than 10K simultaneous client connections when using multiple processes and the Redis store, and the client has some issues that can cause it to open multiple connections to the same server, or not know that its connection has been severed. There are some issues with submitting our fixes (hacks!) back to the project – in many cases they only work with WebSockets (the only Socket.io transport we use). We are working to get those changes which are fit for general consumption ready to submit back to the project.



The State of the Kiln

January 9th, 2012 by Benjamin Pollack

As the new year opens, I thought it was high time to look back at the last year of Kiln and see what we’ve accomplished, then turn forward and discuss what we’re working on for the next twelve months.

Hold onto your hats. There’s a lot to like.

The Year in Review

The Kiln Dodo enjoying a partyWe had a tough act to follow last year. In 2010, we launched Kiln On Demand and Kiln for Your Server, shipped piles of new features, and found ourselves home to tens of thousands of users’ source code. But Kiln was new at the time; most of those of you who came to us were already familiar with distributed source control in one way or another. Kiln’s mission is to bring distributed version control to as many people as possible, and that means appealing to more than just early adopters.

To make that happen, we made a giant list of what was keeping people from jumping to Kiln, and we spent nine months fixing every issue one by one. We rewrote the review system to be more user-friendly (by overhauling the UI and making it easier to make cases and turning reviews into multiparty discussions instead of one-on-one chats). We helped Kiln integrate tightly with lots of other services so you could keep your existing infrastructure. We allowed configurable diffs, we taught Kiln how to display images in your repositories, we beefed up the API, we contributed largefiles (based on kbfiles, our extension that makes it trivial to have large binary files in Mercurial) back to the main Mercurial project, and we added a full-blown groups permission system so you can have large Kiln installs without getting lost in a quagmire of user management.

The result? Kiln 2.7, which we think gets even those who don’t have a lot of experience with DVCS easily using distributed version control, rather than getting lost in a muddle of tool arcana.

What’s to Come

“First nine months,” I hear you say. “Fascinating. That might be an impressive list. But the year is twelve months long, so what have you done for us lately?”

We have a strong rule against publishing road-maps for products, for a simple reason: we don’t like to disappoint you. We hate promising you’ll have some awesome new feature in Kiln 45.7, like telekinetic abilities, and then smashing your hopes most excellently by announcing we’ve had to delay them until 45.8. Instead, we prefer to surprise you each release with piles of new goodies. A kind of monthly Christmas for Kiln, if you will.

Of course, the flip-side is that we know you can feel abandoned when we go dark for awhile, even if we’ve got a really good reason for doing that. And I think we’ve been too quiet lately, so some of you are wondering if we’ve been run over by a bus.

Good news: we’re still very much here! And we’re still working hard. Unfortunately, our current projects are harder than our old ones, so we haven’t had much to show for it for a few months.

To fix that, I want to welcome you behind the curtain, and introduce you to three features coming your way in the next few months: the Kiln Client for Mac OS X; SSH support; and vastly improved search and general performance across the entire app.

Kiln Client for OS X

I admit it: Kiln may run on Windows, and I may love writing code for .NET, but at home, everything I own runs OS X. So it frustrates me that the Kiln Client is only available for Windows.

That’ll be changing shortly. Coming soon to a Mac near you: Kiln Client for OS X.

A screenshot of TortoiseHg for OS X

While it looks and works similarly to the Windows client, since both are based on the excellent TortoiseHg, we’ve taken the time to make it work like a real Mac app. You install it by drag-and-drop, and it takes care of the rest. It’ll automatically put Mercurial on your path, it’ll automatically update itself when there are changes, and otherwise behave just like a good Mac citizen.

SSH Support

While we love Mercurial over HTTP, the simple truth is that it’s not perfect for everyone. HTTP-based pushes put you into HTTP timeout hell, where you have to make sure your timeouts are correct at every level of your network (load balancer, web server, web site in IIS, and so on). For Kiln On Demand, that’s okay, because we control the full infrastructure. For our licensed customers, this can be a serious problem.

Good news: SSH suffers none of these issues, and Kiln is adding SSH support. And not just for On Demand; it’ll be available for licensed customers as well, as a single, easy-to-use, turnkey solution.

Improved General Search

Search in Kiln is a big deal, and we love it and we know you love it, but we want to be even faster, and we want it to be better with non-Windows-compatible file names.

So we’re rebuilding Kiln’s search story around Elastic Search, an excellent NoSQL search solution. We’re currently still heavily in the development stage, but already, our search times are down from a second or two for very large Kiln installations to just a couple of milliseconds. We think this will be a big deal for our licensed customers, and a huge deal for our On Demand customers.

We’ll have this work on Kiln On Demand in a couple of months, and for Kiln licensed customers about a month later.

So we believe there’s a lot to love in the coming months for Kiln. And we hope that this roadmap of what we’re doing, and the rough timeline we plan to do it in, will get you as excited as we are.

In the meantime, happy Kilning.



Why do we pay sales commissions?

January 4th, 2012 by Dan Ostlund

Among our many cherished verities and assumed assumptions is the widespread belief—nearly universal practice actually—that salespeople are to be paid commissions. It’s the way things are done. Stop signs are red. Salespeople get commissions.

But why?

This is a practice so deeply ingrained that almost everyone assumes that commissions are an unalloyed good, and that salespeople won’t work without them. I’ll return to that notion about work shortly, but it’s somewhat amazing that commissions are so widely lauded when they come laden with so many recurring problems. These issues pop up with distressing regularity.

There are all kinds of problems with commissions, for example, high turnover as salespeople shop jobs to get a slightly more lucrative commission system. Always attempting to maximize personal benefit which results in system gaming like making fake phone calls to hit call numbers, sandbagging deals into the next quarter, sniping new leads, and so on (the list here is actually endless).

The problems include infighting over who gets credit for accounts and sales. They include constantly comparing territories and account value to determine fairness between salespeople. They include an enormous amount of overhead as each salesperson sedulously tracks every transaction no matter how minute to make sure they get paid on it (by the way, they hate having to do this, and it’s a staggering waste of time. It’s also a place where weak salespeople like to hide out).

All of this is organizational dysfunction, and it’s a recipe for resentment and distrust among your team.

Management then tries to correct for these problems. They constantly drop or add ballast. They have to carefully structure the pay plan, the territories, the lead assignments. They have to referee disputes, tweak the various systems, and try to keep everyone happy. It’s like a spinning top and every time it starts to wobble, management has to try to nudge it back. It’s a large amount of effort spent propping up a system that we have all just assumed is necessary.

But it gets even worse.

In research by Dan Ariely and others it appears that higher incentives, actually reduce performance. That’s a perverse and counter-intuitive result, but in several different kinds of experiments, groups that were promised the largest amount of money as a reward for doing a task performed that task more slowly, and completed the tasks less often.

And yet the paladins defending commissions are everywhere.

They say that commission-based pay maximizes autonomy. Assume there are ten “units” of pay which, of course, can be divided up in all kinds of ways; eight units of salary and two units of commission, or one unit of salary and nine units of commission. How best to do this is the source of several different kinds of holy war.

This means, they say, that for every unit of base salary that gets added to the pay mixture in place of commission there is an increase in employer control, or put the other way round, a reduction in autonomy. More salary in the mix destroys the space for independence and kills the morale of the worker. It turns motivated independent free-agent sales people into wage slaves.

But it gets even worse, according to the defenders of commissions, and here is the crux of their argument.

They say that without the potential to make extra money for your efforts—that is without a one-for-one relationship between a piece of work and the resultant money—that there is no real motivation to work, or, god forbid, to do any extra work. How do you keep salespeople hungry if you don’t pay them commissions? This is the great inscrutable puzzle in sales pay. Too much salary and not only do salespeople lose their independence they become slugs on top of it.

But hang on a second. Isn’t all this deeply insulting to salespeople? Doesn’t it presuppose that salespeople are lazy and greedy and unethical and that it is only the sweet smell of more lucre that gets them to shed their pajamas each morning?

According to this view sales people are only motivated by a specific type of pay—the commission, and the magic is in finding, like Goldilocks, that place that’s just so perfectly just right.  Because of this we regard all the pathologies commissions create as just part of the price.

But could it be possible that we’ve got cause and effect reversed here? Is it possible that the stereotype of the slimy salesperson, and the derangement of the culture they so often get blamed for, are actually the result of the way they are paid instead of the kind of people they are? Could commissions, rather than the people, be the primary cause of dysfunctional sales cultures?

Think for a second: we don’t insist that other kinds of workers be paid on commissions. Only an amazing idiot pays a programmer by lines of code. We don’t assume that programmers will dog it if they are paid only salary. Actually, we think just the opposite; they will work hard because they care about the work they’re doing, and they won’t need lashings, or thumbscrews, or other popular forms of motivation. We assume they are internally motivated. This is one of the hallmarks of the enlightened software industry. Development isn’t (usually) conveyor belt work. It doesn’t (usually) suck the soul out of you. It’s creative, it’s mind work and it operates on an entirely different rhythm from traditional kinds of labor, so heavy external control tends to be counter-productive.

And this is true of other workers today—designers and writers and interaction experts, for example. They work on this different rhythm, and are trusted to do so, not coerced by the style of pay they receive, and in good work places, not too often by a manager either. A writer may impose a rigid schedule on herself, but any manager who attempts to impose a similar sort of external authoritarian control is suffering from a kind of insanity given how out of synch such management is with the demands of a lot of today’s work. And so it is with attempts to control with pay.

The different pay systems, then, leave us with two kinds of workers; the lazy salesperson who needs to be coerced with direct rewards, and programmers who can be trusted to go about their day and get good work done. It’s just weird when you think about it.

But, these are actually two different views of workers, not necessarily two kinds of workers.

The tension between these views of workers was described in the 1960s by Douglas MacGregor in his book The Human Side of Enterprise. He suggested that managers had two views of motivation, and that a manager’s theory of motivation determined company culture. The first view he called Theory X which assumes that people are lazy, want to avoid work and need to be controlled, coerced, punished, and lavishly rewarded in order to perform. Sounds like some sort of S&M dungeon to me. Theory X demands a lot of managerial control and tends to demotivate, generate hostility, and generally make people into sour pusses.

The second he called Theory Y which assumes that people are self-motivated, derive satisfaction from their work, are creative, and thrive when given autonomy.

With these two views in hand, we now have a way to describe why sales commissions create so much dysfunction. Commissions assume a Theory X world. It assumes salespeople are lazy. They need external motivation to shake off the moss. If you pay them on salary, they just won’t get things done. The commissions view then makes the mistake of presuming that this can be fixed with either larger overall commissions or a greater percentage of total pay coming from commissions.

But if Theory X doesn’t fit and is a degrading way to treat employees, then doing more of it gets you nothing but more degradation and misery, right?

The Theory X way of doing sales compensation, I now think, has habituated us into accepting deranged and dysfunctional sales behavior as if it’s just the cost of doing business.

We Get Rid of Commissions

So we got rid of commissions.

We thought about for a long time before we did it, but we finally switched about a year ago.

We did it because we were having a lot of the problems with commissions described above even though all of our salespeople are ethical and decent. Commissions just encourage certain kinds of behavior; dysfunction is built into the logic of the system.

The different kinds of  pay created divisions at Fog Creek. It was the fundamental thing that separated sales from everyone else. Divisions like that can be exceptionally corrosive to morale, and that’s something Joel and Michael (the founders) specifically set out to avoid when they designed the Fog Creek compensation system. Fairness was one of their main concerns, just as it was in the newer StackExchange comp system, and they felt that commissions just didn’t fit.

I’ll admit that I was skeptical about ending commissions mainly because commissions are such an established way of paying salespeople. I worried that we wouldn’t be able to hire anyone. I worried that some of our sales people might quit. I worried, good Theory Y votary that I was notwithstanding, that the salespeople would coast. I worried when my friends in sales management said this would never work.

To my great surprise, our salespeople really liked the idea. They especially liked being on the same salary plan as the rest of the company.

So we did it, and no catastrophes struck us. No earthquakes. No plagues, and no one quit. In the year since we dropped the commission system our sales have gone up. In fact, four of the last five months have been record months. We can’t reasonably say that our record sales were caused by this change, but we can reasonably say it didn’t hurt, and that’s worth having a hard think about in your own company. There is no guarantee that this will work for everyone, but it’s unlikely to be a disaster either.

Our salespeople all estimated that they were spending about 20% of their time just keeping track of what money was due them. There was constant horse trading. And, most worrying, we created a heavy disincentive to do all the service stuff that makes customer service shine. Why would you want a system that sets up after-sales service as competition against new sales, especially if you have a small sales team? Reputation and retention, after all, are both paths to revenue.

Removing commissions has changed the sales team. It has taken their focus off their compensation. They have all that administration time back for more useful things. They take a longer view of the value of a prospect, and are less worried about who is going to buy right now. They feel less stress about taking vacation. They don’t quibble among themselves over accounts. And best of all, they feel more integrated with the company.

As John, one of our salespeople said, “It’s made the team better. It’s removed the ‘me, me, me’ mentality. Now I want to share information with everyone on the team, and everyone is willing to pitch in because it doesn’t hurt me to help my colleagues.”

Getting rid of commissions lets us forget about policing the wobble. Now, it is not necessarily the case that commissions are always bad, or always fail, or are wrong for you. It’s just that they come with real problems, and you need to carefully weigh both your desired company culture, and the costs of policing your commissions system against the expected increase in performance, which is a very hard calculation to make.

For us, it’s been a great success, and at least from that perspective it might be time we punch the Theory X, commissions-based sales culture right in the nose. Real redemption might lie in removing the source of the derangement and treating sales people like we treat programmers and other workers that we implicitly trust.



FogBugz gets Fresh

November 18th, 2011 by Dan Ostlund

We think the FogBugz interface is getting a little long in the tooth. The problem is that we’ve been working on features and speed and, and, and…

There is always something that feels more pressing. Always some technical debt to pay down. Always a bug in some super wonky edge case, But It’s Happening To A Very Important Customer, and needs to get fixed. In short, we always can find a very reasonable and important reason to put off the design changes that pretty much everyone here wants to make.

But as everyone who has been near the internet or heard of Steve Jobs will tell you, design is just as important as features—possibly more so in some cases. In the same way that we’ve come to regard our website as a product, we also know that design is a feature that deserves equal billing with more pedestrian features. It’s not that we didn’t know this—we’ve always cared a great deal about it, and in fact, Joel wrote a book about it—it’s just that we haven’t had the time or the resources to devote to it in some time.

Tastes change. Technologies change. Sometimes even established practices and affordances change. It’s entirely possible to build up what we might call design debt in the same way that software can accumulate technical debt. And FogBugz has some design debt to pay down at this point. It’s time to scrape some of the mold off the bread.

But design changes are tricky. They alienate power users.  If things are mysteriously moved you have to be sent to a Microsoft re-education camp (to learn where all of your commonly used administration tools absconded to…again). You’re just as likely to make things worse as you are to make them better, especially if you’re doing a lot of guessing about what people want and how they behave with your software.

So this weekend we’re rolling out some design changes to FogBugz. We’ve tried to balance updating the look without requiring the cumbersome re-training camps. There are some very subtle changes on the gridview page but these are mainly making menu items a bit bigger, making fonts more consistent, and some color changes. You probably won’t even notice these changes.

The more interesting changes are on the case view page.

Here is a pic of the current (soon to be old) case view.

 

Here is a pic of the new case view.

 

There were a couple of things we tried to do. First, we just wanted to make it look a bit fresher and more modern, so we changed the icons, using some of the icons from the very beautiful Fugue Icon Pack.

Next, we felt it was important to make certain kinds of information stand out better. You might occasionally want to know when a case was assigned to someone, but more often the content of the case is much more important. Given that, we de-emphasized the information around administrative minutiae and tried to make the comments of a case more apparent.

If you send an email or have a case that comes in via email (in other words, the case has a correspondent), you’ll notice a nice little air mail bar along the top of that portion of the case. That makes it clear that you are dealing with or sending an email and distinguishes it from some other normal case edit. You’ll never accidentally send an email from FogBugz again. This one deserves a little more comment. At the risk of indiscreetly tooting our own trumpet, I’ll say that this was one of those occasional moments of perfect design inspiration. I didn’t even know it was going to be there, but the first time I saw it I understood instantly what it meant. Without the need for a text label, or an explanation, or some other heavy-handed means the function is totally clear. It’s an “I could have had a V-8” moment. Seems so obvious after the fact.

Cases have always contained status information, but you had to read text to know what it was. That’s OK, but we wondered if it would be better to give a visual cue so that this information could be absorbed at a glance. Now the status of a case is given a color in addition to the text. Green is active, blue is resolved, and gray is closed.

We’ve also done some odds and ends with color and more consistent font choices and the like.

Overall we’re pretty pleased with the balance we struck between freshness on one hand and consistency on the other.

We were able to do this because we moved a talented support tech over to the design team, and he made these changes over the course of a couple of weeks. But, alas, the commute was making life with his new child hard to manage and he left Fog Creek. Everyone on the FogBugz team was thrilled to have him, and we’re committed to making more changes to the FogBugz interface to make it more modern and responsive and generally more pleasing.

So this post ends with a request. If you’re a good designer, or know one who wants to work at a great software company in New York City, please send them our way. Check the Fog Creek careers page for the job posting.



Let Them Have Cake … And Ice Cream Too!

November 17th, 2011 by Rock Hymas

Consider a cake shop owner in a small town in Siberia. He sells the best cake around. Whenever someone wants a treat after dinner they stroll down to the cake shop and get a nice big slice of cake. But along comes a new treat: ice cream. Our cake shop owner ridicules his neighbor for starting an ice cream stand, saying no one will buy it because it’s cold. This is Siberia, after all. But rather than running an oven all the time, she can just keep her ice cream outside in the freezing cold, so it sells for less. Suddenly a whole bunch of people who could never afford cake are stopping at the ice cream stand after both lunch and dinner. It’s cheap, it tastes good, and the teenagers are having contests to see how much they can eat before passing out from brain freeze. It seems the whole town is there, and some of the cake shop customers start getting ice cream just because they want to hang out with their friends at the ice cream stand. All of a sudden the neighbor is rolling in the dough and she wants to buy out the cake shop since they have a great location.

Now suppose the owner of the cake shop recognizes that lots of his customers will like the ice cream (even in Siberia) because it’s so much cheaper. So he sets up an ice cream stand and hires his neighbor to start selling it as fast as possible. He notices that some of his regular cake shop customers are going to the ice cream stand instead, because it’s cheaper. So cake sales are down a little. Rather than shutting down the ice cream stand though, he brings it into the cake shop, adds a special “cake” flavored ice cream, and becomes the town hero. Not only has he saved his business, but he’s also been able to meet the dessert needs of more of his fellow townspeople. Cake is selling even better than before, ice cream is selling even better than that, and the cake-fanatics and ice cream groupies all get to hang out together.

Clayton Christensen’s talk at this year’s Business of Software was all about how companies disrupt, and are disrupted by, other companies. In building a product, you (or those who came before you) made decisions that are really hard to change after the fact. That’s fine; those “stakes in the ground” are what made the product successful. But it also limits the viable lifetime of the product. At some point disruption happens.

Disruption =

Larger Market + Lower Price + Different Measure

 

Professor Christensen pointed out that disruption occurs when a company solves a given problem for a larger audience at a lower price point with a different measuring stick for comparing value. Companies that successfully avoid being disrupted are usually able to do so only by disrupting themselves. Few companies are able to pull that trick off, and it typically involves having a different team or business unit set up in order to avoid all the baggage that the last-generation products carry with them.

What does this have to do with Fog Creek? Well, I’ll admit that I was a little skeptical when Fog Creek launched Trello. My worries pretty much lined up with those outlined in this post from a FogBugz customer. Though it’s taken some time, I’m starting to see how important it is for Fog Creek to prepare for the disruption of its flagship product, FogBugz.  By doing so, we can make sure FogBugz will keep solving our customers problems and keep making us money.

Trello fits the model for disrupting Fogbugz. It solves the problem of planning the work on a project for a larger audience than just software companies, at a lower price point than a FogBugz or a Jira, with a different measuring stick: putting Trello and FogBugz next to each other on a feature comparison chart doesn’t make any sense.

Trello’s target market is much larger than that of FogBugz. Where FogBugz targeted software development teams with features like bug tracking, automated crash reporting, and evidence-based scheduling, Trello can provide value to any group of two or more people working together on something that can be broken down into steps. Even though any group of people could use FogBugz to do the same thing, using FogBugz doesn’t really make sense when you’re in HR doing hiring, or in law working through cases, or in a studio vetting country bands. Using Trello does.

Additionally, Trello has a lower price point: free. Everything currently offered by Trello is free, and will remain so going forward. Yes, there will probably be value-added features and services that Fog Creek will charge for at some point. Compared to free, though, FogBugz is expensive at $25 per user per month. When you just want something simple to plan your wedding, that doesn’t make any sense. But if you’re managing software projects for reasonably complex products, then FogBugz easily adds that much value, and our customers make that clear by coming back again and again.

Trello also has a different measuring stick. And that’s the real reason it will eventually overtake FogBugz, even for software projects. Or rather, it will take over some of the roles that FogBugz fills in a software development company. We already use it here at Fog Creek to manage our work at a coarse-grained level. It provides a potentially public view into product development, which is cool to see.

The real key, though, is that Trello, like FogBugz, is opinionated–but it has very different opinions. Rather than seeing work on a project as a large set of small items that need to be tracked individually, it sees project work as a small set of somewhat larger tasks that fit into a bigger whole, a workflow defined by the team. If FogBugz tried to create some kind of dashboard view of your bugs to compete with Trello, you’d be so overwhelmed by minutia that you’d give up and walk away. No one wants that kind of view when they’re dealing with hundreds or thousands of individuals cases. But Trello redefines the way we see our project work and that fundamentally changes the game.

Wait a sec! I’m on the FogBugz team. What am I saying?! Am I just making a Steve Yegge “TMI” mistake by posting this publicly? Am I talking myself out of a job?

 

TMI?

 

I don’t think so. And here’s why.

The same legacy that prevents it from winning in the larger Trello market gives it a competitive advantage in the market for software development teams. FogBugz will continue to make money, and it is still growing at a nice pace. It needs investment, but not the kind of investment that would attempt to turn it into Trello. Rather, the kind of investment that will take advantage of disrupting products like Trello while preserving it’s usefulness in the niche of software development teams.

And our customers, for the most part, aren’t going to jump ship for Trello anytime soon. Currently, Trello can replace only a very small part of what FogBugz provides. One customer pointed out that it cannot handle bulk editing, screenshot captures are painful, and categorization and search aren’t designed for the situation where you have lots of items. Additionally, FogBugz supports incoming and outgoing email, automated crash reports, and deep hierarchies of work. You can install it within your network and use it completely internally. It’s also got awesome source control integration with Kiln. These are all things that software development teams care about, often passionately. Trello, and other products like it, may eventually meet some of these needs, or integrate with other tools that do, but that will take time. Time that will allow FogBugz to further differentiate itself.

Besides, learning new ways of working is hard. Anecdote time! We spent some time over the last few months rethinking the UI for FogBugz. As the team lead, I made the call to focus on that, and did what I could to protect the team through the process. Unfortunately, when it was done, Joel pointed out that it wouldn’t fly with our customers. Why? Because we “moved the cheese” in too many ways. It would have required our customers to learn new ways of working, and human nature doesn’t like to do that. Though that does limit what we can do with FogBugz going forward, it also strengthens the ties that our current customers have with the product. They know where to click, which keyboard shortcuts to use, and they know what to expect. That helps them work faster, get in the flow, and keep the boss happy.

Finally, and related to the last point, FogBugz has an ecosystem of supporting technology around it. When our customers start using FogBugz and Kiln, they often integrate with a set of tools and technology. This is one reason it’s harder to innovate on those products, because our customers don’t just rely on not having to learn new ways of working, they also rely on their custom plugins not breaking, on their random Python scripts still working, and on third-party tools continuing to integrate well with FogBugz. Each of those is a barrier to entry – they make it harder for a competing product to win our customers away from us (yep, even if the competitor is made by us as well) because they also add value for the customer.

In short, for almost all of FogBugz existing customers, and most of the larger market of customers using a competing bug tracker, using Trello instead is not the right business decision. And within that market FogBugz can make sure that business decision doesn’t change by doing the right kinds of investment. Investment that will keep FogBugz relevant and growing into the future is core-competency investment, not more features or chasing faster competitors in a race decided by a radically different “finish line”.

And there are tons of things we can do in the core of FogBugz: reduce the complexity of the product both for our users and for our developers; make it faster; phase out old features that aren’t being used by anyone; fix the backlog of bugs; improve our up-time; integrate with Trello and other apps. That last point is probably the most salient. As Trello grows and becomes a better fit for certain kinds of project management work, FogBugz can increasingly integrate with and offload those areas to Trello. At the same time, it can target specific problems that software development teams face, and provide value by solving them well.

In the end our customers will be able to have their cake … and their ice cream too.



Longer, More Formal Closings Invite Fewer Thank-you Replies

November 10th, 2011 by Rich Armstrong

Since we are entering the season of giving thanks, I thought I’d drop a little note about those useless little thank-you emails you sometimes get when you help a customer through FogBugz (or, really, any other system).

A thank you note

We get the occasional customer request asking us to implement some sort of auto-ignore feature that won’t re-open a case when the person just responds with “Thanks!”  Now, as the person responsible for customer happiness at Fog Creek, the very idea of something that “auto-ignores” any customer communication sends up all sorts of red flags.

We’ve looked at this from all different angles and have just never come up with an automated method that doesn’t suffer from both false positives and false negatives. In the end, the false positives (closing cases that did need more work) are far more destructive than the small time savings garnered by closing these cases automatically.

The problem really is not the person responding, but your response to them.  If you send a response like this, you’re basically leaving something unconfirmed and basic human decency will prompt some portion of the population to provide you with closure on your interaction.

Hi Bill,

Please reboot your computer.

Rich

Since this is basically a social issue, not a technical one, it calls for a social hack, not a technical hack. Through experimentation, we find that the longer and more formal your closing, the fewer thank-you emails you get in response.

You can use snippets to craft a closing that does not invite response. Here’s a good example that has worked very well for us:

Hi Bill,

This can easily be solved by rebooting your computer.

Let me know if I can be of further assistance, if indeed this was helpful, or if this raises any other questions.

All the best,

Rich

This closing doesn’t have any hubris in it. We don’t assume that we solved your issue. In fact, we don’t assume that we’ve helped at all, or closed the conversation at all. Paradoxically, by inviting dissent or further conversation, we invite only the constructive responses.

Thanks!



Friday Linkparty

November 4th, 2011 by Dan Ostlund

Read on!

A Starcraft AI that’s more like a human player: This implementation faces many of the obstacles that human players face, like only having a single viewport, and being able to issue one command at a time. Still, 2000 APM is crazy!

Tuning TCP for more vroom vroom: Adjusting initcwnd for better performance.

Operations efficiency is one of your possible competitive advantages: And Apple, of course, excels at it. “The decision to focus on a few product lines, and to do little in the way of customization, is a huge advantage.”

Crowd sourcing algorithms: Big prizes

Some musings on patents. On Andrew Sullivan’s blog. There is also the always excellent Lewis Hyde, who has a fantastic book on patents and copyright.

 

 

 



FogBugz and Kiln Build and Release Report Card

November 3rd, 2011 by Rock Hymas

My wife and I are a couple months into our first year of homeschooling, and we’ve been discussing how the kids are doing with it, and the challenges we’ve faced as their teachers and parents. Those discussions have led us to make some important changes, in the same way that a school report card can get parents and children to make changes that lead to more learning and happier kids. In that spirit, I offer here a report card on the build and release management work we’ve been doing on FogBugz and Kiln [1] over the last 12-18 months.

About a year ago, I was in the middle of writing a series of posts about the practices of build and release management. Many of those practices came from looking at what we were doing with FogBugz and Kiln and identifying what worked, and what didn’t. Others came from what I saw and experienced while working within the Microsoft Office organization, and still more came from discussions within our team at the Creek. Though in some areas we were doing things right, most of the posts were a description of what our existing build and release (non-)system was doing utterly wrong and what I hoped we could change to fix it. So the series was often written in a spirit of hope that we would one day live up to the ideal.

In fact, you could take many of the posts I wrote last fall and read them as the opposite of how things existed at Fog Creek in the summer of 2010. So, in the spirit of Jason Cohen’s recent BoS talk about honesty, I’m going to be brutally honest about how we were doing and how we are now doing.

First, how we were doing. Here’s the report card for June of 2010, divided into subjects, of course.

Report Card for June 2010
Subject Grade
Fast builds/deploys

Deploying upgrades to www.fogcreek.com takes over an hour

Other builds and deployments often take much longer than they need to

Upgrades to FogBugz or Kiln often involve downtime of a few seconds to half an hour or more

D
No hard dependencies

Checked out source code is tied to absolute paths in some cases

The build and deployment scripts are full of hard coded paths and other hard dependencies

F
Single commands (automate, automate, automate)

We can’t run most builds with a single command

Many of the build and deployment scripts are combined

Deploying a new version takes multiple steps

Different scripts are used for dev builds vs official builds

Different configurations of the build use different build scripts

There is no tooling to make it easy to setup your dev system

There are no scripts to configure a new server for our hosted environment, or a new dev box for development

D
Staging Environment

We have no staging environment for our hosted FogBugz and Kiln products

F
Continuous Integration

There is no continuous integration builds or tests running

F
Scripts @ production quality

The scripts are old, unmaintained, with lots of cruft

Many built binaries are stored directly in the VCS

D
Onboarding new developers

There is no easy way for devs to build installers

There are no build scripts that do incremental builds

D
Insight into builds and releases

Old builds are not saved for most official configurations

We aren’t tracking build statistics like failures, build quality, etc.

Logs of builds are not systematically kept and stored anywhere

Build failures do not automatically notify anyone, nor do successful builds

There are three different types of version numbers for our three different deployment targets

F

All in all, things were a mess. ”Good grief!”, I hear you cry. “So what have you done about these issues? Anything? What are you planning to do?” I’m glad you asked.

My initial goal in the role of build and release manager was to eliminate the position. I wanted to get things running smoothly, and spread the knowledge and ownership of the problems through the team, so that I could go back to doing product development. As such, a decent amount of the progress described here was done by other great developers here at the Creek. Back in January (when I joined FogBugz as the team lead), I thought I’d gotten about halfway through the work required for that to happen. But, of course, new ideas come up constantly, so maybe it never ends.

Since January, I haven’t been able to do as much. In addition to my small efforts, other great Creekers have picked up some of the slack. Here’s our report card for October 2011.

Report card for October 2011
Subject Grade
Fast builds/deploys

A new, fast build machine

Reduced web site deployments from taking ~80 minutes to taking ~8 minutes

FogBugz and Kiln deployments now faster, because they don’t also rebuild the product

Upgrades to FogBugz or Kiln still involve downtime of a few seconds to half an hour or more

C+
No hard dependencies

Removed all the absolute paths

Removed many other hard dependencies

Removed some remaining hard dependencies that occasionally caused build failures

B
Single commands (automate, automate, automate)

Mortar, an internal website for kicking off builds and deployments

Lots of help on this one from Benjamin at bitquabit

Separated build and deploy scripts for hosted builds

Some of the build and deployment scripts are still combined

Deploying a new version takes multiple steps

Different scripts are still used for dev builds vs official builds

Different configurations of the build still use different build scripts, but some combining work has been done

There is still no tooling to make it easy to setup your dev system

There are still no scripts to configure a new server for our hosted environment, or a new dev box for development

C+
Staging Environment

Still no progress on the staging environment

F
Continuous Integration

Continuous integration builds exist, mostly just building the products

Basic tests have been added to some continuous integration builds

B
Scripts @ production quality

Removed many checked in binaries, building them instead

Many scripts converted from FinalBuilder to Python, but a few still remain

B-
Onboarding new developers

Branch builds, for any branch a dev wants to setup

bf, a better tool for building FogBugz incrementally on dev machines

Aaron kick-started this and continues to be the driving force

There is still no easy way for devs to build installers

D
Insight into builds and releases

Saving old hosted builds

Moved to a single type of version number no matter the config

Build logs for all official builds

Cases filed on build failure

Notifying build owners when builds complete, successfully or not

We aren’t tracking build statistics like failures, build quality, etc.

B-

As you can see, there is still plenty of work to do. The glaring failure in both report cards is our lack of a staging environment. We’re now working on making that a reality, with high hopes that it will help us start catching a new class of bugs before they make it into our customers hands. As part of that work, we’ll also be able to work on:

  • Scripts for configuring our servers, by type
  • No-downtime upgrades
  • Move our internal dogfooding to the staging environment, for better consolidation and testing of deployment scripts

Other possibilities for further work should be pretty clear from the report card, but the staging environment is the key focus when it comes to improving our builds and releases.

One note about the culture for both FogBugz and Kiln. We’ve learned a ton in the last couple years by seeing at close range what can be done in greenfield development on products like Trello, or teams that actively maintain and improve their processes for builds and releases, like StackOverflow. Obviously, when working with a legacy product like FogBugz, the costs of implementing certain practices may vary a lot, and we have to take that into consideration, but the inspiration that comes from seeing what is possible keeps us working to make things better.

When I look at this report card, I feel like my oldest son a little bit. He’s our perfectionist, always expecting to get everything right the first time through, and pretty disappointed when that doesn’t happen. We’re hoping that some of the changes we make in our homeschooling will help him relax a bit so he can learn more freely. And yes, it’s somewhat humbling to look at our build and release report card and admit how far we have to go. But I have high hopes that in another six months we’ll have improved our grades again.

 

[1] What about Trello?

Trello has the awesome advantage of being greenfield development. As such, they’ve really started from the ground up by doing the right things when it comes to builds and releases. They frequently release midday with no downtime, they have a nice staging environment, their releases take almost no time at all. They have run into some interesting load challenges with updating their clients after an upgrade, but I’ll let the Trello team talk about that in their own sweet time.



DNA Changes Ecosystems: Our support team changed the company, and now must change itself.

October 25th, 2011 by Rich Armstrong

If you take a look at the Support Engineer job posting for Fog Creek, what you’re actually looking at is a finely crafted trap. It’s a trap to get coders to come talk to our customers. And it’s worked very well. We’ve managed to get five very smart, very capable people to come do a job they probably would never have considered at another company. The difference was in the DNA. They knew we were doing something different because the job posting laid out our core principles.

Years ago, Joel laid out some basic precepts about how he wanted to approach customer service. It’s the first thing I read after getting hired away from Google to grow out Fog Creek’s support team. I was the entire support team for 20 months. (Two guys shared the job before.) I buckled down and I squeezed every improvement I could out of the support workflow, which was easy when it was only me. There was no team communication overhead, and I could tweak the processes without getting buy-in from others.  I also was able to affect the development of the software such that the raw support load decreased, even though the customer base more than doubled. I took Joel’s precepts and built them into the DNA of the support team we ended up hiring, and that allowed us to get a long way.
 

Eugène Ferdinand Victor Delacroix 055
 

The one precept I’ve built into our DNA more than any other is “Fix everything two ways”. Everyone says they fix the hard problems, but they don’t. I wanted to make it a reality.  We call it “Fix It Twice” and we use it in a lot of ways.
 

First, if we think a customer issue can be solved by a deeper change in the software, we always spin that idea off in to a secondary queue. The worst time to think about how to do a deep fix is right after you’ve solved the initial problem. The job demands empathy with the person having the problem, and that leads to a kind of recency bias. You’re always worked-up about the thing that just happened. What you do about that worked-up feeling makes all the difference. (Even recognizing this is helpful for fending off frustration in the support role.) If you just let it go, you end up with the feeling that “stuff never gets fixed.” If, however, you spawn off that potential change into a secondary queue, where it can be evaluated at another time when you’ve regained your distance and objectivity, you get a better sense of where resources might be best dedicated to reduce load overall. This, in itself, makes fix-it-twice worth all the trouble. You’ve taken frustration and turned it into business intelligence.

 

After that, we realized that the amount of stuff waiting for us in the fix-it-twice queue was a very good proxy for job satisfaction. If the numbers stayed low, that meant we were getting the slack to go and think about how to make our job easier and the software better.  If the numbers stayed low regularly, it meant we could code. That has led to real improvements in the software, like the Do Later plug-in. The support team has stayed happy and productive for a long time…. And then, the roof fell in. We lost one of our support reps to the design team. Well, actually, one of our support reps is a gifted designer and got poached (with our blessing).
 

When the team sat down to figure out what we wanted from our newest support rep, we realized something pretty momentous: The job had changed. It’s only now that I realize that building it into the DNA of our company has changed the company such that our needs are totally different. A responsive support team and an effective engineering team mean that the easy problems get solved forever and you’re left with mostly tough ones. We knew it’d been getting harder for years, felt it viscerally. The problems just kept getting more complex, less common, harder to diagnose. But now for the first time, that meant that the team had to change because our ecosystem had changed.
 

If your DNA says that you prey on slow, lumbering, delicious prey, eventually you’re going to find that those walking buffets are getting scarce. After that, you might find that you have to run a little faster to eat. You might also find that the plants your now-scarce prey fed on have changed and been supplanted by others. We find ourselves in a different world, and we had a hand in making it.
 

So, let’s call this the Kilneolithic epoch. The job posting still lays out our DNA. It’s still who we are. But it’s no longer the finely crafted trap it was before. Now, we’re looking for a different animal. The person we’re looking for is personable and curious and techie, just like always, but we need someone a bit more hardcore. We’re good at troubleshooting .NET and IIS.  But Kiln runs on OpenGrok, Apache, Redis, Mercurial. We need someone who’ll thrive in our new ecosystem, maybe even more than we do.
 

Recognizing that we’d changed our world was both gratifying and terrifying. We know we’re good at the base job, but it’s taken us to a place where we’re challenged. The team’s up for it, but we need one more body. If you’re up for it, email us at jobs@fogcreek.com. (Yes, this blog post was, itself, a finely crafted trap.)