Archive for the ‘Fog Creek’ Category

Protecting the Maker’s Schedule From Chat

March 12th, 2015 by Gareth Wilson

OK, we’ve all done it. You spotted a little issue, or couldn’t remember something. So you opened up Chat and quickly pinged a colleague about it. It wasn’t that important to you, but it was right there in the front of your mind, so you asked anyway. Harmless, right?

Well, maybe not. At least not for creatives. Developers, designers and anyone else who creates or builds things, work in a different way to most people. They work on a ‘Maker’s Schedule‘ as Paul Graham puts it, based around units of time of half a day at least. Unlike managers, sales people, and other team members who have fractured days – working in short, hourly blocks around meetings. Makers work best in blocks of uninterrupted time. For a maker, even a single meeting, or other interruption during this creative time can prove disruptive. What may seem like a minor interruption – a single question pinged on chat – distracts them and takes them out of their flow.

private_office (2)

Our solution is to file a case. Shocker – the creators of an issue tracker think cases are great. But along with private offices for developers and a healthy absence of meetings, it’s something that has been a critical part of our culture for a long time. And honestly, we just think it’s the solution that’s most respectful of a colleague’s time. By simply creating a case with the relevant details, the issue is documented and assigned to the maker in a manner that minimizes disruption. This allows them to prioritize and action the issue at a time that fits their workflow.

As a recent Creeker, I must admit this took some getting used to. I had cases for the smallest of tasks and even just questions. Cases didn’t seem as gentle as chat, but somehow harsher, maybe even passive-aggressive. I think it’s because by creating a case, it made it a Thing® rather than just a quick question. Cases are permanent, whereas chat feels temporary. But, in fact, if you raise a random question on chat then it’s doing the receiver a disservice. It suggests that what you want to discuss is more important than whatever they’re working on.

Chat can also be limbo for information. Whoever you pinged on chat may have read it, but this doesn’t mean they were really paying attention. The onus is on the recipient to handle the information received. So often, you just get back a lacklustre response and the problem gets forgotten about, lost amongst the Cat Gifs and integration notifications.

What’s more, a problem raised in a 1-to-1 chat prevents others from working on it too. So it can increase the communication cost if it has to be passed on to someone else. Chat also seems to demand an immediate response. Sure, you can set your status to ‘busy’, but we don’t always remember to do that, and how often is it noticed or respected anyway?

This isn’t to say don’t use chat. We use it all the time across the company. It’s more to do with using the right form of communication for the task or question. Do you really need an immediate response, and from that one person? The simplest thing for you to do at that moment isn’t without its consequences for others. So, don’t ping me. File a case.

Knowing When to Stop – Tech Talk

March 6th, 2015 by Gareth Wilson

 

There comes a point in every instance of creation when the creator steps back and says, “Done.” But how do you know when a thing is complete? And what happens when you continue past that point?

In this short Tech Talk, Matt, a System Administrator here at Fog Creek, using examples from Computer Science, Finance, and Art, explores different perspectives on this question. It acts as a cautionary tale for anyone involved in software development about the dangers of feature creep and not knowing what done looks like.

 

About Fog Creek Tech Talks

At Fog Creek, we have weekly Tech Talks from our own staff and invited guests. These are short, informal presentations on something of interest to those involved in software development. We try to share these with you whenever we can.

 

Content and Timings

  • Computing (0:19)
  • Gambling and Finance (0:48)
  • Art (1:30)
  • Software (3:22)
  • Examples of Feature Creep (4:50)

 

Transcript

We often lose track of what does done look like and when do we reach it and say, “Enough. Working on this more is not going to help anything.” When you start thinking about this… well, we’ll start with computing.

Computing

Computing has a well-defined halting problem. In computability theory, the problem of determining from description of an arbitrary computer program input whether the program will finish running or continue to run forever. This is like basic, way-down, Comp Sci stuff, where does this input associated with this algorithm actually lead and does it meet a condition or does it loop forever. When you start making this more of a human aspect, you wind up with a kind of the mathematical halting problem or the gambler’s dilemma.

Gambling and Finance

The gambler is trying to maximize their profit. They’re a min-maxer; they want to invest the least possible amount and gain the most possible amount. I’ll include finance in here, which is a kind of sophisticated gambling with a little bit more information. The notion is that you have tools, mathematical tools, that you can choose to apply to this problem to figure out what the optimal solution can be. There are tools for this. You can actually sit down with a spreadsheet or some piece of software or a pencil and paper and figure out the proper and optimal solution for this problem. For the creative, you stop when the expression of the idea is complete. When the hell is that?

Art

Well, there are very few mathematical models you can apply to figure out when your watercolor of a beautiful waterfall is finished. There are some boundaries, which are established by the medium. If you’re doing sculpture and you’re chipping away, eventually there’s not going to be any stone to chip away anymore and you are going to have to stop because there’s nothing left, literally. If you are a painter, or a sculptor in clay you can continue to add clay, but eventually gravity is going to take over and say, “all right, you can’t really put that newfangled nose on this statue. It just doesn’t have the support.” There are realities that do apply themselves to certain media. The Statue of Liberty can’t necessarily hold her arms straight out or have forearms because the structural architectural notions of that beautiful thing out in the harbor just doesn’t support that, like Michelangelo up there. “I saw the angel in the marble and carved until I set him free. Every block of stone has a statue inside it and it is the task of the sculptor to discover it.” Michelangelo has this concept of the finished artifact in his head; he’s trying to liberate it from the medium. Ultimately, he knows what he’s looking for. Maybe he doesn’t know specifics, but he’s going to keep trying to pull it out until you wind up with something like this. There’s a more contemporary quote: “Knowing just when to declare a work of art finished is an eternal struggle for many artists. The issue is that if you don’t work on a piece enough, the work can come across as incomplete. On the other hand, overworking a piece can cause the work to appear tired and tedious. The most compelling works of art throughout history are able to establish a strong balance of gesture and spontaneity while simultaneously appearing to be substantial and fully resolved.” Much more fuzzy, no maths going into this. I’m done when I think it’s done.

Software

Then we get to Software. We’ve kind of come full circle. We started with the computing halting problem, we came all the way through art, which is one of the most open, creative processes, and now we’re here at software, which is even more open than art. Yes, the machine implies certain things for you based on the way it acts and the way you can use it, but ultimately you can create literally fantastic worlds inside a machine. When you don’t know when to stop with software, you start suffering what’s called feature creep. Jamie Zawinsky was a programmer for Netscape, and he came up with Zawinsky’s law, which is, “every program attempts to expand until it can read e-mail. Those programs which cannot so expand are replaced by ones which can.” There’s this notion that applications are pushed well beyond their requirements and intentions. You have a program, you’re trying to solve a problem, and then you fall in love with the thing in the process of making it. Then you start thinking, “Well, what else can I add? What else can I do? Where else can I go? Oh, well this function isn’t very good; it doesn’t have that je ne sais quoi I was looking for when I was writing it. I’m going to go back and rewrite it. Oh, there’s a new library; I could re-implement this in another language. I could do this, that and the other.” You can just fall into this hole and get stuck there and never know when done is done, because you’ve lost sight of what you were originally even intending and what the finished state looked like, if you knew what it was in the beginning.

Examples of Feature Creep

This is what feature creep looks like in one image. This is Microsoft Word 2010 with every toolbar enabled; that is all you get to type in. Yes, some people might use these things. Yes, that is an interface to get people to be able to use those things, but, you have long gone past the notion of laying out documents. Here’s another example, and it’s kind of a case study, I think, in not knowing when to stop because you don’t know what it is that you’re trying to come up with.

They had a team, they were in Australia, they were isolated from the rest of the Google environment, and they spent two years working on this thing in isolation, effectively. That’s what it looked like. The paradigm they said they wanted to change was e-mail, and they said they wanted to re-implement e-mail for the 21st century. Nobody knew what Wave was or where it fit in, but it was supposed to… they got so attached to this thing that they kept adding more and more crap to it that it ceased to be e-mail for the 21st century. It turned into this communications hub. They ate so much dog food that it was poisoning them. They spent their entire life in Wave. All of their internal communications, all of their internal documentation, all of the rest of their stuff was all in Wave, and they just expected everybody to do that. This was the central focus of their working life. Whenever they encountered a thing like, “oh, I wish I could send a tweet from within Wave,” or “I wish I could read this RSS feed from within Wave, because I’m always in Wave and I want to be able to do these things,” they kept making more and more and more and more and more complex. Two years later, when they finally crack open the box and join the rest of the world, they have this monstrosity that the only way to use it successfully is to basically have it take over your life and then do all of these things within the context of Wave, which is not what people want.

The question is, ultimately, when do we stop? The answer is when it’s done, which is kind of a cop-out. Because if you don’t figure out what done looks like when you start, you’ll never figure it out along the way. We stop when it’s done. Figuring out what done is is the problem.

A Developer’s Guide to Growth Hacking – Tech Talk

February 27th, 2015 by Gareth Wilson

 

Given the media hype that surrounds the term ‘Growth Hacking’, you can be forgiven for dismissing the whole thing as another marketing buzzword. But what can get lost in the hubbub are some useful, development-inspired, working practices that can help a team focus on maximizing growth.

In this Tech Talk, Rob Sobers, Director of Inbound Marketing at Varonis, tells you all you need to know about Growth Hacking. Rob explains what Growth Hacking is and describes the processes key for it to be effective – from setting goals, to working through an experimentation cycle and how it works in practice.

Rob was formerly a Support Engineer here at Fog Creek, and is the creator of his own product, Munchkin Report. He writes on his blog about bootstrapping and startup marketing.

 

About Fog Creek Tech Talks

At Fog Creek, we have weekly Tech Talks from our own staff and invited guests. These are short, informal presentations on something of interest to those involved in software development. We try to share these with you whenever we can.

 

Content and Timings

  • What is Growth Hacking (0:00)
  • People (2:34)
  • Process (3:22)
  • Setting Goals (5:25)
  • Experimentation Cycle (6:12)
  • How It Works In Practice (12:03)

 

Transcript

What is Growth Hacking

I was a developer, started out my career as a developer, kind of moved into the design space and then did customer support here, and then now I’m doing marketing. I’ve been doing marketing for the past, I don’t know, two and a half three years almost. This sort of like, phrase, growth hacker kind of cropped up. I kind of let the phrase pass me by. I just didn’t discuss it. I didn’t call myself a growth hacker. I stayed completely out of it, mainly because of stuff like this.

It’s just overwhelming. Like Google ‘growth hacking’, you’ll want to throw up. What it really comes down to is that growth hacking is not at all about tactics. It’s not about tricks. It’s not about fooling your customers into buying your software or finding some secret lever to pull that’s hidden that’s going to unlock massive growth for your company. It’s really about science. It’s about the process. It’s about discipline. It’s about experimentation. Tactics are inputs to a greater system.

If someone came up to you, you’re a Starcraft player and said, “What tactic should I use?” You would have a million questions, “Well what race do you play? Who are you playing against? Who’s your opponent? What does he like to do? What race is he playing? Is it two vs. two or is it three vs. three?” There’s so many different questions. So if someone comes up to me and says, “What tactics? What marketing channels should I use for my business?” You can’t answer it. The answer is not in the tactics.

So this is what Sean Ellis, this is how he defines growth hacking. He says, “Growth hacking is experiment driven marketing.” You walk into most marketing departments, and they’ve done a budget, and they sit in a room, and they decide how to divvy up that money across different channels. “Okay, we’ll buy some display ads. We’ll do some Google Ad Words. We’ll invest in analyst relations,” but they’re doing it blind. Year after year, they’re not looking at the results, looking at the data, and they’re not running experiments. So it’s really kind of blind. So this is really the difference.

I took it one step further. I said growth hacking is experiment-driven marketing executed by people who don’t need permission or help to get things done, because I think growth hacking’s a lot about the process. And it’s about culture, and embracing the idea of doing a whole bunch of marketing experiments week over week. But if you have a team that is only idea-driven, and tactic driven, and then they have to farm out all of the production to multiple other stakeholders in the business like teams of Devs or Designers, then you’re not able to iterate. So to simplify it I just said, “Growth hacking equals people, people who have the requisite skills to get things done from start to finish, and process.”

People

So let’s talk about people. You don’t just wake up in the morning and just say like, “Let’s do some marketing.” You have to know what your goals are and then break it down into little pieces, and then attack based on that. So this is a system, that was devised by Brian Balfour at HubSpot. I call it the Balfour method. A good way to measure a person, when you’re hiring to be a growth hacker and run growth experiments, is to show them this chart and say, “Well how far around the wheel can you get before you need to put something on somebody else’s to-do list?” Now granted you’re not going to always be able to hire people that can do everything. I’ve seen it work where people can do bits and pieces, but it sure is nice to have people who can do design and development on a growth team.

Process

So before you begin implementing a process at your company, what you want to do is establish a method for testing. And then you need analytics and reporting. I’ve seen a lot of companies really miss the boat with their analytics. They’ve got it too fragmented across multiple systems. The analytics for their website is too far detached from the analytics that’s within their products. Because you don’t want to stop at the front-facing marketing site. It’s great to run A/B tests, and experiment on your home page, and try and get more people to click through to your product page and your sign-up page, but then also there are these deep product levers that you can experiment with your onboarding process, and your activation and your referral flow.

So what you’re really looking for, and the reason why you establish a system and a method, is number one to establish a rhythm. So at my company we were in a funk where we were just running an A/B test every now and then when we had spare time. It’s really one of the most high-value things we could be doing, yet we were neglecting to do it. We were working on other projects. The biggest thing we did was we had implemented this process which it forces us to meet every Monday morning and discuss, layout our experiments, really define what our goals are, and establish that rhythm.

Number two is learning, and that basically is that all the results of your experiments should be cataloged so that you can feed them back into the loop. So if you learned a certain thing about putting maybe a customer testimonial on a sign-up page, and it increases your conversion by 2%, maybe you take a testimonial and put it somewhere else where where it might have the same sort of impact. So you take those learnings and you reincorporate them, or you double down.

Autonomy, that goes back to teams. You really want your growth team to be able to autonomously make changes and run their experiments without a lot of overhead. And then accountability, you’re not going to succeed the majority of the time. In fact you’re going to fail most of the time with these experiments. But the important thing is that you keep learning and you’re looking at your batting average and you’re improving things.

Setting Goals

So Brian’s system, it has a macro level and a micro level. You set three levels of goals. One that you’re most likely to hit. So 90% of the time you’ll hit it, another goal at which you’ll hit probably 50% of the time, and then a really reach goal which you’ll hit about 10% of the time. And then an example would be let’s improve our activation rate by X%. This is our stated goal. Now for 30 to 60 days let’s go heads down and run experiments until the 60 days is up, and we’ll look and see if we hit our OKRs with checkpoints along the way. So now you zoom in and you experiment. So this is the week-by-week basis. So every week you’re going through this cycle.

Experimentation Cycle

So there’s really four key documents as part of this experimentation cycle. The first is the backlog. That’s where you catalog. That’s where you catalog all your different ideas. Then you have a pipeline which tells you what you’re going to run next, as well as what you’ve run in the past. So that somebody new on the team can come in and take a look and see what you’ve done to get where you are today. Then is your experiment doc which serves sort of as a specification.

So when you’re getting ready to do a big test, like let’s say you’re going to re-engineer your referral flow, you’re going to outline all the different variations. You’re going to estimate your probability for success, and how you’re going to move that metric. It’s a lot like software development as you’re estimating how long somethings going to take, and you’re also estimating the impacts. And then there’s you’re play-books, good for people to refer to.

So with Trello it actually works out really well. So the brainstorm column here, the list here, is basically where anybody on the team can just dump links to different ideas, or write a card up saying, “Oh, we should try this.” It’s just totally off the cuff, just clear out whatever ideas are in your head and you dump them there. So you can discuss them during your meeting where you decide which experiments are coming up this week.

The idea is that you actually want to do go into the backlog. The pipeline are the ones that I’m actually going to do soon, and I’ll make a card, and I’ll put it in the pipeline. And then when I’m ready to design the experiment, I move it into the design phase, and then I create the experiment doc. And then I set my hypothesis, “I’m going to do this. I think it’s going to have this impact. Here are the different pages on the site I’m going to change, or things within the product I’m going to change.” And then later in the doc, it has all of the learnings and the results.

So one key tip that Brian talks about is when you’re trying to improve a certain metric, rather than saying, “Okay, how can we improve conversation rate?” You think about the different steps in the process. It just sort of helps you break the problem in multiple chunks, and then you start thinking a little bit more appropriately. And this is actually where the tactics come into play when you’re brainstorming, because this is where you’d want to actually look to others for inspiration. If you’re interested in improving your referral flow, maybe use a couple of different products, or think about the products that you use where you really thought their referral flow worked well, and then you use that as inspiration to impact yours. You don’t take it as prescription. You don’t try and like apply it one-to-one, but you think about how it worked with their audience and then you try to transfer it over to how it work with yours.

Prioritization, there’s really three factors here. You want to look a the impact, potential impact. You don’t necessarily know but you want to sort of gauge the potential impact should you succeed with this experiment. The probability for success, and this can be based on previous experiments that were very close to this one. So like I mentioned earlier, the customer testimonial you had a certain level of success with that in one part of your product on your website, and you’re going to just reapply it elsewhere. You can probably set the probability to high, because you’ve seen it in action with your company with the product before.

But if you’re venturing into a new space, let’s say like Facebook ads. You never run them for your product before. You don’t know what parameters to target. You don’t know anything about how the system works and the dayparting and that, then you probably want to set the probability to low. And then obviously the resources. Do I need a marketer? Do I need a designer, a developer and how many hours of their time?

So once you move something into the pipeline, I like to have my card look like this. I have my category, my label. So this is something with activation, trying to increase our activation rate. And then I say, “It’s successful. This variable will increase by this amount because of these assumptions.” Then you talk with your team about these assumptions, and try and explain why. So the experiment doc, I had mentioned before, this is sort of like your spec. I like doing this rather than implement the real thing upfront, if you can get away just putting up a landing page, and then worrying about the behind the scenes process later, do that. Like if you’re thinking about changing your pricing. Maybe change the pricing on the pricing page, and then not do all the accounting, billing code, modifications just yet.

Implement, there’s really not much to say about that. The second to last step is to analyze. So you want to check yourself as far as that impact. Did you hit your target? Was it more successful than you thought, less successful? And then most importantly, why? So really understanding why the experiment worked. Was it because you did something that was specifically keyed in on one of the emotions that your audience has? And then maybe you carry that through to later experiments.

And then systemize, another good example of systemizing that actually comes from HubSpot is the idea of an inbound marketing assessment. It’s actually their number one Lead Gen channel which is they just offer, for any company that wants, they’ll sit down one-on-one and do a full assessment of their website, of their marketing program, et cetera. When they were doing these one-on-one discussions those became their best leads most likely to convert.

So they made something called Website Grader which is something you could find online, and it’s sort of like the top of the funnel for that marketing assessment where someone’s like, “Ah, I don’t know if my website’s good at SEO, and am I getting a lot of links?” like that. So they’ll plug it into the marketing grader. It’ll go through and give them a grade. They’ll get a nice report, and then a sales rep in the territory that that person lives will now have a perfect lead-in to an inbound marketing assessment. Which they know is a high-converting activity should someone actually get on the phone with their consultant. So it’s a good example of productising.

How It Works In Practice

So this is just sort of like how the system works. So Monday morning we have … It’s our only meeting. It’s about an hour and a half, and we go through what we learned last week. We look at our goals, make sure we’re on track for our OKR, which is our Objective and Key Result. And then we look at what experiments we’re going to run this week, and then the rest of the week is all going thought that rapid iteration of that cycle of brainstorming, implementing, testing, analyzing, et cetera.

So you kind of go through these periods of 30 to 90 days of pure heads-down focus, and then afterwards you zoom out, and you say, “How good am I at predicting success of these experiments? Are we predicting that we’re going to make a big impact, or making small impacts? Is our resource allocation predictions accurate?” And then you want to always be improving on throughput. So if you were able to run 50 experiments during a 90-day period, your next 90 days you want to be able to run 55 or 60. So you always want to be improving.

4 Steps to Onboarding a New Support Team Hire

February 26th, 2015 by Derrick Miller

newSupportHire (1)
At Fog Creek, every now and then the Support team has the pleasure of onboarding a new member. We know what most of you are thinking “Wait! Did they just say ‘pleasure’?” Yes, yes we did. Team onboarding does not have to be an irksome obstacle in your day-to-day work – it’s a key milestone for your new hire’s long-term success and the process should be repeatable and reusable.

If you’ve ever been in support, you know there can be a lot of cached knowledge representing the status quo, and there is usually, and sometimes exclusively, a “show and tell” style of training. This is the fire drill of knowledge transfer, and it’s an arduous process for all concerned. Not only does it take longer for the new hire to get up to speed, but you also have at least one person no longer helping your customers. For anyone who has ever worked in a queue, we’re sure you can agree that when someone steps out, the team feels the impact immediately.

Our Support team mitigates this by using a well-documented onboarding process. And we do it without a giant paperweight… err training manual. Similar to how we onboard new hires to the company, we also leverage Trello to onboard our new Support team hires. The items on the board are organized and the new hire just works down the list.

The board separates out individual items and team-oriented items. This keeps the new person accountable for their tasks, and it keeps the team involved so that they don’t accidentally abandon them.

support team onboarding new hire trello board

1. Read Up on the Essentials

The first item on the board is titled “What to read on your first day”. This card links to a wiki page that talks about the things the new person needs to know before they can do any real work.

Next, is the “Support Glossary”. This is essential as they’re going to hear words, phrases, and acronyms galore. So scanning through this card helps them start to get a feel for the “lingo”.

With this done, it’s time to join the company chat and get a few nice “hellos” and introductions from other folks in the company. Primarily, this stage helps them to start assimilating the knowledge they’ll need to be successful in the role.

The assimilation process starts with briefly describing the Support team’s role and responsibilities within the organization. This covers our two main workflows: interrupts and queue-based. Then we move on to our guiding customer service principles.

After reading several more cards, which each link off to wiki pages, the new person moves them over to the ever-so-rewarding “Done” column. Starting to feel accomplished, they can start to get their hands dirty.

You may be wondering “couldn’t they just have one card and link to a wiki page with a list of articles?” Sure. But, that process tends to be more of a rabbit hole, and we want our Support team hires to have just the right amount of information in phases, and not dumped on them all at once.

2. Dogfood Until it Hurts

supportTeam2 (1)
After reading for what probably feels like weeks (not really, a day maybe), the new person starts using our products. Since we dogfood our own products, this is a great way to discover and learn about them. They can later use this experience to relate to, and help new customers. They create production accounts, staging accounts, and start a series of configurations. This helps them get into the Support workflow.

3. Go Under the Hood

Configuring web application sites isn’t all that hard, so we up the challenge. The new hire starts creating any necessary virtual machines (VM). Each one is identified on separate cards on the board, naturally. These VMs aid the new Support member in troubleshooting customer environments by replicating them as best as they can.

Since Kiln and FogBugz sit on top of databases, the new person also starts to configure those systems and get familiar with the database schemas. This helps build an understanding of our products’ foundations.

Once they have what we call the “basics”, they can start tricking out their dev machine. This card links to another board with all the juicy details maintained by all devs in the company.

4. Get Immersed in the Workflow

supportTeam (1)
There are a several more cards which discuss process and procedures. These include when to use external resources, where they are located, and how to use them.

A key part of Support is a robust workflow. The team helps the new person get immersed into the workflow by adding them to scripts, giving the repository permissions in Kiln, adding them to recurring team calendar events, and so on. Most importantly, they start to see how the Support team shares knowledge and work on some real customer cases where they will be helping our customers be amazingly happy!

We’ve found that using a lightweight, but clearly defined process, to onboard a new hire to our Support team is key to their efficiency and long-term success. It helps the new hire become self-sufficient, as well as know where they can go for help as they gain experience.

Intro to Electronics – Tech Talk

February 20th, 2015 by Gareth Wilson

 

In this Tech Talk, Lou, a Support Engineer here at Fog Creek who runs our internal weekly Maker Space meet-up, gives an introduction to Electronics. He explains how electronic components like resistors, capacitors, diodes and switches work, along with key laws like Ohm’s Law, Kirchoff’s Voltage and Current Laws and how series and parallel circuits differ.

 

About Fog Creek Tech Talks

At Fog Creek, we have weekly Tech Talks from our own staff and invited guests. These are short, informal presentations on something of interest to those involved in software development. We try to share these with you whenever we can.

 

Content and Timings

  • What is Electricity (0:00)
  • Electric Circuit Water Analogy (1:47)
  • Resistors (3:10)
  • Ohm’s Law (3:43)
  • Kirchoff’s Voltage Law (6:00)
  • Kirchoff’s Current Law (6:40)
  • Capacitors (8:40)
  • Recommended Resources (12:20)

 

Transcript

What is Electricity

Electricity is a fundamental form of energy observable in positive, yadda, yadda, yadda, alright, we’ll just focus on a little bit at a time. Electricity is a fundamental form of energy according to our current models of how we see the world around. This is as low as it gets. This will probably change over time but that’s how we’re representing it for the purposes of this. It’s observable in positive and negative forms, ok, there’s plus and minus. It occurs naturally like lightning, or is produced as in a generator. We all know this, this is great. And that is expressed in movements and interaction of electrons and I think that’s really the key point, that what we’re doing is pushing around electrons. They’re doing all of the heavy lifting. Simple atomic model – let’s go with hydrogen right here. A simple review, we have a proton in the center and it’s got that positive charge unit electron out in the orbitals. It’s carrying that negative charge. So what does that mean, how do we use that knowledge? When we look at our conductors, our metals, right. Copper’s a great conductor. And if you start looking at how the electrons are organized, they’re organized into shells. Shells are filled from the lowest energy level to the highest and every time you fill a shell and step up to the next shell, there’s a jump in energy levels. We don’t really need to know about the energy levels to do electronics, but the neat thing about it is as the atoms grow and you go from to larger protons in the electrons, there’s more pressure or attraction being put on all of the electrons. So when you jump an orbital and start building a new orbital, that’s where you get a weak interaction with the outermost electrons so they’re the easiest to bully. And they’re the ones we’re going to pick on.

So we want to make them work. We talk in terms of voltage and current. Voltage is the pressure that you put on to electrons. They amount of motive force you’re pushing behind them, whereas current is a reflection of how many electrons you’re actually putting to work. So the number of electrons that you’re pushing past your point is current, and the amount of pressure that’s on those electrons, or for lack of a better term the speed they are travelling at, is the voltage. That’s not accurate, but that’s kind of an analogy.

The water analogy, if you’ve ever looked at electronics is probably something you’ve seen before. Voltage is like water pressure in this analogy and current is like water volume. So the amount of water you have flowing past a point is your current. We talk about the amount of water that flows through a point or current through a piece of electronics component. Pressure we talk about the amount of pressure across, or the pressure gradient of water from the start to the finish of a component.

Just to give a picture of some of the things we’ll be looking at – batteries, that’s your voltage or your pressure source. We talk about ideal voltage sources because it’s really easy to pretend that that battery always puts out 9 volts. In practice, that’s not true, but just we’re going to pretend that it always puts out at a fixed 9 volts as it makes things a lot easier to understand as you’re learning it. And that’s going to be like our water pump, that’s how you give that motive force to that water.

Resistors

Wires they’re like pipes, they’re going to be carrying our electrons, that’s how we guide them and send them to where we want them to go. Resistors are components that restrict pipe size, so with a resistor we can determine how many electrons can get past the point. This is to limit current. Everything has resistance, resistors are specially designed to bring a larger amount of resistance than normal at fixed levels so that we can sort of control that pressure. So we treat everything else as ideal in not having resistance because it makes it easier. So we pretend that the only things that resistance are these resistors.

Ohm’s Law

Ohm’s Law tells us that voltage is equal to current times resistance. If you double, if you increase your current, which is confusingly named I, you have to reduce your resistance to get the same amount of voltage. If you don’t and you increase your current, and you keep the same resistance then you’re going to need a bigger voltage supply to compensate for that. And that’s really helpful because again, resistors are how we control current and this is why. For a fixed voltage supply, this Ohm is the thing you’ll be adjusting because you’ll pick your resistors accordingly. And likewise if you pick a bigger resistor then you’ll get less current.

How do we know how much work is being done across the circuit? How do we know that the LED is going to light up? We have to account for all of the voltage. In a complete circuit the pressure from your power supply, your voltage, is going to drop across all of the components. So we have to figure out how much voltage is crossing across each component. Because ideally what we want to find out is how much voltage is dropping across this resistor, because then we can use the voltage and the resistance to determine the current that’s flowing through the circuit. Why is that important? If you throw too many electrons through the LED it blows up. Well, it doesn’t blow up but it smokes and doesn’t work ever again. So I did the math for us, the arithmetic for us, because no-one like to do arithmetic. LEDs have forward voltage, it’s like a pressure gate on a water valve. You need to put a certain amount of pressure across the LEDs or else it doesn’t turn on. Voltages vary as you go from red to blue, white, the forward voltage tends to increase. I picked 2.2 volts because that’s pretty common for a red LED. The remaining voltage is that 6.8 volts since we have 2.2 volts crossing across the LED all 6.8 volts have to drop across that resistor to make a total 9 volts. So what does that mean? If you do the arithmetic we don’t have to concern ourselves with it because computers can do arithmetic, they’re great at that. What you get is .021 amps, so it’s 20 milliamps roughly. 20 milliamps happens to be a great value for getting a lot of brightness out of a LED without over driving it. It’s safe for the LED and the 330 Ohm resistor if you’ve ever bought like an Arduino kit, or anything similar, you see a lot of 330 Ohm resistors. That’s not uncommon when you’re dealing with 5-9 volt power supplies. The math should work out that the current stays in a pretty safe place.

Kirchoff’s Voltage Law

Kirchoff’s Voltage Law, the plain english translation is that the sum of all voltages in a loop is 0. Resistance in a series circuit sums. We have a 330 Ohm resistor, and 3.3K Ohm resistor, we’re going to have 3630 Ohms. So now we no that we could model this as one resistor if we wanted to make this simpler. You could take those two resistors out and put in a 3630 resistor. You’re not going to find one of those, which is why the resistor values are all weird, they’re designed that way so you can build pretty much any resistor value you can imagine. Current is reduced by the increased resistance, because remember, resistors restrict the flow of electrons.

Kirchoff’s Current Law

So what we have, we have two loops. You can examine each loop independently. Kirchoff’s Current Law tells us that the sum of all currents as a node is 0. Or you don’t get nothing for free. So we know from looking at each of these individual circuits, we can just wipe out the bottom loop here and just deal with the LED and a 330 Ohm resistor, so we know that it’s going to get 20 milliamps. The other circuit is exactly the same. Same resistance, same LED. They’re both going to draw 20 milliamps. If you want 20 milliamps for one circuit and 20 milliamps for the other circuit, you’re going to need 40 milliamps out of your power supply. You don’t get nothing for free.

In parallel circuits, parallel resistances sum inversely. This might be intuitive, because when you think about it, what did we just say? The whole circuit is now taking twice as much current. Now the same amount of current is only flowing through each individual branch. But if you reduce this down to a single equivalent circuit, then you’d be looking at a total resistance for the circuit of 165 Ohms. Current has increased circuit wide. Voltage drops are independent across each loop. The pressure doesn’t drop, there’s no, we’re going to treat it like there’s no internal resistance at nodes, things like that. Again, in practice things can be a little bit different but in ideal analysis, this is what we, this is how we treat everything. And that’s pretty much how you’ll see it if you start playing around with meters and looking at this deeper detail.

Ok, you’re now an expert on making things light up. You know how to make things light up by themselves, you know how to make things light up in loops and in parallel. If you want you can throw more LEDs in series, just keep in mind that with that forward voltage. Lets say you have a 2.2 volt LED, right. You put two in series it’s going to be a 4.4 volt drop. You put 4 in, it’s going to be a 8 volt drop. You put that fifth one in, it’s going to go over 9 volts, if you can’t put enough pressure across them then they don’t open at all, LEDs don’t light up, it’s like you cut a circuit. Nothing will light up.

Capacitors

Capacitors are a curve ball. They operate a little bit differently. Capacitors are like a storage device, in a water analogy they would be, you know, one of these nice big water towers. So the size of the capacitor determines how many electrons it can hold. Capacitors are measured in farads, and farads are one coulomb volt. So if you have electrons coming out from the upper left, again we have parallel circuits, the voltage is going to push electrons on to the capacitor until it’s full. Then it’s going to stop pushing. The voltage is also going to flow our LED and our 330 Ohm resistor. Again so we don’t blow it up. Then we’re going to come back and go to the battery. One the upper right corner is our switch button. By default that push button is off. So again, that’s going to break the circuit. No electrons are going to flow until we’ve pushed that button. That’s how you control things. Once you push that button, electrons are getting through, they’re getting dumped on to the capacitor. When you release that button what you’re going to have is electrons flowing from the capacitor, that negative plate through the LED and through the resistor, because now there’s no pressure from the battery holding electrons in the capacitor. So they’re going to flow around to the other side of the capacitor. The capacitor in a way will work like a battery, but the storage tank analogy isn’t great because the water doesn’t refill its own storage tank. You have a complete circuit, with a capacitor that’s acting like your voltage source. That’s providing the pressure and then the complete circuit is the LED and the resistor. Everything up top goes away. Because remember when we switch that switch off, the battery can’t take those electrons so it’s not a circuit. You push that button electrons are racing out as fast as they can at 9 volts of pressure, filling the capacitor very quickly and keeping the light lit. When you let go of that button, there’s no pressure holding those electrons back in the capacitor any more, it’s like you opened the valve on a water tank and out comes all the water. They have to flow through the light, they have to flow through the resistor but you’re going to lose, there’s only a limited number of electrons, or a limited capacity on that capacitor. Once that charge is expended the light goes out. But because the capacitor has a certain amount of pressure on it, as it’s expelling the electrons, it’s losing voltage, it’s losing the ability to maintain that pressure. The same thing happens in batteries, but it’s a little bit different. This happens very quickly in the smaller capacitors and as the voltage drops, remember your circuit analysis. You have to have a certain amount of voltage to keep that LED open and when you’re dropping that voltage there your voltage on the capacitor drops, your pressure drops, the voltage across the resistor changes which changes the current, which changes the speed at which your capacitor drains. And when you change the current, remember the current to a certain degree you’re doing to see dimming in the bulb, the LED.

So the higher the Farad rate of the capacitor, the longer it takes to charge because it can hold more electrons. The time constant is pretty interesting. I find that it never really works as planned, but in general it holds up pretty well. You’re going to be looking at the resistance on Ohms, the capacitance in Farads if you multiply them together this is going to give you the two thirds mark of the discharge of the capacitor. If you fully discharge a capacitor that’s what that number is going to look like. Again with LEDs you have that forward voltage, it cuts off at 2.2 volts so it gets a little wonky, but you get some idea at how long, how much charge that capacitor has.

Recommended Resources

It all gets more complicated from here, but it’s a lot more fun. A few places to get stuff from. SparkFun is great, Fritzing is awesome it lets you build all of the schematic diagrams, and all the drawing diagrams. All that’s at Fritzing. Ada Fruit is local to New York City, also a great place to get stuff and they have amazing tutorials. So I recommend checking that out as well.

How Our Support Team Share Knowledge Using FogBugz

February 17th, 2015 by Derrick Miller

Support or Customer Service, at Fog Creek is different. The team is empowered to help our customers and do whatever it takes to solve a problem. We take great pride in the customer service we make available to our customers and often write about it – from seven steps to remarkable customer service, using it as a competitive advantage, dealing with angry people, to our 5-part series about how we do customer service, and how Trello (a product we created and spun off into its own company) use FogBugz to support 4 million members with just one person.

Something we haven’t written much about is how the support team share knowledge among its members. The nature of Support is fast-paced. There is always a constant flow of knowledge – from archival knowledge to the streaming updates of issues and features.

Staying abreast of this information is difficult, but our Support team stays on top of it all with FogBugz, and it’s search, wikis & cases, auto-subscriptions, shared filters, and subcases. We’ve broken down how we use each of these features to help share knowledge here at Fog Creek.

Write Down Your Internal Processes

The team uses Wikis for longer lived documentation. As a Support team member, you know to look in the “Customer Service” wiki before interrupting someone else. This wiki contains archival information for newer hires (FogBugz has nearly 14 years of history) and documentation of internal processes, among other gems. The great thing about the information in the wiki is that it is searchable. There’s also a clear outline and hierarchy to help organize the content and assist someone trying to find a piece of information. For example, if a new Support member was looking for general information to get started, they can see the table of contents and click the “General Customer Service” link.

wiki_main

If a few terms in the article are known, run a search. For example, here a Support member is looking for an article about To-Do items not completing:

wiki_search2

Document Your Thoughts

In addition to the Wiki, the Support team leaves artifacts for fellow team members in cases. They know that someone will leverage the power of search to help them with future cases. When working on a case from a customer, they write out their hypotheses or thoughts on how to approach the problem in the case comments. They make sure to include error messages too. If the case changes hands because a team member is out of the office, the information for troubleshooting it is already in the case, saving the new team member from starting over.

case_hypotheses

In the Support world, log messages are gold. We include them in case comments to improve searchability. Searching helps them develop additional hypotheses or determine ones to avoid. Perhaps another Support member looking at the case above would say “Why wouldn’t we check the Chrome console logs first for any errors?” and add that to their troubleshooting steps.

For the occasional “FYI” situation, the Support team uses the “Notify More Users” field on a case to explicitly let a team member know of new information. This organically grows knowledge throughout the team. Here we have Mary editing an inbox case and notifying Erin:

case_notify

Stay Updated Automatically with Auto-subscriptions

The auto-subscribe feature is used by the support team to keep up with key Project Areas in FogBugz as well as cases they create.

These key areas being “Known Issues” and “Fix It Twice”. The “Known Issues” area contains cases that quickly document a recently uncovered issue with our production (i.e. live) On Demand service and the communication needed both internally and externally. This could be an unfortunate service interruption, a regression bug or a new feature not working as expected. These issues can be reported either internally or externally. Here, we have an issue reported externally that Mary is creating a known issue case about (you can also do this for “Fix It Twice”) cases:

case_knownissue

Once the case above is created, Mary will be subscribed to the case so that she can see when Jamie adds the related subcases. Jamie has the same auto-subscription preferences, so he’ll be able to automatically see updates on the case that Mary makes. This is visible under the “Subscribed To This Case” field on the left-hand side of a case if you have this feature enabled.

list_subscribers

Essentially, any time a case is updated or created by someone, the rest of the Support team instantly knows about it. They can use this information to react to any cases they are currently handling.

Bonus tip: You can auto-subscribe to Wikis too!

Don’t Repeat Yourself with Shared Filters

To further take advantage of FogBugz’s search capability, the Support team uses three primary shared filters: one for customer cases due today, one for the “Known Issues” area, and one for the “Fix It Twice” area. This is in addition to their own personal filters for cases assigned to them.

sharedfilters

The filter “Support: Next Due”, is very important because of our promise to our customers: We answer all email within one business day. It shows all cases due today. Having these two filters available to every team member without any extra effort (read: creating their own identical filters), saves everyone time so they can focus on their work. New team members briefly look at the list of cases that are due today (and not assigned to them), and subscribe to anything that is new that they might learn something from. They don’t subscribe to everything because that is quite simply just too much information, but cherry-picking some cases to “eavesdrop” helps fill knowledge gaps quickly. The added bonus with eavesdropping on cases is that a new team member will see the culture and tone of the company and use that in his or her own cases.

Keep Things Organized with Subcases

“Who is currently affected by that new Known Issue?” is a question the Support team will ask for every new Known Issue that they get a notification on. The answer is in the subcases. When a case is added to the “Known Issues” area, this case becomes the parent case, and any and all customer cases become subcases. The Support team uses this hierarchical relationship to get a top-level view of who is affected and needs communication about the issue. See “Subcases” on the left-hand side of a case, or click “Case Outline” if there are several.

FogBugz’s search feature is tremendously powerful. The support team uses the available search axes to narrow down cases and/or wikis that contain relevant information. Has a similar issue happened before? How did someone else approach solving the problem? Didn’t someone already report this problem? Among the countless other questions. For example, a full-text search for “todo item not completing” can be run to see what wikis and cases show up:

todoitemnotcompleting

The key search axes for the Support team to sift through full-text results are:

  • orderby:lastedited
  • edited:”today”
  • edited:”-1w..now”
  • type:case
  • type:wiki

The key takeaway here is that the information exists in cases and wikis because someone took the time to write it down. Don’t worry, the first time you write something down, it doesn’t have to be perfect. You can always go back and edit what you wrote.

Creating an auto-subscription, a shared filter, and writing things down in wikis and cases will put you well on your way to getting the most out of sharing information with your team. The powerful search facility provides one way of accessing the increasingly valuable knowledge base growing inside of your FogBugz.

Go and Artificial Intelligence – Tech Talk

February 13th, 2015 by Gareth Wilson

 

In this Tech Talk, Tim, a Software Engineer here at Fog Creek and 2-dan amateur Go player, talks about Go (the board game), Artificial Intelligence and attempts to create computer programs that can beat human players. He gives an overview of Go, explains how to play it and why Go AI is hard. He finishes by describing the progress so far with Go AI programs and what the future is likely to hold.

 

About Fog Creek Tech Talks

At Fog Creek, we have weekly Tech Talks from our own staff and invited guests. These are short, informal presentations on something of interest to those involved in software development. We try to share these with you whenever we can.

 

Content and Timings

  • Introduction (0:00)
  • Overview of Go (0:32)
  • How to Play Go (4:37)
  • Go Artificial Intelligence (12:22)
  • Progress with Go AI (20:04)

 

Transcript

Introduction

Alright, so I’m going to talk a little bit about the board game Go. I’m sorry to all of you who are hoping for Go Lang discussion here. And also some of the progress that has been made with Artificial Intelligence and trying to get computers to be as good as humans at Go.

So first I’m going to tell you a little about the game, then I’m going to teach you how to play, then we’ll talk about a few things that come up in the game that make Artificial Intelligence for Go difficult and then take a look at the state of the art and progress that has been made.

Overview of Go

So Go has three names, so you probably haven’t heard of the others. But Wei-chi is the Chinese name and Baduk is the Korean name. Go came here from Japan, so most people in the US call it Go. It’s around 3 or 4 thousand years old, and it’s the oldest game that is still played in its original form. It was one of the ‘Four Accomplishments’ that were required of Chinese gentlemen back in the day. So, you needed to be able to do calligraphy, painting, you needed to be able to play the Lute and play Go. So after this talk you’ll be a quarter of the way there at least.

In Japan, the Shogun Tokugawa established 4 Go schools around 1600s. And each of those schools was supposed to work on the game, try to perfect the game and then once a year there would be a large tournament called the Castle Games. It wasn’t large in terms of people, each school was allowed to send one representative, but it was a big deal. And the winner of the Castle Games got a cabinet-level position in the Government. So there was a lot of incentives to improve at Go and this is where Go skill really started to blossom, I guess you could say. And then in the 1900s it became possible to make a living playing Go. In Japan, a professional system was created in the 20s and newspapers started offering large prizes for tournaments. Currently the best you can do at Go if you win a large tournament, is win half a million dollars. So, not too shabby. Professional systems were established later on in Korea and China and it’s more popular there now than any game is in the US.

This is a recent street fair in Korea, and you can see there are a few people who know how to play Go on the street there. I think there are about 500 players, and you can see the people in the middle wearing the yellow sleeves are the professionals that were there. So they are all playing something like 8 or 10 simultaneous games against passers-by who just stopped off to play.

So, what is Go like? It is a two-player strategy game, something like Chess. There’s no hidden information or randomness. Usually we play on a 19×19 board, but there are two smaller sizes that are used to teach beginners, or just for a shorter game. And the basic goal of the game is to surround territory. Also within Go there are rankings and a handicap system. So you start out as a beginner, you start out around 25 kyu, somewhere at the bottom and then you progress upwards. So you can go 24 kyu, 23 kyu, not all of the ranks are shown on this chart. And once you get up to 1 kyu as an amateur, you switch to 1 dan, which is basically like a Black belt, and then you work your way up to 6 or 7 dan. So you can see on the left here the EGF rankings, the European Go Federation uses numbers instead of the rankings. And those correspond roughly to Elo ratings, so if you’re familiar with Chess ratings they’re something like that.

Once you get to the strongest amateur levels then you get professional levels above that. And those are not the same distance apart, so you can see the third-row from the top here is 2700 and that’s a 1 dan professional, and then the top professionals in the world are 9 dan and the different in rankings points is only 240 there. The difference in rankings is determined by the number of handicap stones that a player would need to take playing against someone stronger and weaker. So if you’re three ranks apart, then the weaker player would start with three stones on the board and then you’d be able to have a good game. So that’s actually one of the really cool things about Go, is that you can play someone who is quite a bit stronger or weaker than you and still have a good game and it doesn’t really change the feel of the game very much. In Chess, it’s kind of hard to do that, right. If you’re 400 points weaker than somebody in Chess, they’re going to beat you every time. And you can try something like spotting them a piece or the Queen or something like that, but it sort of changes the game. You’re not really playing the same game anymore. So in Go you can start with a handicap and it still feels the same way.

So this is what a finished game looks like. Your pieces are actually played on the intersections of the board. That’s a common beginner thing, almost every other game we know you play in the spaces, but Go you play on the intersections. And the idea is to surround space. The way it works is players put a stone on the intersection on their turn. So you add a stone to the board, and that’s it. The stones don’t move around. They can be captured, so if they get completely surrounded then you can remove them from the board. But they just stay in place once they are added.

How to Play Go

Alright, so now I’m going to teach you the game. There are only three rules. One is the rule of capture. So if any stone on the board has breathing spaces next to it, or liberties, and you can see the liberties are marked on the diagram. So the stone in the corner has two, the stone in the middle has four and actually stones that are connected horizontally or vertically should share their liberties. So this group of two stones on the right here has six liberties and the one at the top has four. So those stones will live or die as a group.

If you fill the liberties around the stones, they are captured. So here are some stones that are almost captured. This is called being ‘in atari.’ It’s kind of like check in Chess. So if it’s black’s move, black could capture any one of these four groups on the next move and that actually is where the name of the company Atari came from. Nolan Bushnell is a big Go fan. So if it’s black’s move, black could play on the last liberty for any of these white groups. And then whichever one he played would disappear. He obviously couldn’t play all four of those move at once. So these stones would be captured and removed from the board and each one of those stones is worth a point at the end of the game. If it was white’s turn when white got into atari, then white could play another stone connected to the ones that are in danger and gain more liberties for them. So it could extend out of atari and try and get away. So you can see the stones on the top edge here, each now have two liberties – one to the right and one below the last stone. And the ones in the middle that have extended each have three liberties now. So it’s going to be harder to capture those. Alright, so that’s rule one.

Rule two, very simple. You are not allowed to commit suicide. So white cannot play any of these four points because it would cause the stone just played to die.

Third rule is the rule of Ko. Ko in Japanese means eternity, and this is to prevent infinite loops in the game. So if you have a shape like this in the game, black can capture a white stone. And so if he captures that white stone, now you’re in the symmetrical situation and white can capture and take us right back to where we started. And that is not allowed. So we can’t have any infinite loops. The simplest way to say this is that the whole board position cannot be repeated. So after black captures white stone, white cannot recapture immediately. White has to go change something else on the board and then on his next turn can come back and capture the black stone. And then actually leads to some complexity that we will see later on. That’s it. Three rules: you can capture stones by surrounding them, can’t commit suicide and can’t keep repeating the same position over and over. So you’re all ready to go play Go now.

The scoring is based on the territory that you can surrounded. The way the game ends is that both players pass. You’ll get to a position at some point where playing in your own territory would be bad because you’ll be filling up your own points that you have surrounded. And playing in the opponents territory would be bad because the stones that you play on their territory would end up getting captured. And so there’s nothing productive to do and at that point you pass and then if the other player passes, the game is over.

So you count the spaces that you have surrounded, they’re each 1 point, so the empty points here with the white and black squares on them are a point each. And then the black group and white group that have the squares on them are dead, so the players, this is actually one of my games, so the players at the end of the game we agreed that the black in the upper left there is dead and the white group in the lower right. So at the end of the game, here we’ve got a black territory in the lower left and in the upper right. And in the lower right. Then we have some white territories. We would count the empty points, we also take the agreed dead stones off the board and each of those is worth a point to the player that captured them. And then we add that all up and white gets 6.5 extra, which are called komi, and that’s because black has the advantage of going first. So we’re compensating for that advantage. And then whoever has the most points after that wins the game.

OK, so the rules are extremely simple, but there are some interesting things that come out of them. One of which is called a ladder. So here we’ve got a situation where white has some stones that are almost surrounded. He’s got one liberty left, and if it’s white’s move, white could try to run away. So white could extend the stones, now he has two liberties. But in this kind of shape black can play atari again, and then white can try to run away again, and black can play atari again. And you can see where this is heading, right. Well white is going to run in to the wall and then he’s going to have anywhere else to run. So these white stones will eventually be captured. A few moves from now it will look like this. And white has one liberty, at d19 up here and if white plays on that point, he will still only have one liberty so the stones will get captured on the next move by black. So if you have a ladder like this, the stones can run away for a while, but then they’ll run in to the edge of the board and they will die.

Things get a little more interesting if there’s a white stone in the way. So in this case we have a white stone along the path of the ladder, and that’s called the ladder breaker. So this ladder isn’t going to work for black. If white starts running away, black can make him zig-zag back and forth a few times and then white will connect to the ladder breaker. So as this point it’s actually black’s move, white is just connected to the ladder breaker and now black has some serious problems. The two points marked with circles are big weaknesses for black and black only gets to play one stone. So black can only fix one problem or the other. And then white’s going to play the other point. And both of these circle-marked points are going to threaten two black stones at once and black will again be only to save one of the two. And so white is actually going to bust out of this ladder and then all of black’s stones are going to be in trouble. So you really only want to play a ladder if it works. It’s very, very bad to play a ladder that doesn’t. And we’ll see why that’s important a little bit later on.

Here we have a white group that is completely surrounded by black and has one liberty in the middle of the group. And if it’s black’s turn, black is allowed to play in the center of this group. When we play the black stone, the black stone that was just played and the white stones will not have any liberties, but first we will first remove the opponents stones before we check to see if our move was suicide. So we’re allowed to go ahead and play here because the white stones are going to die. And then that will give the black stone breathing space. So this is possible, but if white has a group that looks like this. There’s nothing that black can do. Alright, because, since white has two liberties and either of those moves would be suicide for black. There’s no way to ever capture this white group. OK, so this is where life comes from. If you can get two eyes for your group then your group will be alive. And so there are situations like this, where white has a group that is almost alive. And it depends whose turn it is. In this case, if white plays in the middle, white will have two eyes and the whites will live. If black plays in the middle, white can no longer make two separate eyes, and so this group is going to die.

And then with larger groups, you’ll have situations where you don’t actually play it out. So this white group actually have two points in the middle that white could play on to make the group alive. And if black takes one point, white will take the other. So this is actually the most common situation. You’ll have a group that’s big enough that it can make two eyes no matter what black does. And so, experienced players won’t try to kill this group – they’ll just know that it is alive and leave it alone.

Go Artificial Intelligence

Alright, so lets talk about Artificial Intelligence a little bit. Here are a few games and how we’re doing it, creating AIs to beat people at them. Tic-Tac-Toe, obviously trivial. Checkers has 10^20 positions, and the AI strength is perfect and in the last few years Chinook is a Checkers program that now plays perfect Checkers and they did that by evaluating the entire game tree. And so they know that Chinook can’t be beaten, the best you can hope for is a draw. Othello has more positions, and AI is Super-human at Othello. And then we have Chess and two types of Go. 9×9 Go is Go on a really small board that you would sort of start out as a beginner or just play to have a quick game that lasts maybe 10 minutes. And it has 10^38 positions and computers are now competing with the best people. Chess has more positions than that, and as we all know, Chess computers are better than humans. And then 19×19 Go has an enormous number of positions, so 10^172 and the best computers right now are at the strong amateur level.

The way that Chess AI was approached is through Tree search, an Evaluation function and Alpha-Beta pruning. And that’s how the first Go AIs worked as well. We can look at any Chess game as a tree of possible board states, right. We have the initial state at the top and then we have a branch down from that for each legal move. And then from each of those positions we have more branches for the legal moves from there. And we make some gigantic tree that makes every possible position. And we can search within that tree to find good moves and moves that lead us to good situations and avoid bad situations. And so what we can do then is look at all of the possible positions that we are going to, that moves will lead to, and we try and evaluate them. OK, so we look at a position and we try and decide whether it’s good for white or good for black. And this, basically you get a positive number if it is good for white and a negative number if it is good for black. So this happens to be good for white, maybe. I may have that backwards. So anyway, Chess AIs could evaluate the board and then you take your tree and you basically chop off branches that lead to bad positions for the computer. So you make moves that avoid those branches. So you can prune the tree and then analyse the parts of the tree that you’re actually going to allow to happen by making good moves and try to find a path to a winning position.

So Go AI started off doing this. There were basically two approaches at the beginning. There was sort of opening book, where there would be sort of standard openings that were programmed in to the computers. But Go that doesn’t work very well because the opening probably lasts, I mean the standardized openings might last 15 or 20 moves out of a 200 or 300 move game. And that’s just not a big advantage. And then there was some pattern recognition stuff done for small positions, but then beyond that they basically just attacked the problem using Tree search. And one of the big problems with that is that the branching factor in Go is much than in Chess. So in Chess on average there are something like 45 legal moves and a game last for 50 moves, so a tree isn’t that deep. In Go there are on average 250 legal moves and the tree is 300 levels high, instead of 50. So there’s a much larger number of positions to search if you want to make tree search productive.

But that’s not the only problem. Actually in Go the board gets more complicated as the game goes on. So one of the things that’s nice about Chess is that when you get to an end-game position, depending on what pieces are left, it can just be a solved problem. So you can have an end-game database that just says, you know, this huge class of positions all result in a win for white and if you can reach one of those in your tree search, then you don’t have to do any more work. But in Go actually, as you get closer to the end of the game, the game gets more complicated. So that is not helpful. Another thing is that it’s hard to prune branches. You can’t even do things like, say if a move leads to an opponent capturing a large group of stones, then we’re going to ignore that branch because even sacrificing large groups of stones can be a good strategy. So throwing those options out is not really a good way to get a strong program.

And then the biggest problem is that evaluating a position is dificult. So in Chess it turns out it’s not that hard, you have a simple rule that does a pretty good job of telling you who is winning the game. So, a big piece of that is just which pieces have been captured, right. Which player still has more pieces on the board. And in Go it’s much, much harder to look at a board position and decide if it’s even good for white or black. And I’m going to show you a couple of the reasons for that. So one of the big reasons for that is Go has a large board, right. So there’s 361 points and you would like to be able to break that down in to pieces and analyze the pieces independently. Because then the computer can sort of play out all of the possible moves in that area that make sense and decide who is going to win a fight in a small area. But it turns out you can’t do that because of some non-local effects in the game. So we’ve already seen one of these – ladders. And a ladder can result from a complicated fight. So here in the lower-right corner we have a very simple ladder. But you can imagine a fight where a bunch of different groups of stones are fighting for liberties and trying to capture each other. And then at some point that fight results in some group of stones running away and it gets caught in a ladder. So that ladder can actually go all the way across the board and that fight depends on the results in a totally different part of the board, right. So, the computer may decide that this fight in the lower-right may depends on the ladder, well now any move that white makes in the upper-left corner of the board affects that fights. So you really can’t take pieces of the board and analyze them independently when you have something like that happen.

There’s another problem that makes this even more difficult and that’s the Ko rule. So here we have a black group which is on the edge of being alive. If it’s black’s turn black can fill in this point and then black has two separate eyes. So black is alive. If it’s white’s turn white can capture that single black stone and if white can get another move in this area, white will then capture the three black stones on the right because they are now in atari. Because of the Ko rule, white cannot recapture white’s stone immediately, so black can’t take us back to that previous position. So what black has to do is find a threat somewhere else on the board that white will answer. And then after white answers it, white is allowed to come back here and capture the stone. And so this fight, deciding the life of this group can rely on threats throughout the board. So black needs to find a threat. This capture, this group living or dying is worth maybe 25 points. So black needs to find a threat elsewhere that’s worth 25 points, and then if white answers the threat, black can capture the Ko. So we’ll go back to this position. And now white will try and find a threat worth 25 points somewhere and if black answers that, white will re-capture and so this can go back and forth for quite a long time. Using threats from all over the board. And then finally black will win and connect and make the group alive, or if white wins, white will play another move here and capture the three stones. And then black’s group is dead because it’s only got one eye left. So Ko is another issue. Ko fights can arise and then the computer in order to figure out what’s going to happen with a single group in one particular area has to actually look at the entire board and try and figure out what’s going to actually happen. I’m getting depressed, this is making me not want to write a Go AI…

Progress with Go AI

Alright, so, what is the state of the art. So we started off with Tree search and trying to make the evaluation functions better and prune the tree, basically take the same approach as Chess. That worked ok up to a point, that got computers somewhere around the ten kyu level. Then there was sort of a breakthrough, and Go AI started to use Monte Carlo Tree search. So what we do with Monte Carlo Tree search is we take a move, a candidate move. So these moves near the top here are candidate moves. These leaf nodes that are marked. And what you do is play games out from that candidate move, but you don’t use a smart AI to play the rest of the game because that’s way too slow. What you do is you basically just play the moves out randomly from that point until the end of the game. And then you see who won the game and you do that over and over and then you keep stats back in the candidate move node about how many of those games were wins for black and how many were wins for white. So it sounds kind of counter-intuitive that this would work at all, it seems like it would just be playing lots of stupid moves and it really wouldn’t, the results wouldn’t really be affected much by the move you’re trying to evaluate. But, it tuns out this works really well. So you can actually play a whole bunch of stupid moves and see who wins and then do that thousands of times, and that will actually give you a pretty good idea of how good the move that you’re considering is. So that actually turned out to help quite a bit and then there’s another modifications that has been applied to this recently, that’s called UCT. And what they do there is, rather than evaluate each candidate move evenly, UCT tries to work out which of the moves is better and then which moves are likely to be worse. One of the problems with the original Monte Carlo Tree search is that it would look at moves that it thought was good and spend a lot of time on them, but there might be a move, a kind of surprising move that turns out to be really good and it would not evaluate the move enough to find out that it was good. And so UTC tries to balance that out and give a decent amount of analysis to even moves that initially look bad. And so that was a further refinement, and then just a few weeks a go there was a paper published on a new Neural Network approach to Go, and this looks extremely promising. They took a Neural Network and they trained it on a 160,000 professional games and then the way that this Neural Network works is it just looks at the current board, it doesn’t look at the history of the game or anything like that. It just looks at the current state and predicts where the next move will be and just doing that actually gave it a workable AI. So it beat a version of GnuGo that is about 7 kyu and it beat it 90% of the time. So that’s actually surprisingly good because this approach is not using any of the other work that’s already been done. So once this is merged in to the existing techniques it seems like it will probably be another jump in strength.

Alright, so here’s the history of progress. The Chess programs definitely got off to a better start. Go was pretty bad until about 1990 or so, and then there’s sort of a hockey stick effect over here on the right. That is where the Monte Carlo Tree search started being used. So we have a much steeper slope after that, there’s been much progress in the last 10 or 15 years. So if you look at the list here, in 1998 pros could beat the best programs giving them a 25 stone handicap. That is a truly absurd handicap. I think that none of sitting there that just heard the rules for the first time, I don’t think I could beat any of you with a 25 stone handicap. And if I could I do it one time. And then the second time you would annihilate me. Alright 25 stones is a huge handicap so there was a gigantic gap between the best programs and the pros. Then around 2008 a program beat a pro for the first time. So MoGo running on 256 cores, beat Catalin Taranu on a small board, so that’s on a 9×9 board, which is a much easier task than on the large board. But still, impressive. And then that same year MoGo running with 800 cores beat a 9 dan professional with a 9 stone handicap. So this is on a full-size board. And then in the years since then the handicaps have been getting smaller and smaller. So you can see down here at the bottom in 2013 Crazy Stone beat a 9 dan pro on a full board with only 4 stones. Alright, so that’s an impressive accomplishment. And then this year, Crazy Stone beat another 9 dan pro with 4 stones but won by a wide margin. So it’s looking like maybe the different is 3 ranks now. Between Crazy Stone and some of these top professionals. So it’s actually getting closer, starting to breath down our necks. It looks like we’ll probably see a computer than can compete with the best professionals in the next 10 or 15 years.

There are Gophers with Gifts at Fog Creek

February 12th, 2015 by Stephen Asbury

gopherbw (1)That’s right, several Kiwis were bitten by Gophers and are now Kiwi-Gopher hybrids. In other words, Fog Creek has started using Go. As part of that initiative we have some gifts for you – two libraries that we are sharing with the community. The first library, Mini, is a small package for reading .ini style configuration files. The second package is a bit larger and implements a tagged style of logging.

Why We Wrote the Packages

Creating Mini

So, you might ask, why did we write these two packages in the first place? Well, for the ini config reader it’s easy – there wasn’t a package with the features we needed when we started the project. Maybe we could have limped along, but creating our own package was also a good way to get some experience with Go and go test. Speaking of go test, the Mini package has 100% test coverage, a pretty cool milestone.

It’s available on GitHub, and documented on GoDoc.

Tagged Logging

Writing the logging package required a bit more motivation. Go already had a logging package and there were a few others in the wild too. But our motivation sprang from a desire to implement a concept I call tagged logging. Tagged logging is the idea that you can set log levels based on tags, not just as a default value or specific named logger. With tagged logging, each call to the logging package can have multiple mechanisms for determining if the message should be output.

A great example of this is in a project we are working on that wraps our search needs in a HTTP service. When we receive a HTTP request, we note the account associated with the request. Then, we use that account as a tag for each log message generated while we process it. If a particular account is having problems, we can activate debug logging for that account by enabling debug logging for a tag whose value is that account. This allows us to debug specific accounts in production without restarting the service.

In simpler cases, we also tag HTTP code as ‘http’ and Elasticsearch code as ‘elastic’, so we can turn on debug logging orthogonally from the location of the code. The same function can contain log messages tagged as ‘http’ or ‘elastic’, but with the same ‘account’.

So in the code below we can turn on debug for ‘http’, ‘elastic’ or ‘account: 1′ to pick which log messages we want to see:

httpTags:=[]string{"http", "account:1"}
elasticTags:=[]string{"elastic", "account:1"}
...
logger.DebugWithTags(httpTags, "http verbosity")
...
logger.DebugWithTags(elasticTags, "elastic verbosity")

A few other useful features in the logging library are rolled log files, background logging with channels and support for custom appenders. Oh, and in case you are wondering, the logging package is at 88% coverage. We are still learning how to get tests to poke into some of those hard to reach places, like Syslog.

We are just getting a handle on tagged logging, and we hope you give it a try and let us know how it goes for you too. The logging package is also available on GitHub and GoDoc.

Now, no-one ever said to beware of Gophers bearing gifts, so all’s good. Happy coding!

Maintaining Company Culture in a Distributed World – Part 2

February 10th, 2015 by Allison Schwartz

office_fun
In last week’s installment of this series on maintaining company culture in a distributed world, I wrote about how Fog Creek began and how we’ve had to change now that we’ve built a remote team (which makes up half of our workforce). In particular it went into how we’re working to ensure Fog Creek remains a supportive, communal environment – two touchstones of our founding principles. In this week’s blog, I talk about the 3rd cornerstone of Fog Creek culture, something a bit more… FUN!!!

Going Remote Whilst Retaining the Fun

Have you seen Aardvark’d? It’s a documentary from 2005 about the year Fog Creek went from a team of 2 to 10, comprising 6 full-time employees and 4 interns. For us Creekers, it’s a hilarious, if somewhat cringeworthy, piece of nostalgia (which you can watch in full). But it is a glimpse into how Fog Creek’s day-to-day has always included a healthy dose of good times. For example, 19 minutes into the movie, a discussion about the possibility of jumping out the office window onto the roof of the building next door begins, and it doesn’t end until minute 24. You see the Fog Creek interns measure the distance, practice jumping, and take the “can we or can’t we” argument to the street. Literally.

devs
It may seem unimportant (and slightly ridiculous) but that couldn’t be further from the truth. It’s a great example of how Fog Creek has always encouraged our employees to play and form relationships based on more than just work. It’s a part of our secret sauce for employee retention and happiness. To wit, 10 years later, 4 of those 6 full-time employees in the movie still work at Fog Creek (ok, 1 is at our sister company, Trello) and 2 of the 4 interns became full-time Fog Creek employees who were with us until 2013. The proof is in the pudding – fun matters!

When our entire team was located at our office at 55 Broadway fun was extremely easy to come by. You’d see it in casual situations, chatting over coffee around the Espresso machine and over long lunches. Or during afternoon snack breaks in the kitchen, lovingly referred to as “Cheese O’clock”. Conversations would range from astronomy to what happened on Lost (yes, we’ve been around that long). There were larger, organized events too. Such as our beer bashes on Friday afternoons, with themes ranging from The Olympics to British Pub. Or all-company parties at bars and restaurants to celebrate product launches and major milestones. Not forgetting the annual catered picnic each spring to mark the end of the brutal New York winter.

When we opened our doors to remote employees, we knew we’d have to figure out ways to bring both the casual and the planned fun to Creekers across the globe. But guess what – it’s not easy! Just how do you get people who are generally introverted and dispersed all over the world to get to know their co-workers, especially those they don’t work with directly?

For nearly 2 years, we’ve been tackling this question. We’ve tried a lot of things – some have worked, others haven’t. Below are 3 of our most successful strategies so far, and how we implemented them:

remote_fun

1. Make Communication Easy

First, we supply our team with super-convenient communication channels. Like many companies with remotes, we use Google Hangouts and a client strictly for chat, Slack. Within our Slack instance, we have an all-company channel, team channels, and many more niche channels where employees, remote or not, engage in conversations about non-work related topics. Similarly, but more unique to Fog Creek, is CoffeeTime. Written by one of our devs, CoffeeTime is a program that schedules a casual weekly meeting between 2 random employees. Check it out, and/or dig into the code to create your own!

2. Run Remote-friendly Events

Second, we plan activities which remote employees can easily participate in. In the past, we’ve had remote beer bashes, where remote employees meet up on a Google Hangout to have a beer while the employees in HQ do the same. We also translated two long-standing annual Fog Creek competitions to our remote workforce. The first is our mix-tape competition, which we hold every year during the summer. Employees who want to participate anonymously submit a digital mix-tape. We spend all summer listening to the tapes, discussing them in Slack, trying to figure out who created each one. We announce a winner at our end-of-summer party. Our annual Halloween costume contest is the second. In the years before remote employees, everyone participating showed up to work in costume. The only difference now is that our remote employees send us pictures and keep their costumes on all day during video chats.

Both are great opportunities to learn something new about our co-workers (Who’s creative? Who can create a costume? Who knows every song in Neil Young’s catalogue?). It also strengthens those bonds built by fun, the importance of which I highlighted above.

3. Don’t Forget Face-to-face Activities

Third, it’s great to get everyone together every now and then. To that end, our remote employees go on off-sites as teams, and, twice a year, the whole company comes together at HQ for Remote Week. We fly all our remote employees to New York, put them up for 6 nights at a hotel we’ve vetted, and then stock the week with social events like meals out, trivia and karaoke. There’s also all company and individual team meetings for long-term planning and discussing big-picture issues. We end with 1 of our 2 annual, extravagant all-company parties – Welcome to the Summer or the Holiday Party. Based on employee feedback, we’ve found that remote weeks are a great time for Creekers to refresh relationships, build new ones, and see their jobs and teams in the wider company context.

At the end of the day, having a company with a distributed team isn’t easy. It takes a lot of work. If you’re lucky enough to be a part of an organization who cares about it’s employees’ happiness and quality of life, then packing up your company’s culture and sending it across the world can be an opportunity to get even better at it. And that’s good for everyone, regardless of location.

fullteam

Eight Fallacies of Distributed Computing – Tech Talk

February 6th, 2015 by Gareth Wilson

 

In this Tech Talk, Stephen, a Software Engineer here at Fog Creek, explains the Eight Fallacies of Distributed Computing. He does so by providing recent or personal experiences that help to expound each of the fallacies, showing how life, physics, and even sharks can conspire against us.

 

About Fog Creek Tech Talks

At Fog Creek, we have weekly Tech Talks from our own staff and invited guests. These are short, informal presentations on something of interest to those involved in software development. We try to share these with you whenever we can.

 

Content and Timings

  • Introduction (0:00)
  • The Network is Reliable (0:37)
  • Latency is Zero (2:17)
  • Bandwidth is Infinite (4:24)
  • The Network is Secure (6:10)
  • Topology Doesn’t Change (7:24)
  • There is One Administrator (9:25)
  • Transport Cost is Zero (10:29)
  • The Network is Homogeneous (12:53)

 

Transcript

Introduction

So the talk is The Eight Fallacies of Distributed Computing. The Eight Fallacies are something that I heard about at a Java One conference a long time ago by a guy named James Gosling. He attributed them to someone named Peter Deutsch and basically a bunch of guys at Sun had come up with a list of these fallacies. But basically what the fallacies are, are a set of the opposite of rules, a set of fallacies about distributed computing that people often forget.

The Network is Reliable

So, first fallacy, the Network is Reliable. When I first started at NeXT Computer, one of my first jobs was to go out and teach people how to program in Objective-C and NeXTSTEP. One day I was in Michigan, I was teaching the class, everything was good. Everything goes off, we look around, the emergency generators come on because back then they had main-frames, so they had to turn on this generator. And we look outside and there’s a back out right outside the building. The power was out. So number one, back-outs are bad for computing. Bad for networking.

Now, there’s something else that is bad. People. In April of 2009 someone crawled into a manhole cover and chopped through the fibre optic cables that fed San Jose, California, and several other areas. A few years later, in April, I realized someone related to tech, I’m not sure. Boom! They did it again. Right, people are bad.

But there’s something else you have to worry about for your network. Ok, and that’s sharks. Google and other guys wrap their cables in Kevlar because most undersea cables have both power and fiber. The power gives off electro-magnetic radiation, the sharks think that it is a fish that’s freaking out and they come to eat the fish, which then destroys the cable, which then becomes very expensive to fix.

Erm, so, the network is not reliable.

Latency is Zero

What else?

The network is not instant. Latency is zero – fallacy number 2. There’s a company in, I think they are actually in Missouri, called Spread Networks, that bought the rights to lay fibre optic cable from New York to Chicago to shave a few milliseconds off the time that it takes for the signal to get back and forth. And they did that so that they could make more money selling the rights to use their cable to another company. Now, sad thing for Spread Networks, another company is coming in now and is trying to use microwaves and millimeter waves to shave another few milliseconds off so that they can be even faster and sell this to Traders. Because, no-one else so much cares at that level, but latency is zero – No!

One of the Sys Admins at college, sitting in his office and a department Professor comes in. Turns out to be from the Statistics department in this story, that may be a lie, but the story I hope is true. And says ‘we can’t send e-mail more than 500 miles.’ So the guy’s like ‘I don’t believe you’, that’s his first thing right, it’s like this can’t be true. So he starts sending emails and he can’t send emails more than 500 miles. So then he is thinking, so is it geographic, is it really 500 miles or is it that the person is 500 miles away. So he tries sending emails out to someone who is local but whose email server was in Seattle, he I think was in South Carolina. No good. So it turns out some consultant had very smartly upgraded them to Solaris server. This is an old school story. And the Solaris server, in the process he downgraded the mail on the server, the mail server, which didn’t understand the config file. So now it didn’t have a timeout big enough to be able to talk to a server that was more than 500 miles away because the timeout was set to 3ms, which is the time is takes for the signal to travel 500 miles and back is more than 3ms. Latency zero.

Bandwidth is Infinite

Fallacy number 3. Bandwidth is Infinite. Ok, so speaking of something like a message broker or something like that. There’s a problem with bandwidth and that is that bandwidth can also be thought of as not just how much you can stick on your network but how much the different parts of your network can handle. So, if you have a message broker. And everything goes through the message broker, you can’t go any faster than the message broker. Right, data can’t flow any faster than the message broker can process it. The you know, the thing in the middle is just stopping everything else. Fallacy 3, bandwidth is infinite. No. because with a database, if you have one big database your world is constrained by the one big database. If you can have multiple database shards, and they are independent, that’s very important, they don’t, your system isn’t dependent on any one thing.

So, bandwidth is not infinite. But! there’s good news. You can add bandwidth. So, latency isn’t zero, and there’s nothing you can do about that. If you run out of latency, or if you hit the bottom, which you can’t. But if you were somehow a massively great programmer and you can hit the speed of light then you couldn’t do any better than them, right. Well, you can just move things closer, that’s your only choice. But with bandwidth, you can at least add more bandwidth and you can design systems differently. So, there are ways to add bandwidth to the system.

So, what have we got – the network is not secure, it’s slow, it’s very limited, [sarcasm] but it’s safe. So that’s good. So the network is secure, that’s good, we have that going for us. [/sarcasm]

The Network is Secure

The biggest example recently, I’m going to call it the second biggest now, the example of the network is not secure is the Heartbleed issue. Where people were able to try to connect to OpenSSL servers, get random bits of memory from the OpenSSL server, even on failed connection and because the random bits of memory were often near the code that was authenticating users, sometimes they were getting user data just by hitting the server. Even though the server was doing the appropriate thing and just saying ‘sorry, you can’t connect because you’re not authenticated.’ So, we can’t assume the network is secure, the network is not secure.

Moreover, there are bad people out there. In the last ten years over 30 incidents have been reported where over 100,000 user records have been lost. But that’s just minor compared to eBay’s most recent one where there were over 145 million user records lost.

Ok, so, the network isn’t secure, it’s not reliable, it’s slow, it has limited stuff, but at least [sarcasm] it always stays the same. That’s the good thing.[/sarcasm]

Topology Doesn’t Change

Topology Doesn’t Change. So, the CAP theorem, the part we want to think about is something called Partitioning. In the old days, back when I was learning, there was a thing called a mainframe and there was basically a wire from the mainframe to the client and that was it. So, there really wasn’t the concept of partitioning, there was just – the network is up or it is down. It doesn’t matter. But in the new world where we start to think of things like database sharding or just distributing servers around the place, we run in to these problems where maybe some of the servers have buddies still up and others don’t, or maybe they’re all up but the network connecting them isn’t. So if for example you have two data centers and they both have multiple copies of the same thing running, those guys all think that everything is happy dappy but if they line between the data centers goes down you can get in to a situation where most sides are doing the wrong thing. And that’s the whole point of the CAP theorem. Trying to understand what you can accept and not accept in partitioning.

So, sometimes Topology change is good, right. Sometimes we upgrade servers and we get new things and the topology can change in a good way. But even when that happens, we can have problems. For example, when I started in Kiln and I started to learn about deployment, we did a deployment where somebody had taken one of the servers out of the data center. And it was ok, I mean that was intentional, we didn’t use that server, we meant to take it out. Unfortunately all of our tools thought it was still there. So the deployment kept failing because we couldn’t check that it hadn’t deployed to a server that we weren’t using. It isn’t a mistake but it’s because the tools weren’t setup to deal with that topology.

So, topology can change. So what have we got? The network is not reliable, it’s slow, it doesn’t pass everything I want, it’s not safe, it keeps changing, [sarcasm ]but at least, here’s the good news people. We know who to call if there’s a problem. [/sarcasm]

There is One Administrator

The other thing that changed when I started on Kiln, was I would run this script and it would get a little ways and then I wouldn’t have permission. And then I’d get permission and I’d get a little further and then I wouldn’t have permission. Because there were different systems, with different permissions. So people do have ways of dealing with this, like Single Sign-On or one version of that called Kerberos, but even that has issues because, you know, ultimately there are lots of administrators.

Luckily all administrators do what you want. Right? So in May 2013, Edward Snowden, whether you like it or not, as an administrator at the NSA decided to take a lot of data and give it away, right. I’m not saying anything good or bad, I’m just saying, like, there are lots of administrators. If there had been only one administrator, unlikely that would happen. But then the NSA couldn’t do what it does.

The network is not safe, there are all these people in charge, we don’t know who they all are, it’s slow, it’s unreliable, and it won’t send everything we want and it’s constantly changing, but [sarcasm] at least it’s free.

Transport Cost is Zero

Transport cost is zero for the network. And that’s why in early 2014 Netflix paid Comcast to get preferential access to its network for its customers, and then they paid Verizon to do the same thing. And then they paid AT&T to do the same thing. In none of those cases was it to get preferential treatment, that was just a deal where they thought it was important to pay the phone companies and the networks because why not, we should share the money, we’re almost profitable and we should share the wealth, right? [/sarcasm]

So, transport cost is zero. Latency remember is the time it takes for the signal to travel from one computer to the other computer. So one of the projects I worked on at Tibco, was a product call FTL. Now why is it namd FTL? Because marketing people think that you can go faster than light. So when you’re trying to make something go fast, you name it Faster Than Light. Even though the whole point of latency is not zero is that you can’t go faster than light. But ok, so they named it FTL. And FTL was actually a really cool product. We were able to send messages, on multiple transports, so RDMA, TCP/IP, Multicast and shared memory, which is all inside one box, with the same API. And you just change it through configuration. You can actually change it live. So, with the shared memory, the goal was under a micro second, and we achieved that. Under a microsecond, in fact, under 600 nano seconds to send a message from one program to another program, with the same API you could use to send it over RDMA in the data center, for under a few micro seconds, like 2 or 3 micro seconds. Admittedly slower, but still. 600 nano seconds is not very long. At the time it was like 20 slots worth of time in Linux. So the entire amount of work to pack a message up, put it on the transport, send it in a generic way and get a response back – well, you don’t get the response back, but you time it so that you do, right? Under 600 nano seconds. Transport cost isn’t zero.

Now not everyone cares about ten nano seconds, but, you know, it all adds up. And when you’re doing things like sending things across the ocean, that’s 150ms, that’s what it is. Einstein said it had to be. So, nobody can go faster.

The Network is Homogeneous

So, last one. It’s hard to believe that any programmer today has this fallacy, because with the advent of mobile, and everything else people know that there are all kinds of networks going on. I mean, WiFi, you plug in your thing, you use your phone, you use your tablet, people know. But, this is something you have to keep in mind, right? So unless someone has a fibre connection that’s really solid at their house, then they are going to just have vagrancies in their network just from being at home. And the interesting thing about the network being homogeneous, Facebook recently spent several billion dollars buying a company called WhatsApp. And one of the things that WhatsApp does, that is maybe worth buying, is that they didn’t care about cool stuff. So they actually wrote a messaging program that works on old stuff. And slow stuff. Anyway, the network is not the same everywhere, and the nice thing is people like Google and others are trying to build tools that let you try that out. So there’s tools that let you pretend the network isn’t the same.

So, those are the Eight Fallacies.


Looking for more?

Visit the Archives or subscribe via RSS.