What we learned open sourcing a major part of Mailgun
A few weeks ago, we open sourced Flanker, our MIME parsing and email validation library. We've been very happy about the release and the level of interest the Python community has shown. This tweet was typical of what we've heard since the release:
Open sourcing software may seem easy, but it's more complex than just throwing your software on Github and calling it a day. In fact, we were surprised by some of the challenges we ran into and wanted to provide some insights into the process to hopefully help other developers planning on open souring their own code. To that end, here's how we did it and what we learned.
1. Open sourcing more than we originally intended
Originally, we committed to open sourcing Guardpost, our email address validator. When we started down this road we realized quickly that this would be much more valuable packaged with our MIME parsing library and releasing together as Flanker. This lead to additional complications as we were now open sourcing a pretty big portion of Mailgun's core code base.
2. Making Flanker more modular
Since Flanker wasn't originally designed to be used by others, it was a challenge to make sure all of the dependencies to the proprietary code base were removed. We didn't want to enforce our usage of specific technologies on other users, so we took parts of Flanker that used specific Python modules or databases like Redis and dnspython and made them more modular. Now you can use any database (or no database) to cache your results.
This meant we had to first excise the code and make sure it was consumable by others and then we have to effectively integrate Flanker back into Mailgun in its new form. We also took this opportunity to refactor large portions of the address parser (see 5. below), which made things more difficult.
We could have just forked our internal Flanker code base, done a best efforts to remove the dependencies, and gone back to using what was already working internally but then we are either neglecting the open sourced code or maintaining two different code bases. Instead, we decided that we would commit to continue using the open sourced code in production.
This means when we merge a pull request (we've done 8 since open sourcing Flanker a few weeks ago and have 4 more pending), not only do we have to make sure that it's cleanly written, maintainable, and performance oriented, we also have to make sure all the changes play well with the rest of Mailgun.
3. Integrating the open source code back into Mailgun
As you can expect, address parsing as well as MIME parsing are at the heart of Mailgun. Since we had made major modifications in the process of open sourcing the code, we had to be very careful about integrating the open sourced code back into Mailgun without breaking customers' applications. Maybe not quite as impressive, but it felt a little like doing the splits between two semis going in reverse.
We accomplished this by rolling out Flanker in phases. First, we started by rolling out Flanker to a small slice of our traffic. This let us start seeing situations were the new address parser behavior was more strict than before. For certain things, like allowing control characters in display names, we decided to disallow this since it is a security risk. For other things, like Unicode support, we decided to be as flexible as we could in what we would accept. Yes, that means you can now send the poop character as a display name via Mailgun and we will encode it correctly and pass it along.
4. Legal considerations
We also thought it would be wise to see if we could open source it without getting fired. Flanker is one of the core pieces of Mailgun, so it's pretty valuable to Rackspace. While we believe it's sometimes a good strategy to ask for forgiveness rather than permission in order to GTD, this probably wasn't one of those times. So we ran it by the Rackspace legal and product teams to make sure that they were on board. Fortunately, Rackspace's tagline is the "open cloud company" and it's dedicated to open source software, so it wasn't hard to convince Rackers that open sourcing Flanker was the right thing to do.
5. "Beta testing" before open-sourcing
We have been running our MIME parser in production for a long time but the email validator portion (Guardpost) was relatively new. We launched Guardpost as an API before open-sourcing Flanker. This allowed us to effectively test under real conditions before open sourcing.
As Guardpost became popular, even though we were running it on a pretty powerful dedicated box, it was having trouble handling the traffic that was thrown at it. So we spent some time profiling Guardpost looking for what was causing it to slow down and it turned out to be the spelling corrector. We spent some time researching how we could improve it and ended up rewriting it altogether. You can see our blog post about how we were able to make the spelling corrector 135x faster than before.
We were also able to fix a fair amount of bugs people found. This was everything from adding Internet Explorer support for our validator demo (none of us actually had a Windows machine, we had to get a license for Windows to fix that bug!) to sending mail directly to a top level domain (TLD).
Running the code in production before open sourcing allowed us to be confident and proud about the code we were releasing into the wild.
This is big. When you have an internal code base that you are hacking on, you can walk over to the developer that wrote it and ask them a quick question. That's not the case with open source software so we wanted to make sure Flanker was well documented so it would be easy for new people to start hacking on. That means Flanker is well documented within the code with comments as well as external documentation like a Quickstart Guide, User Manual, and API reference.
So that's what we've learned open sourcing Flanker. While it ended up being a lot of work, it was a cathartic process and we look forward to reaping the benefits in the future.
Would love to hear about your experiences in the comments of this post or over on HackerNews.