Product

Free email validation API for web forms

Email validations are a powerful way to ensure you collect accurate data on web forms. See how Mailgun can help.

PUBLISHED ON

PUBLISHED ON

Disclaimer: We’ve added new features and adjusted the pricing of our email validations. For the latest on validations, check out this post.

Email validation is a hard problem that every single developer, building anything on the web, has to deal with. We actually have been working on email validation ourselves for years (more on that below). We looked at the validation services available and weren’t satisfied with any of them, either for performance, accuracy or completeness reasons. So today, we’re releasing a project we affectionately called Guardpost, as our newest API, and are also pulling back the curtain to show exactly how we built it.

We’re launching this as a free service that anyone collecting email through web forms can (and should!) use. You’ll need a Mailgun account to use the service, but you don’t have to send your emails through Mailgun. If you want to get started right away, check out the API documentation, or a little sample jquery app, as an example of how to use Guardpost in a signup form.

We suggest using Guardpost as part of your email collection form to reduce typos and false address submission in conjunction with a link emailed to the address to confirm the email is valid (double opt-in). Of course, you can use Mailgun to send the double opt-in email, as well. This is not intended to be a bulk mailing list cleaning service and we reserve the right to disable your account if we see it being used as such.

To call the Guardpost API, just use the publishable API Key in the My Account tab of your Mailgun account (the one with the “pub-key” prefix).

Now, on to to the technical details:

Why is email validation so hard?

Address validation is hard for multiple reasons:

  1. Email address syntax is fairly complex, complex enough that it is difficult to express with a regular expression while being readable and maintainable for developers.

  2. There is no single definition of what is valid syntax, for an email address, and what is not. The definitions that do exist frequently conflict.

  3. The Internet runs on the Robustness principle, and because of that mail servers will accept addresses that do not conform standards, but are otherwise understandable.

Why did we create an email validation API?

There are 3 main reasons we feel like we needed to build our own service.

  1. Our goal is not to make a perfect address validator that can validate every single address that has ever been created. Our goal is to build a realistic address validator for the types of addresses we see every day.

  2. We’ve sent billions of emails and collected a lot of ESP data. We know that gmail.com is a valid MX host while gmali.com is not.

  3. Furthermore, the validator is ESP specific, so we can go way beyond valid syntax checks, bring in specific requirement for Gmail vs. Yahoo vs. Hotmail.

What does the Validation service do?

Our validator service actually consists of five micro-services:

1. A recursive descent parser for syntax validation

Email address syntax is fairly complex, enough to make a pure regular expression based approach cumbersome and unmaintainable (check out RFC 2822 and RFC 5322 about proper email format then this discussion on Stackoverflow if you need some convincing). That why we wrote a parser that analyzes addresses, and determines if they are valid or not, based on a formal grammar.

What is a formal grammar? Formal grammars (and specifically in our case a context-free grammar) are a set of rules that define the structure of a string. For example, it allows us to transform something we intuitively understand, like an address list, into something formal that a computer can parse.

So what would the context-free grammar for an address list look like? Something like this:

address-list -> address ( delimiter address )*

What we have defined here is an address list, and we are saying it is composed of a single address, followed by zero or more delimiter and single address pairs. For example, the following would be a valid address list:

john@example.com, smith@example.com

While the following would not be:

john@example.com smith@example.com

What’s really nice about recursive descent parsers is that we can take the grammar rules and turn them into code in a fairly straightforward manner. Here is pseudo-code for the above address list example:

Just like that, one by one, we slowly built grammar for every part of an email address. We spent hours pouring over RFCs, looking at bounces, looking at what mail servers actually accept (which is different sometimes from what RFC says), reading how other people were solving this problem to eventually define our own context free grammar for email addresses:

We built our parser around the above grammar for what we think is a realistic email address syntax. Again, this is not just based on RFC, but what we see ESPs accepting from sending billions of emails.

2. Domain Name Service (DNS) lookups

Just because an email address is syntactically valid, doesn’t mean that anyone will receive mail at that address. To receive mail, you have to have a server that will listen for incoming messages, this server is called a Mail Exchanger (MX) and is usually defined in your DNS records. That’s why, in addition to syntax checks, we look up the domains DNS records to see if a MX server has been defined to accept mail.

3. Mail Exchanger existence checks

Again, due to the robustness principle, just because a host does not define MX records does not mean they can’t accept mail. Mail servers will often fall-back to A records to try and deliver mail. That’s why we go one step further than just a DNS query, we ping the Mail Exchanger to make sure that it actually exists.

4. Custom Email Service Provider (ESP) grammar

Being liberal in what you accept is just one part of the robustness principle, the second is be conservative in what you send. Because of that, most ESPs actually have a fairly stringent rules for the local-part (before the @ symbol) you can actually create. For example, Yahoo Mail addresses can only contain letters, numbers, underscores, and one period. So while an address like, “John Smith”@yahoo.com is completely syntactically valid, it does not actually exist at Yahoo Mail and will bounce. That’s why if we know the Mail Exchanger the mail is going to, and we know the big ones like Yahoo, Google, Microsoft (including Hotmail), AOL, and Apple we validate against their more stringent rules.

5. Suggestion Service

Email addresses are frequently mistyped. Instead of @yahoo.com, you might type @yaho.com, that’s why, as part of our validation service, if we detect a misspelled word, we offer suggestions so you don’t miss mail due to a typo. Here’s what that looks like in the jquery demo app [source] we mentioned above.

What we (now do) provide

We’ve talked a lot about what we provide, and for quite a long time we could not provide these features:

  1. Checking if a mailbox exists on a server

  2. Mailing list clean up

However, what is exciting is that since then, we can do both of these things now! Our latest iteration of validations now checks if a mailbox exists, while also providing a risk assessment of each address so you know which ones do and do not belong in your mailing list. For more information, check out this post.

So that’s it. We hope you enjoy the service and it makes your life easier. If you have any questions or comments, let us know.

Happy sending!

The Mailgunners

Sign Up

It's easy to get started. And it's free.

See what you can accomplish with the world’s best email delivery platform.

Related readings

Forrester TEI study reveals Sinch Mailgun ROI and outcomes

When an enterprise starts working with Sinch Mailgun, what’s the financial impact that choice has on the business? It’s a good question – but not exactly an easy one to answer...

Read more

Here’s how to track email opens in Gmail with email tracking

Sending email campaigns doesn’t have to feel like you’re throwing darts into a black hole. Email analytics are a great way to determine the health of your ecommerce campaign and...

Read more

Get your email in check for the holidays

As the holidays approach, senders shift into high gear, determined not to let any opportunities slip between Black Friday and the new year. From subject lines to...

Read more

Popular posts

Email inbox.

Build Laravel 10 email authentication with Mailgun and Digital Ocean

When it was first released, Laravel version 5.7 added a new capability to verify user’s emails. If you’ve ever run php artisan make:auth within a Laravel app you’ll know the...

Read more

Mailgun statistics.

Sending email using the Mailgun PHP API

It’s been a while since the Mailgun PHP SDK came around, and we’ve seen lots of changes: new functionalities, new integrations built on top, new API endpoints…yet the core of PHP...

Read more

Statistics on deliverability.

Here’s everything you need to know about DNS blocklists

The word “blocklist” can almost seem like something out of a movie – a little dramatic, silly, and a little unreal. Unfortunately, in the real world, blocklists are definitely something you...

Read more

See what you can accomplish with the world's best email delivery platform. It's easy to get started.Let's get sending
CTA icon