Regexp-based RFC822 email address validation

saurik · on Nov 16, 2012

This concept been covered on Hacker News so many times before. :(

During a similar conversation 70 days ago, I left a detailed comment that is also relevant now regarding how RFC822 is actually totally irrelevant for the concept of e-mail addresses: what it specifies is how to escape the field values in MIME headers, and thereby has a bunch of rules for how to format an e-mail address that are really "how to embed an e-mail address in a MIME document".

RFC821, the SMTP specification for how you actually send e-mail, is closer, but has different rules about what is allowed because SMTP isn't MIME. A couple things aren't allowed, and some other things now are allowed and don't need to be escaped. Why people think users should type e-mail addresses in RFC822 escaping and not RFC821 escaping makes no sense to me.

However, the real punchline is: why are you asking users to enter e-mail addresses escaped at all? If you have an HTML form, for example, you don't need to escape them, as there is no higher-level protocol in which they are being embedded: the box can contain any characters that are needed, and there are no concepts like MIME comments, etc..

Asking a user to escape their e-mail address in that box is as silly as asking them to escape their username or password according to HTML or URL or some other escaping rules. Or, imagine if they had to enter their full name, but escaped using MIME encoded words... =?iso-8859-1?Q?=A1Hola,_se=F1or!?= makes about as much sense as escaping your e-mail address.

My original comment, which contains many more details about which specific RFCs are involved and what they mean, along with specific examples where things can get different, and a discussion of the context, here:

http://news.ycombinator.com/item?id=4486872

andrewvc · on Nov 16, 2012

So, your reply 70 days ago was in reply to me regarding the ruby library Evan and I wrote.

Funny enough, this perl regex was one of the inspirations for the ruby library Evan and I wrote (though that uses a PEG for parsing)

Weird how things go around...

Jabbles · on Nov 16, 2012

As others have said, the only way to know if an email address is valid is to try and send an email. This doesn't mean that this is useless, as you may want to get users to double-check their input if it doesn't pass this.

Some test cases to think about: http://isemail.info/_system/is_email/test/?all

xyzzy123 · on Nov 16, 2012

Note that the source for that is_email test (I hadn't seen the library before) is here: https://github.com/dominicsayers/isemail

That library is very good for telling you exactly what's wrong with an email address, but looking through the logic, you can see that handling all the edge cases involves significant effort.

It seems to me that this monomaniacal focus on RFC 822 / RFC 5??? compliance is missing the point a bit. That stuff is important if you're writing a mail client or a server, not so much for a signup form or web site use.

I am going to stick my neck out and say that for the purposes of a signup form, emails should be server-side validated as: anyCharButNullOrAt@valid-looking-domain.com

1. You split the email address into a local part and a domain, at the @.

2. The local part is allowed to contain anything but null, or @. This will keep all the people who go john.doe+furrylist@gmail.com happy.

3. The domain section is validated against length and dns charset (letters, numbers, hyphens, dots) and then checked against e.g. the mozilla public suffix list for a valid TLD. This prevents admin@[10.0.0.2] from receiving signup emails.

4. After that you can punt to your mailer, as long as you do not give any feedback to the users (aside from reciept or non-receipt of email) on the success of delivery or verification (to prevent spam / bruting). Just say "an email has been sent", or somesuch.

Anyone with two @s in their email address ("foo@bar"@bar.com) or (foo\@bar@bar.com) is evil, and should not be encouraged. You have to draw the line somewhere, and that's where I draw it. One reason these people are not worth spending time on is because many common clients such as gmail won't let you send to such addresses in any case. Try it...

Let's thank our stars we don't have to handle Unicode addresses yet...

oinksoft · on Nov 16, 2012

There is a middle ground ... I have had success using an approach like the one found in this Python library: http://pypi.python.org/pypi/validate_email

It performs regex validation, and if that passes, tries to get the SMTP server to validate the user's presence. Quite useful for when a user might fat-finger their username in the address.

So, for compliant mail servers, sending an email without verifying receipt via some confirmation token is no more reliable than this method (if it will falsely validate the user, it will probably falsely digest the message as well).

bjourne · on Nov 16, 2012

Be careful as validating email addresses like that will quickly get you grey or blacklisted by various spam cops. The method works well for a few addresses, but you can't use it to validate thousands of addresses at the same time. I learnt that the hard way. :)

sickpig · on Nov 16, 2012

@work we have done a lot of this kind of validations using a custom Tcl script, and luckily we have avoided to be blacklisted till now. Maybe it's due to the fact that validation requests come from an IP address marked as MX for the domain of the sender email address.

Anyway not all SMTP servers are RFC compliant, they respond with a 2XX code to a "RCPT TO" even if user/mailbox doesn't exists. The same thing applies also if the SMTP server is acting as a relay and has no immediate access to the delivering system.

oinksoft · on Nov 16, 2012

That makes a lot of sense ... a reliable way to determine if a user is real could easily be abused.

mosselman · on Nov 16, 2012

Could you expand on this? Why will it get you black listed? What kind of situation are we talking about? It would be pretty annoying to get black listed.

natep · on Nov 16, 2012

I imagine a spammer could generate plausible emails and then check them against SMTP servers to discover the valid ones if they didn't get blacklisted.

pvsnp · on Nov 16, 2012

I check if there's indeed a mail server in DNS to avoid the hassle of waiting for potentially slow SMTP servers to respond - for most cases, it works and I end up catching bogus email addresses from the domain names themselves.

emillon · on Nov 16, 2012

A MX record is not necessary for a host to be able to receive email.

oinksoft · on Nov 16, 2012

Note that that library can also check for an MX server. It goes Valid string -> MX exists -> User exists (you can choose to only check for MX, or only for valid string).

michaelmior · on Nov 16, 2012

Really? How is mail routed then?

emillon · on Nov 16, 2012

It should fall back to a A record. That's not exactly best practice, but if you're compliant that is something that you should be able to handle.

synapseseven · on Nov 16, 2012

Sending an email is the only way to know whether an address is working for a user. Validating whether it meets the standards that define how email addresses should look is a different problem, which is what the regex is going after.

xentronium · on Nov 16, 2012

Please note that RFC822 also covers some unusual forms of writing an address; all of the following are correct addresses:

* John Doe <john.doe@example.com>

* foo:a@b.example,c@d.example,e@f.example;

* john.doe@example.com (John Doe)

See also the excellent explanation by Jukka Korpela for more details: http://www.cs.tut.fi/~jkorpela/rfc/822addr.html

deweerdt · on Nov 16, 2012

Yes, absolutely. When people want to validate an email address, it's more likely they're referring to the SMTP envelope address: http://tools.ietf.org/html/rfc5321#section-4.1.2

RFC 822 (RFC 5322 in its more recent incarnation) refers to the From: header in the email, RFC 5321 refers to the address used in the 'MAIL FROM:' (and RCPT) SMTP command.

dfox · on Nov 16, 2012

and if i remember correctly, this regexp should also accept foovax!example.com!john.doe, which is certainly something you do not want to accept

rplnt · on Nov 16, 2012

It was posted a fews days ago in this thread http://news.ycombinator.com/item?id=4774426 with a great comment:

> If seeing this doesn't make you second guess using a RegExp when a parser is more appropriate, well...you might be a Perl programmer?

kokey · on Nov 16, 2012

I am one. The regexp performs better than the parser, and works in my existing perl code, and has a lot of visibility now, so it's a no brainer for me to use it.

fungi · on Nov 16, 2012

i'm building a simple offline webapp for collecting email addresses at an event on an ipad with no internet connection.

im using a simple regex (may use this one instead) to validate email before sticking in localStorage for latter retrial... if not with regex, how should i validate the email addresses?

jlarocco · on Nov 16, 2012

As somebody else said, don't.

If you really feel you need some kind of validation, have two email fields so the user can enter it twice and double check it themselves. No matter how much effort you put into it, and how complicated your validation code is, if the users want to mess with you, they can always just enter a valid fake address, so there's no point wasting a lot of energy on it because it's easy to defeat anyway.

And since it's impossible to validate addresses with a regular expression, there's a small chance you'll reject a valid address and look dumb.

michaelhoffman · on Nov 16, 2012

I see the two fields thing a lot. It's pointless since I can copy and paste into the second field. Unless they use JavaScript to disable pasting, which is obnoxious.

mikeash · on Nov 16, 2012

I'd wager that being likely to mistype one's own e-mail address correlates highly with not realizing you can copy/paste between the fields.

mikeash · on Nov 16, 2012

What do you expect to achieve by validating the address at all?

fungi · on Nov 19, 2012

ask the user to check what they entered and try again

lucian1900 · on Nov 16, 2012

A parser would likely be shorter and easier to understand.

tjgq · on Nov 16, 2012

They hint that they use this huge regex instead of a parser for performance reasons. At any rate, the regex was not written by hand; it is a concatenation of simpler, easier to understand regexes.

potatolicious · on Nov 16, 2012

> " collecting email addresses at an event on an ipad with no internet connection."

Parsers require an internet connection to work now?

schiffern · on Nov 16, 2012

He's preempting the "the only way to validate email is to send them an email" responses.

VMG · on Nov 16, 2012

don't

benihana · on Nov 16, 2012

Why not save everything they enter and then validate later. I'd rather get bad email addresses while letting everything in than lose valid email addresses but block bad input.

meaty · on Nov 16, 2012

I only check for an @ and at least one character either side. Anything else is the user's problem.

sgt · on Nov 16, 2012

Same here, and personally I don't see the justification for spending all those CPU cycles going through a massive regular expression such as this one.

I'd rather put this on the client side (javascript), as a validation to make sure the user doesn't supply an invalid e-mail address by accident (i.e. for his own convenience and nothing else).

blibble · on Nov 16, 2012

compared to pushing the response back out to the client, the cost of matching against that regex is going to be insignificant, even with it being as monstrous as it is.

(note that I'm not saying using that regex is a good idea!)

jnazario · on Nov 16, 2012

actually you may want to make sure they have at least four characters separated by a dot, e.g. .\@\\.[..]+ ... and i think this is how the regex begins ...

my point though is that you can't send mail to a TLD, you need a domain name. and i don't think we have any one character TLDs.

this is quickly turning into an exercise where you see how such a regex starts to happen. "well, then you have to consider this case ... and handle these exceptions ... and then enforce this ..."

Jabbles · on Nov 16, 2012

my point though is that you can't send mail to a TLD

You can: http://serverfault.com/questions/154991/why-do-some-tld-have...

For instance the pope could get pope@va - if he wanted...

xyzzy123 · on Nov 16, 2012

Try connecting to those on port 25, see if any accept mail... they don't tend to.

macspoofing · on Nov 16, 2012

But they can. In this case, you probably won't alienate any of your potential users but as you add more and more arbitrary rules, you will.

xyzzy123 · on Nov 16, 2012

Fair call, I'm all for fewer arbitrary rules. Especially if it's less code.

I still consider the "oh, but it's valid to have dotless on RHS!" to be one of those facts which is true, but irrelevant.

Those three hypothetical users can't receive email sent from most major web providers (e.g. gmail, who don't allow dotless To:), can't sign up to most web sites (who get their validation wrong), and are at the mercy of pitiless local dns resolver rules (pope@va will go to pope@va.com for US users, a lot of the time).

VMG · on Nov 16, 2012

Try connecting to those on port 25, see if any accept mail... they don't tend to.

That's not a test for validating an email host either - looking up MX records would be more appropriate here.

xyzzy123 · on Nov 25, 2012

I actually meant the MXs, sort of thought that went without saying.

wooster · on Nov 26, 2012

Try `dig mx va` instead.

mootothemax · on Nov 16, 2012

you can't send mail to a TLD

Not only is it possible, when I used to work for a company that administered a TLD, I did just that, sending and receiving email with the address t@TLD.

anonymouz · on Nov 16, 2012

Working for a TLD admistrator suddenly became much more desirable to me.

meaty · on Nov 16, 2012

I really don't care. We also, in automated test environments, send email to user@host so it doesn't escape the internal network.

I don't have to use a regex if I use the methodology I specified.

Simple Java implementation off the top of my head. Very fast, no imports or expression compilation required:

    bool isValidEmailAddress(String emailAddress) {
        int at = emailAddress.indexOf('@');
        if (at < 1 || at == emailAddress.length() - 1)
            return false;
        return !Character.isWhiteSpace(emailAddress.charAt(at - 1)) &&
               !Character.isWhiteSpace(emailAddress.charAt(at + 1));
    }

Improvements welcome. Should be portable to any other language trivially.

meaty · on Nov 21, 2012

C version because I was bored:

   int is_valid_email(char *email) {
           char *at = strstr(email, "@");
           if (at <= email || at == strlen(email) + at - 2)
                   return 0;
           return !isspace(*(at - 1)) && !isspace(*(at + 1));
   }

Test cases:

   assert(0 == is_valid_email(""));
   assert(0 == is_valid_email("@b"));
   assert(0 == is_valid_email("b@"));
   assert(0 == is_valid_email("d@ "));
   assert(0 == is_valid_email(" @d"));
   assert(0 == is_valid_email("   "));
   assert(1 == is_valid_email("a@b"));
   assert(1 == is_valid_email("John Smith <x.y@z.com>"));

readme · on Nov 16, 2012

boolean isValid = (email != null ? email.contains("@") : false)

the goal of client-side validation is to ensure that you can actually make that network call to do a real validation. the rfc is so complicated it's not even worth getting into this business, as evidenced by op's regex.

would love to see some unit tests for that thing.

derefr · on Nov 16, 2012

But you can send mail to, say, a machine listed as "a" in your hosts file.

Adirael · on Nov 16, 2012

And with the new personalized TLDs, wouldn't you be able to have something like ceo@nike? I just check for an @ and at least a character after and before it.

xyzzy123 · on Nov 16, 2012

No. See: http://domainincite.com/10254-why-domain-names-need-punctuat...

meaty · on Nov 16, 2012

Personally that pisses me off as it requires that I fully qualify all my local email addresses as what happens if I have the hostname 'nike' on my local net?

Adirael · on Nov 16, 2012

It's going to get messy. I use a lot of hostnames which may end up being TLDs.

jnazario · on Nov 17, 2012

wow, thanks for the edumacation :) obviously didn't know soe of those things, and completely ignored the local domain bits.

culshaw · on Nov 16, 2012

A more modern thought.

Do you really need to test that strictly for an email address?

If the user is trying to give you a fake email address, chances are they don't want to be part of your service/offering anyway.

I test for an @ and characters either side, that's most flexible bases covered.

I know this doesn't apply to all scenarios but it's one worth considering.

nathan_long · on Nov 16, 2012

You probably want to ensure that there's a dot somewhere to the right of the @, also, but yes, that sounds sane to me.

"something@something.something"

Start of line, at least one non-@, @, at least one character, dot, at least one character, end of line.

^[^@]+@.+\..+$

Test: http://rubular.com/r/G69q1k6fP2

If it fits that, try emailing them.

perokreco · on Nov 16, 2012

There doesn't need to be a dot on the right side.

xyzzy123 · on Nov 16, 2012

I found your comment fairly cryptic. I had a fun twenty minutes trying to work out what you meant, and under what circumstances dotless RHS in email addresses might be legal.

I suppose from the RFC, sure the spec doesn't require dots.

For example, I can use http://mythic-beasts.com/~pdw/cgi-bin/emailvalidate and verify that sure, '1@2!3!4' is a valid RFC822 email address. But I think e.g. UUCP-style addreses are a pathological case, and we don't _really_ want users signing up with them.

Another option would be intranets, e.g. 'baker@internal', but again I think that's being a bit pedantic, since most people on HN are writing webapps for the public Internet, not mail clients.

So can we get an email with foo@<some-dotless-string> routed across the public Internet? Even a bounce would do :)

You might be able to do a riff on xyzzy123@[23.55.211.36] (e.g xyzzy123@[389534500] or xyzzy123@[1737D324]. However, do you _really_ want your users to specify these?

There are mx records for existing TLDs (e.g. com, org, au, mx) - but all the mx records I tried refused connections on port 25. So no mail for 'xyzzy123@com' :(

So gTLDs are another option, and there was a time when it looked like xyzzy123@xyzzycorp might route (as long as it didn't collide with anything on the local resolver's search list). But it seems that dotless use of gTLDs is seriously deprecated at this point, and that ICANN will treat it as a TOS violation: http://domainincite.com/10254-why-domain-names-need-punctuat...

Basically, ICANN's conclusion was that dotless TLDs are a terrible idea for many technical reasons.

I looked into IDNs too, but of course due to the way DNS works, you can't really get around the dots.

So the conclusion of all this is that:

1) Using an RFC822 regex is a terrible way to check emails. The things it thinks are valid are MUCH wider than what you actually want.

2) You should probably check the RHS against a public suffix list if you are e.g. accepting a user email address on a signup page. If you accept dotless TLDs or other constructions (e.g. ips on RHS) there is some (low, but nonzero) risk that a malicious user could cause your systems to route mail to your other systems internally.

ceejayoz · on Nov 16, 2012

Theoretically, no. Realistically, for the average web developer's purposes, yes.

fuzzix · on Nov 16, 2012

Email validation is indeed a complex and occasionally surprising beast.

Clearly this regex is impractical, but any validation you invent yourself is likely incorrect. The best way to validate email addresses remains sending an email to them.

regularfry · on Nov 16, 2012

Is it impractical, though? It's already been written, and I've not seen any suggestion that it's not correct (other than the problem with comments, but that's intrinsic to regexes in general). As long as you're going to follow up by actually sending an email, I don't see a problem with this as a first-pass filter.

rmccue · on Nov 16, 2012

+1, seeing as you're probably going to be sending an activation email anyway. You can do some practical checks, like checking that there's a '@' in the email, and probably trimming spaces (I think leading/trailing whitespace isn't allowed, from memory).

HyprMusic · on Nov 16, 2012

In RFC822 spaces are actually allowed, I think they just have to be in a quoted string.

3ds · on Nov 16, 2012

As per html5 spec the recommended regex is:

/^[a-zA-Z0-9.!#$%&'+/=?^_`{|}~-]+@[a-zA-Z0-9-]+(?:\.[a-zA-Z0-9-]+)$/

http://www.w3.org/TR/html5/states-of-the-type-attribute.html...

stratoukos · on Nov 16, 2012

This will not match addresses with non latin characters.

mitchty · on Nov 16, 2012

Yep, shouldn't the right answer be: accept whatever the user gave you, validate it with a link in the email you send that they've received it.

Should be even more important if you ever intend to do business with say someone from China. Good luck dealing with validating hanji in an email.

tedunangst · on Nov 16, 2012

Addresses do not have non Latin characters.

Aissen · on Nov 16, 2012

Which is false:

http://tools.ietf.org/html/rfc6530

http://tools.ietf.org/html/rfc6531

http://tools.ietf.org/html/rfc6532

http://tools.ietf.org/html/rfc6533

tedunangst · on Nov 16, 2012

Relying on standards that new for something like email would be a mistake IMO. Ymmv.

ambiguity · on Nov 16, 2012

You are missing a * just before the $.

roel_v · on Nov 16, 2012

This is from "Mastering Regular Expressions", by Jeffrey Friedl, O'Reilly 1997. The book presents it as a 'fun' example of how to write huge regex'es that are still understandable and maintainable (the version posted here is without all the comments that are in the book).

bryanlarsen · on Nov 16, 2012

This is a generated regexp, it's not hand crafted. Making fun of it is like pasting up machine code compiled from C and saying "machine code sucks".

mosselman · on Nov 16, 2012

E-mail validation can be useful, but I would stay away from this thing. Look at what you are trying to do from a higher level.

Most likely the user wants something from you as well as you from them. If a user gives you a bad e-mail, despite a very basic e-mail regex, whatever, they won't get an e-mail, not my problem.

If it is to register on your website, just let them, send them a confirmation e-mail to their 'email', meanwhile allowing them to use the system (or not). Then if after x-time they haven't confirmed, just delete the user again. This will save you a lot of trouble.

If you want something more high-tech like checking a huge list of e-mails in a system you could go with a solution suggested below, just send them an e-mail.

Regex is evil!

billyjobob · on Nov 16, 2012

This regex tests for RFC 822 compliance, but what if you get a user who has an email address that itself doesn't comply with the RFC?

jacques_chester · on Nov 16, 2012

Actually, it doesn't.

Email addresses can't, strictly speaking, be tested for with regexes. This one "only" tests various nestings to I think about 3 levels deep.

ygra · on Nov 16, 2012

Then they have a hard time receiving mail from anyone, I guess.

gvalkov · on Nov 16, 2012

I'm surprised there isn't a Perl6 version of Mail::RFC822. This is exactly the kind of thing that Perl6 rules[1] are supposed to excel at. It would be good publicity, especially now that rakudo has usable releases.

[1]: http://en.wikipedia.org/wiki/Perl_6_rules

jmedwards · on Nov 16, 2012

I had a quick glance through the expression, looks good from here.

jmedwards · on Nov 16, 2012

(where here = an asylum for the insane.)

chris_wot · on Nov 16, 2012

The only thing I've ever seen that is worse than this is the sendmail configuration file.

habosa · on Nov 16, 2012

I find it pretty entertaining that this RegEx is so big that visual patterns emerged. In my browser there are clear diagonal lines of "@" symbols across the RegEx. If it looks like ASCII art, your RegEx is probably too big.

JimWestergren · on Nov 16, 2012

With PHP the following simple code works great for me:

  function validate_email($email) {
    if(filter_var($email, FILTER_VALIDATE_EMAIL) === FALSE) {
      return false;
    } else {
      return true;
    }
  }

smackfu · on Nov 16, 2012

Validation of emails is pretty pointless since most errors will be typos that pass the regex anyway. You're better off trying to give warning messages based on common typos.

bdg · on Nov 16, 2012

Great, now I have something to strike fear into the hearts of new devs who ask me about email validation.

I'm not sure I'm comfortable using a regex like this in production. Sure, we can write lots of tests and ensure it performs correctly, and the rfc is unlikely to change so once proven solid it won't change... but using this just feels wrong. Like I'm using the dark side of the force.

_wwz4 · on Nov 16, 2012

I've been watching people deal with this problem for years and years... why? I can parse a CSV file far more easily.

You'd think there would be an RFC that specifies a simple email address format that everyone can follow. If you don't conform to that format, your email gets dropped on the floor until you get a better client.

DanBC · on Nov 16, 2012

This is one of the things that people really want for email2.

Unfortunately email works well enough, and has such a massive install base, that email2 is never going to happen.

That's why you see so many "Email but not email" startups.

Smrchy · on Nov 16, 2012

For JS validation on the client i use this:

http://blog.tcs.de/javascript-near-perfect-email-validation-...

A lot shorter and has only 3 cases that would not be detected. Enough for 99.x% of all entered emails.

dutchbrit · on Nov 16, 2012

There's a very nice function in PHP that validates emails.

filter_var('foo@bar.com', FILTER_VALIDATE_EMAIL);

The actual beast: https://github.com/php/php-src/blob/master/ext/filter/logica...

CalvinCopyright · on Nov 16, 2012

Makes me think of the first reply to this StackOverflow question:

http://stackoverflow.com/questions/1732348/regex-match-open-...

michaelhoffman · on Nov 16, 2012

But, unlike XHTML, it is possible to validate an e-mail address with a regular expression (assuming comments have been removed).

_delirium · on Nov 16, 2012

It's also, unlike XHTML, not particularly easy to do it with a parser: most of the complexity of the regex is due to the litany of edge cases for what constitutes a valid email address, not due to it being a regex.

topbanana · on Nov 16, 2012

The point of regex is to be human readable. This might as well be a binary blob

1nvader · on Nov 16, 2012

This regex deserves a downvote! If you don't undertand it (and i guess you don't if it's not written by hand) - don't never ever use it!

The only way to validate a email adress is to send a validation mail/link.

dfox · on Nov 16, 2012

email address validation should not be motivated by what is valid address by some RFC, but what you feel confortable passing to your MTA, because you have exact understanding of what will happen. On the application side you probably don't want to store adresses with comment and real name fields and other such only human readble data. My rules are: contains exactly one @, contains zero or more +, does not contain any other characters that are special cased by this (notably ',', ';' and '!').

turshija · on Nov 16, 2012

'@'.' is a valid email address.

seriously ? :)

jrajav · on Nov 16, 2012

I can't find an online tester that doesn't choke on this, but if you're curious to try it out, here's the token Perl one-liner:

    echo "x@y.com" | perl -lne 'print "$_ is valid!" if /(?:(?:\r\n)?[ \t])*(?:(?:(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)*\<(?:(?:\r\n)?[ \t])*(?:@(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\>(?:(?:\r\n)?[ \t])*)|(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)*:(?:(?:\r\n)?[ \t])*(?:(?:(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)*\<(?:(?:\r\n)?[ \t])*(?:@(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\>(?:(?:\r\n)?[ \t])*)(?:,\s*(?:(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)*\<(?:(?:\r\n)?[ \t])*(?:@(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\>(?:(?:\r\n)?[ \t])*))*)?;\s*)/'

ByronFortescue · on Nov 16, 2012

For some reason this is valid as well?

    echo "blaat@blaat" | perl -lne 'print "$_ is valid!" if /(?:(?:\r\n)?[ \t])*(?:(?:(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)*\<(?:(?:\r\n)?[ \t])*(?:@(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\>(?:(?:\r\n)?[ \t])*)|(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)*:(?:(?:\r\n)?[ \t])*(?:(?:(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)*\<(?:(?:\r\n)?[ \t])*(?:@(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\>(?:(?:\r\n)?[ \t])*)(?:,\s*(?:(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)*\<(?:(?:\r\n)?[ \t])*(?:@(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\>(?:(?:\r\n)?[ \t])*))*)?;\s*)/'

nicktelford · on Nov 16, 2012

Because the domain name doesn't need to be fully-qualified; it can just be a machine name on the local network.

To illustrate this: "user@localhost" is a valid email address.

All these overly complex regular expressions miss a major point: even if the e-mail address is valid according to the RFC it doesn't guarantee that:

  * The domain name exists.
  * The user exists at the specified domain.
  * All of the SMTP servers between you and the recipient adhere exactly to the RFC.
  * The user actually owns or has access to the e-mail account in question.

Whenever I need to validate an e-mail address, I just use something simple like ".+@.+" to ensure sanity and move on to more pressing matters. As a friend once pointed out to me: it's usually far more damaging to reject valid e-mail addresses than to accept invalid ones; be liberal in what you accept and verify the e-mail address by sending them a confirmation mail.

qznc · on Nov 16, 2012

Yes, especially websites should accept more than [a-zA-Z0-9] for the user part. This would allow filtering emails. E.g. gmails can tag emails this way: john.doe+spam@gmail.com

jrajav · on Nov 16, 2012

That is, in fact, a valid email address (in the sense that it will pass all complying validations).

wilhil · on Nov 16, 2012

if the second blaat via DNS is resolvable, it will work fine.

A company I consulted at had a mail server that was internal only, and via their DNS server, they had resolvable names for department1, department2 etc...

They used to send messages to addresses like user@department1, user@department2 etc, and as each resolved fine and it worked very well.

alexchamberlain · on Nov 16, 2012

That's fine if you know a machine called blaat.

adv0r · on Nov 16, 2012

This validation ignore one exception :

Gmail allows users to enter an arbitrary number of dots .

Therefore these are a valid email addresses :

your......name@gmail.com y.o.u.r...name....@gmail.com

and all resolve to yourname@gmail.com

http://support.google.com/mail/bin/answer.py?hl=en&ctx=m...

Lockyy · on Nov 16, 2012

And this is a problem I constantly run into with web services, some throw an error if I use a ., others throw an error if I try to do something like example+note@example.com. I use one or the other to help sort emails. It's even worse when a sign up form accepts an email in the latter format, but the login form does not for some reason. So I have an account with a note added but I cannot login. I had this problem with the Odeon website for a while, eventually had to phone them up and ask them to change my accounts email address to one without a note.

message · on Nov 16, 2012

Old as hell

ta12121 · on Nov 16, 2012

It boggles my mind too. This is new to people here?