验证字符串是否是有效的电子邮件地址的最优雅的代码是什么?


当前回答

这是我的答案——Phil的解决方案不适用于“someone@q.com”这样的单字母域名。信不信由你,这是used =)(例如,转到centurylink)。

菲尔的答案也只适用于PCRE标准…所以c#将会接受它,但javascript将会爆炸。这对于javascript来说太复杂了。所以你不能使用Phil的mvc验证属性的解决方案。

这是正则表达式。它将很好地与MVC验证属性一起工作。 @之前的所有内容都被简化了,这样至少javascript可以工作。我可以在这里放松验证,只要交换服务器不给我5.1.3。 @后面的都是Phil针对单字母域修改的解决方案。

public const string EmailPattern =
        @"^\s*[\w\-\+_']+(\.[\w\-\+_']+)*\@[A-Za-z0-9]([\w\.-]*[A-Za-z0-9])?\.[A-Za-z][A-Za-z\.]*[A-Za-z]$";

对于那些建议使用system.net.mailmessage()的人来说,这个东西太灵活了。当然,c#将接受电子邮件,但交换服务器将爆炸5.1.3运行时错误一旦你试图发送电子邮件。

其他回答

.net 4.5增加了System.ComponentModel.DataAnnotations.EmailAddressAttribute

你可以浏览EmailAddressAttribute的源代码,这是它内部使用的正则表达式:

const string pattern = @"^((([a-z]|\d|[!#\$%&'\*\+\-\/=\?\^_`{\|}~]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])+(\.([a-z]|\d|[!#\$%&'\*\+\-\/=\?\^_`{\|}~]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])+)*)|((\x22)((((\x20|\x09)*(\x0d\x0a))?(\x20|\x09)+)?(([\x01-\x08\x0b\x0c\x0e-\x1f\x7f]|\x21|[\x23-\x5b]|[\x5d-\x7e]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(\\([\x01-\x09\x0b\x0c\x0d-\x7f]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF]))))*(((\x20|\x09)*(\x0d\x0a))?(\x20|\x09)+)?(\x22)))@((([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.)+(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.?$";

Personally, I would say that you should just make sure there is an @ symbol in there, with possibly a . character. There's many regexes you could use of varying correctness, but I think most of these leave out valid email addresses, or let invalid ones through. If people want to put in a fake email address, they will put in a fake one. If you need to verify that the email address is legit, and that the person is in control of that email address, then you will need to send them an email with a special coded link so they can verify that it indeed is a real address.

我只是想指出,最近在. net文档中增加了关于电子邮件验证的内容,也使用了Regex操作。 关于它们实现的详细解释可以在那里找到。

https://learn.microsoft.com/en-us/dotnet/standard/base-types/how-to-verify-that-strings-are-in-valid-email-format

为了方便起见,下面是他们的测试结果:

//       Valid: david.jones@proseware.com
//       Valid: d.j@server1.proseware.com
//       Valid: jones@ms1.proseware.com
//       Invalid: j.@server1.proseware.com
//       Valid: j@proseware.com9
//       Valid: js#internal@proseware.com
//       Valid: j_9@[129.126.118.1]
//       Invalid: j..s@proseware.com
//       Invalid: js*@proseware.com
//       Invalid: js@proseware..com
//       Valid: js@proseware.com9
//       Valid: j.s@server1.proseware.com
//       Valid: "j\"s\""@proseware.com
//       Valid: js@contoso.中国

基于@Cogwheel的回答,我想分享一个修改后的解决方案,适用于SSIS和“脚本组件”:

Place the "Script Component" into your Data Flow connect and then open it. In the section "Input Columns" set the field that contains the E-Mail Adresses to "ReadWrite" (in the example 'fieldName'). Switch back to the section "Script" and click on "Edit Script". Then you need to wait after the code opens. Place this code in the right method: public override void Input0_ProcessInputRow(Input0Buffer Row) { string email = Row.fieldName; try { System.Net.Mail.MailAddress addr = new System.Net.Mail.MailAddress(email); Row.fieldName= addr.Address.ToString(); } catch { Row.fieldName = "WRONGADDRESS"; } }

然后,您可以使用条件分割过滤掉所有无效记录或任何您想做的事情。

我最终使用了这个正则表达式,因为它成功地验证了逗号、注释、Unicode字符和IP(v4)域地址。

有效地址为:

“@example。org (评论)test@example。org тест@example。org ტესტი@example。org test@[192.168 . 1章1段]

 public const string REGEX_EMAIL = @"^(((\([\w!#$%&'*+\/=?^_`{|}~-]*\))?[^<>()[\]\\.,;:\s@\""]+(\.[^<>()[\]\\.,;:\s@\""]+)*)|(\"".+\""))(\([\w!#$%&'*+\/=?^_`{|}~-]*\))?@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$";