Use regex (preg_match_all) to extract e-mail from a page
Submitted by winter on Tue, 10/09/2007 - 15:06.
When parsing a site to harvest e-mail , we use Regex with preg_match_all() function to get all e-mail but the problem is that in some html, user type email like that : abc @ gmail.com
How we solve this issue. Use this regex pattern:
$pattern="/([\s]*)([_a-zA-Z0-9-]+(\.[_a-zA-Z0-9-]+)*([ ]+|)@([ ]+|)([a-zA-Z0-9-]+\.)+([a-zA-Z]{2,}))([\s]*)/i"; preg_match_all($pattern, $html_page, $matches);
That pattern will extract all type of email.

Delicious
Digg
StumbleUpon
Propeller
Reddit
Magnoliacom
Newsvine
Furl
Facebook
Google
Yahoo
Technorati
Icerocket
Post new comment