python - finding email address in a web page using regular expression -
i'm beginner-level student of python. here code have find instances of email addresses web page.
page = urllib.request.urlopen("http://website/category") reg_ex = re.compile(r'[-a-z0-9._]+@([-a-z0-9]+)(\.[-a-z0-9]+)+', re.ignorecase m = reg_ex.search_all(page) m.group()
when ran it, python module said there invalid syntax , on line:
m = reg_ex.search_all(page)
would tell me why invalid?
consider alternative:
## suppose have text many email addresses str = 'purple alice@google.com, blah monkey bob@abc.com blah dishwasher' ## here re.findall() returns list of found email strings emails = re.findall(r'[\w\.-]+@[\w\.-]+', str) ## ['alice@google.com', 'bob@abc.com'] email in emails: # each found email string print email
source: https://developers.google.com/edu/python/regular-expressions
Comments
Post a Comment