300字范文 > python爬虫网络请求超时_Python网络爬虫编写5-使用代理处理异常和超时

python爬虫网络请求超时_Python网络爬虫编写5-使用代理处理异常和超时

时间：2018-07-10 23:36:15

# coding=utf-8

”’

从同一个地址发出的http请求过多过频繁，都可能被网站给封掉

要解决这个问题，就需要不停地更换代理

同时，如果在用urllib2访问url的时候出现错误

可以用python的异常处理机制获取错误内容

最后，urlopen可以指定一个timeout参数，用来设置超时时间

”’

from bs4 import BeautifulSoup #导入beautifulsoup

from lxml import etree

#from urllib2 import Request, urlopen, URLError, HTTPError

import urllib2

#设定访问的代理服务器

#使用 urllib2.install_opener() 会设置 urllib2 的全局 opener

proxy_handler = urllib2.ProxyHandler({“http”: ‘http://1.183.125.230:24470/’})

opener = urllib2.build_opener(proxy_handler)

urllib2.install_opener(opener)

url = ‘’ #你要爬取的网页地址

req = urllib2.Request(url)

req.add_header(‘User-Agent’, ‘Mozilla/5.0 (Windows NT 6.1; rv:33.0) Gecko/0101 Firefox/33.0’)

#处理访问错误

try:

response = urllib2.urlopen(req, timeout=5)

except urllib2.URLError, e:

print “Failed to access url due to ” + e.reason

else:

the_page = response.read()

print the_page

本内容不代表本网观点和政治立场，如有侵犯你的权益请联系我们处理。

网友评论

网友评论仅供其表达个人看法，并不表明网站立场。

python爬虫网络请求超时_Python网络爬虫编写5-使用代理 处理异常和超时