python爬虫之headers处理、网络超时、代理服务问题处理

2021-01-16 07:12

阅读：913

YPE html>

标签：search ble user headers 打开网页发送请求 requests 头部打印

1、请求headers处理

　　我们有时请求服务器时，无论get或post请求，会出现403错误，这是因为服务器拒绝了你的访问，这时我们可以通过模拟浏览器的头部信息进行访问，这样就可以解决反爬设置的问题。

import requests
# 创建需要爬取网页的地址
url = ‘https://www.baidu.com/‘     
# 创建头部信息
headers = {‘User-Agent‘:‘OW64; rv:59.0) Gecko/20100101 Firefox/59.0‘}
# 发送网络请求
response  = requests.get(url, headers=headers)    
# 以字节流形式打印网页源码
print(response.content)

结果：

b‘\n\n\n    \n    \n

2、网络超时问题

　　在访问一个网页时，如果该网页长时间未响应，系统就会判断该网页超时，而无法打开网页。下面通过代码来模拟一个网络超时的现象。

import requests
# 循环发送请求50次
for a in range(1, 50):
    # 捕获异常
    try:
        # 设置超时为0.5秒
        response = requests.get(‘https://www.baidu.com/‘, timeout=0.5)
        # 打印状态码
        print(response.status_code)
    # 捕获异常
    except Exception as e:
        # 打印异常信息
        print(‘异常‘+str(e))

结果：

以上代码中，模拟进行了50次循环请求，设置超时时间为0.5秒，在0.5秒内服务器未作出相应视为超时，程序会将超时信息打印在控制台中。

python爬虫之headers处理、网络超时、代理服务问题处理

标签：search ble user headers 打开网页发送请求 requests 头部打印

原文地址：https://www.cnblogs.com/xiao02fang/p/12927267.html

上一篇：spring框架——bean的自动装配不需要手动指定 property 的 value 值

下一篇：04.PageNumberPagination分页

文章来自：搜素材网的编程语言模块，转载请注明文章出处。
文章标题：python爬虫之headers处理、网络超时、代理服务问题处理
文章链接：http://soscw.com/index.php/essay/42602.html

亲，登录后才可以留言！

python爬虫之headers处理、网络超时、代理服务问题处理

评论

热门文章

推荐文章

最新文章

置顶文章