《python编程锦囊》之网络爬虫
2021-02-11 21:16
标签:port enc number 重要 字符串 nbsp 设置 ext style 1、对中国天气预报网站爬虫 结果: 说明:获取User-Agent和Referer的方法,打开谷歌浏览器,打开中国天气预报网站:http://www.weather.com.cn/weather1d/101010100.shtml,右键>审查元素,会出现以下页面: 选择network>Headers,获取User-Agent和Referer。 《python编程锦囊》之网络爬虫 标签:port enc number 重要 字符串 nbsp 设置 ext style 原文地址:https://www.cnblogs.com/xiao02fang/p/12732921.html#!/usr/bin/env python3
#导入网络请求模块
import requests
#导入Json模块
import json
#头部信息,需要设置网络工具中提取的重要信息“User-Agent”和“Referer”
headers = {‘User-Agent‘:‘Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36‘ ‘(KHTML, like Gecko) Chrome/41.0.2272.118 Safari/537.36‘, ‘Referer‘:‘http://www.weather.com.cn/weather1d/101010100.shtml‘}
#网络请求网址
url = ‘http://d1.weather.com.cn/sk_2d/101010100.html?_=1555986533582‘
#发送网络请求
response = requests.get(url,headers = headers)
#判断请求是否成功
if response.status_code==200:
#编码返回信息
response.encoding = ‘utf-8‘
#将json字符串中“var dataSK =”去除
print(‘发送网页请求,返回的网页内容:‘,response.text)
json_str = response.text.replace(‘var dataSK =‘,‘‘)
print(‘解码前,返回的json类型:‘,json_str)
#将json字符串转换为字典类型
json_info = json.loads(json_str)
#打印当前的字典
print(‘解码后,返回的字典类型:‘,json_info)
#打印城市
print(‘城市:‘,json_info[‘cityname‘])
#打印温度
print(‘当前温度:‘,json_info[‘temp‘])
#打印湿度
print(‘相对湿度:‘,json_info[‘SD‘])
#打印风向等级
print(‘风向等级:‘,json_info[‘WD‘],json_info[‘WS‘])
#打印空气质量
print(‘空气质量pm2.5:‘,json_info[‘aqi_pm25‘])
#打印车辆限号
print(‘车辆限号为:‘,json_info[‘limitnumber‘])
发送网页请求,返回的网页内容: var dataSK = {"nameen":"beijing","cityname":"北京","city":"101010100","temp":"18","tempf":"64","WD":"西风","wde":"W","WS":"3级","wse":"<12km/h","SD":"53%","time":"18:22","weather":"多云","weathere":"Cloudy","weathercode":"d01","qy":"1002","njd":"8.93km","sd":"53%","rain":"0.0","rain24h":"0","aqi":"67","limitnumber":"不限行","aqi_pm25":"67","date":"04月19日(星期日)"}
解码前,返回的json类型: {"nameen":"beijing","cityname":"北京","city":"101010100","temp":"18","tempf":"64","WD":"西风","wde":"W","WS":"3级","wse":"<12km/h","SD":"53%","time":"18:22","weather":"多云","weathere":"Cloudy","weathercode":"d01","qy":"1002","njd":"8.93km","sd":"53%","rain":"0.0","rain24h":"0","aqi":"67","limitnumber":"不限行","aqi_pm25":"67","date":"04月19日(星期日)"}
解码后,返回的字典类型: {‘nameen‘: ‘beijing‘, ‘cityname‘: ‘北京‘, ‘city‘: ‘101010100‘, ‘temp‘: ‘18‘, ‘tempf‘: ‘64‘, ‘WD‘: ‘西风‘, ‘wde‘: ‘W‘, ‘WS‘: ‘3级‘, ‘wse‘: ‘<12km/h‘, ‘SD‘: ‘53%‘, ‘time‘: ‘18:22‘, ‘weather‘: ‘多云‘, ‘weathere‘: ‘Cloudy‘, ‘weathercode‘: ‘d01‘, ‘qy‘: ‘1002‘, ‘njd‘: ‘8.93km‘, ‘sd‘: ‘53%‘, ‘rain‘: ‘0.0‘, ‘rain24h‘: ‘0‘, ‘aqi‘: ‘67‘, ‘limitnumber‘: ‘不限行‘, ‘aqi_pm25‘: ‘67‘, ‘date‘: ‘04月19日(星期日)‘}
城市: 北京
当前温度: 18
相对湿度: 53%
风向等级: 西风 3级
空气质量pm2.5: 67
车辆限号为: 不限行