Python3爬虫之urllib携带cookie爬取网页的方法
如下所示:
importurllib.request
importurllib.parse
url='https://weibo.cn/5273088553/info'
#正常的方式进行访问
#headers={
#'User-Agent':'Mozilla/5.0(WindowsNT10.0;WOW64)AppleWebKit/537.36(KHTML,likeGecko)Chrome/63.0.3239.84Safari/537.36'
#}
#携带cookie进行访问
headers={
'GEThttps':'//weibo.cn/5273088553/infoHTTP/1.1',
'Host':'weibo.cn',
'Connection':'keep-alive',
'Upgrade-Insecure-Requests':'1',
'User-Agent':'Mozilla/5.0(WindowsNT10.0;WOW64)AppleWebKit/537.36(KHTML,likeGecko)Chrome/63.0.3239.84Safari/537.36',
'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
#'Referer:https':'//weibo.cn/',
'Accept-Language':'zh-CN,zh;q=0.9',
'Cookie':'_T_WM=c1913301844388de10cba9d0bb7bbf1e;SUB=_2A253Wy_dDeRhGeNM7FER-CbJzj-IHXVUp7GVrDV6PUJbkdANLXPdkW1NSesPJZ6v1GA5MyW2HEUb9ytQW3NYy19U;SUHB=0bt8SpepeGz439;SCF=Aua-HpSw5-z78-02NmUv8CTwXZCMN4XJ91qYSHkDXH4W9W0fCBpEI6Hy5E6vObeDqTXtfqobcD2D32r0O_5jSRk.;SSOLoginState=1516199821',
}
request=urllib.request.Request(url=url,headers=headers)
response=urllib.request.urlopen(request)
#输出所有
#print(response.read().decode('gbk'))
#将内容写入文件中
withopen('weibo.html','wb')asfp:
fp.write(response.read())
以上这篇Python3爬虫之urllib携带cookie爬取网页的方法就是小编分享给大家的全部内容了,希望能给大家一个参考,也希望大家多多支持毛票票。