Python3爬虫之urllib携带cookie爬取网页的方法
如下所示:
importurllib.request importurllib.parse url='https://weibo.cn/5273088553/info' #正常的方式进行访问 #headers={ #'User-Agent':'Mozilla/5.0(WindowsNT10.0;WOW64)AppleWebKit/537.36(KHTML,likeGecko)Chrome/63.0.3239.84Safari/537.36' #} #携带cookie进行访问 headers={ 'GEThttps':'//weibo.cn/5273088553/infoHTTP/1.1', 'Host':'weibo.cn', 'Connection':'keep-alive', 'Upgrade-Insecure-Requests':'1', 'User-Agent':'Mozilla/5.0(WindowsNT10.0;WOW64)AppleWebKit/537.36(KHTML,likeGecko)Chrome/63.0.3239.84Safari/537.36', 'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8', #'Referer:https':'//weibo.cn/', 'Accept-Language':'zh-CN,zh;q=0.9', 'Cookie':'_T_WM=c1913301844388de10cba9d0bb7bbf1e;SUB=_2A253Wy_dDeRhGeNM7FER-CbJzj-IHXVUp7GVrDV6PUJbkdANLXPdkW1NSesPJZ6v1GA5MyW2HEUb9ytQW3NYy19U;SUHB=0bt8SpepeGz439;SCF=Aua-HpSw5-z78-02NmUv8CTwXZCMN4XJ91qYSHkDXH4W9W0fCBpEI6Hy5E6vObeDqTXtfqobcD2D32r0O_5jSRk.;SSOLoginState=1516199821', } request=urllib.request.Request(url=url,headers=headers) response=urllib.request.urlopen(request) #输出所有 #print(response.read().decode('gbk')) #将内容写入文件中 withopen('weibo.html','wb')asfp: fp.write(response.read())
以上这篇Python3爬虫之urllib携带cookie爬取网页的方法就是小编分享给大家的全部内容了,希望能给大家一个参考,也希望大家多多支持毛票票。