1. <em id="yud1w"><acronym id="yud1w"><u id="yud1w"></u></acronym></em>
      
      
      <button id="yud1w"></button>

      python教程

      当前位置:首页?>?requests爬虫?>?当前文章

      requests爬虫

      requests的get请求url传参及无效参数

      2019-07-18 197赞 老董笔记
      每篇文章努力于解决一个问题!更多精品可移步文章底部。

        很多网站的url是带有参数的(http://www.xxx.com/get?key1=val1&key2=val2),比如在百度搜索www.bdd33.com,然后搜索结果页的url是很长的一串,取部分参数也可以访问如:https://www.baidu.com/s?tn=50000021_hao_pg&word=python66.com,requests对于这种带参数的url如何实现请求呢?

        1、如何进行url传参

      官方原文:You often want to send some sort of data in the URL’s query string. If you were constructing the URL by hand, this data would be given as key/value pairs in the URL after a question mark, e.g. httpbin.org/get?key=val. Requests allows you to provide these arguments as a dictionary of strings, using the params keyword argument. As an example, if you wanted to pass key1=value1 and key2=value2 to httpbin.org/get, you would use the following code:
      

       译文:Requests 允许你使用params关键字参数,以一个字符串字典来提供这些参数。举例来说,如果你想传递 key1=value1 和 key2=value2 到 httpbin.org/get ,那么你可以使用如下代码

      # -*- coding: utf-8 -*-
      import requests
      
      payload = {'key1': 'value1', 'key2': 'value2'}
      r = requests.get("http://httpbin.org/get", params=payload)
      

        同理,如果访问python66.com的百度搜索结果页就可以这样:

      headers = {
      'user-agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36 Edg/95.0.1020.44',
      }
      payload = {'word': '50000021_hao_pg', 'tn': 'python66.com'}
      
      r = requests.get("https://www.baidu.com", params=payload,headers=headers)
      

        2、url传参的本质

      # -*- coding: utf-8 -*-
      headers = {
      'user-agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36 Edg/95.0.1020.44',
      }
      payload = {'word': '50000021_hao_pg', 'tn': 'python66.com'}
      
      r = requests.get("https://www.baidu.com", params=payload,headers=headers)
      print(r.url)
      
      https://www.baidu.com/?word=50000021_hao_pg&tn=python66.com
      

        观察上面的代码结果,url传参实际上和直接访问拼接好的url没有区别,只不过是requests在内部进行了处理,上述访问百度搜索结果页的例子可以直接按如下来写

      # -*- coding: utf-8 -*-
      import requests
      
      headers = {
      'user-agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36 Edg/95.0.1020.44',
      }
      payload = {'word': '50000021_hao_pg', 'tn': 'python66.com'}
      
      r = requests.get("https://www.baidu.com/s?tn=50000021_hao_pg&word=python66.com", headers=headers)
      

        3、无效传参

      Note that any dictionary key whose value is None will not be added to the URL’s query string
      

        PS:注意字典里值为None的键都不会被添加到URL的查询字符串里,也就是说字典的一个键的值为None,实际上等于url不添加这个参数

      # -*- coding: utf-8 -*-
      headers = {
      'user-agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36 Edg/95.0.1020.44',
      }
      payload = {'word': '50000021_hao_pg', 'tn': None}
      
      r = requests.get("https://www.baidu.com", params=payload,headers=headers)
      print(r.url)
      
      https://www.baidu.com/?word=50000021_hao_pg
      

      文章评论

      requests的get请求url传参及无效参数文章写得不错,值得赞赏
      国产99视频精品免视看6