国产99视频精品免视看6

    1. <em id="yud1w"><acronym id="yud1w"><u id="yud1w"></u></acronym></em>
      
      
      <button id="yud1w"></button>

      python

      当前位置:首页?>?SEO工具?>?当前文章

      SEO工具

      python多线程百度mo关键词和url一对一排名查询

      2020-08-24 179赞 python中国网
      每篇文章努力于解决一个问题!python高级、python面试全套、操作系统经典课等可移步文章底部。

        之前有百度PC的排名查询,查询百度mo端的排名也不难,原理是一样的。本文查询排名是指定url和关键词一对一查询!线程数默认是1,现在百度反爬比之前严重!线程最好是1。【多线程写同一个文件需要加锁否则可能数据错乱】

        1、kwd_url.txt,每行关键词和url一对,中间用制表符(直接从excel复制)隔开,url必须加http或者https

        2、区分http和https

        3、区分http://aaa/bbb/和http://aaa/bbb

      # ‐*‐ coding: utf‐8 ‐*‐
      """
      kwd和url一对一查询 仅查前十名
      kwd_url.txt,每行关键词和url一对,中间用制表符(直接从excel复制)隔开,url必须加http或者https
      区分http和https
      区分http://aaa/bbb/和http://aaa/bbb
      """
      
      import requests
      from pyquery import PyQuery as pq
      import threading
      import queue
      import gc
      import json
      
      
      class BdmoRank(threading.Thread):
      
          def __init__(self):
              threading.Thread.__init__(self)
      
          # 读取txt文件 获取待查询url
          @staticmethod
          def read_txt(filepath):
              q = queue.Queue()
              for line in open(filepath, encoding='utf-8'):
                  kwd_url = line.strip().split('	')
                  q.put(kwd_url)
              return q
      
          # 获取某词的serp源码
          def get_html(self, url, retry=2):
              try:
                  r = requests.get(url=url, headers=user_agent, timeout=5)
              except Exception as e:
                  print('获取源码失败', url, e)
                  if retry > 0:
                      self.get_html(url, retry - 1)
              else:
                  html = r.text
                  return html
      
          # 获取某词的serp源码上包含排名url的div块
          def get_data_logs(self, html):
              data_logs = []
              if html and '百度' in html:
                  doc = pq(html)
                  try:
                      div_list = doc('.c-result').items()
                  except Exception as e:
                      print('提取div块失败', e)
                  else:
                      for div in div_list:
                          data_log = div.attr('data-log')
                          data_logs.append(data_log) if data_log is not None else data_logs
              return data_logs
      
          # 检查链接是否首页有排名
          def check_include(self, url, data_logs=[]):
              rank = None
              for data_log in data_logs:
                  # json字符串要以双引号表示
                  data_log = json.loads(data_log.replace("'", '"'))
                  if url == data_log['mu']:
                      rank = data_log['order']
                      return url,rank
              return url,rank
      
          # 线程函数
          def run(self):
              while 1:
                  kwd_url = q.get()
                  try:
                      kwd = kwd_url[0]
                      url_check = kwd_url[1]
                      url = "https://m.baidu.com/s?ie=utf-8&word={0}".format(kwd)
                      html = self.get_html(url)
                      data_logs = self.get_data_logs(html)
                      url,rank = self.check_include(url_check,data_logs)
                      print(kwd,url,rank)
                      f.write(kwd + url + '	' + str(rank) + '
      ')
                      del kwd
                      del url_check
                      gc.collect()
                  except Exception as e:
                      print(e)
                  finally:
                      q.task_done()
      
      
      if __name__ == "__main__":
      
          user_agent = {
              'User-Agent': 'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Mobile Safari/537.36'}
          q = BdmoRank.read_txt('kwd_url.txt')
          f = open('bdmo_rank1.txt','w',encoding='utf-8')
          # 设置线程数
          for i in list(range(1)):
              t = BdmoRank()
              t.setDaemon(True)
              t.start()
          q.join()
          f.flush()
          f.close()
      
      
      北京二手车出售 https://m.renrenche.com/bj/ershouche/ 2
      鞍山二手宝沃 https://m.renrenche.com/cn/baowo_baowoBX7/ None
      鞍山二手北汽新能源 https://m.renrenche.com/cn/beiqixinnengyuan/jishou/ None
      鞍山二手北汽威旺 https://m.renrenche.com/as/ None
      鞍山华泰新能源二手车报价 https://m.renrenche.com/cn/huataixinnengyuan/ 5
      


        python多线程百度mo关键词和url一对一排名查询代码如上,有问题请及时反馈给我。

      文章评论

      python多线程百度mo关键词和url一对一排名查询文章写得不错,值得赞赏
      国产99视频精品免视看6