300字范文,内容丰富有趣,生活中的好帮手!
300字范文 > 道客巴巴爬虫

道客巴巴爬虫

时间:2024-06-21 14:29:42

相关推荐

道客巴巴爬虫

使用xpathhelp控件

import requests, re, json, pandas as pd, timefrom selenium import webdriver # selenium2.48.0 支持phantomjsfrom lxml import etreeimport timeimport os, time# 页 /list-8308-0-1.html# 文件 /p-9139147359378.htmldriver = webdriver.PhantomJS(executable_path=r'C:\Users\wang\Desktop\phantomjs-2.1.1-windows (1)\bin\phantomjs.exe')file_urls_list=[]for i in range(1,30,1):time.sleep(3)url = "/list-8308-0-"+str(i)+"1.html"driver.get(url=url)tree = etree.HTML(driver.page_source)file_urls = tree.xpath(".//h3[@class='sd-type-title']/a/@href")file_urls=[ "/"+str(i) for i in file_urls ]file_urls_list.extend(file_urls)print(file_urls)with open("url.txt","w",encoding="utf-8") as f:for i in file_urls:if len(i)==len("//p-7367816610215.html"):f.write(i)f.write("\n")f.close()

本内容不代表本网观点和政治立场,如有侵犯你的权益请联系我们处理。
网友评论
网友评论仅供其表达个人看法,并不表明网站立场。