---- 淘宝优惠券 ----资源下载 ---域名问题某些图片和js资源无法访问,导致一些代码实例无法运行!(代码里gzui.net换成momen.vip即可)

[Python] 某视频网站抓取排行榜视频

Python 蚂蚁 41℃ 0评论

https://www.52pojie.cn/thread-1631910-1-1.html

import requests
from lxml import etree
import time

url ='https://www.pearvideo.com/popular'
raw_url = 'https://www.pearvideo.com/'

respon = requests.get(url)
main_page = etree.HTML(respon.text)
name_list = main_page.xpath("/html/body/div[2]/div/div[1]/ul")

for i in name_list:  # 各个榜单的名字和代码
    pop_name = i.xpath("./li/a/text()")
    print('榜单为:',pop_name)
    pop_num = i.xpath("./li/a/@href")
    # print(pop_num)

input_name = input('输入榜单名称:')
if input_name in pop_name:
    pop_page_url = raw_url+pop_num[pop_name.index(input_name)]
else:
    pop_page_url = url
    print('将使用默认总榜!')

respon = requests.get(pop_page_url)
top_page = etree.HTML(respon.text)
top_list = top_page.xpath('//*[@id="popularList"]')
for j in top_list:  # 榜单的视频id
    vedio_id = j.xpath("./li/div/div/span/@data-id")
    file_name = j.xpath('./li/div/a/h2/text()')

for k in vedio_id:
    url = f'https://www.pearvideo.com/video_{k}'
    XHR_url = f'https://www.pearvideo.com/videoStatus.jsp?contId={k}&mrd=0.12841592667885604'

    head = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:99.0) Gecko/20100101 Firefox/99.0',
            'Referer':url}
    respon = requests.get(XHR_url,headers=head)
    systemTime = respon.json()['systemTime']
    srcUrl = respon.json()['videoInfo']['videos']['srcUrl']
    srcUrl = srcUrl.replace(systemTime,f'cont-{k}')
    output_name = file_name[vedio_id.index(k)]
    f = open(f'd:/{output_name}.mp4',mode='wb')
    f.write(requests.get(srcUrl).content)
    f.close()
    print(output_name,'下载完毕!')
    time.sleep(2)

respon.close()

转载请注明:有爱前端 » [Python] 某视频网站抓取排行榜视频

喜欢 (0)or分享 (0)
发表我的评论
取消评论

表情