selenium4.3 网页长截屏
有勇气的牛排
356
Python
2023-05-18 21:05:30
1 网页整体长截屏
模式:headless
缺点:无法看到浏览器,遇到特殊场景人工无法辅助。
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import os
import time
def get_image(url, pic_name):
chrome_options = Options()
chrome_options.add_argument('headless')
driver = webdriver.Chrome(options=chrome_options)
driver.get(url)
time.sleep(1)
width = driver.execute_script("return document.documentElement.scrollWidth")
height = driver.execute_script("return document.documentElement.scrollHeight")
print(width, height)
driver.set_window_size(width, height)
time.sleep(1)
driver.save_screenshot(pic_name)
if __name__ == "__main__":
url = "https://www.baidu.com/s?wd=人工智能"
pic_name = "image3.png"
get_image(url, pic_name)
<h2><a id="1__0"></a>1 网页整体长截屏</h2>
<p>模式:headless<br />
缺点:无法看到浏览器,遇到特殊场景人工无法辅助。</p>
<pre><div class="hljs"><code class="lang-python"><span class="hljs-comment"># -*- coding: utf-8 -*-</span>
<span class="hljs-keyword">from</span> selenium <span class="hljs-keyword">import</span> webdriver
<span class="hljs-keyword">from</span> selenium.webdriver.chrome.options <span class="hljs-keyword">import</span> Options
<span class="hljs-keyword">import</span> os
<span class="hljs-keyword">import</span> time
<span class="hljs-keyword">def</span> <span class="hljs-title function_">get_image</span>(<span class="hljs-params">url, pic_name</span>):
<span class="hljs-comment"># 设置chrome开启的模式,headless就是无界面模式</span>
<span class="hljs-comment"># 一定要使用这个模式,不然截不了全页面,只能截到你电脑的高度</span>
chrome_options = Options()
chrome_options.add_argument(<span class="hljs-string">'headless'</span>)
driver = webdriver.Chrome(options=chrome_options)
<span class="hljs-comment"># 控制浏览器写入并转到链接</span>
driver.get(url)
time.sleep(<span class="hljs-number">1</span>)
<span class="hljs-comment"># 接下来是全屏的关键,用js获取页面的宽高,如果有其他需要用js的部分也可以用这个方法</span>
width = driver.execute_script(<span class="hljs-string">"return document.documentElement.scrollWidth"</span>)
height = driver.execute_script(<span class="hljs-string">"return document.documentElement.scrollHeight"</span>)
<span class="hljs-built_in">print</span>(width, height)
<span class="hljs-comment"># 将浏览器的宽高设置成刚刚获取的宽高</span>
driver.set_window_size(width, height)
time.sleep(<span class="hljs-number">1</span>)
driver.save_screenshot(pic_name)
<span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">"__main__"</span>:
url = <span class="hljs-string">"https://www.baidu.com/s?wd=人工智能"</span>
pic_name = <span class="hljs-string">"image3.png"</span>
get_image(url, pic_name)
</code></div></pre>
留言