我目前使用硒webdriver解析通过facebook用户的朋友页面,并从AJAX脚本提取所有id。但我需要向下滚动来找到所有的朋友。如何向下滚动硒。我正在使用python。


当前回答

这段代码滚动到底部,但不需要每次都等待。它会不断滚动,然后在底部停止(或超时)

from selenium import webdriver
import time

driver = webdriver.Chrome(executable_path='chromedriver.exe')
driver.get('https://example.com')

pre_scroll_height = driver.execute_script('return document.body.scrollHeight;')
run_time, max_run_time = 0, 1
while True:
    iteration_start = time.time()
    # Scroll webpage, the 100 allows for a more 'aggressive' scroll
    driver.execute_script('window.scrollTo(0, 100*document.body.scrollHeight);')

    post_scroll_height = driver.execute_script('return document.body.scrollHeight;')

    scrolled = post_scroll_height != pre_scroll_height
    timed_out = run_time >= max_run_time

    if scrolled:
        run_time = 0
        pre_scroll_height = post_scroll_height
    elif not scrolled and not timed_out:
        run_time += time.time() - iteration_start
    elif not scrolled and timed_out:
        break

# closing the driver is optional 
driver.close()

这比每次等待0.5-3秒的响应要快得多,因为每次响应可能需要0.1秒

其他回答

滚动加载页面。例如:medium, quora等

last_height = driver.execute_script("return document.body.scrollHeight")
    while True:
        driver.execute_script("window.scrollTo(0, document.body.scrollHeight-1000);")
        # Wait to load the page.
        driver.implicitly_wait(30) # seconds
        new_height = driver.execute_script("return document.body.scrollHeight")
    
        if new_height == last_height:
            break
        last_height = new_height
        # sleep for 30s
        driver.implicitly_wait(30) # seconds
    driver.quit()

如果你想滚动到无限页面的底部(如linkedin.com),你可以使用下面的代码:

SCROLL_PAUSE_TIME = 0.5

# Get scroll height
last_height = driver.execute_script("return document.body.scrollHeight")

while True:
    # Scroll down to bottom
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

    # Wait to load page
    time.sleep(SCROLL_PAUSE_TIME)

    # Calculate new scroll height and compare with last scroll height
    new_height = driver.execute_script("return document.body.scrollHeight")
    if new_height == last_height:
        break
    last_height = new_height

参考:https://stackoverflow.com/a/28928684/1316860

使用“send keys”方法滚动页面的循环:

pre_scroll_height = driver.execute_script('return document.body.scrollHeight;')
while True:
    driver.find_element_by_tag_name('body').send_keys(Keys.END)
    time.sleep(5)
    post_scroll_height = driver.execute_script('return document.body.scrollHeight;')

    print(pre_scroll_height, post_scroll_height)
    if pre_scroll_height == post_scroll_height:
        break
    pre_scroll_height=post_scroll_height

如果你想在一个特定的视图/帧(WebElement)内滚动,你只需要用你想要滚动的特定元素替换“body”。在下面的例子中,我通过“getElementById”获得该元素:

self.driver.execute_script('window.scrollTo(0, document.getElementById("page-manager").scrollHeight);')

这就是YouTube上的例子……

当使用youtube时,浮动元素给出值“0”作为滚动高度 与其使用return document。body。scrollHeight"尝试使用这个"return document。documentelement。scrollHeight" 根据您的网速调整滚动暂停时间 否则它将只运行一次,然后在此之后中断。

SCROLL_PAUSE_TIME = 1

# Get scroll height
"""last_height = driver.execute_script("return document.body.scrollHeight")

this dowsnt work due to floating web elements on youtube
"""

last_height = driver.execute_script("return document.documentElement.scrollHeight")
while True:
    # Scroll down to bottom
    driver.execute_script("window.scrollTo(0,document.documentElement.scrollHeight);")

    # Wait to load page
    time.sleep(SCROLL_PAUSE_TIME)

    # Calculate new scroll height and compare with last scroll height
    new_height = driver.execute_script("return document.documentElement.scrollHeight")
    if new_height == last_height:
       print("break")
       break
    last_height = new_height