【英文】Selenium学习笔记

Introduction

Python uses Selenium for web scraping. This case study is based on Google Chrome implementation.

Download Drivers

Download Dependencies

1
pip3 install selenium

Import Dependencies

1
from selenium import webdriver

Create Driver Object

  • After creating the driver object, a new browser window automatically opens, regardless of whether an existing browser window is already open.
1
driver = webdriver.Chrome()

Wait for the Browser to Fully Open

1
2
3
4
import time

driver = webdriver.Chrome()
time.sleep(5)

Automatic Destruction After Use

1
2
with webdriver.Chrome() as driver:
...

Destroy the Driver Object

  • Closing the browser window after destroying the driver object.
1
driver.quit()

Access URL

<url>: URL link to be accessed by the browser.

1
driver.get("<url>")

Find Elements

Import Dependencies

1
from selenium.webdriver.common.by import By

ById

<id>: ID of the HTML tag.

1
res = driver.find_element(by=By.NAME, value="<id>")

Get Text Data

  • Gets the innerText of the HTML tag excluding the child tags.
1
res.text

Get Attribute Value

  • Gets the attribute value of the HTML tag.

<key>: Attribute name.

1
value = res.get_attribute("<key>")

Simulate Click

1
res.click()

ByCSSSelector

.father .son: Content selected by the CSS selector.

1
res_list = driver.find_element(by=By.CSS_SELECTOR, value=".father .son")

Get Data

  • The object needs to be iterated through the list in order to operate on it.
1
2
for res in res_list:
res.text

Execute JS

<javascript>: JS code.

1
driver.execute_script("<javascript>")

Completion

References

Selenium Official Documentation (Chinese)
Article on Simplified Chinese