入门指南 - 库
安装
Pip
pip install --upgrade pip
pip install playwright
playwright install
Conda
conda config --add channels conda-forge
conda config --add channels microsoft
conda install playwright
playwright install
这些命令会下载Playwright包并安装Chromium、Firefox和WebKit的浏览器二进制文件。要修改此行为,请参阅installation parameters。
使用方法
安装完成后,您可以在Python脚本中import
Playwright,并启动3种浏览器中的任意一种(chromium
、firefox
和webkit
)。
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch()
page = browser.new_page()
page.goto("http://playwright.dev")
print(page.title())
browser.close()
Playwright 支持两种API变体:同步和异步。如果您的现代项目使用asyncio,则应使用异步API:
import asyncio
from playwright.async_api import async_playwright
async def main():
async with async_playwright() as p:
browser = await p.chromium.launch()
page = await browser.new_page()
await page.goto("http://playwright.dev")
print(await page.title())
await browser.close()
asyncio.run(main())
第一个脚本
在我们的第一个脚本中,我们将导航到https://playwright.dev/
并在WebKit中截取屏幕截图。
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.webkit.launch()
page = browser.new_page()
page.goto("https://playwright.dev/")
page.screenshot(path="example.png")
browser.close()
默认情况下,Playwright以无头模式运行浏览器。要查看浏览器界面,请将headless选项设置为False
。您还可以使用slow_mo来减慢执行速度。在调试工具section中了解更多信息。
firefox.launch(headless=False, slow_mo=50)
交互式模式 (REPL)
您可以启动交互式python REPL:
python
然后在其内部启动Playwright以进行快速实验:
from playwright.sync_api import sync_playwright
playwright = sync_playwright().start()
# Use playwright.chromium, playwright.firefox or playwright.webkit
# Pass headless=False to launch() to see the browser UI
browser = playwright.chromium.launch()
page = browser.new_page()
page.goto("https://playwright.dev/")
page.screenshot(path="example.png")
browser.close()
playwright.stop()
异步REPL例如asyncio
REPL:
python -m asyncio
from playwright.async_api import async_playwright
playwright = await async_playwright().start()
browser = await playwright.chromium.launch()
page = await browser.new_page()
await page.goto("https://playwright.dev/")
await page.screenshot(path="example.png")
await browser.close()
await playwright.stop()
Pyinstaller
你可以使用Playwright与Pyinstaller来创建独立的可执行文件。
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch()
page = browser.new_page()
page.goto("https://playwright.dev/")
page.screenshot(path="example.png")
browser.close()
如果您想将浏览器与可执行文件捆绑在一起:
- Bash
- PowerShell
- 批量处理
PLAYWRIGHT_BROWSERS_PATH=0 playwright install chromium
pyinstaller -F main.py
$env:PLAYWRIGHT_BROWSERS_PATH="0"
playwright install chromium
pyinstaller -F main.py
set PLAYWRIGHT_BROWSERS_PATH=0
playwright install chromium
pyinstaller -F main.py
将浏览器与可执行文件捆绑会生成更大的二进制文件。建议仅捆绑您使用的浏览器。
已知问题
time.sleep()
会导致状态过时
大多数情况下您不需要手动等待,因为Playwright具备自动等待功能。如果您仍然依赖手动等待,应该使用page.wait_for_timeout(5000)
而非time.sleep(5)
,而且最好完全避免使用超时等待,但有时这对调试很有帮助。在这些情况下,请使用我们的等待方法(wait_for_timeout
)替代time
模块。这是因为我们在内部依赖异步操作,当使用time.sleep(5)
时这些操作无法被正确处理。
在Windows上与asyncio
的SelectorEventLoop
不兼容
Playwright在子进程中运行驱动程序,因此在Windows上需要使用asyncio
的ProactorEventLoop
,因为SelectorEventLoop
不支持异步子进程。
在Windows Python 3.7上,Playwright将默认事件循环设置为ProactorEventLoop
,因为这是Python 3.8+的默认设置。
多线程
Playwright的API不是线程安全的。如果您在多线程环境中使用Playwright,应该为每个线程创建一个playwright实例。详情请参阅线程问题。