Functions
click(x, y, click_type='left')
click(x, y, click_type='left')
- x, y (int): The screen coordinates to move the mouse.
- click_type (str): The mouse button to click (
"left"
,"middle"
, or"right"
).
None
scroll(x, y, scroll_x=0, scroll_y=0)
scroll(x, y, scroll_x=0, scroll_y=0)
- x, y (int): The screen coordinates to move the mouse.
- scroll_x (int): Horizontal scroll offset.
- Negative => Scroll left (button 6)
- Positive => Scroll right (button 7)
- scroll_y (int): Vertical scroll offset.
- Negative => Scroll up (button 4)
- Positive => Scroll down (button 5)
- Moves the mouse to
(x, y)
. - Scrolls by scroll_x and scroll_y
None
keypress(keys: list[str])
keypress(keys: list[str])
- keys (list[str]): A list of keys to press. Example:
["CTRL", "f"]
for Ctrl+F.
- Executes a keypress
- Supports shortcuts like Ctrl+Fs
None
type_text(text: str)
type_text(text: str)
- text (str): The string of text to type.
- Types a string of text at the current cursor location.
None
get_screenshot()
get_screenshot()
- Takes a screenshot as a png
- Captures the base64 result.
- Returns that base64 string.
- (str): A base64-encoded PNG screenshot.
- RuntimeError if the screenshot command fails.
goto(url)
goto(url)
- url (str): The URL to navigate to (e.g.,
https://example.com
).
- Opens Firefox in the container and navigates to the specified URL in a new tab
None
get_page_html(query)
get_page_html(query)
- query (str): JavaScript query to retrieve the HTML content.
Defaults to
'return document.documentElement.outerHTML;'
, which returns the full DOM HTML.
- Uses the underlying agent to execute a JavaScript query and retrieve the current webpage’s HTML content.
- (str): The HTML content of the current page, or an error message if retrieval fails.