action()

Check out the guide for using this function for more details

Purpose

The action function lets you control the desktop environment via an agent, managing commands, user inputs, security checks, and tool calls in a streamlined way.

Arguments

  • input_text (str, optional):
    New commands or responses to agent prompts.

  • acknowledged_safety_checks (bool, optional):
    Set True after the user confirms pending security checks.

  • ignore_safety_and_input (bool, optional):
    Automatically approves security checks and inputs. Use cautiously.

  • Handlers (callables, optional):
    Customize how different statuses are handled:

    • complete_handler(data): Final results.
    • needs_input_handler(messages): Collects user input.
    • needs_safety_check_handler(safety_checks, pending_call): Approves security checks.
    • error_handler(error_message): Manages errors.
  • tools (list, optional):
    List of available tools for the agent to use when executing actions.

  • function_map (dict, optional):
    Dictionary mapping tool call identifiers to actual callable functions.

How it works

The action function returns one of four statuses:

Status Handling

  • COMPLETE:
    Task finished successfully. Handle final output.

  • ERROR:
    Review the returned error message and handle accordingly.

  • NEEDS_INPUT:
    Provide additional user input and call action() again with this input.

  • NEEDS_SECURITY_CHECK:
    Review security warnings and confirm with acknowledged_safety_checks=True.

Example workflow:

status, data = agent.action(input_text="Open Chrome")

if status == AgentStatus.COMPLETE:
    print("Done:", data)
elif status == AgentStatus.ERROR:
    print("Error:", data)
elif status == AgentStatus.NEEDS_INPUT:
    user_reply = input(f"Input needed: {data}")
    agent.action(input_text=user_reply)
elif status == AgentStatus.NEEDS_SECURITY_CHECK:
    confirm = input(f"Security checks: {data['safety_checks']} Proceed? (y/N): ")
    if confirm.lower() == "y":
        agent.action(acknowledged_safety_checks=True)
    else:
        print("Action cancelled.")

Auto Mode

Set ignore_safety_and_input=True for automatic handling of inputs and security checks. Use carefully as this bypasses user prompts and approvals.

Using Handlers

Provide handler functions to automate status management, simplifying your code:

agent.action(
    input_text="Open Chrome",
    complete_handler=lambda data: print("Done:", data),
    error_handler=lambda error: print("Error:", error),
    needs_input_handler=lambda msgs: input(f"Agent asks: {msgs}"),
    needs_safety_check_handler=lambda checks, call: input(f"Approve {checks}? (y/N): ").lower() == "y"
)

Using Tools and Function Map

When providing tools, ensure each tool has a corresponding callable in the function_map. The agent will invoke these callables based on the tool call identifiers:

tools = ["web_search", "calculator"]

function_map = {
    "web_search": web_search_function,
    "calculator": calculator_function
}

agent.action(
    input_text="What is 42 divided by 7?",
    tools=tools,
    function_map=function_map
)

🚀 Guide: Using the action Command

The action function lets your agent execute tasks in the desktop environment. It handles:

  • Starting a new conversation with a command.
  • Continuing a conversation by supplying user input.
  • Acknowledging safety checks for a pending call.
  • Auto-handling safety checks and input if ignore_safety_and_input=True.
  • Custom handler delegation for each status.

Internally, action manages state and either returns a (status, data) tuple for you to process or calls the appropriate handler if provided.


📌 Quick Overview

def action(
    input_text=None,
    acknowledged_safety_checks=False,
    ignore_safety_and_input=False,
    complete_handler=None,
    needs_input_handler=None,
    needs_safety_check_handler=None,
    error_handler=None
):
    # ...
  • input_text (str, optional):

    • A new command to start a conversation.
    • A user’s response if the agent has asked for more input.
    • None if you’re just confirming safety checks.
  • acknowledged_safety_checks (bool, optional):

    • Indicates that the user has confirmed pending checks.
    • Only relevant if a NEEDS_SECURITY_CHECK status was returned previously.
  • ignore_safety_and_input (bool, optional):

    • If True, the function automatically handles safety checks and input requests, requiring no user interaction.
  • Handlers (callables, optional):

    • complete_handler(data): Handles COMPLETE.
    • needs_input_handler(messages): Handles NEEDS_INPUT.
    • needs_safety_check_handler(checks, pending_call): Handles NEEDS_SECURITY_CHECK.
    • error_handler(error_message): Handles ERROR.

Return Value:

  • A tuple (status, data), where:
    • status is one of:
      • COMPLETE: The agent finished successfully.
      • ERROR: An error occurred.
      • NEEDS_INPUT: The agent needs more user input.
      • NEEDS_SECURITY_CHECK: The agent needs confirmation for a risky action.
    • data:
      • For COMPLETE: The final response object.
      • For ERROR: An error message.
      • For NEEDS_INPUT: A list of messages asking for input.
      • For NEEDS_SECURITY_CHECK: A list of safety checks and the pending call.

When handlers are provided, action may not return a status in the usual way—it delegates behavior to those handlers.


🌀 Handling the Workflow (Interactive Example)

action covers multiple scenarios:

  1. Starting a conversation with input_text (e.g., a command: “Open Chrome”).
  2. Continuing a conversation by providing user input if the agent is waiting for it.
  3. Acknowledging safety checks with acknowledged_safety_checks=True if the agent flagged a security concern.
  4. Auto-handling if ignore_safety_and_input=True, which bypasses user checks.

When you call action, you get (status, data) back (unless you use handlers). Use status to decide your next move:

  • COMPLETE:

    • The task is done. data contains the final response.
    • You can display or log it.
  • ERROR:

    • data holds an error message explaining what went wrong.
    • You can retry, log, or show it to the user.
  • NEEDS_INPUT:

    • The agent requires more information. Use data (often a list of prompts) to know what it wants.
    • Get input from the user, then call action again, supplying that text in input_text.
  • NEEDS_SECURITY_CHECK:

    • The agent found a risky action. You must confirm it’s safe.
    • Call action again with acknowledged_safety_checks=True to proceed.
    • No extra input_text is required unless the agent specifically requests it.

Here’s a straightforward example:

status, data = agent.action(input_text="Open Firefox")

if status == AgentStatus.COMPLETE:
    print("Done:", data)
elif status == AgentStatus.ERROR:
    print("Error:", data)
elif status == AgentStatus.NEEDS_INPUT:
    user_reply = input(f"Input needed: {data}")
    agent.action(input_text=user_reply)
elif status == AgentStatus.NEEDS_SECURITY_CHECK:
    confirm = input(f"Security checks: {data['safety_checks']} Proceed? (y/N): ")
    if confirm.lower() == "y":
        agent.action(acknowledged_safety_checks=True)
    else:
        print("Action cancelled.")

For a more robust loop-based approach:

status, data = agent.action(input_text="Open a file")

while status in [AgentStatus.NEEDS_INPUT, AgentStatus.NEEDS_SECURITY_CHECK]:
    if status == AgentStatus.NEEDS_INPUT:
        user_reply = input(f"Agent needs more info: {data}")
        status, data = agent.action(input_text=user_reply)
    elif status == AgentStatus.NEEDS_SECURITY_CHECK:
        confirm = input(f"Security checks: {data['safety_checks']} Proceed? (y/N): ")
        if confirm.lower() == "y":
            status, data = agent.action(acknowledged_safety_checks=True)
        else:
            print("Action cancelled.")
            break

if status == AgentStatus.COMPLETE:
    print("Final result:", data)
elif status == AgentStatus.ERROR:
    print("Error:", data)

🤖 Automated (Non-Interactive) Mode

Set ignore_safety_and_input=True to:

  • Automatically approve safety checks.
  • Automatically generate responses to agent questions to continue with the prompt

This is useful for:

  • Automated actions that must run without user interaction.
  • Headless or server-based scenarios.

CAUTION: It is inherently risky because you skip all manual confirmations and user input. Ensure your applications and use cases are safe before using auto mode.

Example:

status, data = agent.action(
    input_text="Open Chrome",
    ignore_safety_and_input=True
)

if status == AgentStatus.COMPLETE:
    print("Completed:", data)
elif status == AgentStatus.ERROR:
    print("Error:", data)

Examples Using Handlers

You can avoid manual if/else checks by supplying handlers for each status. action will call them automatically:

  • complete_handler(data): Called when the agent finishes.
  • needs_input_handler(messages): Called if the agent wants more input.
  • needs_safety_check_handler(safety_checks, pending_call): Called if the agent flags a safety check.
  • error_handler(error_message): Called if something goes wrong.

Why Handlers?

  • They allow complex logic in a more organized, modular style.
  • They help you integrate with other tools or services, since each status is handled by a dedicated function.
  • They reduce repeated conditional code in your main flow.

Example:

result = [None]  # a mutable container to store final output or None

def complete_handler(data):
    """COMPLETE -- handle final results"""
    print("\n✅ Task completed successfully!")
    result[0] = data

def needs_input_handler(messages):
    """NEEDS_INPUT -- prompt the user and return the response"""
    for msg in messages:
        if hasattr(msg, "content"):
            text_parts = [part.text for part in msg.content if hasattr(part, "text")]
            print(f"\n💬 Agent asks: {' '.join(text_parts)}")

    user_says = input("Enter your response (or 'exit'/'quit'): ").strip()
    if user_says.lower() in ("exit", "quit"):
        print("Exiting as per user request.")
        result[0] = None
        return None
    return user_says

def needs_safety_check_handler(safety_checks, pending_call):
    """NEEDS_SAFETY_CHECK -- confirm or deny safety checks"""
    for check in safety_checks:
        if hasattr(check, "message"):
            print(f"☢️  Pending Safety Check: {check.message}")

    ack = input("Type 'ack' to confirm, or 'exit'/'quit': ").strip().lower()
    if ack in ("exit", "quit"):
        print("Exiting as per user request.")
        result[0] = None
        return False
    if ack == "ack":
        print("Acknowledged. Proceeding with the computer call...")
        return True
    return False

def error_handler(error_message):
    """ERROR -- print error and store None"""
    print(f"😱 ERROR: {error_message}")
    result[0] = None

# Provide handlers to `action`:
status, data = desktop.action(
    input_text="Open Chrome",
    complete_handler=complete_handler,
    needs_input_handler=needs_input_handler,
    needs_safety_check_handler=needs_safety_check_handler,
    error_handler=error_handler
)

When handlers are specified, action manages each status internally and continues until it hits COMPLETE or ERROR (unless you stop it prematurely).


4. Key Takeaways

  1. Scenarios: Start new tasks, resume with user input, or confirm safety checks.
  2. Statuses: Always handle COMPLETE, ERROR, NEEDS_INPUT, NEEDS_SECURITY_CHECK.
  3. Resuming: Pass new input (input_text) or confirm checks (acknowledged_safety_checks=True) to continue.
  4. Auto-mode: ignore_safety_and_input=True is convenient but risky.
  5. Handlers: Offer a cleaner, more modular way to manage status-based logic.

For any additional questions, contact founders@passage-team.com