[Beta] Computer Use
Teaching models to navigate computers, responsibly.
As Anthropic introduces new capabilities to letting models control users device, now you can use computer use with OpenAI compatible APIs. https://www.anthropic.com/news/developing-computer-use
Models
Currently, Computer Use is supported on the following models:
- Claude 3.5 Sonnet (anthropic/claude-3-5-sonnet-20241022)
** * All other models are not supported.**
Usage
{
"model": "anthropic/claude-3-5-sonnet-20241022",
"messages": [
{
"role": "user",
"content": "Save a picture of a cat to my desktop."
}
],
"tools": [
{
"type": "computer_20241022",
"name": "computer",
"display_width_px": 1024,
"display_height_px": 768,
"display_number": 1
},
{
"type": "text_editor_20241022",
"name": "str_replace_editor"
},
{
"type": "bash_20241022",
"name": "bash"
}
],
"tool_choice": "auto"
}
Available Tools
Computer
Tool type: computer_20241022
Parameters:
display_width_px
: Required The width of the display being controlled by the model in pixels.display_height_px
: Required The height of the display being controlled by the model in pixels.display_number
: Optional The display number to control (only relevant for X11 environments). If specified, the tool will be provided a display number in the tool definition.
Tool input data (only for reference):
{
"properties": {
"action": {
"description": """The action to perform. The available actions are:
* `key`: Press a key or key-combination on the keyboard.
- This supports xdotool's `key` syntax.
- Examples: "a", "Return", "alt+Tab", "ctrl+s", "Up", "KP_0" (for the numpad 0 key).
* `type`: Type a string of text on the keyboard.
* `cursor_position`: Get the current (x, y) pixel coordinate of the cursor on the screen.
* `mouse_move`: Move the cursor to a specified (x, y) pixel coordinate on the screen.
* `left_click`: Click the left mouse button.
* `left_click_drag`: Click and drag the cursor to a specified (x, y) pixel coordinate on the screen.
* `right_click`: Click the right mouse button.
* `middle_click`: Click the middle mouse button.
* `double_click`: Double-click the left mouse button.
* `screenshot`: Take a screenshot of the screen.""",
"enum": [
"key",
"type",
"mouse_move",
"left_click",
"left_click_drag",
"right_click",
"middle_click",
"double_click",
"screenshot",
"cursor_position",
],
"type": "string",
},
"coordinate": {
"description": "(x, y): The x (pixels from the left edge) and y (pixels from the top edge) coordinates to move the mouse to. Required only by `action=mouse_move` and `action=left_click_drag`.",
"type": "array",
},
"text": {
"description": "Required only by `action=type` and `action=key`.",
"type": "string",
},
},
"required": ["action"],
"type": "object",
}
Text Editor
Tool type: text_editor_20241022
Tool input schema(only for reference):
{
"properties": {
"command": {
"description": "The commands to run. Allowed options are: `view`, `create`, `str_replace`, `insert`, `undo_edit`.",
"enum": ["view", "create", "str_replace", "insert", "undo_edit"],
"type": "string"
},
"file_text": {
"description": "Required parameter of `create` command, with the content of the file to be created.",
"type": "string"
},
"insert_line": {
"description": "Required parameter of `insert` command. The `new_str` will be inserted AFTER the line `insert_line` of `path`.",
"type": "integer"
},
"new_str": {
"description": "Optional parameter of `str_replace` command containing the new string (if not given, no string will be added). Required parameter of `insert` command containing the string to insert.",
"type": "string"
},
"old_str": {
"description": "Required parameter of `str_replace` command containing the string in `path` to replace.",
"type": "string"
},
"path": {
"description": "Absolute path to file or directory, e.g. `/repo/file.py` or `/repo`.",
"type": "string"
},
"view_range": {
"description": "Optional parameter of `view` command when `path` points to a file. If none is given, the full file is shown. If provided, the file will be shown in the indicated line number range, e.g. [11, 12] will show lines 11 and 12. Indexing at 1 to start. Setting `[start_line, -1]` shows all lines from `start_line` to the end of the file.",
"items": { "type": "integer" },
"type": "array"
}
},
"required": ["command", "path"],
"type": "object"
}
Bash
Tool type: bash_20241022
Tool input schema (only for reference):
{
"properties": {
"command": {
"description": "The bash command to run. Required unless the tool is being restarted.",
"type": "string"
},
"restart": {
"description": "Specifying true will restart this tool. Otherwise, leave this unspecified.",
"type": "boolean"
}
}
}