Tools that ship with OpenAgent and can be enabled per Store without any external server.

Tools

OpenAgent includes built-in tool implementations that can be exposed through Tool records without running an external MCP server. Create or edit Tool records in the Tools area, then attach the selected records to a Store's Tools field. The model sees each enabled tool's name and description alongside any MCP tools, and decides when to call them.

Available tools

web_search

Searches the web and returns a list of results. The model receives result titles, URLs, and snippets and can decide which to fetch for more detail.

Useful for: current events, topics not in the knowledge base, information that changes frequently.

Supported search engines: DuckDuckGo (default, no key required), Bing, Google (requires API key and Search Engine ID), Baidu (requires API key). Configure the engine in the Tool record's Sub Type field.

A web_search Tool record exposes both web_search and image_search. image_search returns image URLs, source pages, dimensions, and thumbnails for vision-capable models; if the selected model cannot accept images, the tool returns an error telling the model to continue with text-based tools.

When web_search is used, the search results are saved in the Message's searchResults field. The webSearchEnabled flag on the Message is set to true.

web_fetch

Fetches the full text content of a given URL. Returns the page's text after stripping HTML. Use after web_search when you want the model to read the full content of a specific page, not just the snippet.

If a site blocks the plain HTTP request with a 403 response, OpenAgent can retry through the browser-backed fetch path and return the rendered text. JavaScript-rendered pages, login flows, or pages that need interaction should still use web_browser.

web_browser

Opens a real headless Chrome browser session (via chromedp). Enabling this tool type exposes four sub-tools to the model:

Sub-tool	What it does
`web_browser`	Navigate to a URL, wait for the page to render, return visible text content. Handles JavaScript SPAs and dynamic pages.
`browser_screenshot`	Navigate to a URL and capture a full-page screenshot, returned as base64-encoded PNG.
`browser_evaluate`	Navigate to a URL, evaluate a JavaScript expression in the page context, return the result.
`browser_click`	Navigate to a URL, click an element identified by a CSS selector, return the updated page content.

Use web_browser when web_fetch returns incomplete or empty content for a page (JavaScript-rendered apps, login redirects, etc.).

browser_use

Higher-level browser automation backed by the Claude-in-Chrome extension. Beyond reading page content, browser_use can interact with pages: click buttons, fill forms, navigate through multi-step flows, and extract structured data.

Use cases: logging into services, submitting forms, extracting data from paginated tables, interacting with web applications.

browser_use requires the Claude-in-Chrome browser extension to be installed and connected. It gives the agent direct control of a browser session.

shell

Executes shell commands on the server running OpenAgent and returns the output. Supports both one-shot foreground commands and long-running background sessions.

Foreground mode: runs a command and returns when it finishes. Configurable timeout (default 30s, max 300s).

Background session mode (set background: true or use action parameter): starts a persistent shell session and returns a session_id. Subsequent calls with action: poll read output, action: write or action: submit send input, action: send_keys sends key sequences (Enter, Ctrl+C, etc.), action: resize resizes the PTY, and action: stop terminates the session.

PTY mode (pty: true): allocates a pseudo-terminal for interactive CLI programs (vim, python REPL, npm interactive prompts, etc.).

When a foreground or PTY command times out, OpenAgent terminates the command's process group so child processes do not keep running after the parent shell exits. Stopping a background session follows the same cleanup path after sending the requested interrupt or stop signal.

shell gives the agent arbitrary command execution on the server. Only enable it in deployments where you control who can access the Store and you trust the model's judgment. Do not expose a Store with shell enabled to untrusted users.

time

Returns the current date and time in UTC and the server's local timezone. Use this when the model needs to know the current time to answer questions or construct time-based queries.

Example: "What happened in the news today?" — the model can call time first to establish today's date, then call web_search with a date-qualified query.

office

Reads and processes office document formats (DOCX, XLSX, PPTX). Useful when a user attaches a document to the chat and asks the agent to analyze it, without going through the Files/Vectors pipeline.

Available office sub-tools include Word read/write, Excel read/write, pptx_read, pptx_write, pptx_template_analyze, and pptx_template_fill. The PowerPoint writer creates editable .pptx files from an inline PptxGenJS script passed in the tool call. The script exports build(pptx, ctx), adds slides through PptxGenJS, and can receive optional JSON data plus an assets_dir for images or icons.

For template-based PowerPoint generation, call pptx_template_analyze on a local .pptx file or chat attachment URL first. It returns slide, text slot, image, table, chart, SmartArt, and capacity metadata. Then call pptx_template_fill with a plan that selects, repeats, reorders, and fills template slides. Template filling can replace PNG/JPEG images while preserving the template picture frame, edit SmartArt node text, and resize supported SmartArt layouts. It validates missing targets, chart data, unsupported image formats, and new object collisions before writing the output file.

pptx_write requires Node.js on the OpenAgent host. Source deployments use the bundled worker under tool/pptx-worker; if dependencies are missing, OpenAgent can run npm ci in that worker directory before generating the deck.

local_file

Provides access to files on the local filesystem of the server running OpenAgent. Enabling this tool type exposes six sub-tools:

Sub-tool	What it does
`local_special_dirs`	Returns common OS directories (Desktop, Documents, Downloads) with their absolute paths. Call this first when the user says "my Desktop" or "Downloads" without giving a full path.
`local_file_scan`	Scans a directory recursively and returns descendant files/directories with type, name, path, size, modified time, and optional text previews. Use `preview_chars` to control preview length or set it to `0` to skip previews.
`local_file_read`	Reads text from a local file by absolute path. Supported document types are parsed; other files are read as UTF-8 text. Supports offset and limit for large files.
`local_pdf_ocr_read`	Reads text from scanned local PDFs through OCR. It posts the PDF to the Tool record's Provider URL, or uses OpenAgent's managed local OCR service when the Provider URL is empty.
`local_file_write`	Writes text content to a local file. Requires an absolute path. Safe by default — will not overwrite unless `overwrite: true`.
`local_file_move`	Moves a file from one absolute path to another. Requires `confirmed: true` — the model must get explicit user confirmation before moving.

local_file gives the agent read and write access to the server's filesystem. Only enable in deployments where you trust the model's access scope.

For PDF OCR, leave Provider URL empty to use the managed local service at http://127.0.0.1:8001/ocr/pdf, or set it to a compatible OCR HTTP endpoint that accepts multipart field file and returns {"text":"recognized text"}. The managed service requires Python 3.10+ on the OpenAgent host and installs its Python dependencies on first startup.

Older deployments may refer to local_documents_scan, local_document_read, or local_text_write. Current OpenAgent exposes the generic local_file_scan, local_file_read, local_pdf_ocr_read, and local_file_write APIs instead.

You have access to web_search and web_fetch.
Before answering any question about current events, call web_search first.
Only call web_fetch on URLs returned by web_search — do not fetch arbitrary URLs.

If you find the model over-using or under-using a particular tool, adjusting the system prompt is usually more effective than removing the tool entirely.

Built-in Tools

Tools

Available tools

web_search

web_fetch

web_browser

browser_use

shell

time

office

local_file

gui

video_download

Enabling tools

Combining with MCP

Controlling tool use from the system prompt

On this page