Contact emails
dome...@chromium.org<mailto:dome...@chromium.org>,sushr...@microsoft.com<mailto:sushr...@microsoft.com>,
 
m...@chromium.org<mailto:m...@chromium.org>,kenjibah...@chromium.org<mailto:kenjibah...@chromium.org>,fran...@microsoft.com

Explainer
https://github.com/webmachinelearning/prompt-api?tab=readme-ov-file#tool-use

Specification
None yet.

Summary
We would like to prototype Function Calling in the Prompt API. At a high level, 
this allows developers to define additional capabilities that a language model 
can invoke as it processes a prompt either to trigger some actions or obtain 
additional information to complete a task.

Blink component
Blink > AI > Prompt
Motivation
Large Language Models (LLMs) are most effective when they can move beyond text 
generation and interact with external data and systems. This capability, 
commonly known as "Function Calling" or "Tool Use," allows a model to query 
APIs, perform actions, or access real-time information, transforming it from a 
passive chatbot into an active assistant. We propose adding Function Calling to 
the Prompt API to bring this powerful capability to the web platform for 
on-device models.
Enabling developers to define tools that the on-device model can use will 
unlock a new class of rich, AI-powered web experiences. Based on strong 
developer signals from Chrome’s built-in AI 
hackathon<https://googlechromeai.devpost.com/> and early preview 
program<https://developer.chrome.com/docs/ai/join-epp>, there is significant 
demand for this feature across a range of applications:

  *
Natural language UX: A flight search website could enhance their UX with this 
capability to allow users to express their needs more directly, without having 
to discover and learn more advanced controls (e.g. search filters).
  *
Data-Driven Applications: A developer building a data visualization dashboard 
could provide tools to query and present data. This would allow users to ask 
questions like, "Show me sales from last quarter for the EMEA region" and have 
the model translate the request into a structured query and graph definitions 
for the respective tools.
  *
Creative and Productivity Tools: an editor could expose building blocks and 
fundamental actions as tools to allow their users to easily create powerful 
macros / custom features without having to learn a new language.

Adding this capability to the Prompt API will provide a standardized, 
model-agnostic way for web developers to integrate their site's logic with an 
on-device LLM, all while keeping user data on-device.
Relationship to Script Tools
This proposal is related, yet different, from the Script Tools API (intent to 
prototype<https://groups.google.com/a/chromium.org/g/blink-dev/c/W444JxsqxZw/m/b7E_JLXjBwAJ>).
 While both involve exposing JavaScript functions for AI use, they serve 
distinct, complementary purposes. The key difference lies in who orchestrates 
the interaction:

  *
Prompt API Function Calling (This Proposal): The web page (web developer) is 
the orchestrator. A developer uses the Prompt API to call the on-device model 
to enhance their site's own features. The page defines the tools and initiates 
the prompt, controlling the entire interaction. This is an "inward-facing" 
capability for the site to use AI.
  *
Script Tools: An external agent is the orchestrator. A web page uses the Script 
Tools API to expose its capabilities to an external agent (e.g. a browser or 
OS-level agent, or features and apps including accessibility related 
opportunities). The external agent, acting on a user's behalf, discovers and 
invokes these tools to perform tasks on the page. This is an "outward-facing" 
capability for the site to work well when used through an external agent.

These APIs are two sides of the same coin, and we are closely collaborating to 
ensure a unified developer narrative. A website could use the Prompt API to 
enhance its own UX while also exposing Script Tools to allow an external agent 
(e.g. browser or OS agent, other apps/features) to automate tasks on the user’s 
behalf.

Alternative Considered: Direct MCP Integration
We considered exposing a lower-level protocol like the Model Context Protocol 
(MCP) directly to the web. While MCP is emerging as a standard for agents 
interacting with external tools and data, we believe that a higher-level, 
native API is better suited for the web platform, in alignment with W3C design 
principles and developer ergonomics.
Our and Script Tools’ approach is to provide an interface that is isomorphic 
with the concepts in MCP but abstracts away the low-level protocol details. 
This design has several advantages:

  *
Developer Ergonomics: It provides a simple, familiar JavaScript API that is 
easier for web developers to adopt and integrate into their existing 
client-side logic.
  *
Direct Access to In-Page State: The API naturally operates within the page's 
execution context, giving tools trivial access to the DOM, session data, and 
other transient state generated by the user. A direct MCP integration would 
require an additional bridge to communicate this live, in-tab context to where 
the Tools are called.
  *
Durability and Decoupling: It avoids tightly coupling the web platform to a 
specific version of an external protocol that is still rapidly evolving.
  *
Alignment with client-side upsides: having the tools be locally defined and 
using JavaScript is a better option to benefit from client-side AI’s 
advantages, in particular resiliency to network flakiness or offline 
conditions, but also privacy or costs concerns.

By providing a web platform specific API that is isomorphic to MCP, we deliver 
immediate value to developers in a way that is both powerful and ergonomic. 
This pragmatic approach is the right first step for the web platform and does 
not preclude a more direct MCP integration in the future as the protocol 
stabilizes.
Initial public proposal
None yet.

TAG review
None yet.

TAG review status
TBD.

Risks

Interoperability and Compatibility
Addition of function calling to the prompt API, helps alleviate one aspect of 
interop concern between different models implementing the prompt API.
Language models use special tokens to define tools and parameters, these tokens 
and the template used to declare tool availability are model specific. For 
example: Phi 4, the model used by Microsoft Edge, uses the following format to 
declare tools in its system prompt
<|tool|>[{"name": "get_weather_updates", "description": "Fetches weather 
updates for a given city using the RapidAPI Weather API.", "parameters": 
{"city": {"description": "The name of the city for which to retrieve weather 
information.", "type": "str", "default": "London"}}}]<|/tool|>
Gemini Nano, the model used by Chrome, has a different, not public, format (as 
a reference see Gemma’s documentation on the 
topic<https://ai.google.dev/gemma/docs/capabilities/function-calling>).
Today, web developers eager to use function calling with the prompt API on user 
agents that implement it with Phi 4 can pass the exact string above and receive 
responses as below which they can parse into a javascript function call
<|tool_call|>[{"name": "get_weather_updates", "arguments": {"city": “San 
Francisco"}}]<|/tool_call|>
such an implementation is not interoperable. Our Fn. calling proposal builds an 
abstraction for web developers to use function calling in an interoperable way, 
handling the conversion of high level function description to model specific 
syntax.
Gecko: No signal on function calling specifically (also see Prompt API 
position<https://github.com/mozilla/standards-positions/issues/1213>)
WebKit: No signal on function calling specifically (also see Prompt API 
position<https://github.com/WebKit/standards-positions/issues/495>)
Web developers: early signals indicating high interest (see motivation section 
and engagement on WICG 
repo<https://github.com/webmachinelearning/prompt-api/issues/7>)

Is this feature fully tested by 
<https://chromium.googlesource.com/chromium/src/+/master/docs/testing/web_platform_tests.md>
 
web-platform-tests<https://chromium.googlesource.com/chromium/src/+/master/docs/testing/web_platform_tests.md>?
Aiming for proper test coverage.

Requires code in //chrome?
True

Tracking bug
http://crbug.com/422803232

Estimated milestones
No milestones specified.

Link to entry on the Chrome Platform Status
https://chromestatus.com/feature/5085580563841024

-- 
You received this message because you are subscribed to the Google Groups 
"blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to blink-dev+unsubscr...@chromium.org.
To view this discussion visit 
https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CH5PR00MB2282BDFD8DC41C4F1A200FC2FD2CA%40CH5PR00MB2282.namprd00.prod.outlook.com.

Reply via email to