Re: [blink-dev] Intent to Experiment: Prompt API

Mike Taylor Fri, 16 May 2025 07:05:09 -0700

LGTM to experiment from M139 to M144 inclusive.

On 5/16/25 3:18 AM, Domenic Denicola wrote:

        Contact emails
a...@chromium.org, m...@chromium.org, btri...@chromium.org,dome...@chromium.org, kenjibah...@chromium.org
        Explainer
https://github.com/webmachinelearning/prompt-api/blob/main/README.md<https://github.com/webmachinelearning/prompt-api/blob/main/README.md>
        Specification
None yet, although some of the shared infrastructure inhttps://webmachinelearning.github.io/writing-assistance-apis/#supporting<https://webmachinelearning.github.io/writing-assistance-apis/#supporting>willbe used.
        Summary
An API designed for interacting with an AI language model using text,image, and audio inputs. It supports various use cases, fromgenerating image captions and performing visual searches totranscribing audio, classifying sound events, generating textfollowing specific instructions, and extracting information orinsights from text. It supports structured outputs which ensure thatresponses adhere to a predefined format, typically expressed as a JSONschema, to enhance response conformance and facilitate seamlessintegration with downstream applications that require standardizedoutput formats.
This API is also exposed in Chrome Extensions, currently as an OriginTrial. This Intent is for exposure as an Origin Trial on the web.
        Blink component
Blink>AI>Prompt<https://issues.chromium.org/issues?q=customfield1222907:%22Blink%3EAI%3EPrompt%22>
        TAG review
https://github.com/w3ctag/design-reviews/issues/1093<https://github.com/w3ctag/design-reviews/issues/1093>
        TAG review status

Pending


        Risks



        Interoperability and Compatibility
This feature, like all built-in AI features, has inherentinteroperability risks due to the use of AI models whose behavior isnot fully specified. See some general discussion inhttps://www.w3.org/reports/ai-web-impact/#interop.
In particular, because the output in response to a given prompt variesby language model, it is possible for developers to write brittle codethat relies on specific output formats or quality, and does not workacross multiple browsers or multiple versions of the same browser.
There are some reasons to be optimistic that web developers won'twrite such brittle code. Language models are inherentlynondeterministic, so creating dependencies on their exact output isdifficult. And many users will not have the hardware necessary to runa language model, so developers will need to code in a way such thatthe prompt API is always used as an enhancement, or has appropriatefallback to cloud services.
Several parts of the API design help steer developers in the rightdirection, as well. The API has clear availability testing featuresfor developers to use, and requires developers to state their requiredcapabilities (e.g., modalities and languages) up front. Mostimportantly, the structured outputs feature can help mitigate againstwriting brittle code that relies on specific output formats.
Gecko: No signal(https://github.com/mozilla/standards-positions/issues/1213<https://github.com/mozilla/standards-positions/issues/1213>)
WebKit: No signal(https://github.com/WebKit/standards-positions/issues/495<https://github.com/WebKit/standards-positions/issues/495>)
Web developers: Strongly positive(https://github.com/webmachinelearning/prompt-api/blob/main/README.md#stakeholder-feedback<https://github.com/webmachinelearning/prompt-api/blob/main/README.md#stakeholder-feedback>)
Other signals: We are also working with Microsoft Edge developers onthis feature, with them contributing the structured output functionality.
        Activation
This feature would definitely benefit from having polyfills, backed byany of: cloud services, lazily-loaded client-side models using WebGPU,or the web developer's own server. We anticipate seeing an ecosystemof such polyfills grow as more developers experiment with this API.
        WebView application risks
Does this intent deprecate or change behavior of existing APIs, suchthat it has potentially high risk for Android WebView-based applications?
None



        Goals for experimentation
Validate the technical implementation and developer experience ofmultimodal inputs with a broader audience and actual usage.
Assess how structured output improves ergonomics and could addressinteroperability concerns between implementations (e.g. differentunderlying models).
Gather extensive feedback from a wide range of web developers rootedin real world usage.
Identify diverse and innovative use cases to inform a roadmap of taskAPIs.
        Ongoing technical constraints

None



        Debuggability
It is possible that giving DevTools more insight into thenondeterministic states of the model, e.g. random seeds, could helpwith debugging. See discussionathttps://github.com/webmachinelearning/prompt-api/issues/74<https://github.com/webmachinelearning/prompt-api/issues/74>.
We also have some internal debugging pages which give more detail onthe model's status, e.g. chrome://on-device-internals, and parts ofthese might be suitable to port into DevTools.
        Will this feature be supported on all six Blink platforms
        (Windows, Mac, Linux, ChromeOS, Android, and Android WebView)?

No
Not all platforms will come with a language model. In particular, inthe initial stages we are focusing on Windows, Mac, and Linux.
        Is this feature fully tested by web-platform-tests
        
<https://chromium.googlesource.com/chromium/src/+/main/docs/testing/web_platform_tests.md>?

No
We plan to write web platform tests for the API surface as much aspossible. The core responses from the model will be difficult to test,but some facets are testable, e.g. the adherence to structured outputresponse constraints.
        Flag name on about://flags

prompt-api-for-gemini-nano-multimodal-input


        Finch feature name

AIPromptAPIMultimodalInput


        Requires code in //chrome?

True


        Tracking bug
https://issues.chromium.org/issues/417530643<https://issues.chromium.org/issues/417530643>
        Measurement

We have various use counters for the API, e.g. LanguageModel_Create


        Non-OSS dependencies
Does the feature depend on any code or APIs outside the Chromium opensource repository and its open-source dependencies to function?
Yes: this feature depends on a language model, which is bridged to theopen-source parts of the implementation via the interfaces in//services/on_device_model.
        Estimated milestones

Origin trial desktop first

        

139

Origin trial desktop last

        

144

DevTrial on desktop

        

137

DevTrial on Android

        

137



        Anticipated spec changes
Open questions about a feature may be a source of future web compat orinterop issues. Please list open issues (e.g. links to known githubissues in the project for the feature specification) whose resolutionmay introduce web compat/interop risk (e.g., changing to naming orstructure of the API in a non-backward-compatible way).
https://github.com/webmachinelearning/prompt-api/issues/42<https://github.com/webmachinelearning/prompt-api/issues/42>issomewhat worth keeping an eye on, but we believe a forward-compatibleapproach is possible by just providing constant min = max values.
        Link to entry on the Chrome Platform Status
https://chromestatus.com/feature/5134603979063296?gate=5106702730657792<https://chromestatus.com/feature/5134603979063296?gate=5106702730657792>
        Links to previous Intent discussions
Intent to Prototype:https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAM0wra_LXU8KkcVJ0x%3DzYa4h_sC3FaHGdaoM59FNwwtRAsOALQ%40mail.gmail.com<https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAM0wra_LXU8KkcVJ0x%3DzYa4h_sC3FaHGdaoM59FNwwtRAsOALQ%40mail.gmail.com>
This intent message was generated by Chrome Platform Status<https://chromestatus.com/>.
--
You received this message because you are subscribed to the GoogleGroups "blink-dev" group.To unsubscribe from this group and stop receiving emails from it, sendan email to blink-dev+unsubscr...@chromium.org.To view this discussion visithttps://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAM0wra9oT0jygAYT00WPp0_wtZ-znrB2OdZ6GQb%2B3thFLP19pA%40mail.gmail.com<https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAM0wra9oT0jygAYT00WPp0_wtZ-znrB2OdZ6GQb%2B3thFLP19pA%40mail.gmail.com?utm_medium=email&utm_source=footer>.


--
You received this message because you are subscribed to the Google Groups 
"blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to blink-dev+unsubscr...@chromium.org.
To view this discussion visit 
https://groups.google.com/a/chromium.org/d/msgid/blink-dev/d1c1090c-9c5b-400c-9b33-c30ca804dc3f%40chromium.org.

Re: [blink-dev] Intent to Experiment: Prompt API

Reply via email to