Built-In Capabilities

When developing an AI-enhanced application, it is a common requirement to have the ability to search and browse the internet, or to perform other tasks that are not directly related to the AI model itself. This is where the built-in AI capabilities come in.

You can use the built-in capability with AI models supporting tool calls, including the most common models like the GPT-4 series, Claude series, Llama3.1 series, etc.

1. How to use

This feature is currently under heavy development and subject to change in future releases.

Known issue: When using Anthropic models (e.g., Claude), there may be issues during the response (returning before triggering the capability provider). We are expecting to resolve the issue before 5th August.

To use the built-in AI capabilities, you need to pass certain HTTP request headers to the API. The headers are:

Capability	Header	Description	Details
Search	`x-feature-search-internet: true`	Allows the AI model to search the internet.
Browser Website	`x-feature-browse-website: true`	Allows the AI model to browse the webpages	Will cache the browse results for 24 hours for the same URL

2. Capabilities

Search

Once the Search capability is enabled, the AI model will be able to search the internet for information. We are going to use multiple search engines to get the results for you; currently, we support Google only.

Here is an sample request to use the search capability:

In message chunks, we added the params citations to include the search results.

{
  ...
	"choices": [
		{
			"index": 0,
			"message": {
				"role": "assistant",
				"content": "As of February 23, 2025, estimates of Elon Musk's net worth vary:\n\n*   **Bloomberg Billionaires Index:** \\$402 billion (as of February 8, 2025)\n*   **Forbes:** \\$384.0 billion\n*   **Wikipedia:** Cites Forbes as estimating his net worth at \\$397 billion as of February 2025.\n*   **Investopedia:** \\$433 billion",
				"citations": [
					{
						"type": "url",
						"title": "wikipedia.org",
						"url": "https://vertexaisearch.cloud.google.com/grounding-api-redirect/AQXblrxQ6z2dDKKFyi3V1sdd_coDW5PSUWuiU3NduoXUsydZLrU9L-e9GPPXhz3weApE8gtkneia479xCnOEXOabgtVH29WRPgEPYVJS3-mZkA1FE09pF8ghkSZLFeD8yF5iCiodcGajXsB1rDYvpig=",
						"search_queries": [
							"elon musk net worth"
						],
						"confidence": 0.97496605,
						"start_char_index": 67,
						"end_char_index": 143
					}
				]
			},
			"logprobs": null,
			"finish_reason": ""
		}
    ...
  ]
}

Field	Required	Description
`type`	Yes	The type of the citation. Currently, we only support `url`.
`title`	No	The title of the citation.
`url`	No	The URL of the citation.
`search_queries`	No	The search queries used to get the citation.
`confidence`	No	The confidence of the citation.(only support in gemini models)
`start_char_index`	No	The start character index of the citation in the message.(only support in gemini models)
`end_char_index`	No	The end character index of the citation in the message.(only support in gemini models)

Browse Website

Enabling the Browse Website capability will allow the AI model to browse the webpages. We will cache the browse result for 24 hours for the same URL. This capability supports fetching multiple URLs at the same time.

For response latency considerations, the capability uses different timeout configurations for different numbers of URLs to fetch in the same batch:

Number of URLs to fetch	Timeout
`count(urls) <= 1`	5s
`1 < count(urls) <= 4`	2s
`5 < count(urls) <= 9`	1.5s
`count(urls) > 10`	1s

3. Common questions

Q: Why Anthropic models trigger search/browse capabilities almost every time?

Different model have diferent sensitivity to using the external tools, and Anthropic models are more sensitive to the tools, which may trigger the search/browse capabilities more often. You can specific in system prompts to let Anthropic models you want model to use the tools less offen.

e.g.:

System: ONLY use the search tools when it is necessary.

Q: It seems it takes longer to get a response when using the built-in AI capabilities. Why is that?

The built-in AI capabilities require additional processing time to perform the tasks. And longer context will result in larger TTFT(Time to first token). This can result in longer response times compared to directly calling the AI model without capabilities. Switch to smaller models will also help to reduce the latency.

Q: Why should I use built-in AI capabilities rather than implementing them myself through tool calls?

Simplicity: The built-in AI capabilities are designed to be easy to use and require minimal setup. We handle all the complexity for you.
Speed: We handle the requests at the edge, reducing up to 5x request latency for your apps to resolve the request from client/server side.
Cost Efficiency: We will cache the request for the same AI capability (depends on the specific capabilities you are using), which would reduce the latency and cost for you.
Reliability: We have a robust infrastructure to handle the requests, which ensures the reliability of the AI capabilities.
Security: We have a robust security model to ensure the security of the AI capabilities, which protects your data and privacy.