Public Information

Base Parameters

Parameter	Description
BasePath	the base path for API calls
Endpoint	the random path generated when calling APIs
AccessKey	the AccessKey for API calls

Request Path

https://{basePath}/search/{endpoint}/read

Request Method

GET

Request Header

Parameter	Type	Description
Authorization	String	Input format: Bearer {AK}, where AK refers to the AccessKey
Pragma	String	Input value: no-cache. When no-cache is provided, the response results will not be cached and each request is independent; if not provided, the response results for the same query term will be cached for 10 minutes

Request Body Parameters

Parameter	Required	Type	Description
url	Y	String	Target URL to read
formats	N	List	Content format. Options:： - HTML - TEXT - MARKDOWN Must select one value; defaults to TEXT if not provided
mode	N	String	Reading mode:： - quality:Quality mode, uses dynamic rendering - fast:Fast mode, uses static page reading (default if not provided) - auto:Auto mode, automatically selects fast or quality based on the URL - Non-enumerated values default to fast
totalTimeout	N	Int	Disabled by default。End-to-end request timeout in milliseconds. The request will be aborted if it exceeds this time.
timeout	N	Int	Read timeout in milliseconds. Default: 30000ms Controls the API connection time, not the total request duration.
imageDownloadEnable	N	Bool	Image conversion switch (converts to base64 encoding).Default false When formats is HTML/MARKDOWN and imageDownloadEnable is true, image links in the response are converted to base64.（Note: Enabling this will incur additional fees for images parsed from the webpage.） Note: If the URL points to a PDF, this feature is disabled and the result is returned in base64 by default.
imageInContent	N	Bool	Works with imageDownloadEnable. Default: true; images converted to base64 are embedded in the main content. If set to false, an additional parameter image_base64_list is returned, containing all base64 images in a separate list.
pdfExtractEnable	N	Bool	When the URL points to a PDF, use this to enable/disable content extraction. Default: false true: Return parsed PDF text content false: Return PDF as base64 Has no effect if the URL does not point to a PDF. Note: PDF content extraction incurs additional fees.。
enhancedOcr	N	Bool	Only effective if pdfExtractEnable is true Enhanced PDF extraction, improves accuracy and coverage. Default: False Note: OCR recognition incurs additional fees.

Response Value Description

Field	Type	Description
html	String	When format is html, returns the HTML version of the page content.
markdown	String	When format is markdown, returns the Markdown version of the content.
text	String	When format is text, returns the plain text version of the content.
metadata.title	String	Webpage title.
metadata.description	String	Webpage description.
metadata.keywords	String	Webpage keywords.
logo	String	URL of the website logo.
site_name	String	Website name.
image_list	Array	List of image URLs. Only included if the page contains images.
image_base64_list	Array	List of images in base64 encoding.
pdf_pages	Int	Number of pages if the target is a PDF.
internal_links	Array	List of internal links (same domain as the target URL) found in the content.
external_links	Array	List of external links (different domain from the target URL) found in the content.
external_links	Array	List of external links found in the target URL content.

Request Example

cURL

curl --location --request POST 'https://{BasePath}/search/{Endpoint}/read' \
--header 'Authorization: Bearer {AK}' \
--header 'Content-Type: application/json' \
--data-raw '{
    "url": "https://www.volcengine.com/docs/6369/67267",
    "formats": [
        "TEXT"
    ],
    "mode":"quality"
}'

Python（Requests）

import requests
import json

BasePath = 'xxxxxx'   # Replace with your obtained BasePath
Endpoint = 'xxxxxx'   # Replace with your obtained Endpoint
AK = 'xxxxxx'         # Replace with your obtained AccessKey

url = f"https://{BasePath}/search/{Endpoint}/read"

payload = json.dumps({
"url": "https://www.volcengine.com/docs/6369/67267",
"formats":[
    "TEXT"
    ],
"mode": "quality"
})

headers = {
'Authorization': f'Bearer {AK}',
"Content-Type": "application/json"
}

response = requests.post(url, headers=headers, data=payload)

print(response.text)

Response Example

{
    "markdown": "导航\n\n请求结构\n\n最近更新时间：2025.04.23 12:06:55首次发布时间：2021.02.25 21:02:47\n\n火山引擎的OpenAPI的请求结构如下：\n\n服务地址（Endpoint）用于访问火山引擎的云服务，通常是一个URL，客户端可以通过该地址与服务进行通信。  \n火山引擎服务地址的标准形式为：\n\nRegional服务 （区域化部署） Global服务（中心化部署）\n\n    \n    \n    {service}.{region}.volcengineapi.com\n\n例如：云服务器的Service为：ecs，其在\n\n    \n    \n    cn-beijing-autodriving\n\nRegion的服务地址为：\n\necs.cn-beijing-autodriving.volcengineapi.com\n\n    \n    \n    {service}.volcengineapi.com\n\n例如：访问控制的Service为：iam，作为Global服务，其服务地址为：iam.volcengineapi.com\n\n注意\n\n  * 当Service中存在下划线( _ )符号时，Endpoint需转为中划线( - )符号。存在大写字母时需转成小写。\n  * 存在部分云产品暂未适配标准Endpoint，请前往您所使用的云产品-API参考或开发指南中查看详情。\n\n推荐使用安全性更高的 HTTPS方式发送请求。\n\n请求方法详见各个接口具体的需求。在火山引擎中的OpenAPI大多数支持GET或POST请求。\n\n火山引擎的OpenAPI请求包含两类参数：公共请求参数和接口请求参数。其中公共请求参数在每个请求中都必须包含。接口请求参数需参考各个服务的接口文档。\n\n请求及返回结果使用UTF-8的字符集进行编码。\n\n售后在线咨询\n\n",
    "logo": "https://portal.volccdn.com/obj/volcfe/misc/favicon.png",
    "site_name": "API签名调用指南",
    "image_list": [],
    "metadata": {
        "title": "请求结构--API签名调用指南-火山引擎",
        "description": "火山引擎官方文档中心，产品文档、快速入门、用户指南等内容，你关心的都在这里，包含火山引擎主要产品的使用手册、API或SDK手册、常见问题等必备资料，我们会不断优化，为用户带来更好的使用体验",
        "keywords": "API签名调用指南"
    },
    "internal_links": [],
    "external_links": []
}