Skip to main content

Highlight data API

Use the Highlight Data API to obtain line coordinates for each line number returned by the Execution API.
Ensure that the Enable Highlight option is enabled in the exported tool settings in the workflow otherwise, highlight data will not be returned.

Request

Endpoint/deployment/api/<organization_id>/<api_name>/highlight/
URLhttps://us-central.unstract.com/deployment/api/<organization_id>/<api_name>/highlight/
MethodGET
HeadersAuthorization: Bearer <YOUR API KEY>

Parameters

ParameterTypeDefaultRequiredDescription
whisper_hashstring--YesThe Whisper hash returned in the Execution API response and you can find it in the below json path: result.metadata.whisper_hash
line_numbersstring--YesThe Line numbers returned in the Execution API response and you can find it in the below json path: result.metadata.line_numbers. Specify Comma-separated list of line numbers like 2,3,6
text_extractor_namestring--YesThe name of the text extractor adapter that is used in the Prompt studio e.g llm-whisperer-v2

Example Curl Request

curl -X GET --location 'https://us-central.unstract.com/deployment/api/org_HJqNgodUQsA99S3A/mar17/highlight/?whisper_hash=REPLACE_WITH_WHISPER_HASH&line_numbers=REPLACE_WITH_LINE_NUMBERS&text_extractor_name=REPLACE_WITH_TEXT_EXTRACTOR_NAME' \
--header 'Authorization: Bearer <Your API Key>'

Example 200 Response

{
"highlights":{
"2":{
"base_y":155,
"base_y_percent":4.8927,
"height":51,
"height_percent":1.6098,
"page":0,
"page_height":3168,
"raw":[
0,
155,
51,
3168
]
},
"3":{
"base_y":187,
"base_y_percent":5.9028,
"height":26,
"height_percent":0.8207,
"page":0,
"page_height":3168,
"raw":[
0,
187,
26,
3168
]
},
"6":{
"base_y":300,
"base_y_percent":9.4697,
"height":38,
"height_percent":1.1995,
"page":0,
"page_height":3168,
"raw":[
0,
300,
38,
3168
]
}
}
}

400: Missing required fields

This response occurs when any of required parameter is missing or empty.

{
"type":"validation_error",
"errors":[
{
"code":"blank",
"detail":"This field may not be blank.",
"attr":"whisper_hash"
},
{
"code":"blank",
"detail":"This field may not be blank.",
"attr":"text_extractor_name"
},
{
"code":"blank",
"detail":"This field may not be blank.",
"attr":"line_numbers"
}
]
}

Line Numbers

This data can be used show the bounding box of each line of text extracted from the document in the original document. Useful for highlighting the extracted text in the original document.

The line metadata contains the bounding box coordinates for each line of text extracted from the document. For each line, an array of arrays is provided with the bounding box coordinates. The bounding box information is represented as a array of four integers: [page_no, y, height, max_height] where:

  • page_no: The page number of the document.
  • base_y: The y-coordinate of the bounding box (bottom).
  • height: The height of the bounding box.
  • page_height: The height of the original page.
info

Highlighting is for the entire row of the original document. The x coordinates are not provided as they will effectively be the width of the page. x1=0 and x2=page_width always.


(0,y-height) +-------------------------+ (page_width,y-height)
| |
| |
(0,y) +-------------------------+ (page_width,y)

highlight_data = {
"2": { # Example highlight data for line 2
"base_y": 155, # Y-coordinate
"base_y_percent": 4.8927,
"height": 51, # Height
"height_percent": 1.6098,
"page": 0, # Page number
"page_height": 3168, # Page height
"raw": [
0, # Page number
155, # Y-coordinate
51, # Height
3168 # Page height
]
}
}