Highlight data API
Use the Highlight Data API to obtain line coordinates for each line number returned by the Execution API.
Ensure that the Enable Highlight
option is enabled in the exported tool settings in the workflow otherwise, highlight data will not be returned.
Request
Endpoint | /deployment/api/<organization_id>/<api_name>/highlight/ |
URL | https://us-central.unstract.com/deployment/api/<organization_id>/<api_name>/highlight/ |
Method | GET |
Headers | Authorization: Bearer <YOUR API KEY> |
Parameters
Parameter | Type | Default | Required | Description |
---|---|---|---|---|
whisper_hash | string | -- | Yes | The Whisper hash returned in the Execution API response and you can find it in the below json path: result.metadata.whisper_hash |
line_numbers | string | -- | Yes | The Line numbers returned in the Execution API response and you can find it in the below json path: result.metadata.line_numbers . Specify Comma-separated list of line numbers like 2,3,6 |
text_extractor_name | string | -- | Yes | The name of the text extractor adapter that is used in the Prompt studio e.g llm-whisperer-v2 |
Example Curl Request
curl -X GET --location 'https://us-central.unstract.com/deployment/api/org_HJqNgodUQsA99S3A/mar17/highlight/?whisper_hash=REPLACE_WITH_WHISPER_HASH&line_numbers=REPLACE_WITH_LINE_NUMBERS&text_extractor_name=REPLACE_WITH_TEXT_EXTRACTOR_NAME' \
--header 'Authorization: Bearer <Your API Key>'
Example 200
Response
{
"highlights":{
"2":{
"base_y":155,
"base_y_percent":4.8927,
"height":51,
"height_percent":1.6098,
"page":0,
"page_height":3168,
"raw":[
0,
155,
51,
3168
]
},
"3":{
"base_y":187,
"base_y_percent":5.9028,
"height":26,
"height_percent":0.8207,
"page":0,
"page_height":3168,
"raw":[
0,
187,
26,
3168
]
},
"6":{
"base_y":300,
"base_y_percent":9.4697,
"height":38,
"height_percent":1.1995,
"page":0,
"page_height":3168,
"raw":[
0,
300,
38,
3168
]
}
}
}
400: Missing required fields
This response occurs when any of required parameter is missing or empty.
{
"type":"validation_error",
"errors":[
{
"code":"blank",
"detail":"This field may not be blank.",
"attr":"whisper_hash"
},
{
"code":"blank",
"detail":"This field may not be blank.",
"attr":"text_extractor_name"
},
{
"code":"blank",
"detail":"This field may not be blank.",
"attr":"line_numbers"
}
]
}
Line Numbers
This data can be used show the bounding box of each line of text extracted from the document in the original document. Useful for highlighting the extracted text in the original document.
The line metadata contains the bounding box coordinates for each line of text extracted from the document. For each line, an array of arrays is provided with the bounding box coordinates. The bounding box information is represented as a array of four integers: [page_no, y, height, max_height] where:
page_no
: The page number of the document.base_y
: The y-coordinate of the bounding box (bottom).height
: The height of the bounding box.page_height
: The height of the original page.
Highlighting is for the entire row of the original document. The x coordinates are not provided as they will effectively be the width of the page. x1=0
and x2=page_width
always.
(0,y-height) +-------------------------+ (page_width,y-height)
| |
| |
(0,y) +-------------------------+ (page_width,y)
highlight_data = {
"2": { # Example highlight data for line 2
"base_y": 155, # Y-coordinate
"base_y_percent": 4.8927,
"height": 51, # Height
"height_percent": 1.6098,
"page": 0, # Page number
"page_height": 3168, # Page height
"raw": [
0, # Page number
155, # Y-coordinate
51, # Height
3168 # Page height
]
}
}