Skip to main content
Version: 2.0.0

LLMWhisperer Modes

Feature matrix for LLMWhisperer modes

Native TextLow CostHigh QualityForm
PDF (not scanned)
PDF (scanned)
PDF (with forms)
Images
MS Office Document
MS Office Excel
MS Office Powerpoint
LibreOffice Writer
LibreOffice Calc
LibreOffice Impress
Checkbox and Radio button detection✗ No✗ No✗ No✓ Yes
Lines reproduction in output✗ No✓ Yes✓ Yes✓ Yes
Extraction performanceVery fastFastMediumMedium
Image preprocessing (median filter and gaussian blur)✗ No✓ Yes✗ No✗ No
Line splitting stratergy choice✓ Yes✓ Yes✓ Yes✓ Yes
Supported languagesAll (unicode)120*+300+300+
Handwritting recognition✗ NoBasic support✓ Yes✓ Yes
Layout preserving output✓ Yes✓ Yes✓ Yes✓ Yes
AI/ML based enhancement✗ No✗ No✓ Yes✓ Yes
Rotation and skew compensationNA✗ No✓ Yes✓ Yes
Auto repair PDFs✓ Yes✓ Yes✓ Yes✓ Yes
Dense text contentBest performanceVery goodVery goodVery good
High entropy content (each page contains large variery of text sizes)Best performanceVery goodVery goodVery good
Native TextLow CostHigh QualityForm
Recommended use cases• Low latency requirement
• All documents are PDFs
• PDFs are native text PDFs
• Cost sensitive application
• High quality scanned PDFs
• High quality scanned images
• No handwritten documents
• Medium/low quality scanned PDFs
• Medium/low quality scanned images
• Handwritten documents
• Checkbox and radio button detection
• Medium/low quality scanned PDFs
• Medium/low quality scanned images
• Handwritten documents