Releases: VikParuchuri/surya
Refactor surya; new table recognition model
Refactor
This is a complete refactor of surya - the code is now cleaner and better organized. Models are now imported and used differently, here is an example for OCR:
from PIL import Image
from surya.recognition import RecognitionPredictor
from surya.detection import DetectionPredictor
image = Image.open(IMAGE_PATH)
langs = ["en"] # Replace with your languages or pass None (recommended to use None)
recognition_predictor = RecognitionPredictor()
detection_predictor = DetectionPredictor()
predictions = recognition_predictor([image], [langs], detection_predictor)
See the README for how to use other models.
Table recognition
There is a new table recognition model which detects colspans/rowspans better, along with header cells. It also isn't as complex to use, since it operates on just the images versus the images and bboxes.
What's Changed
- Layout improvements by @VikParuchuri in #267
- New table model; total refactor by @VikParuchuri in #279
- Add ci workflow by @VikParuchuri in #284
Full Changelog: v0.8.3...v0.9.0
Pin pypdfium2
Pin pypdfium2 version - newest version can cause issues.
New layout model
Layout model is twice as fast and more accurate.
What's Changed
- Update layout model by @VikParuchuri in #270
Full Changelog: v0.8.1...v0.8.2
Add bad OCR detection model
- Add a model to detect bad OCR text
- Add top_k predictions to layout
- Add in test suite
What's Changed
- Add OCR Error Detection Model by @tarun-menta in #261
- Add
top_k
to Surya Layout and Fix Confidence Value Issue by @iammosespaulr in #263 - Bad OCR detection model by @VikParuchuri in #268
New Contributors
- @tarun-menta made their first contribution in #261
Full Changelog: v0.8.0...v0.8.1
Surya Bugfixes and Improvements to `pdftext`
Update to the latest pdftext
release, incorporating heuristic-based segmentation for enhanced performance and accuracy.
Full Changelog: v0.7.0...v0.8.0
New layout model
- New layout model that detects more block types, includes ordering, and performs better
- Remove ordering model
- Refactor internals to use common model definitions and processors
- Add flags for easier compilation
What's Changed
- Fix detection results being mutated in place by @iammosespaulr in #236
- Add support for compiled inference by @iammosespaulr in #237
- Dev by @VikParuchuri in #243
- Layout2 by @VikParuchuri in #249
- New layout model by @VikParuchuri in #252
Full Changelog: v0.6.13...v0.7.0
Performance Improvements
- Don't pass around heatmaps by default
- Use threadpool vs processpool
- Optimize fn for postprocessing
Flatten table form fields, overlap postprocessing
- Overlap postprocessing if batch size >1
- Flatten in form fields with tables
Transformers 4.46, torch 2.5 fixes
- Fixes cudnn backend issue with torch 2.5
- Fixes issue with sdpa and transformers 4.46
- Flatten form fields into pdf by default for table rec
Threads
Threads cause issues on a small % of devices. Although they do give good speedups on most, supporting them seems like a bad idea.