Skip to content

Commit

Permalink
Merge pull request #160 from jze/upstream
Browse files Browse the repository at this point in the history
Made output more compatible to the hOCR spec.
  • Loading branch information
zuphilip authored Dec 4, 2016
2 parents 976a3ba + 060ff21 commit d472d29
Showing 1 changed file with 4 additions and 2 deletions.
6 changes: 4 additions & 2 deletions ocropus-hocr
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,9 @@ for arg in args.files:
base,_ = ocrolib.allsplitext(arg)
try:
E("===",arg)
P("<div class='ocr_page' title='file %s'>"%arg)
image = ocrolib.read_image_binary(arg)
height, width = image.shape
P("<div class='ocr_page' title='image %s; bbox 0 0 %d %d'>"%(arg,width,height))

# to proceed, we need a pseg file and a
# subdirectory containing text lines
Expand All @@ -88,7 +90,7 @@ for arg in args.files:
# and insert paragraph breaks as needed

id = regions.id(i)
y0,x0,y1,x1 = regions.bboxMath(i)
y0,x0,y1,x1 = regions.bbox(i)
if last_coords is not None:
lx0,ly0 = last_coords
dx,dy = x0-lx0,y1-ly0
Expand Down

0 comments on commit d472d29

Please sign in to comment.