Posted on 2012-07-25

Authors: Syed Saqib Bukhari, Faisal Shafait and Thomas M. Breuel

Keywords:

ridge
printed text
non-text segmentation
gaussian-filter bank
reading order

Q1: How line skew is determined?

There is a $\theta$ parameter in Gaussian kernel which is used to produce ridges. This may be used in detecting the skew, but since it's constant for an entire page, a varying line skew will probably decrease its performance.

Q2: How non-text portions are detected?

The paper does not include a description, but cites "S. S. Bukhari, F. Shafait, and T. M. Breuel, “Improved document image segmentation algorithm using multiresolu- tion morphology,” in Proc. SPIE Document Recognition and Retrieval XVIII, San Jose, CA, USA, Jan. 2011" as a source for an improved technique.

Q3: Which heuristics are used in reading order determination?

Breuel is reported to have an algorithm in "T. M. Breuel, “High performance document layout analysis,” in Symposium on Document Image Understanding Technol- ogy, Greenbelt, MD, USA, April 2003." The paper says the authors modified the algorithm for right-to-left scripts. No details again.

Q4: How large is the dataset and what does it contain?

25 Arabic documents and 20 Urdu documents are used.

Q5: Are there any techniques applicable to divans?

There might be, if any of them described in detail. We already have more sophisticated text line detection techniques. For the other I'll need to read the cited works.