The first step in grabbing information from a pdf file is to translate it 
into text format with pdftotext -layout command. 
Is it available any specific python tool or library to describe the 
layout of a page with ascii characters and to help in identifying and 
extracting the useful pieces of information? For example a function 
allowing to select N characters at line I starting from column Y. 

If a such tool is not available, what is in your mind the best structure 
to describe in python a two dimensions page layout?
-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to