Dear Friends,
Greetings from GuRu Prevails!
GuRu Prevails is the only organization in India that offers training program
in Machine Learning for MS aspirants and corporate employees. As an add on
module we offer a 16 hours workshop on Python.
The work-shop will be held at our training facility o
>
> > Thanks Kunal , i am happy as well as sad , happy because my job has
> become
> > much simpler and sad because everytime i think of a novel (honeslty )
> idea
> > .. i see someone else has already executed it .. :(
>
My intension of telling about the existing works was to let you know of the
On 24 May 2010 12:06, Rahul R wrote:
> Thanks Kunal , i am happy as well as sad , happy because my job has become
> much simpler and sad because everytime i think of a novel (honeslty ) idea
> .. i see someone else has already executed it .. :(
>
> David Heinemeier Hansson once said that reuse i
On Mon, May 24, 2010 at 7:51 PM, Dhananjay Nene wrote:
> You may want to try out pdfminer. Its very similar to xpdf in structure and
> should give you the parsed data into unicode directly.
>
Tried but I got the same output as xpdf. I guess it's because of the point
mentioned by Gora- 'you might n
Tried .. didn't work out well enough. The output is same as what I get out
of xpdf
On Mon, May 24, 2010 at 7:51 PM, Dhananjay Nene wrote:
> You may want to try out pdfminer. Its very similar to xpdf in structure and
> should give you the parsed data into unicode directly.
>
> On Mon, May 24, 2010
On Mon, 24 May 2010 19:13:26 +0530
Eknath Venkataramani wrote:
> I have around 45 pdfs to convert into raw text containing text in
> _HINDI_ . When I use the xpdf package, the generated text is very
> weird, so I'd like to write a program which would convert the pdf
> text into Unicode text as it
You may want to try out pdfminer. Its very similar to xpdf in structure and
should give you the parsed data into unicode directly.
On Mon, May 24, 2010 at 7:13 PM, Eknath Venkataramani wrote:
> I have around 45 pdfs to convert into raw text containing text in _HINDI_ .
> When I use the xpdf pack
I have around 45 pdfs to convert into raw text containing text in _HINDI_ .
When I use the xpdf package, the generated text is very weird, so I'd like
to write a program which would convert the pdf text into Unicode text as it
is.
The fonts used in the pdfs:
name
Hi Rahul,
Not sure if you'd be interested, however, david beazly who was here a couple
of weeks back, has a straight forward implementation of Lex and Yacc in
Python named: PLY : http://www.dabeaz.com/ply/index.html
See if this interests you. This might be easier for the porting part, and
then add