[tesseract-ocr] Re: image_to_sting() alsways delivers empty string (Python)

2017-05-05 Thread anita josic
Hey guys I am now so far that I have the picture in really rich gray tones, so that not everything is so "noisy" (image.convert ('L') instead of image.convert ('1'). But still no output. I think I really need to cut the text and then remove the background. Maybe an expert can show me the best wa

Re: [tesseract-ocr] image_to_sting() alsways delivers empty string (Python)

2017-05-05 Thread anita josic
Thank you for the so nice / positive-looking and detailed help. I really feel like I can handle it by myself, really. Thank you so much. May the force be with you Am Freitag, 5. Mai 2017 18:37:56 UTC+2 schrieb zdenop: > > Really? And you thing your image fits to that examples? > E.g. texts are in

Re: [tesseract-ocr] image_to_sting() alsways delivers empty string (Python)

2017-05-05 Thread Zdenko Podobný
Really? And you thing your image fits to that examples? E.g. texts are in the line, there is not noise - just the text, DPI is OK etc??? You will never get good output from bad input. Zdenko On Fri, May 5, 2017 at 10:31 AM, anita josic wrote: > Hi > > I read it now, but still don't know what I

[tesseract-ocr] Re: train a new font for language of persian

2017-05-05 Thread shree
There is already farsi/persian traineddata for tesseract-ocr 4.0-alpha at https://github.com/tesseract-ocr/tessdata/raw/master/fas.traineddata Have you given it a try? Which font do you want to add to it? On Thursday, May 4, 2017 at 6:06:09 PM UTC+5:30, Ava Nimaee wrote: > > hi every one. i want

Re: [tesseract-ocr] Re: train a new font for language of persian

2017-05-05 Thread universal reseller
at first you must select a version and install it on your machine then try to train your fonts البته هنوز فرسی رو کامل ساپورت نمیکنه ولی خروجیش بد هم نیست -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop

[tesseract-ocr] Any hints for Arabic user custom traineddata (e.g. new font)

2017-05-05 Thread bmwmine
Hi everybody I am creating curpus data to train. It is more than 9000pages (like 3 encyclopedias) of tif for "Traditional Arabic" font (will add other fonts later) will use Tesseract 4.00 Alpha LSTM any hints will be useful. I took same config of recent ara.traineddata (12mb). set char_spacing

[tesseract-ocr] Re: train a new font for language of persian

2017-05-05 Thread bmwmine
> > refer to this tutorial > https://wn.com/training_tesseract_ocr_for_arabic_language_tutorial > it is for arabic but the both has simalirities tesseract has no gui once you will take a preview of what is going on, you can use command lines https://github.com/tesseract-ocr/tesseract/wiki/Tr

Re: [tesseract-ocr] Re: image_to_sting() alsways delivers empty string (Python)

2017-05-05 Thread anita josic
Hello again i tried out to follow these instructions for the usage of bazaar https://github.com/tesseract-ocr/tesseract/blob/master/doc/tesseract.1.asc#config-files-and-augmenting-with-user-data having now /usr/share/tesseract-ocr/tessdata/eng.user-words: (contains DHL and other words the i

Re: [tesseract-ocr] image_to_sting() alsways delivers empty string (Python)

2017-05-05 Thread anita josic
Hi Zdenko I read it now, but still don't know what I need to use. I already read a lot but I still don't know what part is missing. I am hoping for real feedback and help. I am not really coming forward trying stuff on my own as you can see. Am Freitag, 5. Mai 2017 09:23:58 UTC+2 schrieb zden

Re: [tesseract-ocr] image_to_sting() alsways delivers empty string (Python)

2017-05-05 Thread anita josic
Hi I read it now, but still don't know what I need to use. I already read a lot but I still don't know what part is missing. I am hoping for real feedback and help. I am not really coming forward trying stuff on my own as you can see. Am Freitag, 5. Mai 2017 09:23:58 UTC+2 schrieb zdenop: > >

Re: [tesseract-ocr] Re: image_to_sting() alsways delivers empty string (Python)

2017-05-05 Thread Zdenko Podobný
Did you read https://github.com/tesseract-ocr/tesseract/wiki/ImproveQuality? Zdenko On Fri, May 5, 2017 at 10:25 AM, anita josic wrote: > > Using > tesseract --tessdata-dir /usr/share/tesseract-ocr temp2.jpg -l eng -psm 20 > text > > in the terminal, I get the output > ‘33:; > in text.txt. Well

[tesseract-ocr] Re: image_to_sting() alsways delivers empty string (Python)

2017-05-05 Thread anita josic
Using tesseract --tessdata-dir /usr/share/tesseract-ocr temp2.jpg -l eng -psm 20 text in the terminal, I get the output ‘33:; in text.txt. Well, that is at least something, but far away from what I intended to get. Looking forward to answers. Am Freitag, 5. Mai 2017 09:10:49 UTC+2 schrieb a

Re: [tesseract-ocr] image_to_sting() alsways delivers empty string (Python)

2017-05-05 Thread Zdenko Podobný
Did you read https://github.com/tesseract-ocr/tesseract/wiki/ImproveQuality? Zdenko On Fri, May 5, 2017 at 9:10 AM, anita josic wrote: > > > Hello > > I a

[tesseract-ocr] image_to_sting() alsways delivers empty string (Python)

2017-05-05 Thread anita josic
Hello I am trying to extract text from a picture, but I always geht an empty text. The used picture in the code for image_to_string('temp2.jpg') is added