thank you Shree, I will let my colleague know and continue this discussion there.
เมื่อ วันพุธที่ 3 ตุลาคม ค.ศ. 2018 0 นาฬิกา 21 นาที 27 วินาที UTC+7, shree เขียนว่า: > > Please see https://github.com/tesseract-ocr/tesseract/issues/1579 > and continue further discussion there. > > On Tue, Oct 2, 2018 at 9:52 AM Shree Devi Kumar <shree...@gmail.com > <javascript:>> wrote: > >> There is an open issue with similar problem in issue tracker. It will >> help to move the discussion there. >> >> I will test with your sample image and also post link to the issue. >> >> On Tue, 2 Oct 2018, 01:01 Rujrawee K, <hevalina...@gmail.com >> <javascript:>> wrote: >> >>> >>> ok, Shree, I miscommunicated with my colleague, he said this problem >>> occurred on both default and custom trained model, I mean no matter what >>> model are used if I trained in single language with no other language using >>> in the training process and use it with other model with "-l" and having >>> both language in the same line it will read in 1 language but works fine on >>> single language in that line(please find result below for clearer >>> explanation) >>> my answers are as below : >>> >>> 1. we trained for using with LSTM >>> 2. we used "tessdata_best" >>> 3. code as show below >>> >>> config_name = ('-l eng+tha --oem 1 --psm 3 -c >>> preserve_interword_spaces=1') >>> im_name = cv2.imread(img_path_name, cv2.IMREAD_COLOR) >>> text_name = pytesseract.image_to_string(im_name,config=config_name) >>> print (text_name) >>> >>> >>> [image: en_th.jpg] >>> >>> >>> *The result is : * [image: result.jpg] >>> >>> as you can see if the input image have both language(eng+thai) in the >>> same line it will read only in 1 language but when having single language >>> in that line it will read in correct language these are both default >>> model(same result with custom model) >>> >>> เมื่อ วันอังคารที่ 2 ตุลาคม ค.ศ. 2018 10 นาฬิกา 14 นาที 11 วินาที UTC+7, >>> shree เขียนว่า: >>>> >>>> 1. Have you trained for legacy tesseract engine or for LSTM? >>>> >>>> 2. Which default traineddata are you using? >>>> >>>> 3. For us to test, please provide an image and the commands used for >>>> testing and the output you got. >>>> >>>> On Mon, Oct 1, 2018 at 11:08 PM Rujrawee K <hevalina...@gmail.com> >>>> wrote: >>>> >>>>> Hi Shree, >>>>> Yes we tried that and it's working ok, but my problem is when I'm >>>>> trying to train a new thai model and then use it with default eng model >>>>> from tess4 like "-l custom_tha+eng" it can only read in 1 language that >>>>> comes first in the command, in this case "custom_tha" and result is the >>>>> same for "-l eng+custom_tha" it will only read "eng" but when using both >>>>> languages default model from tess4 it can read both languages at the same >>>>> time with out a problem except the accuracy. do I missed something? >>>>> >>>>> เมื่อ วันอังคารที่ 2 ตุลาคม ค.ศ. 2018 8 นาฬิกา 26 นาที 48 วินาที >>>>> UTC+7, shree เขียนว่า: >>>>>> >>>>>> Have you tried >>>>>> >>>>>> https://github.com/tesseract-ocr/tessdata_fast/blob/master/script/Thai.traineddata >>>>>> >>>>>> which is supposed to support both Thai and English >>>>>> >>>>>> On Mon, Oct 1, 2018 at 5:33 AM Rujrawee K <hevalina...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> After I trained my custom Thai language model to use in my tesseract >>>>>>> 4, it's working fine(not talking about the accuracy) but it cannot read >>>>>>> the >>>>>>> English language due to not included in the model so I'm trying to >>>>>>> combine >>>>>>> my custom tha lang with default eng lang with "-l custom_tha+eng" the >>>>>>> output shows that the tesseract still cannot read english texts but >>>>>>> when I >>>>>>> swap to "-l eng+custom_tha" it can read english text now but not the >>>>>>> thai >>>>>>> texts, it's like that tesseract only use 1 model to read the text. but >>>>>>> when >>>>>>> using both tha and eng default model from tesseract 4 it's working fine. >>>>>>> *my question is* why and any solution/suggestion for this problem? >>>>>>> >>>>>>> Regards >>>>>>> >>>>>>> -- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "tesseract-ocr" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>> send an email to tesseract-oc...@googlegroups.com. >>>>>>> To post to this group, send email to tesser...@googlegroups.com. >>>>>>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>>>>>> To view this discussion on the web visit >>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/5cd91f67-0aa1-40a3-a605-4b90d413b2cd%40googlegroups.com >>>>>>> >>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/5cd91f67-0aa1-40a3-a605-4b90d413b2cd%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>> . >>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> ____________________________________________________________ >>>>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >>>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "tesseract-ocr" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to tesseract-oc...@googlegroups.com. >>>>> To post to this group, send email to tesser...@googlegroups.com. >>>>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/tesseract-ocr/4364a760-774d-4e0f-83c6-8210e0a0f824%40googlegroups.com >>>>> >>>>> <https://groups.google.com/d/msgid/tesseract-ocr/4364a760-774d-4e0f-83c6-8210e0a0f824%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> >>>> -- >>>> >>>> ____________________________________________________________ >>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >>>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to tesseract-oc...@googlegroups.com <javascript:>. >>> To post to this group, send email to tesser...@googlegroups.com >>> <javascript:>. >>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/tesseract-ocr/0c1cae97-8232-41cf-8143-2fe9870378c6%40googlegroups.com >>> >>> <https://groups.google.com/d/msgid/tesseract-ocr/0c1cae97-8232-41cf-8143-2fe9870378c6%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> For more options, visit https://groups.google.com/d/optout. >>> >> > > -- > > ____________________________________________________________ > भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/8395316d-77a5-4c34-8e93-e1e0fc4d5a7c%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.