try to play with the leptonica pixAutoPhotoinvert function[1]. quick test with following C code snippets provided attached result:
pix = leptonica.pixRead("des_resume3.png"); pix1 = leptonica.pixThresholdToBinary(pix, 170); autoinverted = pixAutoPhotoinvert(pix1, thresh, NULL, NULL); pixWrite("autoinverted.png", autoinverted, IFF_PNG); [1] https://github.com/DanBloomberg/leptonica/blob/f7a4bdc48f54c973e6b7c47b9181ac0ef0bd2089/src/pageseg.c#L2370 Zdenko st 6. 1. 2021 o 17:43 Deepak Sharma <dee...@intellectfaces.co.in> napĂsal(a): > I am trying to preprocess resumes for building an OCR model. Please refer > to the reference image attached in this message. > As you can see, under the skills section, all the skills are surrounded by > bluish green patch. I need help with how to remove those colors from the > image? > Ideally, after preprocessing, the image should be just white(background) > with black text > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/bc43973f-a2fb-40d7-af07-792fbebe04bdn%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/bc43973f-a2fb-40d7-af07-792fbebe04bdn%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8ynzeE72ptxxJuN52%3DoDXCv%3DDtrmyJxwE2zJfmzYDEMog%40mail.gmail.com.