try to play with the leptonica pixAutoPhotoinvert function[1].
quick test with following C code snippets provided attached result:

pix = leptonica.pixRead("des_resume3.png");
pix1 = leptonica.pixThresholdToBinary(pix, 170);
autoinverted = pixAutoPhotoinvert(pix1, thresh, NULL, NULL);
pixWrite("autoinverted.png", autoinverted, IFF_PNG);

[1]
https://github.com/DanBloomberg/leptonica/blob/f7a4bdc48f54c973e6b7c47b9181ac0ef0bd2089/src/pageseg.c#L2370

Zdenko


st 6. 1. 2021 o 17:43 Deepak Sharma <dee...@intellectfaces.co.in>
napĂ­sal(a):

> I am trying to preprocess resumes for building an OCR model. Please refer
> to the reference image attached in this message.
> As you can see, under the skills section, all the skills are surrounded by
> bluish green patch. I need help with how to remove those colors from the
> image?
> Ideally, after preprocessing, the image should be just white(background)
> with black text
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/bc43973f-a2fb-40d7-af07-792fbebe04bdn%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/bc43973f-a2fb-40d7-af07-792fbebe04bdn%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8ynzeE72ptxxJuN52%3DoDXCv%3DDtrmyJxwE2zJfmzYDEMog%40mail.gmail.com.

Reply via email to