Hello Ajith, Tom Browder suggests taking a look at Raku (née Perl6), and I concur. While I don't know Malayalam at all, I can write the regex code below with ease:
> #all code below using the Raku REPL: > say '0123456789'.chars; 10 > say $/ if '0123456789' ~~ / \d+ /; 「0123456789」 > #now with Bengali digits: > say '০১২৩৪৫৬৭৮৯'.chars; 10 > say $/ if '০১২৩৪৫৬৭৮৯' ~~ / \d+ /; 「০১২৩৪৫৬৭৮৯」 >#now with Malayalam digits: > say '൦൧൨൩൪൫൬൭൮൯'.chars; 10 > say $/ if '൦൧൨൩൪൫൬൭൮൯' ~~ / \d+ /; 「൦൧൨൩൪൫൬൭൮൯」 More info here: https://www.nntp.perl.org/group/perl.perl6.users/2020/06/msg8828.html https://www.nntp.perl.org/group/perl.perl6.users/2020/06/msg8845.html HTH, Bill. On Sun, Jul 19, 2020 at 4:36 AM Ajith R <ajithramay...@yahoo.co.in> wrote: > > Hi, > > > First, there is a somewhat specific question about unspecified > > substitutions. For all I know about these substitutions, you might > > actually need XSLT to do them properly. > > The substitution that I had in mind requires referring to characters based on > their unicode properties like script, block... > > > I think you should absolutely use perl if it makes you happy. > > Unix has a pretty interesting collection of various small tools (which > > "do one thing and do it well" as you may hear), and shells facilitate > > hooking up their outputs and inputs. Almost as if they were made to do > > just that. > > I don't prescribe to using a tool for the sake of happiness. With my limited > knoweldege I want to select one that is adequate to do the job. > The subsitution that I wanted in many text files was deleting text from > languages other than Malayalam,english and punctuation. This required a > program that could match charcters based on their unicode character of block > / script. I didn't find anything to suggest that sed could do that. May be, I > didn't search properly. > Did I miss a utility(including sed) that can do the kind of substitution I > mentioned above? > > Thanks, > ajith >