Hi.

On Sun, 08 Jan 2017 10:11:26 +0100
Hans <hans.ullr...@loop.de> wrote:

> Hi all, 
> 
> I have a little problem with using grep.
> 
> The problem: 
> 
> I have a wordlist with 3,5 Mio words in ASCII. No I want filter out all words 
> with 5,6, 
> 7, 8, 9 and 10 signs in seperate lists. The wordlist contains all sort of 
> signs, like 
> alphanumeric, control signs like "^", "]" and others.
> So it must be same, whatever sign grep reads. I found this:
> 
> grep -o -w -E '^[[:alnum:]]{5}' file1
> 
> 
> But it looks like it is only grepping text. I read the manual of grep, and I 
> see, there 
> are more options to chose. But I did not completely understand, if I have to 
> chose 
> every option in addition or if is there an option,which covers every kind of 
> sign.

As it should be. regex(7) specifies that character classes are defined
in wctype(3), which states that '[[:alnum:]]' merely implements isalnum
(3), which, in turn is defined as (isalpha(c) || isdigit(c)).

So, what you really need is for five characters only (note final '$'):

egrep '^.{5}$' file1

or, if you need whole words (i.e. need to exclude spaces):

egrep '^[^ ]$' file1

Reco

Reply via email to