Re: MUA statistics

Rob 'Feztaa' Park Sat, 10 Nov 2001 16:04:03 -0800

On Sat, Nov 10, 2001 at 06:20:32PM -0500, David T-G (dis)graced my inbox with:
> ...and then Rob 'Feztaa' Park said...
> % On Sat, Nov 10, 2001 at 01:56:28PM +0530, Prahlad Vaidyanathan (dis)graced my 
>inbox with:
> % > > the 161 Web clients is one person (my best friend, he uses a webmail thing).
> 
> And he's still a friend?  I thought you had better taste ;-)


Heh, even in spite of that he's a cool guy. It's the same "Insert Random
Name Here" guy that I was talking about before. Apparently his ISP gives
him a broken email address, so he can't use a real mail client even if
he wanted to.

> % > That's not the only flaw - currently it looks through all
> % > the files in ~/Email, including =sent-mail, and =mutt-users,
> % > which will naturally be _extremely_ biased.
> % 
> % Well, there's no need to go through sent-mail, because that's just me,
> % and I know what I use. I would still like it to go through mutt-users,
> 
> Of course, if one were to do that and implement the non-duplication, then
> you would properly count yourself as a mutt user instead of having to
> abstain!

In that case, I could program it to ignore me. After all, once we get
the script to recognize who uses what, instead of just how many of a
given client have sent me email, it should be easy to discriminate
against people as well as folders ;)

> % > So, one could just do something like so:
> % > FILES=$(find $HOME/Email -type f -maxdepth 1 ! -name
> % > sent-mail ! -name mutt-users )
> % > 
> % > then, make it grep only through $FILES, and not '*'.
> % 
> % Yep.
> 
> Sounds good enough.  Better yet, you probably only have a limited number
> of incoming folders, and you were probably clever enough to set them up
> with a matching initial regexp (for me it's =F.*), so it would also be
> pretty easy to limit the file listing that way.

Nope, I didn't look that far ahead. 'course, I could rename all my mail
folders, but then I'd have to rewrite most of my procmail scripts, and
I'm really not in the mood for that.

> If you insist on looking into every file in the directory except for an
> exception or two, I should think that
> 
>   FILES=`ls -1 $HOME/Email | egrep -v 'sent-mail|mutt-users'`
> 
> would be quicker than a find, even if you also told ls to list in long
> format and excluded directories and then used awk to print only the last
> field of each line...  Then, of course, there's the sed method, too,
> which might be even faster than awk.
> 
> % > But, as far as the problem you mentioned goes - checking for
> % > the Sender/X-Sender fields (possibly piping it through sort
> % > and uniq to remove duplicates), and then grepping through it
> % > would be, to say the least, tedious (at least in bash).
> % 
> % Yeah, I wouldn't know where to start with a bash script, especially with
> % mbox style folders (I could grep through to find how many unique people
> % emailed me easily enough, but how would I link each person to a mua?)
> 
> I should think that, here, awk would be your friend; start looking every
> time you see a ^From_ and build an array of addresses you've seen and
> signatures you've seen and their counts, only incrementing if the address
> is new.  Yeah, perl (or probably even python :-) would be nicer, but
> don't give up on bash; do it for the exercise!

Well, I sort of know what to do, but grep has gimped REGEX.

If this command would work:

cat ~/mail/*|grep ^(From:|Message-|User-Agent|X-Mailer)

Then we could get a list of everybody who sent me a message, and the
corresponding Message-ID, User-Agent, and X-Mailer headers for each
person. Then it should be a simple sed or awk script to change this:

From: Pat McRotch
User-Agent: Mutt
X-Mailer: Mutt
From: Ben Dover
Message-ID: <[EMAIL PROTECTED]>
From: Phil McCrack
User-Agent: Some_thing_else

into this:

Pat McRotch: Mutt
Ben Dover: Pine
Phil McCrack: Some_thing_else

and then you could use that to count how many of each client you have.

But, unfortunately, that grep command I stated earlier yields this:

bash: syntax error near unexpected token `^(F'

(and putting it in quotes makes it look for that literal string, not the
"this line or this line or this line or this line" that it should be).

> % > It would probably make more sense in perl or something, but
> % > I don't know perl, so I'll leave it to someone else :-)
> % 
> % Well, I just bought a book on perl, perhaps I could learn how to do this
> % ;)
> 
> He's coming around!  Woo hoo!

ROFL. Python still rocks, I just thought I'd pick up perl because it's
an important language. After this I'm probably going to fool around with
assembler or C++ ;)

> % Not right away though, I've got a busy weekend ahead of me.
> 
> Oh, phooey.

Yep, gotta love it. I have this huge project due monday. It's worth 3
credits (which I technically don't need, but I'm required to do this
project in order to graduate), and we had a month to work on it. I'm
thinking of starting tonight...

> % C:\WINDOWS C:\WINDOWS\GO C:\PC\CRAWL
> 
> I *love* this one!  Replace the spaces with semicolons, which still makes
> grammatical sense, and you have a real PATH value under DOS :-)

Yeah, it's pretty funny. I like some of the quotes more, though ("A
verbal agreement isn't worth the paper it's written on!" -- some guy I
forget). I can send you the file with all my sigs in it, if you want ;)

-- 
Rob 'Feztaa' Park
[EMAIL PROTECTED]
--
"640K ought to be enough for anybody."
                -- Bill Gates, 1981

PGP signature

Re: MUA statistics

Reply via email to