Thanks, That worked well... On Fri, 2003-06-13 at 17:32, Ben Kal wrote: > On 11 Jun 2003 Grzesiek Sedek <[EMAIL PROTECTED]> wrote: > > > Anyone have an idea how to extract clear text from inbox file (actual > > file is from m$ entuage on mac called Messages) it got corrupded and > > mail client does not read it. its quite big 500 Mb so I have to do it at > > least semi automaticly. main problem are the attachments(I dont need > > them)- they quite big, rest of content is text. > > You do not describe what the contents of the file look like, so I must > guess at what distinguishes attachments from message texts. > > My guess then is that the 500 Mb file is essentially a text file, and that > the attachments you want to get rid of are big solid blocks of characters: > long sequences of lines, all of the same length, without any spaces in them. > If that is true, a simple sed command will suffice: > > sed -e '/^[^ ][^ ]*$/d' Messages > Messages_attachments_stripped > > This says: delete all lines that are not empty and do not contain spaces. > Be careful. You may want to refine the regular expression that selects > the lines to be deleted. As it stands, a line like > ------------------------------- > that someone may have used in a message text to make a line stand out > as a header, will also be deleted, as well as lines delimiting parts > of messages, like > --346095821--1674543256--1308352331 > > Ben -- Grzesiek Sedek <[EMAIL PROTECTED]>
-- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]