Mike,

This is what I am looking for.

http://en.wikipedia.org/wiki/Automatic_summarization

I want to obtain a summary of a huge document as meaningful sentences. I do
not want a bag of words as the output. I have 1000's of documents each one
running to 3-4 pages. I plan to use R to do clustering/classification of
these documents. Instead of working with the original document, I think it
would be better to work with a summary of the documents since this would
avoid memory issues.

Thank you.

Ravi



On Tue, May 31, 2011 at 10:02 PM, Mike Marchywka <marchy...@hotmail.com>wrote:

>
>
>
>
>
>
>
> ----------------------------------------
> > Date: Tue, 31 May 2011 03:25:56 -0700
> > From: viora...@gmail.com
> > To: r-help@r-project.org
> > Subject: [R] Text Summarization
> >
> > Is there a text mining/ NLP package in R that could do text
> summarization?
> > For example, take a huge text as input and provide a summary of the text.
> >
> > In package tm, summarization is defined more as high frequency terms
> which
> > is not what I want. I actually want a summary of what is present in the
> huge
> > volume of text.
> >
> Cliff's notes? Can you define it more precisely? There are some
> computational
> linguistics packages IIRC.
>
>
> > Any help on a R package would be helpful. Thank you.
> >
> > Ravi
> >
> > --
> > View this message in context:
> http://r.789695.n4.nabble.com/Text-Summarization-tp3562735p3562735.html
> > Sent from the R help mailing list archive at Nabble.com.
> >
> > ______________________________________________
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> > and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to