Hi list
Closing this one off myself, this is what I did:
The error seems to concern the update of tm to version 0.6: the
conversion to lower case text should now be:
> docs <- tm_map(docs, content_transformer(tolower))
Everything else seems to work fine thereafter.
The issue in the tutorial concerns section 3.1. wherein Graham creates a
function toSpace. This seems to introduce an additional term that tm_map
and later DocumentTermMatrix do not seem to know how to handle. This is
probably an incorrect interpretation of what's going on, but the fix
appears to be to use the above line earlier in the preparation stage.
If anyone has more informed insight, please share.
Cheers
Sun
On 25/02/15 17:33, Sun Shine wrote:
Hi list
I've been working my way through a tutorial on text mining (
http://onepager.togaware.com/TextMiningO.pdf ) and all was well until
I came across this problem using tm (text miner):
++++++++++code+++++++++++++++++++
> docs <- tm_map(docs, content_transformer(tolower))
Warning messages:
1: In mclapply(x$content[i], function(d) tm_reduce(d, x$lazy$maps)) :
all scheduled cores encountered errors in user code
2: In mclapply(content(x), FUN, ...) :
all scheduled cores encountered errors in user code
++++++++++end-code++++++++++++++++
After some searching, it appears the best fix for this problem was to
pass an explicit lazy=TRUE argument to tm, like this:
> docs <- tm_map(docs, content_transformer(tolower), lazy=TRUE)
However, a little further on in the tutorial to set up the text
matrix, a related (?) error was returned:
++++++++++code+++++++++++++++++++
> dtm <- DocumentTermMatrix(docs)
Error in UseMethod("meta", x) :
no applicable method for 'meta' applied to an object of class
"try-error"
In addition: Warning message:
In mclapply(unname(content(x)), termFreq, control) :
all scheduled cores encountered errors in user code
++++++++++end-code++++++++++++++++
I tried applying the explicit lazy=TRUE again, but doesn't change
things. I have gone over the tutorial again and have followed all of
the steps (including loading the requisite libraries). Moreover,
searching on the web seems to return several contradictory suggestions
and I'm no wiser than I was before.
The closest I came to an answer was at Stack Overflow
http://stackoverflow.com/questions/24771165/r-project-no-applicable-method-for-meta-applied-to-an-object-of-class-charact
and that answer suggested using the latest tm (v 0.6) and claimed that
the earlier tolower step was wrong. However, my code used the
recommended: corpus <- tm_map(corpus, content_transformer(tolower))
Is there anyone on the list who could either sign-post me to a
solution or assist in debugging this please?
I'm running R version 3.1.2 and tm is 0.6
Many thanks
Sun
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.