"I am working on the following part of building a neural network to try
indeed classifying some text."

... and so you are most likely trying to reinvent wheels. There are
already many such tools available here:

Some of these are interfaces to C language code, and so are probably
much more efficient than anything you (or I) can do at the R level.

Of course, if this is mainly a programming exercise for you, than this
is largely irrelevant.


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Mon, Apr 30, 2018 at 9:25 AM, Luca Meyer <lucam1...@gmail.com> wrote:
> Thank you for both replies Don & Rui,
> The very issue here is that there is a search that needs to be done within
> a text field and I agree with Rui later comment that regexpr might indeed
> be the time consuming piece of code.
> I might try to optimise this piece of code later on, but for the time being
> I am working on the following part of building a neural network to try
> indeed classifying some text.
> Again, thanks,
> Luca
> 2018-04-30 17:25 GMT+02:00 MacQueen, Don <macque...@llnl.gov>:
>> Luca,
>> If speed is important, you might improve performance by making d0 into a
>> true matrix, rather than a data frame (assuming d0 is indeed a data frame
>> at this point). Although data frames may look like matrices, they aren’t,
>> and they have some overhead that matrices don’t.  I don’t think you would
>> be able to use the [[nm]] syntax with a matrix, but [ , nm] should work,
>> provided the matrix has column names. Or you could perhaps index by column
>> number.
>> I had a project some years ago in which I reduced calculation time a lot
>> by extracting the numeric columns of a data frame and working with them,
>> then recombining them with the character columns. R’s performance working
>> with data frames has improved since then, so I really don’t know if it
>> would make a difference for your task.
>> -Don
>> --
>> Don MacQueen
>> Lawrence Livermore National Laboratory
>> 7000 East Ave., L-627
>> Livermore, CA 94550
>> 925-423-1062
>> Lab cell 925-724-7509
>> *From: *Luca Meyer <lucam1...@gmail.com>
>> *Date: *Monday, April 30, 2018 at 8:08 AM
>> *To: *Rui Barradas <ruipbarra...@sapo.pt>
>> *Cc: *"MacQueen, Don" <macque...@llnl.gov>, array R-help <
>> r-help@r-project.org>
>> *Subject: *Re: [R] How to visualise what code is processed within a for
>> loop
>> Hi Rui
>> Thank you for your suggestion,
>> I have tested the code suggested by you against that supplied by Don in
>> terms of timing and results are very much aligned: to populate a 5954x899
>> 0/1 matrix on my machine your procedure took 79 secs, while the one with
>> ifelse employed 80 secs, hence unfortunately not really any significant
>> time saved there.
>> Nevertheless thank you for your contribution.
>> Kind regards,
>> Luca
>> 2018-04-28 23:18 GMT+02:00 Rui Barradas <ruipbarra...@sapo.pt>:
>> I forgot to explain why my suggestion.
>> The logical condition returns FALSE/TRUE that in R are coded as 0/1.
>> So all you have to do is coerce to integer.
>> This works because the ifelse will return a 1 or a 0 depending on the
>> condition. Meaning exactly the same values. And is more efficient since
>> ifelse creates both vectors, the true part and the false part, and then
>> indexes those vectors in order to return the appropriate values. This is
>> the double of the trouble and a great deal of memory used.
>> Rui Barradas
>> On 4/28/2018 10:12 PM, Rui Barradas wrote:
>> Hello,
>> instead of ifelse, the following is exactly the same and much more
>> efficient.
>> d0[[nm]] <- as.integer(regexpr(d1[i,1], d0$X0) > 0)
>> Hope this helps,
>> Rui Barradas
>> On 4/28/2018 8:45 PM, Luca Meyer wrote:
>> Thanks Don,
>>      for (i in 1:10){
>>        nm <- paste0("V", i)
>>        d0[[nm]] <- ifelse( regexpr(d1[i,1], d0$X0) > 0, 1, 0)
>>      }
>> is exaclty what I needed.
>> Best regards,
>> Luca
>> 2018-04-25 23:03 GMT+02:00 MacQueen, Don <macque...@llnl.gov>:
>> Your code doesn't make sense to me in a couple of ways.
>> Inside the loop, the first line assigns a value to an object named "t".
>> Then, the second line does the same thing, assigns a value to an object
>> named "t".
>> The value of the object named "t" after the second line will be the output
>> of the ifelse() expression, whatever that is. This has the effect of making
>> the first line irrelevant. Whatever value t has after the first line is
>> replaced by whatever it gets from the second line.
>> It looks like the first line inside the loop is constructing the name of a
>> data frame column, and storing that name as a character string. However,
>> the second line doesn't use that name at all. If your goal is to update the
>> contents of a column, you need to assign something to that column in the
>> next line. Instead you assign it to the object named "t".
>> What you're looking for will be more along the lines of this:
>>      for (i in 1:10){
>>        nm <- paste0("V", i)
>>        d0[[nm]] <- ifelse( regexpr(d1[i,1], d0$X0) > 0, 1, 0)
>>      }
>> This may not a complete solution, since I have no idea what the contents
>> or structure of d1 are, or what the regexpr() is expected to return.
>> And notice the use of double brackets, [[ and ]]. This is one way to
>> reference a column of a  data frame when you have the column's name stored
>> in a variable. Another way is d0[ , nm]
>> A couple of additional comments:
>>   "t" is a poor choice of object name, because it is one of R's built-in
>> functions (immediately after starting a fresh session of R, with nothing
>> left over from any previous session, type help("r") and see what you get).
>>   ifelse() is intended for use on vectors, not scalars, and it looks like
>> maybe you're using it on a scalar (can't be sure about this, though)
>> For example, ifelse() is designed for this kind of usage:
>> ifelse( c(TRUE, FALSE, TRUE) , 1:3, 11:13)
>> [1]  1 12  3
>> Although it works ok for these
>> ifelse(TRUE, 3, 4)
>> [1] 3
>> ifelse(FALSE, 3, 4)
>> [1] 4
>> They are not really what it is intended for.
>> --
>> Don MacQueen
>> Lawrence Livermore National Laboratory
>> 7000 East Ave., L-627
>> Livermore, CA 94550
>> 925-423-1062
>> Lab cell 925-724-7509
>> On 4/24/18, 12:30 AM, "R-help on behalf of Luca Meyer" <
>> r-help-boun...@r-project.org on behalf of lucam1...@gmail.com> wrote:
>>      Hi,
>>      I am trying to debug the following code:
>>      for (i in 1:10){
>>        t <- paste("d0$V",i,sep="")
>>        t <- ifelse(regexpr(d1[i,1],d0$X0)>0,1,0)
>>      }
>>      and I would like to see what code is actually processing R, how can I
>> do
>>      that?
>>      More to the point, I am trying to update my variables d0$V1 to d0$V10
>>      according to the presence or absence of some text (contained in the
>> file
>>      d1) within the d0$X0 variable.
>>      The code seem to run ok, if I add print(table(t)) within the loop I
>> can see
>>      that the ifelse procedure is working and to some cases within the
>> d0$V1 to
>>      d0$V10 variable range a 1 is assigned. But when checking my d0$V1 to
>> d0$V10
>>      after the for loop they are all still equal to zero...
>>      Thanks,
>>      Luca
>>          [[alternative HTML version deleted]]
>>      ______________________________________________
>>      R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>      https://stat.ethz.ch/mailman/listinfo/r-help
>>      PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>>      and provide commented, minimal, self-contained, reproducible code.
>>     [[alternative HTML version deleted]]
>> ______________________________________________
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> ______________________________________________
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>         [[alternative HTML version deleted]]
> ______________________________________________
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to