Re: [R] spliting first 10 words in a string

David Winsemius Tue, 02 Nov 2010 12:23:46 -0700


On Nov 2, 2010, at 3:01 PM, Matevž Pavlič wrote:

Hi all,
Thanks for all the help. I managed to do it with what Gaj suggested(Excel :().
The last solution from David is also freat i just don't undestandwhy R put the words in 14 columns and thre rows?

Because the maximum number of words was 14 and the fill argument wasTRUE. There were three rows because there were three items in thesupplied character vector.

I would like it to put just the first 10 words in source field to 10diefferent destiantion fields, but the same row. And so on...is thatpossible?

I don't know what a destination field might be. Those are not R datatypes.

This would trim the extra columns (in this example set to thosegreater than 8) by adding a lot of "NULL"'s to the end of a colClassesspecification .... at the expense of a warning message which can beignored:

> read.table(textConnection(words), fill=T, colClasses =c(rep("character", 8), rep("NULL", 30) ) , stringsAsFactors=FALSE )

   V1    V2    V3      V4    V5    V6    V7      V8
1   I  have     a columnn  with  text  that     has
2   I would  like      to split these words      in
3 but  just first     ten words    in   the string.
Warning message:

In read.table(textConnection(words), fill = T, colClasses =c(rep("character", :

  cols = 14 != length(data) = 38


If you want to assign the first column to a variable then just:

> first8 <- read.table(textConnection(words), fill=T, colClasses =c(rep("character", 8), rep("NULL", 30) ) , stringsAsFactors=FALSE)

> var1 <- first8[[1]]
> var1
[1] "I"   "I"   "but"

--
David.


Thank you, m
-----Original Message-----

From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of David Winsemius

Sent: Tuesday, November 02, 2010 3:47 PM
To: Gaj Vidmar
Cc: r-h...@stat.math.ethz.ch
Subject: Re: [R] spliting first 10 words in a string


On Nov 2, 2010, at 6:24 AM, Gaj Vidmar wrote:

Though <forbidden> in this list, in Excel it's just (literally!)
five clicks
away!
(with the column in question selected)
Data -> Text to Columns -> Delimited -> tick Space -> Finish
Pa je! (~Voila in Slovenian)
(then import back to R, keeping only the first 10 columns if so
desired)


You could do the same thing without needing to leave R. Just
read.table( textConnection(..), header=FALSE, fill=TRUE)

read.table(textConnection(words), fill=T)

   V1    V2    V3      V4    V5    V6    V7      V8       V9
V10      V11   V12 V13 V14
1   I  have     a columnn  with  text  that     has    quite
a      few words  in it.
2   I would  like      to split these words      in separate columns
3 but  just first     ten words    in   the string.       Is    that
possible    in  R?


Regards,
Assist. Prof. Gaj Vidmar, PhD
University Rehabilitattion Institute, Republic of Slovenia

Irrelevant P.S. Long ago, before embarking on what eventually ended
mainly
in statistics,
I did two years of geology, so (and also because of knowing what the
poster's institute does)
I even kinda imagine what these data are.

"Matev¾ Pavliè" <matevz.pav...@gi-zrmk.si> wrote in message
news:ad5ca6183570b54f92aa45ce2619f9b9d96...@gi-zrmk.si...

Hi,

I am sorry, will try to be more exact from now on...

I have a data.frame  with a field called Opis. IT contains
sentenses that
I would like to split in words or fields in data.frame...when I say
columns I mean as in Excel table. I would like to split "Opis" into
ten
fields from the first ten words in Opis field.
Here is an example of my data.frame.

'data.frame':   22928 obs. of  12 variables:
$ VrtinaID        : int  1 1 1 1 2 2 2 2 2 2 ...
$ ZapStev         : int  1 2 3 4 1 2 3 4 5 6 ...
$ GlobinaOd       : num  0 0.8 9.2 10.1 0 0.9 2.6 4.9 6.8 7.3 ...
$ GlobinaDo       : num  0.8 9.2 10.1 11 0.9 2.6 4.9 6.8 7.3 8.2 ...

$ Opis : Factor w/ 12754 levels "","(MIVKA) DROBENMELJAST

PESEK, GOST, SIVORJAV",..: 2060 11588 2477 11660 7539 3182 7884
9123 2500
4756 ...
$ ACklasifikacija : Factor w/ 290 levels "","(CL)","(CL)/(SC)",..:
154 125
101 101 NA 106 125 80 106 101 ...
$ GeolNastOd      : num  0 0.8 9.2 10.1 0 0.9 2.6 4.9 6.8 7.3 ...
$ GeolNastDo      : num  0.8 9.2 10.1 11 0.9 2.6 4.9 6.8 7.3 8.2 ...
$ GeolNastOpis    : Factor w/ 113 levels "","B. M. S.",..: 56 53 53
53 56
53 53 53 53 53 ...
$ NacinVrtanjaOd  : num  0e+00 1e+09 1e+09 1e+09 0e+00 ...
$ NacinVrtanjaDo  : num  1.1e+01 1.0e+09 1.0e+09 1.0e+09 1.0e+01 ...
$ NacinVrtanjaOpis: Factor w/ 43 levels "","H. N.","IZKOP",..: 26 1
1 1 26
1 1 1 1 1 ...

Hope that explains better...
Thank you, m

-----Original Message-----
From: David Winsemius [mailto:dwinsem...@comcast.net]
Sent: Monday, November 01, 2010 10:13 PM
To: Matev¾ Pavliè
Cc: r-help@r-project.org
Subject: Re: [R] spliting first 10 words in a string


On Nov 1, 2010, at 4:39 PM, Matev¾ Pavliè wrote:

Hi all,
I have a columnn with text that has quite a few words in it. Iwould
like to split these words in separate columns, but just first ten
words in the string. Is that possible in R?


Not sure what a column means to you. It's not a precisely defined R
type or class. (And you are requested to offered a concrete example
rather than making us guess.)

words <-"I have a columnn with text that has quite a few words in

it. I would like to split these words in separate columns, but just
first ten words in the string. Is that possible in R?"

strsplit(words, " ")[[1]][1:10]

[1] "I"       "have"    "a"       "columnn" "with"    "text"
"that"    "has"     "quite"   "a"


Or if in a dataframe:

words <-c("I have a columnn with text that has quite a few words in

it.", "I would like to split these words in separate columns","but

just first ten words in the string. Is that possible in R?")

worddf <- data.frame(words=words)

t(sapply(strsplit(worddf$words, " "), "[", 1:10) )

   [,1]  [,2]    [,3]    [,4]      [,5]    [,6]    [,7]    [,
8]      [,9]       [,10]
[1,] "I"   "have"  "a"     "columnn" "with"  "text"  "that"  "has"
"quite"    "a"
[2,] "I"   "would" "like"  "to"      "split" "these" "words" "in"
"separate" "columns"
[3,] "but" "just"  "first" "ten"     "words" "in"    "the"
"string."
"Is"       "that"


--
David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] spliting first 10 words in a string

Reply via email to