works great thanks. And you cut off my code a lot and removed the loop. 

 
David Biau


>________________________________
> De : Uwe Ligges <lig...@statistik.tu-dortmund.de>
>À : Biau David <djmb...@yahoo.fr> 
>Cc : arun <smartpink...@yahoo.com>; r help list <r-help@r-project.org> 
>Envoyé le : Dimanche 13 janvier 2013 18h22
>Objet : Re: [R] extracting character values
> 
>
>
>On 13.01.2013 18:02, Biau David wrote:
>> OK,
>>
>> here is a minimal working example:
>>
>> au1 <- c('biau dj', 'jones kb', 'van den hoofs j', ' biau dj', 'biau dj', 
>> 'campagna r', 'biau dj', 'weiss kr', 'verdegaal sh', 'riad s')
>> au2 <- c('weiss kr', 'ferguson pc', ' greidanus nv', ' porcher r', 'ferguson 
>> pc', 'pessis e', 'leclerc p', 'biau dj', 'bovee jv', 'biau d')
>> au3 <- c('bhumbra rs', 'lam b', 'garbuz ds', NA, 'chung p', ' biau dj', 
>> 'marmor s', 'bhumbra r', 'pansuriya tc', NA)
>>
>> netw <- data.frame(au1, au2, au3)
>> res <- data.frame(matrix(NA, nrow=dim(netw)[1], ncol=dim(netw)[2]))
>>
>> for (i in 1:dim(netw)[2])
>> {
>> wh <- regexpr('[a-z]{3,}', as.character(netw[,i]))
>> res[i] <- substring(as.character(netw[,i]), wh, wh + 
>> attr(wh,'match.length')-1)
>> }
>
>
>There may be an easier solution, but this should do:
>
>res <- data.frame(lapply(netw,
>      function(x)
>        gsub("^ *([[:alpha:] ]*) +[[:alpha:]]+$", "\\1", x)))
>
>Uwe Ligges
>
>
>
>
>>   problem is for author "van den hoofs j" who is only retrieved as 'van'
>>
>> thanks,
>>
>>
>> David Biau
>>
>>
>>> ________________________________
>>> De : arun <smartpink...@yahoo.com>
>>> À : Biau David <djmb...@yahoo.fr>
>>> Envoyé le : Dimanche 13 janvier 2013 17h38
>>> Objet : Re: [R] extracting character values
>>>
>>> HI,
>>>
>>>
>>>   res <- data.frame(matrix(NA, nrow=dim(netw)[1], ncol=dim(netw)[2]))
>>> #Error in matrix(NA, nrow = dim(netw)[1], ncol = dim(netw)[2]) :
>>>   # object 'netw' not found
>>> Can you provide an example dataset of netw?
>>> Thanks.
>>> A.K.
>>>
>>>
>>>
>>> ----- Original Message -----
>>> From: Biau David <djmb...@yahoo.fr>
>>> To: r help list <r-help@r-project.org>
>>> Cc:
>>> Sent: Sunday, January 13, 2013 3:53 AM
>>> Subject: [R] extracting character values
>>>
>>> Dear all,
>>>
>>> I have a dataframe of names (netw), with each cell including last name and 
>>> initials of an author; some cells have NA. I would like to extract only the 
>>> last name from each cell; this new dataframe is calle 'res'
>>>
>>>
>>> Here is what I do:
>>>
>>> res <- data.frame(matrix(NA, nrow=dim(netw)[1], ncol=dim(netw)[2]))
>>>
>>> for (i in 1:x)
>>> {
>>> wh <- regexpr('[a-z]{3,}', as.character(netw[,i]))
>>> res[i] <- substring(as.character(netw[,i]), wh, wh + 
>>> attr(wh,'match.length')-1)
>>> }
>>>
>>>
>>> the problem is that I cannot manage to extract 'complex' names properly 
>>> such as ' van der hoops bf  ': here I only get 'van', the real last name is 
>>> 'van der hoops' and 'bf' are the initials. Basically the last name has 
>>> always a minimum of 3 consecutive letters, but may have 3 or more letters 
>>> separated by one or more space; the cell may start by a space too; initials 
>>> never have more than 2 letters.
>>>
>>> Someone would have a nice idea for that? Thanks,
>>>
>>>
>>> David
>>>
>>>      [[alternative HTML version deleted]]
>>>
>>>
>>> ______________________________________________
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>>
>>>
>>     [[alternative HTML version deleted]]
>>
>>
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to