Às 11:23 de 27/08/2024, Francesca PANCOTTO via R-help escreveu:
Dear Contributors,
I have a problem with a database composed of many individuals for many
periods, for which I need to perform a manipulation of data as follows.
Here I report the procedure I need to do for the first 32 observations of
the first period.


cbind(VB1d[,1],s1id[,1])
       [,1] [,2]
  [1,]    6    8
  [2,]    9    5
  [3,]   NA    1
  [4,]    5    6
  [5,]   NA    7
  [6,]   NA    2
  [7,]    4    4
  [8,]    2    7
  [9,]    2    7
[10,]   NA    3
[11,]   NA    2
[12,]   NA    4
[13,]    5    6
[14,]    9    5
[15,]   NA    5
[16,]   NA    6
[17,]   10    3
[18,]    7    2
[19,]    2    1
[20,]   NA    7
[21,]    7    2
[22,]   NA    8
[23,]   NA    4
[24,]   NA    5
[25,]   NA    6
[26,]    2    1
[27,]    4    4
[28,]    6    8
[29,]   10    3
[30,]   NA    3
[31,]   NA    8
[32,]   NA    1


In column s1id, I have numbers from 1 to 8, which are the id of 8 groups ,
randomly mixed in the larger group of 32.
For each group, I want the value that is reported for only to group
members, to all the four group members.

For example, value 8 in first row , second column, is group 8. The value
for group 8 of the variable VB1d is 6. At row 28, again for s1id equal to
8, I have 6.
But in row 22, the value 8 of the second variable, reports a value NA.
in each group is the same, only two values have the correct number, the
other two are NA.
I need that each group, identified by the values of the variable S1id,
correctly report the number of variable VB1d that is present for just two
group members.

I hope my explanation is acceptable.
The task appears complex to me right now, especially because I will need to
multiply this procedure for x12x14 similar databases.

Anyone has ever encountered a similar problem?
Thanks in advance for any help provided.

----------------------------------

Francesca Pancotto

Associate Professor Political Economy

University of Modena, Largo Santa Eufemia, 19, Modena

Office Phone: +39 0522 523264

Web: *https://sites.google.com/view/francescapancotto/home
<https://sites.google.com/view/francescapancotto/home>*

  ----------------------------------

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Hello,

Here is a solution.
Split the 1st column by the 2nd, keep only the not-NA values and unlist, to have a named vector.
Then put the names and the values together with cbind.



mat <- structure(
  c(6L, 9L, NA, 5L, NA, NA, 4L, 2L, 2L, NA, NA, NA, 5L,
    9L, NA, NA, 10L, 7L, 2L, NA, 7L, NA, NA, NA, NA, 2L, 4L, 6L,
    10L, NA, NA, NA, 8L, 5L, 1L, 6L, 7L, 2L, 4L, 7L, 7L, 3L, 2L,
    4L, 6L, 5L, 5L, 6L, 3L, 2L, 1L, 7L, 2L, 8L, 4L, 5L, 6L, 1L, 4L,
    8L, 3L, 3L, 8L, 1L), dim = c(32L, 2L))


res <- split(mat[, 1L], mat[, 2L]) |> lapply(\(x) x[!is.na(x)]) |> unlist()
nms <- names(res)
res <- cbind(
  VB1d = res,
  s1id = substr(nms, 1, nchar(nms) - 1L) |> as.integer()
)
res
#>    VB1d s1id
#> 11    2    1
#> 12    2    1
#> 21    7    2
#> 22    7    2
#> 31   10    3
#> 32   10    3
#> 41    4    4
#> 42    4    4
#> 51    9    5
#> 52    9    5
#> 61    5    6
#> 62    5    6
#> 71    2    7
#> 72    2    7
#> 81    6    8
#> 82    6    8



Hope this helps,

Rui Barradas


--
Este e-mail foi analisado pelo software antivírus AVG para verificar a presença 
de vírus.
www.avg.com

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to