Às 11:23 de 27/08/2024, Francesca PANCOTTO via R-help escreveu:
Dear Contributors,
I have a problem with a database composed of many individuals for many
periods, for which I need to perform a manipulation of data as follows.
Here I report the procedure I need to do for the first 32 observations of
the first period.
cbind(VB1d[,1],s1id[,1])
[,1] [,2]
[1,] 6 8
[2,] 9 5
[3,] NA 1
[4,] 5 6
[5,] NA 7
[6,] NA 2
[7,] 4 4
[8,] 2 7
[9,] 2 7
[10,] NA 3
[11,] NA 2
[12,] NA 4
[13,] 5 6
[14,] 9 5
[15,] NA 5
[16,] NA 6
[17,] 10 3
[18,] 7 2
[19,] 2 1
[20,] NA 7
[21,] 7 2
[22,] NA 8
[23,] NA 4
[24,] NA 5
[25,] NA 6
[26,] 2 1
[27,] 4 4
[28,] 6 8
[29,] 10 3
[30,] NA 3
[31,] NA 8
[32,] NA 1
In column s1id, I have numbers from 1 to 8, which are the id of 8 groups ,
randomly mixed in the larger group of 32.
For each group, I want the value that is reported for only to group
members, to all the four group members.
For example, value 8 in first row , second column, is group 8. The value
for group 8 of the variable VB1d is 6. At row 28, again for s1id equal to
8, I have 6.
But in row 22, the value 8 of the second variable, reports a value NA.
in each group is the same, only two values have the correct number, the
other two are NA.
I need that each group, identified by the values of the variable S1id,
correctly report the number of variable VB1d that is present for just two
group members.
I hope my explanation is acceptable.
The task appears complex to me right now, especially because I will need to
multiply this procedure for x12x14 similar databases.
Anyone has ever encountered a similar problem?
Thanks in advance for any help provided.
----------------------------------
Francesca Pancotto
Associate Professor Political Economy
University of Modena, Largo Santa Eufemia, 19, Modena
Office Phone: +39 0522 523264
Web: *https://sites.google.com/view/francescapancotto/home
<https://sites.google.com/view/francescapancotto/home>*
----------------------------------
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Hello,
Here is a solution.
Split the 1st column by the 2nd, keep only the not-NA values and unlist,
to have a named vector.
Then put the names and the values together with cbind.
mat <- structure(
c(6L, 9L, NA, 5L, NA, NA, 4L, 2L, 2L, NA, NA, NA, 5L,
9L, NA, NA, 10L, 7L, 2L, NA, 7L, NA, NA, NA, NA, 2L, 4L, 6L,
10L, NA, NA, NA, 8L, 5L, 1L, 6L, 7L, 2L, 4L, 7L, 7L, 3L, 2L,
4L, 6L, 5L, 5L, 6L, 3L, 2L, 1L, 7L, 2L, 8L, 4L, 5L, 6L, 1L, 4L,
8L, 3L, 3L, 8L, 1L), dim = c(32L, 2L))
res <- split(mat[, 1L], mat[, 2L]) |> lapply(\(x) x[!is.na(x)]) |> unlist()
nms <- names(res)
res <- cbind(
VB1d = res,
s1id = substr(nms, 1, nchar(nms) - 1L) |> as.integer()
)
res
#> VB1d s1id
#> 11 2 1
#> 12 2 1
#> 21 7 2
#> 22 7 2
#> 31 10 3
#> 32 10 3
#> 41 4 4
#> 42 4 4
#> 51 9 5
#> 52 9 5
#> 61 5 6
#> 62 5 6
#> 71 2 7
#> 72 2 7
#> 81 6 8
#> 82 6 8
Hope this helps,
Rui Barradas
--
Este e-mail foi analisado pelo software antivírus AVG para verificar a presença
de vírus.
www.avg.com
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.