On 25.01.2013 12:08, Berend Hasselman wrote:
On 25-01-2013, at 10:25, marcoguerzoni <marco.guerz...@unito.it> wrote:

dear all,

thank you for reading.

I have a dataset of artists and where and when they had an exhibition. I'd like to create an affiliation network in the form of matrix, telling me
which aritist have been in the same at the same time.
I manage to do it, but given that I have 96000 observation the program takes
30 months to complete.
her what i have done.
the data look like this

Artist <-c(1,2,3,2,4,4,5)
Begin <- as.Date(c('2006-08-23', '2006-03-21', '2006-03-06', '2006-01-13',
'2006-05-20', '2006-07-13', '2006-07-20'))
End <- as.Date(c('2006-10-23', '2006-11-30', '2006-05-06', '2006-12-13',
'2006-09-20', '2006-08-13', '2006-09-20'))
Istitution <- c(1, 2, 2, 1, 1, 2, 1)

artist is the name of the artist, Begin and End is the when and Istitutionis
the where.

my IF is working,


#number of unique artist
c <- unique(Artist)
d <- length(c)
a <-length(Artist)

B <- mat.or.vec(d,d)

for(i in 1:d) {
for(j in 1:d) {
if (Istitution[i]  == Istitution[j]) {
if (Begin[i] <= End[j])
{
if (End[i]-Begin[j] >= 0) {
B[i,j] <- B[i,j]+1
B[i,i] <- 0
}
}
else{
if (End[j]-Begin[i] >= 0) {
B[i,j] <- B[i,j]+1
B[i,i] <- 0
}
}
 }
  }
print(i)
   }
do you have a way to make the programm simpler and faster?


It is not clear why you are only using the unique artists.
You shouldn't be using "c" as variable name. It is a builtin function.

Since the result is symmetric you can change the j-loop  to for(j in
(i+1):d).
After the loop you can do

B[lower.tri(B)] <- t(B)[lower.tri(B)]

to fill the remainder of the matrix B. This would certainly be more
efficient.

But I don't quite understand what you are trying to do.
With you example you could compute the result you desire.

Gerrit's answer is concise.

Berend


thank you Berend,

what I like to do is to have a symmetric matrix, where raws and colums are artists and value I get 1 (or true) if they had an exhibition in the same and in the same place. My unelegant code is working, but for 96000 observation is requiring months and months. Gerrit is very elegant, but i run out of memory...

the problem is the size. I am looking maybe for a way to divide gerrit solutonin smaller steps which can be handled

thanx

Marco

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to