At 18:23 22/08/2010, Cecilia Carmo wrote:
I have done
intersect(names(df1), names(df2))
[1] "firm" "year"

This is the key I used to merge
merge(df1,df2,by=c("firm","year"))

And there is just one row firm/year in df1 that matches with another firm/year row in df2. Df1 has more firm/year rows than df2, and them don't match with none in df2.

That is what you believe but it seems that R disagrees.

I imagine the dataframes are too big to post so what I would try first is to create new dataframes containing just the variables firm and year (say newdf1 and newdf2), merge them and see whether I got the expected number of rows. If I did then I would add other variables back into the dataframe until the problem re-appeared.


Cecília

Em Sun, 22 Aug 2010 12:09:57 -0500
 Erik Iverson <er...@ccbr.umn.edu> escreveu:
Cecilia -
Find what columns you're matching on,
intersect(names(df1), names(df2)),
Maybe that will shed some light on the issue.
On 08/22/2010 12:02 PM, Cecilia Carmo wrote:
Thanks, but I don't have multiple matches and the lines repeated in the
final dataframe are exactly equal in all columns.

Cecília

Sat, 21 Aug 2010 10:58:53 -0500
Hadley Wickham <had...@rice.edu> escreveu:
You may find a close reading of ?merge helpful, particularly this
sentence: "If there is more than one match, all possible
matches contribute one row each" (so check that you don't have
multiple matches).

Hadley

On Sat, Aug 21, 2010 at 10:45 AM, Cecilia Carmo <cecilia.ca...@ua.pt>
wrote:
Hi everyone,

I have been merging many big dataframes (about 80000 rows each) and I
never
had this problem, but now it happened to me and I want to know if
someone
knows what could be happening.
The final dataframe has many rows, an impossible number! I have done
edit(dataframe) and I saw that there are many repeated rows (all equal).

Thanks for any help,

Cecília Carmo
Universidade de Aveiro
Portugal

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Michael Dewey
http://www.aghmed.fsnet.co.uk

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to