Ignore the third one, my math was badÅ worked out to 733 bytes / row and it ended up being 6.6 gig as it compacted it some after it was done when the load was light(noticed that a bit later)
But what about the other two? Is that the time is expected approximately? Thanks, Dean On 8/10/12 3:50 PM, "Hiller, Dean" <dean.hil...@nrel.gov> wrote: >****** 3. In my test below, I see there is now 8Gig of data and 9,000,000 >rows. Does that sound right?, nearly 1MB of space is used per row for a >50 column row???? That sounds like a huge amount of overhead. (my values >are long on every column, but that is still not much). I was expecting >KB / row maybe, but MB / row? My column names are "col"+I as well so >they are very short too. > >A common configuration is 1T drives per node, so I was wondering if >anyone ran any tests with map/reduce on reading in all those rows(not >doing anything with it, just reading it in). > >****** 1. How long does it take to go through the 500MB that would be on >that node? > >I ran some tests on just writing a fake table in 50 columns wide and am >seeing it will take about 31 hours to write 500MB of information (a node >is about full at 500MB since need to reserve 50-30% space for compaction >and such). Ie. If I need to rerun any kind of indexing, it will take 31 >hoursÅ does this sound about normal/ballpark? Obviously many nodes will >be below so that would be worst case with 1 T drives. > >****** 2. Anyone have any other data? > >Thanks, >Dean