Re: better partitioning strategy in hive

2012-02-18 Thread rk vishu
> Hello All, > > We have a hive table partitioned by date and hour(330 columns). We have 5 > years worth of data for the table. Each hourly partition have around 800MB. > So total 43,800 partitions with one file per partition. > > When we run select count(*) from table, hive is taking for ever to s

Medical App

2012-02-18 Thread Dalia Sobhy
Hiii all, I am developing a medical app but i have reached this: 1. For structured Data such as patient info, medical history...etc: i would use RDBMS. 2. For non-structured data such as scans, images, MRIs: i would use hbase "nosql database" So is this new arch is good?? I need your opinions..

Re: Optimized Hadoop

2012-02-18 Thread Anty
On Fri, Feb 17, 2012 at 3:27 AM, Todd Lipcon wrote: > Hey Schubert, > > Looking at the code on github, it looks like your rewritten shuffle is > in fact just a backport of the shuffle from MR2. I didn't look closely > additionally, the rewritten shuffle in MR2 has some bugs, which harm the overa

Re: Medical App

2012-02-18 Thread Jamack, Peter
There are ways to integrate each, but you'll have to figure out what actually matches and what doesn't. Maybe a customer id can link a medical image, with their info, etc. I've done a combo of MongoDB as the non structured OLTP, Postgres as the RDBMS, and then Utilizing Hadoop as the analytic por

RE: Medical App

2012-02-18 Thread Dalia Sobhy
My usecase is developing a scalable medical App. This app has two main parts :1. Patient's Info2. Patient's Diagnosis, Imaging...etc.3. How to link between them. I want to perform an API for developers to use... I want to use a NoSQL Database, because this is my thesis topic and I am stuck in it

Re: Medical App

2012-02-18 Thread Alexander Lorenz
Hi, for unstructured data hbase, sqoop as connector between hbase and rmdbs. as an rest interface for hbase you could use stargate. Oozie as framework for periodical jobs. best, Alex sent via my mobile device On Feb 18, 2012, at 11:17 PM, Dalia Sobhy wrote: > My usecase is developing a sc