> Hello All,
>
> We have a hive table partitioned by date and hour(330 columns). We have 5
> years worth of data for the table. Each hourly partition have around 800MB.
> So total 43,800 partitions with one file per partition.
>
> When we run select count(*) from table, hive is taking for ever to s
Hiii all,
I am developing a medical app but i have reached this:
1. For structured Data such as patient info, medical history...etc: i would use
RDBMS.
2. For non-structured data such as scans, images, MRIs: i would use hbase
"nosql database"
So is this new arch is good??
I need your opinions..
On Fri, Feb 17, 2012 at 3:27 AM, Todd Lipcon wrote:
> Hey Schubert,
>
> Looking at the code on github, it looks like your rewritten shuffle is
> in fact just a backport of the shuffle from MR2. I didn't look closely
>
additionally, the rewritten shuffle in MR2 has some bugs, which harm the
overa
There are ways to integrate each, but you'll have to figure out what
actually matches and what doesn't. Maybe a customer id can link a medical
image, with their info, etc.
I've done a combo of MongoDB as the non structured OLTP, Postgres as the
RDBMS, and then
Utilizing Hadoop as the analytic por
My usecase is developing a scalable medical App.
This app has two main parts :1. Patient's Info2. Patient's Diagnosis,
Imaging...etc.3. How to link between them.
I want to perform an API for developers to use...
I want to use a NoSQL Database, because this is my thesis topic and I am stuck
in it
Hi,
for unstructured data hbase, sqoop as connector between hbase and rmdbs.
as an rest interface for hbase you could use stargate. Oozie as framework for
periodical jobs.
best,
Alex
sent via my mobile device
On Feb 18, 2012, at 11:17 PM, Dalia Sobhy wrote:
> My usecase is developing a sc