Ian,
You still want to stick to your relational modeling. :-(
You need to play around more with hierarchical models to get a better
appreciation.
If you model as if you're working with a RDBMS then you will end up with a poor
HBase table design.
In ERD models, you don't have the concept o
Sure. Maybe it's useful to talk about the functional aspect of relationships in
models. In an RDBMS, explicit relationship play a couple roles:
- foreign key constraints: don't allow a tuple in relation A to point to a row
in relation B that doesn't exist
- join optimization - knowledge of how t
An entity is an entity.
When you couple them you are saying that there's a relationship to them in the
model.
What I am saying is that you can have an HBase model which is not a single
table, however when you look at your use case, you are querying data from a
single table at a time.
Going
Mike, what do you mean by "you can have entities, except that they are not
coupled"? You mean, they have no relationship to each other? Or the
relationship is defined elsewhere (e.g. application code)? The concept of
"coupling" seems a little overloaded and not as concise here as "relationship".
LOL...
Ian wrote:
"But, something just occurred to me: just because your physical implementation
(HBase) doesn't support normalized entities and relationships doesn't mean your
*problem* doesn't have entities and relationships. :) An Author is one entity,
a Title is another, and a Genre is a th
Mike and I get into good discussions about ERD modeling and HBase a lot ... :)
Mike's right that you should avoid a design that relies heavily on
relationships when modeling data in HBase, because relationships are tricky
(they're the first thing that gets throw out the window in a database that
Sorry, but you missed the point.
(Note: This is why I keep trying to put a talk at Strata and the other
conferences on Schema design yet for some reason... it just doesn't seem
important enough or sexy enough... maybe if I worked for Cloudera/Intel/etc ...
;-)
Look,
The issue is what is a
I understand that there shouldn't be unlimited number of column families. I
am using this example on purpose to see how it comes into play.
On Fri, Jul 5, 2013 at 12:07 PM, Michael Segel wrote:
> Why do you have so many column families (CF) ?
>
> Its not a question on the physical limitations, b
Why do you have so many column families (CF) ?
Its not a question on the physical limitations, but more on the issue of data
design.
There aren't that many really good examples of where you would have multiple
column families that would require more than a handful of CFs.
When I teach or le
Asaf,
I am using the Genre/Author stuff as an example but yes at the moment I
only have 5 column families. However, over time I may have more (no upper
limit decided that this point). See below for more responses
On Wed, Jul 3, 2013 at 3:42 PM, Asaf Mesika wrote:
> Do you have only 5 static a
Do you have only 5 static author names?
Keep in mind the column family name is defined when creating the table.
Regarding tall vs wide debate:
HBase is first and for most a Key Value database thus reads and writes in
the column-value level. So it doesn't really care about rows.
But it's not entire
Not off hand.
But its something that I think I could cobble up over the next couple of days
if my wife runs out of projects for me to do around the house. ;-)
On Jul 3, 2013, at 12:57 PM, Stack wrote:
> On Wed, Jul 3, 2013 at 7:08 AM, Michael Segel
> wrote:
>
>> Really a bad title for th
On Wed, Jul 3, 2013 at 7:08 AM, Michael Segel wrote:
> Really a bad title for the section.
>
> Schema Smackdown? Really?
>
> 6.10.1 isn't really valid is it? Rows version versions?
> IMHO it should be Columns versus versions. (Do you put a timestamp in the
> column qualifier name versus having
Really a bad title for the section.
Schema Smackdown? Really?
6.10.1 isn't really valid is it? Rows version versions?
IMHO it should be Columns versus versions. (Do you put a timestamp in the
column qualifier name versus having an enormous number of versions allowed?)
There's more, but I
I have a major typo in the question so I apologize. I meant to say 5
families with 1000+ qualifiers each.
Lets work with an example, (not the greatest example here but still). Lets
say we have a Genre Class like this:
Class HistoryBooks{
ArrayList author1;
ArrayList author2;
ArrayList author3
If they are accessed mostly together they should all be a single column family.
The key with tall or wide is based on the total byte size of each KeyValue.
Your cells would need to be quite large for 50 to become a problem. I still
would recommend using a single CF though.
—
Sent from iPhone
On
The section on Rows vs. Columns at
http://hbase.apache.org/book/schema.smackdown.html talks about expanding
horizontally vs. vertically.
Can someone please explain to me when to choose rows vs. columns. The
sections reads, "To be clear, this guideline is in the context is in
extremely wide cases,
17 matches
Mail list logo