Re: hbase rowkey design

2016-05-16 Thread Heng Chen
In my company, we calculate UV/PV offline in batch, and update every day. If do it online, url + timestamp could be the rowkey. 2016-05-16 18:13 GMT+08:00 齐忠 : > Yes, like google analytics. > > 2016-05-16 17:48 GMT+08:00 Heng Chen : > > You want to calculate UV/PV online? > > > > 2016-05-16 16

Re: hbase rowkey design

2016-05-16 Thread 齐忠
Yes, like google analytics. 2016-05-16 17:48 GMT+08:00 Heng Chen : > You want to calculate UV/PV online? > > 2016-05-16 16:46 GMT+08:00 齐忠 : > >> I have very large log(50T per day), >> >> My log event as follows >> >> url,visitid,requesttime >> >> http://www.aaa.com?a=b&c=d&e=f, 1, 1463387380 >> h

Re: hbase rowkey design

2016-05-16 Thread Heng Chen
You want to calculate UV/PV online? 2016-05-16 16:46 GMT+08:00 齐忠 : > I have very large log(50T per day), > > My log event as follows > > url,visitid,requesttime > > http://www.aaa.com?a=b&c=d&e=f, 1, 1463387380 > http://www.aaa.com?a=b&c=d&e=fa, 1, 1463387280 > http://www.aaa.com?a=b&c=d&e=fa, 2

hbase rowkey design

2016-05-16 Thread 齐忠
I have very large log(50T per day), My log event as follows url,visitid,requesttime http://www.aaa.com?a=b&c=d&e=f, 1, 1463387380 http://www.aaa.com?a=b&c=d&e=fa, 1, 1463387280 http://www.aaa.com?a=b&c=d&e=fa, 2, 1463387280 http://www.aaa.com?a=b&c=d&e=fab, 2, 1463387280 http://www.aaa.com?a=b&c

Re: Rowkey design

2015-11-30 Thread Mohammad Tariq
Hi Marko, Scan expects complete start and end row keys, IIRC. Order would anyway get disturbed as you are salting your keys. [image: http://] Tariq, Mohammad about.me/mti [image: http://] On Mon, Nov 30, 2015 at 1:19 PM, Marko Dinic wrote: > Hi Ted, > > Thank you for th

Re: Rowkey design

2015-11-29 Thread Marko Dinic
Hi Ted, Thank you for that information. Do you have some other suggestion, perhaps? Best regards, Marko On Monday, November 30, 2015, Ted Yu wrote: > bq. duplicate data to two different tables, one with > (salt-productId-timestamp) > and other with (salt-productId-place) keys > > I suggest thi

Re: Rowkey design

2015-11-29 Thread Marko Dinic
Hi Tariq, Thank you for your answer. But won't that break the ordering of my rows by timestamp thus making it impossible to scan by time range using STARTROW ENDROW? Best regards, Marko On Monday, November 30, 2015, Mohammad Tariq wrote: > Hi Marko, > > You could add the place(and unit as wel

Re: Rowkey design

2015-11-29 Thread Ted Yu
bq. duplicate data to two different tables, one with (salt-productId-timestamp) and other with (salt-productId-place) keys I suggest think twice about the above schema. It may become tricky keeping data in the two tables in sync. Meaning, when update to table1 succeeds but update to table2 fails,

Re: Rowkey design

2015-11-29 Thread Mohammad Tariq
Hi Marko, You could add the place(and unit as well) to your key if that's not making it very long. And then use RowFilter with SubstringComparator to get the desired rows. [image: http://] Tariq, Mohammad about.me/mti [image: http://] On Mon, Nov 30, 2015 at 3:49 AM, Mark

Rowkey design

2015-11-29 Thread Marko Dinic
Hello, everyone! I'm new to HBase and I need help designing rowkeys for use case that looks like this: - Products are listed, where each product has a product id. - Each product has a timestamp. - Each product is created in certain place (e.g. city) - Each product is created by some unit (e.g. fa

Re: hbase rowkey design ways

2015-08-23 Thread Ted Yu
Have you read the following ? http://hbase.apache.org/book.html#rowkey.design Cheers On Sun, Aug 23, 2015 at 8:01 AM, jackiehbaseuser wrote: > > Hi > > How many ways when i design the hbase rowkey ,and give some examples. > > Thank u very much! > > Best regards! > > qiguo

hbase rowkey design ways

2015-08-23 Thread jackiehbaseuser
Hi How many ways when i design the hbase rowkey ,and give some examples. Thank u very much! Best regards! qiguo

Re: Rowkey design question

2015-04-17 Thread Michael Segel
it, you can just retry. > You can either leave orphaned data (since the commit bit is not set, it's not > visible to a client), or you periodically look for those and clean them up. > > Hope this helps. Please let us know how it goes. > > -- Lars > > > ___

Re: Rowkey design question

2015-04-11 Thread lars hofhansl
From: Kristoffer Sjögren To: user@hbase.apache.org Sent: Wednesday, April 8, 2015 6:41 AM Subject: Re: Rowkey design question Yes, I think you're right. Adding one or more dimensions to the rowkey would indeed make the table narrower. And I guess it also make sense to store actual va

Re: Rowkey design question

2015-04-11 Thread Andrew Purtell
ny case and in all seriousness. Michael, feel free to educate > > yourself about what the intended use of coprocessors is - preferably > before > > you come here and start an argument ... again. We're more than happy to > > accept a patch from you with a "correct" i

Re: Rowkey design question

2015-04-11 Thread Kevin O'dell
Trying to figure out the best place to jump in here... Kristoffer, I would like to echo what Michael and Andrew have said. While a pre-aggregation co-proc may "work" in my experience with co-procs they are typically more trouble than they are worth. I would first try this outside the client t

Re: Rowkey design question

2015-04-11 Thread Sean Busbey
> > > > Can we just let this thread die? It didn't start with a useful > proposition. > > > > -- Lars > > > > From: Andrew Purtell > > To: "user@hbase.apache.org" > > Sent: Thursday, April 9, 2015 4:53 PM > > Subject: Re: Rowkey

Re: Rowkey design question

2015-04-11 Thread Michael Segel
is thread die? It didn't start with a useful proposition. > > -- Lars > > From: Andrew Purtell > To: "user@hbase.apache.org" > Sent: Thursday, April 9, 2015 4:53 PM > Subject: Re: Rowkey design question > > On Thu, Apr 9, 2015 at 2:26 PM, Michael

Re: Rowkey design question

2015-04-09 Thread Michael Segel
Andrew, In a nutshell running end user code within the RS JVM is a bad design. To be clear, this is not just my opinion… I just happen to be more vocal about it. ;-) We’ve covered this ground before and just because the code runs doesn’t mean its good. Or that the design is good. I would love

Re: Rowkey design question

2015-04-09 Thread Andrew Purtell
This is one person's opinion, to which he is absolutely entitled to, but blanket black and white statements like "coprocessors are poorly implemented" is obviously not an opinion shared by all those who have used them successfully, nor the HBase committers, or we would remove the feature. On the ot

Re: Rowkey design question

2015-04-09 Thread Michael Segel
Ok… Coprocessors are poorly implemented in HBase. If you work in a secure environment, outside of the system coprocessors… (ones that you load from hbase-site.xml) , you don’t want to use them. (The coprocessor code runs on the same JVM as the RS.) This means that if you have a poorly written

Re: Rowkey design question

2015-04-08 Thread Kristoffer Sjögren
An HBase coprocessor. My idea is to move as much pre-aggregation as possible to where the data lives in the region servers, instead of doing it in the client. If there is good data locality inside and across rows within regions then I would expect aggregation to be faster in the coprocessor (utiliz

Re: Rowkey design question

2015-04-08 Thread Michael Segel
When you say coprocessor, do you mean HBase coprocessors or do you mean a physical hardware coprocessor? In terms of queries… HBase can perform a single get() and return the result back quickly. (The size of the data being returned will impact the overall timing.) HBase also caches the resu

Re: Rowkey design question

2015-04-08 Thread Kristoffer Sjögren
But if the coprocessor is omitted then CPU cycles from region servers are lost, so where would the query execution go? Queries needs to be quick (sub-second rather than seconds) and HDFS is quite latency hungry, unless there are optimizations that i'm unaware of? On Wed, Apr 8, 2015 at 7:43 PM,

Re: Rowkey design question

2015-04-08 Thread Michael Segel
I think you misunderstood. The suggestion was to put the data in to HDFS sequence files and to use HBase to store an index in to the file. (URL to the file, then offset in to the file for the start of the record…) The reason you want to do this is that you’re reading in large amounts of data

Re: Rowkey design question

2015-04-08 Thread Kristoffer Sjögren
Yes, I think you're right. Adding one or more dimensions to the rowkey would indeed make the table narrower. And I guess it also make sense to store actual values (bigger qualifiers) outside HBase. Keeping them in Hadoop why not? Pulling hot ones out on SSD caches would be an interesting solution.

Re: Rowkey design question

2015-04-08 Thread Michael Segel
Ok… First, I’d suggest you rethink your schema by adding an additional dimension. You’ll end up with more rows, but a narrower table. In terms of compaction… if the data is relatively static, you won’t have compactions because nothing changed. But if your data is that static… why not put the

Re: Rowkey design question

2015-04-08 Thread Kristoffer Sjögren
I just read through HBase MOB design document and one thing that caught my attention was the following statement. "When HBase deals with large numbers of values > 100kb and up to ~10MB of data, it encounters performance degradations due to write amplification caused by splits and compactions." Is

Re: Rowkey design question

2015-04-08 Thread Kristoffer Sjögren
A small set of qualifiers will be accessed frequently so keeping them in block cache would be very beneficial. Some very seldom. So this sounds very promising! The reason why i'm considering a coprocessor is that I need to provide very specific information in the query request. Same thing with the

Re: Rowkey design question

2015-04-07 Thread Nick Dimiduk
Those rows are written out into HBase blocks on cell boundaries. Your column family has a BLOCK_SIZE attribute, which you may or may have no overridden the default of 64k. Cells are written into a block until is it >= the target block size. So your single 500mb row will be broken down into thousand

Re: Rowkey design question

2015-04-07 Thread Kristoffer Sjögren
Sorry I should have explained my use case a bit more. Yes, it's a pretty big row and it's "close" to worst case. Normally there would be fewer qualifiers and the largest qualifiers would be smaller. The reason why these rows gets big is because they stores aggregated data in indexed compressed fo

Re: Rowkey design question

2015-04-07 Thread Michael Segel
Sorry, but your initial problem statement doesn’t seem to parse … Are you saying that you a single row with approximately 100,000 elements where each element is roughly 1-5KB in size and in addition there are ~5 elements which will be between one and five MB in size? And you then mention a co

Re: Rowkey design question

2015-04-07 Thread Imants Cekusins
> how HBase loads the data into memory. If you init Get and specify columns with addColumn, it is likely that only data for these columns is read and loaded in memory. Rowkey is best kept short. So are column qualifiers.

Rowkey design question

2015-04-07 Thread Kristoffer Sjögren
Hi I have a row with around 100.000 qualifiers with mostly small values around 1-5KB and maybe 5 largers ones around 1-5 MB. A coprocessor do random access of 1-10 qualifiers per row. I would like to understand how HBase loads the data into memory. Will the entire row be loaded or only the qualif

Re: Newbie question: Rowkey design

2013-12-17 Thread Wilm Schumacher
t; prefix). >> >> You can have a look at secondary index. Secondary index is very helpful. >> >> >> >> >> 2013/12/16 Wilm Schumacher >> >>> Hi, >>> >>> I'm a newbie to hbase and have a question on the rowkey design and I >&

Re: Newbie question: Rowkey design

2013-12-17 Thread yonghu
irectly (say you have a random number as the row key's > prefix). > > You can have a look at secondary index. Secondary index is very helpful. > > > > > 2013/12/16 Wilm Schumacher > > > Hi, > > > > I'm a newbie to hbase and have a question on the

Re: Newbie question: Rowkey design

2013-12-16 Thread Tao Xiao
base and have a question on the rowkey design and I > hope this question isn't to newbie-like for this list. I have a question > which cannot be answered by knoledge of code but by experience with > large databases, thus this mail. > > For the sake of explaination I create a s

Newbie question: Rowkey design

2013-12-16 Thread Wilm Schumacher
Hi, I'm a newbie to hbase and have a question on the rowkey design and I hope this question isn't to newbie-like for this list. I have a question which cannot be answered by knoledge of code but by experience with large databases, thus this mail. For the sake of explaination I crea

Re: Hbase RowKey design schema

2013-08-29 Thread Doug Meil
Hi there, One thing to mention about the BigTable paper is they reverse the URL so that scans work with subdomains. www.subdomain1.cnn.com -> com.cnn.subdomain1.www www.subdomain2.cnn.com -> com.cnn.subdomain2.www If you don't reverse the URL there isn't an easy scan (short of creating another

Re: Hbase RowKey design schema

2013-08-29 Thread Shahab Yunus
What advantage you will be gaining by compressing? Less space? But then it will add compression/decompression performance overhead. A trade-off but a especially significant as space is cheap and redundancy is OK with such data stores. Having said that, more importantly, what are your read use-case

Hbase RowKey design schema

2013-08-29 Thread Wasim Karani
I am using HBase to store webtable content like how google is using bigtable. For reference of google bigtable My question is on RowKey, how we should be forming it. What google is doing is saving the URL in a reverse order as you can see in the PDF document "com.cnn.www" so that all the links ass

Re: issue about rowkey design

2013-08-19 Thread Michael Segel
@despegar.com] > Sent: Sunday, August 18, 2013 6:25 PM > To: user@hbase.apache.org; Kiru Pakkirisamy > Subject: Re: issue about rowkey design > > You can use a secondary table as a 'secondary index' setting your row as > value (or column) in it. > Enviado desde

RE: issue about rowkey design

2013-08-18 Thread Vladimir Rodionov
18, 2013 6:25 PM To: user@hbase.apache.org; Kiru Pakkirisamy Subject: Re: issue about rowkey design You can use a secondary table as a 'secondary index' setting your row as value (or column) in it. Enviado desde mi BlackBerry de Personal (http://www.personal.com.ar/) -Original Message-

Re: issue about rowkey design

2013-08-18 Thread fgaule
y-To: user@hbase.apache.org Subject: Re: issue about rowkey design what you mean secondary index? has hbase secondary index? On Sat, Aug 17, 2013 at 12:48 AM, Kiru Pakkirisamy < kirupakkiris...@yahoo.com> wrote: > We did design with something equivalent to userid as the key and all the > user se

Re: issue about rowkey design

2013-08-18 Thread ch huang
e.org > Sent: Friday, August 16, 2013 8:06 AM > Subject: Re: issue about rowkey design > > > HBase is all about denormalization and designing for the usecase/query > pattern. If it's possible for your application it will be better to > provide three different indexes, as opp

Re: issue about rowkey design

2013-08-16 Thread Kiru Pakkirisamy
From: Bryan Beaudreault To: user@hbase.apache.org Sent: Friday, August 16, 2013 8:06 AM Subject: Re: issue about rowkey design HBase is all about denormalization and designing for the usecase/query pattern.  If it's possible for your application it will be better to provide

Re: issue about rowkey design

2013-08-16 Thread Bryan Beaudreault
HBase is all about denormalization and designing for the usecase/query pattern. If it's possible for your application it will be better to provide three different indexes, as opposed to fitting them all into one rowkey design. On Fri, Aug 16, 2013 at 5:33 AM, ch huang wrote: >

issue about rowkey design

2013-08-16 Thread ch huang
hi,all i have data (data is very huge) with user id ,session id ,and visit time. my query pattern is ,"find all user id in certain time range,find one user's all session id ,and find all session id in certain time range". my difficult is that i can not find a rowkey that good for all the s

Re: Rowkey design and presplit table

2013-03-07 Thread Lukáš Drbal
Hello guys, sorry for my longest response, iam working on cluster update from 0.94.1 to 0.94.5. Ted: yes, i'll post my solution after import data into production cluster Asaf: "Why do you need to use prefix split policy?" Maybe i don't need it. I want distribute "unknown" keys to all nodes, avoi

Re: Rowkey design and presplit table

2013-03-07 Thread James Taylor
ROUP BY category_id HAVING count(comment_id) > 100 Regards, James On 03/06/2013 11:42 PM, Asaf Mesika wrote: I would convert each id to long and then use Bytes.toBytes to convert this long to a byte array. If it is an int then even better. Now, write all 3 longs one after another to one

Re: Rowkey design and presplit table

2013-03-06 Thread Asaf Mesika
int. Why do you need to use prefix split policy? On Monday, March 4, 2013, Lukáš Drbal wrote: > Hi, > > i have one question about rowkey design and presplit table. > > My usecase: > I need store a lot of comments where each comment are for one article and > this article has o

Re: Rowkey design and presplit table

2013-03-04 Thread Ted Yu
; > > > > Lukas Drbal > > > > > > > > > 2013/3/4 Jilal Oussama > > > > > > > You can split in your application using a regular expression on the > > > > underscore char if the langage supports them (like sp

Re: Rowkey design and presplit table

2013-03-04 Thread Lukáš Drbal
4 Jilal Oussama > > > > > You can split in your application using a regular expression on the > > > underscore char if the langage supports them (like spliting data of a > csv > > > file) > > > > > > > > > 2013/3/4 Lukáš Drbal >

Re: Rowkey design and presplit table

2013-03-04 Thread Ted Yu
t; > Lukas Drbal > > > 2013/3/4 Jilal Oussama > > > You can split in your application using a regular expression on the > > underscore char if the langage supports them (like spliting data of a csv > > file) > > > > > > 2013/3/4 Lukáš Drbal >

Re: Rowkey design and presplit table

2013-03-04 Thread Lukáš Drbal
using a regular expression on the > underscore char if the langage supports them (like spliting data of a csv > file) > > > 2013/3/4 Lukáš Drbal > > > Hi, > > > > i have one question about rowkey design and presplit table. > > > > My usecase: > >

Re: Rowkey design and presplit table

2013-03-04 Thread Jilal Oussama
You can split in your application using a regular expression on the underscore char if the langage supports them (like spliting data of a csv file) 2013/3/4 Lukáš Drbal > Hi, > > i have one question about rowkey design and presplit table. > > My usecase: > I need store a lot

Rowkey design and presplit table

2013-03-04 Thread Lukáš Drbal
Hi, i have one question about rowkey design and presplit table. My usecase: I need store a lot of comments where each comment are for one article and this article has one category. What i need: 1) read one comment by id (where i know commentId, articleId and categoryId) 2) read all coments for

Re: RowKey design with hashing

2013-02-24 Thread Jean-Marc Spaggiari
If you never care about scans ordering i.e. you only do point gets to > > see > > >>> whether you've already seen an email address, do the hash part. > > >>> > > >>> I'd perfer #1 over #2, because it would let you do efficient key > prefix > >

Re: Rowkey design question

2013-02-21 Thread Mohammad Tariq
Another good point. Warm Regards, Tariq https://mtariq.jux.com/ cloudfront.blogspot.com On Fri, Feb 22, 2013 at 3:45 AM, Asaf Mesika wrote: > An easier way is to place one byte before the time stamp which is called a > bucket. You can calculate it by using modulu on the time stamp by the > num

Re: Rowkey design question

2013-02-21 Thread Asaf Mesika
An easier way is to place one byte before the time stamp which is called a bucket. You can calculate it by using modulu on the time stamp by the number of buckets. We are now in the process of field testing it. On Tuesday, February 19, 2013, Paul van Hoven wrote: > Yeah it worked fine. > > But a

Re: Rowkey design question

2013-02-19 Thread Mohammad Tariq
You can use FuzzyRowFilterto do that. Have a look at this link. You might find it helpful. Warm Rega

Re: Rowkey design question

2013-02-19 Thread Paul van Hoven
Yeah it worked fine. But as I understand: If I prefix my row key with something like md5-hash + timestamp then the rowkeys are probably evenly distributed but how would I perform then a scan restricted to a special time range? 2013/2/19 Mohammad Tariq : > No. before the timestamp. All the row

Re: Rowkey design question

2013-02-19 Thread Mohammad Tariq
No. before the timestamp. All the row keys which are identical go to the same region. This is the default Hbase behavior and is meant to make the performance better. But sometimes the machine gets overloaded with reads and writes because we get concentrated on that particular machine. For example t

Re: Rowkey design question

2013-02-19 Thread Paul van Hoven
Hey Tariq, thanks for your quick answer. I'm not sure if I got the idea in the seond part of your answer. You mean if I use a timestamp as a rowkey I should append a hash like this: 135727920+MD5HASH and then the data would be distributed more equally? 2013/2/19 Mohammad Tariq : > Hello Pa

Re: Rowkey design question

2013-02-19 Thread Mohammad Tariq
Hello Paul, Try this and see if it works : scan.setStartRow(Bytes.toBytes(startDate.getTime() + "")); scan.setStopRow(Bytes.toBytes(endDate.getTime() + 1 + "")); Also try not to use TS as the rowkey, as it may lead to RS hotspotting. Just add a hash to your rowkeys so that data

Rowkey design question

2013-02-19 Thread Paul van Hoven
Hi, I'm currently playing with hbase. The design of the rowkey seems to be critical. The rowkey for a certain database table of mine is: timestamp+ipaddress It looks something like this when performing a scan on the table in the shell: hbase(main):012:0> scan 'ToyDataTable' ROW

Re: RowKey design with hashing

2013-02-13 Thread Ted Yu
> >>> If you never care about scans ordering i.e. you only do point gets to > see > >>> whether you've already seen an email address, do the hash part. > >>> > >>> I'd perfer #1 over #2, because it would let you do efficient key prefix >

Re: RowKey design with hashing

2013-02-13 Thread Mehmet Simsek
gt;>> >>> I'd perfer #1 over #2, because it would let you do efficient key prefix >>> block encoding (FAST_DIFF). >>> >>> -- Lars >>> >>> >>> >>> >>> From: Nurettin Şimşek >>> To: user@hbase.apac

Re: RowKey design with hashing

2013-02-13 Thread Ted Yu
__ > > From: Nurettin Şimşek > > To: user@hbase.apache.org > > Sent: Wednesday, February 13, 2013 12:35 AM > > Subject: RowKey design with hashing > > > > Hi All, > > > > In our project mail adresses are row key. Which rowkey d

Re: RowKey design with hashing

2013-02-13 Thread Jean-Marc Spaggiari
pache.org > Sent: Wednesday, February 13, 2013 12:35 AM > Subject: RowKey design with hashing > > Hi All, > > In our project mail adresses are row key. Which rowkey design  we should > choose? > > 1) com.yahoo@ (Reversed) > 2) x...@yahoo.com > 3) md5 hash(x...@yahoo.com) > 4) Any other solution. > > Many thanks. > > -- > M. Nurettin ŞİMŞEK

Re: RowKey design with hashing

2013-02-13 Thread lars hofhansl
. -- Lars From: Nurettin Şimşek To: user@hbase.apache.org Sent: Wednesday, February 13, 2013 12:35 AM Subject: RowKey design with hashing Hi All, In our project mail adresses are row key. Which rowkey design  we should choose? 1) com.yahoo@ (Reversed) 2) x...@yahoo.

Re: RowKey design with hashing

2013-02-13 Thread Nurettin Şimşek
Thanks Jean, 3 can be good for us.

Re: RowKey design with hashing

2013-02-13 Thread Jean-Marc Spaggiari
I don't see any issue with #2 and it might be the simplest one. But all will depend on your read pattern. If you need to scan by domain, 1 is better. I you need to list the emails without knowing it, 2 might be better. If you only access it given a specific address, 3 can be good. So I will say, a

Re: RowKey design with hashing

2013-02-13 Thread Nurettin Şimşek
I want to search email adress equality. There are many many domains not only yahoo. What is disadvantages of using hashing?

Re: RowKey design with hashing

2013-02-13 Thread Amit Sela
ow keys as '' without adding '@yahoo.com'. > > -- > Regards, > Alexander Ignatov > > > > On 2/13/2013 12:35 PM, Nurettin Şimşek wrote: > >> Hi All, >> >> In our project mail adresses are row key. Which rowkey design we should &g

Re: RowKey design with hashing

2013-02-13 Thread Alexander Ignatov
If you have only one domain 'yahoo.com' for all mail addresses you probably can use row keys as '' without adding '@yahoo.com'. -- Regards, Alexander Ignatov On 2/13/2013 12:35 PM, Nurettin Şimşek wrote: Hi All, In our project mail adresses are row key.

RowKey design with hashing

2013-02-13 Thread Nurettin Şimşek
Hi All, In our project mail adresses are row key. Which rowkey design we should choose? 1) com.yahoo@ (Reversed) 2) x...@yahoo.com 3) md5 hash(x...@yahoo.com) 4) Any other solution. Many thanks. -- M. Nurettin ŞİMŞEK

Rowkey design for time series data

2012-07-03 Thread Bartosz M. Frak
Hey Guys, Before I get to my thoughts on the rowkey design, here's some background info about the problem we are trying to tackle. We are producing about 60TB of data a year (uncompressed). Most of this data is collected continuously from various detectors around our facility Vast maj