from:"SiMaYunRui"

partition by category

2015-04-08 Thread SiMaYunRui

Hi folks, I am writing to ask how to filter and partition a set of files thru Spark. The situation is that I have N big files (cannot fit into single machine). And each line of files starts with a category (say Sport, Food, etc), while only have less than 100 categories actually. I need a progr

RE: Percentile example

2015-02-17 Thread SiMaYunRui

from TABLE-NAME GROUP BY FIELD1, FIELD2;” JavaSchemaRDD result = hsc.hql(hql); List grp = result.collect(); for (int z = 2; z < row.length(); z++) { // Do something with the results } Curt From: SiMaYunRui Date: Sunday, February 15, 2015 at 10:37 AM To: "use

RE: Percentile example

2015-02-17 Thread SiMaYunRui

getting an exact answer this way -- the approximation is only important for distributing work among all executors. Even if the approximation is inaccurate, you'll still correct for it, you will just have unequal partitions. Imran On Sun, Feb 15, 2015 at 9:37 AM, SiMaYunRui wrote: hello,

Percentile example

2015-02-15 Thread SiMaYunRui

hello, I am a newbie to spark and trying to figure out how to get percentile against a big data set. Actually, I googled this topic but not find any very useful code example and explanation. Seems that I can use transformer SortBykey to get my data set in order, but not pretty sure how can I ge

partition by category

RE: Percentile example

RE: Percentile example

Percentile example

4 matches

Site Navigation

Mail list logo

Footer information