Re: KNN for large data set

2015-01-22 Thread Sudipta Banerjee
;> neighbors locally. You can start with LSH + k-nearest in Google >> scholar: http://scholar.google.com/scholar?q=lsh+k+nearest -Xiangrui >> >> On Tue, Jan 20, 2015 at 9:55 PM, DEVAN M.S. wrote: >> > Hi all, >> > >> > Please help me to find out be

Re: Spark Team - Paco Nathan said that your team can help

2015-01-22 Thread Sudipta Banerjee
Dont ever reply to my queries :D On Thu, Jan 22, 2015 at 11:02 PM, Lukas Nalezenec < lukas.naleze...@firma.seznam.cz> wrote: > +1 > > > On 22.1.2015 18:30, Marco Shaw wrote: > > Sudipta - Please don't ever come here or post here again. > > On Thu, Jan 22

Re: Spark Team - Paco Nathan said that your team can help

2015-01-22 Thread Sudipta Banerjee
nworks/MapR and look for their "express VMs" which > can usually run on Oracle Virtualbox or VMware. > > Marco > > > On Thu, Jan 22, 2015 at 7:36 AM, Sudipta Banerjee < > asudipta.baner...@gmail.com> wrote: > >> >> >> Hi Apache-Spark team , >

Re: Spark Team - Paco Nathan said that your team can help

2015-01-22 Thread Sudipta Banerjee
take the responses you've gotten so far as about as much > > answer as can be had here and do some work yourself, and come back > > with much more specific questions, and it will all be helpful and > > polite again. > > > > On Thu, Jan 22, 2015 at 2:51 PM, Sudipta

Re: Are these numbers abnormal for spark streaming?

2015-01-22 Thread Sudipta Banerjee
ay9 hours 15 minutes 8 seconds9 hours 10 minutes 58 seconds9 hours >12 minutes9 hours 13 minutes 2 seconds9 hours 14 minutes 10 seconds9 >hours 15 minutes 8 seconds > > > Are these "normal". I was wondering what the scheduling delay and total > delay terms are, and if it's normal for them to be 9 hours. > > I've got a standalone spark master and 4 spark nodes. The streaming app > has been given 4 cores, and it's using 1 core per worker node. The > streaming app is submitted from a 5th machine, and that machine has nothing > but the driver running. The worker nodes are running alongside Cassandra > (and reading and writing to it). > > Any insights would be appreciated. > > Regards, > Ashic. > > > > -- Sudipta Banerjee Consultant, Business Analytics and Cloud Based Architecture Call me +919019578099

Re: Spark Team - Paco Nathan said that your team can help

2015-01-22 Thread Sudipta Banerjee
quot;Hi, I have $10,000, please find me some means of transportation so I can > get to work." > > Please provide (a lot) more details. If you can't, consider using one of > the pre-built express VMs from either Cloudera, Hortonworks or MapR, for > example. > > Marco &

Fwd: Spark Team - Paco Nathan said that your team can help

2015-01-22 Thread Sudipta Banerjee
Hi Apache-Spark team , What are the system requirements installing Hadoop and Apache Spark? I have attached the screen shot of Gparted. Thanks and regards, Sudipta -- Sudipta Banerjee Consultant, Business Analytics and Cloud Based Architecture Call me +919019578099