Ok…
After having some off-line exchanges with Shashidhar Rao came up with an idea…
Apply machine learning to either implement or improve autoscaling up or down
within a Storm/Akka cluster.
While I don’t know what constitutes an acceptable PhD thesis, or senior project
for undergrads… this is
I would suggest study spark ,flink,strom and based on your understanding
and finding prepare your research paper.
May be you will invented new spark ☺
Regards,
Vaquar khan
On 16 Jul 2015 00:47, "Michael Segel" wrote:
> Silly question…
>
> When thinking about a PhD thesis… do you want to tie it
Silly question…
When thinking about a PhD thesis… do you want to tie it to a specific
technology or do you want to investigate an idea but then use a specific
technology.
Or is this an outdated way of thinking?
"I am doing my PHD thesis on large scale machine learning e.g Online learning,
Look at this :
http://www.forbes.com/sites/lisabrownlee/2015/07/10/the-11-trillion-internet-of-things-big-data-and-pattern-of-life-pol-analytics/
On Wed, Jul 15, 2015 at 10:19 PM shahid ashraf wrote:
> Sorry Guys!
>
> I mistakenly added my question to this thread( Research ideas using
> spark).
Sorry Guys!
I mistakenly added my question to this thread( Research ideas using spark).
Moreover people can ask any question , this spark user group is for that.
Cheers!
😊
On Wed, Jul 15, 2015 at 9:43 PM, Robin East wrote:
> Well said Will. I would add that you might want to investigate GraphC
Well one of the strength of spark is standardized general distributed
processing allowing many different types of processing, such as graph
processing, stream processing etc. The limitation is that it is less
performant than one system focusing only on one type of processing (eg
graph processing).
Well said Will. I would add that you might want to investigate GraphChi which
claims to be able to run a number of large-scale graph processing tasks on a
workstation much quicker than a very large Hadoop cluster. It would be
interesting to know how widely applicable the approach GraphChi takes
There seems to be a bit of confusion here - the OP (doing the PhD) had the
thread hijacked by someone with a similar name asking a mundane question.
It would be a shame to send someone away so rudely, who may do valuable
work on Spark.
Sashidar (not Sashid!) I'm personally interested in running g
Hi Daniel
Well said
Regards
Vineel
On Tue, Jul 14, 2015, 6:11 AM Daniel Darabos <
daniel.dara...@lynxanalytics.com> wrote:
> Hi Shahid,
> To be honest I think this question is better suited for Stack Overflow
> than for a PhD thesis.
>
> On Tue, Jul 14, 2015 at 7:42 AM, shahid ashraf wrote:
>
Try to repartition it to a higher number (at least 3-4 times the total # of
cpu cores). What operation are you doing? It may happen that if you are
doing a join/groupBy sort of operation that task which is taking time is
having all the values, in that case you need to use a Partitioner which
will e
Hi Shahid,
To be honest I think this question is better suited for Stack Overflow than
for a PhD thesis.
On Tue, Jul 14, 2015 at 7:42 AM, shahid ashraf wrote:
> hi
>
> I have a 10 node cluster i loaded the data onto hdfs, so the no. of
> partitions i get is 9. I am running a spark application ,
11 matches
Mail list logo