Hello,
I have the edges of a graph stored as parquet files (about 3GB). I am loading
the graph and trying to compute the total number of triplets and triangles.
Here is my code:
val edges_parq = sqlContext.read.option("header","true").parquet(args(0) +
"/year=" + year)
val edges: RDD[Edge[Int
Hello,
I am trying to compute conductance, bridge ratio and diameter on a given graph
but I face some problems.
- For the conductance my problem is how to compute the cuts so that they are
kinda semi-clustered. Is the partitioningBy from GraphX related to dividing a
graph into multiple subgra
allocation. Anyway, i
>> don't know does this parameters works without dynamic allocation.
>>
>>> On Wed, Jul 11, 2018 at 5:11 PM Thodoris Zois wrote:
>>> Hello,
>>>
>>> Yeah you are right, but I think that works only if you use Spark dynamic
ters instead using spark.max.cores. I think
> spark.dynamicAllocation.minExecutors and spark.dynamicAllocation.maxExecutors
> configuration values can help you.
>
> On Tue, Jul 10, 2018 at 5:07 PM Thodoris Zois <mailto:z...@ics.forth.gr>> wrote:
> Actually after some exper
for example, but have with
> 8 or 9, so you can use smaller executers for better fit for available
> resources on nodes for example with 4 cores and 1 GB RAM, for example
>
> Cheers,
> Pavel
>
>> On Mon, Jul 9, 2018 at 9:05 PM Thodoris Zois wrote:
>> Hello list,
&
Hello list,
We are running Apache Spark on a Mesos cluster and we face a weird behavior of
executors. When we submit an app with e.g 10 cores and 2GB of memory and max
cores 30, we expect to see 3 executors running on the cluster. However,
sometimes there are only 2... Spark applications are no
As far as I know from Mesos with Spark, it is a running state and not a pending
one. What you see is normal, but if I am wrong somebody correct me.
Spark driver at start operates normally (running state) but when it comes to
start up executors, then it cannot allocate resources for them and han
event
> potential data corruption issues. Appreciate if you please share some
> details of your approach.
>
>
> Thanks!
> madhav
> On Wed, May 2, 2018 at 3:34 AM, Thodoris Zois
> wrote:
> > That’s what I did :) If you need further information I can post my
> > soluti
logistic (meaning 0 & 1's) before
> modeling? What are OS and spark version you using?
>
> Thank You,
>
> Irving Duran
>
>
> On Fri, Apr 27, 2018 at 2:34 PM Thodoris Zois <mailto:z...@ics.forth.gr>> wrote:
> Hello,
>
> I am running an experim
Hello,
I am running an experiment to test logistic and linear regression on spark
using MLlib.
My dataset is only 128MB and something weird happens. Linear regression takes
about 127 seconds either with 1 or 500 iterations. On the other hand, logistic
regression most of the times does not mana
If you are looking for a Spark scheduler that runs on top of Kubernetes then
this is the way to go:
https://github.com/apache/spark/blob/master/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/KubernetesClusterSchedulerBackend.scala
You can also have a look
Hello list!
I am trying to familiarize with Apache Spark. I would like to ask something
about partitioning and executors.
Can I have e.g: 500 partitions but launch only one executor that will run
operations in only 1 partition of the 500? And then I would like my job to die.
Is there any e
12 matches
Mail list logo