Re: Using Spark as a simulator

2017-07-07 Thread Steve Loughran
On 7 Jul 2017, at 08:37, Esa Heikkinen mailto:esa.heikki...@student.tut.fi>> wrote: I only want to simulate very huge "network" with even millions parallel time syncronized actors (state machines). There are also communication between actors via some (key-value pairs) database. I also want th

RE: Using Spark as a simulator

2017-06-21 Thread Mahesh Sawaiker
lto:jornfra...@gmail.com>> Lähetetty: 20. kesäkuuta 2017 17:12 Vastaanottaja: Esa Heikkinen Kopio: user@spark.apache.org<mailto:user@spark.apache.org> Aihe: Re: Using Spark as a simulator It is fine, but you have to design it that generated rows are written in large blocks for optimal performan

RE: Using Spark as a simulator

2017-06-20 Thread Mahesh Sawaiker
I have already seen on example where data is generated using spark, no reason to think it's a bad idea as far as I know. You can check the code here, I m not very sure but I think there is something there which generates data for the TPCDS benchmark and you can provide how much data you want in

Re: Using Spark as a simulator

2017-06-20 Thread Jörn Franke
It is fine, but you have to design it that generated rows are written in large blocks for optimal performance. The most tricky part with data generation is the conceptual part, such as probabilistic distribution etc You have to check as well that you use a good random generator, for some cases