Thanks for your input.  That 1 hour per data point actually be a problem,
since sometimes we have reports with 100s of data points and need to
generate 100,000 reports.  So we definitely need to distribute this, but I
don't know where to start with this unfortunately.

On Mon, Mar 7, 2016 at 2:42 PM, Anurag [via Apache Spark User List] <
ml-node+s1001560n26421...@n3.nabble.com> wrote:

> Definition - each answer by an user is an event (I suppose)
>
> Let's estimate the number of events that can happen in a day in your case.
>
> 1. Max of Survey fill-outs / user = 10 = x
> 2. Max of Questions per survey = 100 = y
> 3. Max of users = 100,000 = z
>
>
> Maximum answers received in a day = x * y * z = 100,000,000 = 100 million
>
> Assuming you use a single c3.2xlarge machine,
> each data point in the report will get calculated in less than 1 hour
> (telling from my personal experience)
>
> I guess that would help.
>
> Regards
> Anurag
>
>
> ------------------------------
> If you reply to this email, your message will be added to the discussion
> below:
>
> http://apache-spark-user-list.1001560.n3.nabble.com/Is-Spark-right-for-us-tp26412p26421.html
> To unsubscribe from Is Spark right for us?, click here
> <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=26412&code=Z3VpbGxhdW1lLmJpbG9kZWF1QGdtYWlsLmNvbXwyNjQxMnw2MTU2NjY2NjE=>
> .
> NAML
> <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Is-Spark-right-for-us-tp26412p26422.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to