There’s a mix of terms here. CPU is the physical chip which most likely contains more than 1 physical core. If you’re on Intel, there are physical cores and virtual cores. So 1 physical core is seen by the OS as two virtual cores.
Then there are ‘cores per executor’ (spark terminology). So when you say core… which do you mean? Physical, virtual, or spark core? ;-) This is what happens when terms get misused and adopted to mean something similar… e.g. ‘real time’. With respect to your question… the suggestion is that you may want to consider overloading the number of partitions to ‘cores’ (spark) because while processing a partition, there will be times when your process would be blocked or waiting on something so working on a different partition would help with cpu utilization. (Assuming you’re only running spark jobs on your machine.) IMHO that’s an overly aggressive estimate. But again like most things… YMMV. > On Apr 1, 2016, at 4:12 AM, vaibhavrtk <vaibhav...@gmail.com> wrote: > > As per Spark programming guide, it says "we should have 2-4 partitions for > each CPU in your cluster.". In this case how does 1 CPU core process 2-4 > partitions at the same time? > > Does it do context switching between tasks or run them in parallel? If it > does context switching how is it efficient compared to 1:1 partition vs > Core? > > PS: If we are using Kafka direct API in which kafka partitions= Rdd > partitions. Does that mean we should give 40 kafka partitions for 10 CPU > Cores? > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Relation-between-number-of-partitions-and-cores-tp26658.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org