Re: Have I done everything correctly when subscribing to Spark User List

Sivakumaran S Mon, 08 Aug 2016 11:45:28 -0700

Does it have anything to do with the fact that the mail address is displayed as 
user @spark.apache.org <http://spark.apache.org/>? There is a space before ‘@‘. 
This is as received in my mail client.


Sivakumaran


> On 08-Aug-2016, at 7:42 PM, Chris Mattmann <mattm...@apache.org> wrote:
> 
> Weird!
> 
> 
> 
> 
> 
> On 8/8/16, 11:10 AM, "Sean Owen" <so...@cloudera.com> wrote:
> 
>> I also don't know what's going on with the "This post has NOT been
>> accepted by the mailing list yet" message, because actually the
>> messages always do post. In fact this has been sent to the list 4
>> times:
>> 
>> https://www.mail-archive.com/search?l=user%40spark.apache.org&q=dueckm&submit.x=0&submit.y=0
>> 
>> On Mon, Aug 8, 2016 at 3:03 PM, Chris Mattmann <mattm...@apache.org> wrote:
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On 8/8/16, 2:03 AM, "matthias.du...@fiduciagad.de" 
>>> <matthias.du...@fiduciagad.de> wrote:
>>> 
>>>> Hello,
>>>> 
>>>> I write to you because I am not really sure whether I did everything right 
>>>> when registering and subscribing to the spark user list.
>>>> 
>>>> I posted the appended question to Spark User list after subscribing and 
>>>> receiving the "WELCOME to user@spark.apache.org" mail from 
>>>> "user-h...@spark.apache.org".
>>>> But this post is still in state "This post has NOT been accepted by the 
>>>> mailing list yet.".
>>>> 
>>>> Is this because I forgot something to do or did something wrong with my 
>>>> user account (dueckm)? Or is it because no member of the Spark User List 
>>>> reacted to that post yet?
>>>> 
>>>> Thanks a lot for yout help.
>>>> 
>>>> Matthias
>>>> 
>>>> Fiducia & GAD IT AG | www.fiduciagad.de
>>>> AG Frankfurt a. M. HRB 102381 | Sitz der Gesellschaft: Hahnstr. 48, 60528 
>>>> Frankfurt a. M. | USt-IdNr. DE 143582320
>>>> Vorstand: Klaus-Peter Bruns (Vorsitzender), Claus-Dieter Toben (stv. 
>>>> Vorsitzender),
>>>> 
>>>> Jens-Olaf Bartels, Martin Beyer, Jörg Dreinhöfer, Wolfgang Eckert, Carsten 
>>>> Pfläging, Jörg Staff
>>>> Vorsitzender des Aufsichtsrats: Jürgen Brinkmann
>>>> 
>>>> ----- Weitergeleitet von Matthias Dück/M/FAG/FIDUCIA/DE am 08.08.2016 
>>>> 10:57 -----
>>>> 
>>>> Von: dueckm <matthias.du...@fiduciagad.de>
>>>> An: user@spark.apache.org
>>>> Datum: 04.08.2016 13:27
>>>> Betreff: Are join/groupBy operations with wide Java Beans using Dataset 
>>>> API much slower than using RDD API?
>>>> 
>>>> ________________________________________
>>>> 
>>>> 
>>>> 
>>>> Hello,
>>>> 
>>>> I built a prototype that uses join and groupBy operations via Spark RDD 
>>>> API.
>>>> Recently I migrated it to the Dataset API. Now it runs much slower than 
>>>> with
>>>> the original RDD implementation.
>>>> Did I do something wrong here? Or is this a price I have to pay for the 
>>>> more
>>>> convienient API?
>>>> Is there a known solution to deal with this effect (eg configuration via
>>>> "spark.sql.shuffle.partitions" - but now could I determine the correct
>>>> value)?
>>>> In my prototype I use Java Beans with a lot of attributes. Does this slow
>>>> down Spark-operations with Datasets?
>>>> 
>>>> Here I have an simple example, that shows the difference:
>>>> JoinGroupByTest.zip
>>>> <http://apache-spark-user-list.1001560.n3.nabble.com/file/n27473/JoinGroupByTest.zip>
>>>> - I build 2 RDDs and join and group them. Afterwards I count and display 
>>>> the
>>>> joined RDDs.  (Method de.testrddds.JoinGroupByTest.joinAndGroupViaRDD() )
>>>> - When I do the same actions with Datasets it takes approximately 40 times
>>>> as long (Methodd e.testrddds.JoinGroupByTest.joinAndGroupViaDatasets()).
>>>> 
>>>> Thank you very much for your help.
>>>> Matthias
>>>> 
>>>> PS1: excuse me for sending this post more than once, but I am new to this
>>>> mailing list and probably did something wrong when registering/subscribing,
>>>> so my previous postings have not been accepted ...
>>>> 
>>>> PS2: See the appended screenshots taken from Spark UI (jobs 0/1 belong to
>>>> RDD implementation, jobs 2/3 to Dataset):
>>>> <http://apache-spark-user-list.1001560.n3.nabble.com/file/n27473/jobs.png>
>>>> 
>>>> <http://apache-spark-user-list.1001560.n3.nabble.com/file/n27473/Job_RDD_Details.png>
>>>> 
>>>> <http://apache-spark-user-list.1001560.n3.nabble.com/file/n27473/Job_Dataset_Details.png>
>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> View this message in context: 
>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Are-join-groupBy-operations-with-wide-Java-Beans-using-Dataset-API-much-slower-than-using-RDD-API-tp27473.html
>>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>>> 
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>

Re: Have I done everything correctly when subscribing to Spark User List

Reply via email to