https://stackoverflow.com/a/51854022/299676
On Tue, 8 Feb 2022 at 09:25, Stelios Philippou <stevo...@gmail.com> wrote: > This has the information that you require in order to add an extra column > with a sequence to it. > > > On Tue, 8 Feb 2022 at 09:11, <capitnfrak...@free.fr> wrote: > >> >> Hello Gourav >> >> >> As you see here orderBy has already give the solution for "equal >> amount": >> >> >>> df = >> >>> >> sc.parallelize([("orange",2),("apple",3),("tomato",3),("cherry",5)]).toDF(['fruit','amount']) >> >> >>> df.select("*").orderBy("amount",ascending=False).show() >> +------+------+ >> | fruit|amount| >> +------+------+ >> |cherry| 5| >> | apple| 3| >> |tomato| 3| >> |orange| 2| >> +------+------+ >> >> >> I want to add a column at the right whose name is "top" and the value >> auto_increment from 1 to N. >> >> Thank you. >> >> >> >> On 08/02/2022 13:52, Gourav Sengupta wrote: >> > Hi, >> > >> > sorry once again, will try to understand the problem first :) >> > >> > As we can clearly see that the initial responses were incorrectly >> > guessing the solution to be monotonically_increasing function >> > >> > What if there are two fruits with equal amount? For any real life >> > application, can we understand what are trying to achieve by the >> > rankings? >> > >> > Regards, >> > Gourav Sengupta >> > >> > On Tue, Feb 8, 2022 at 4:22 AM ayan guha <guha.a...@gmail.com> wrote: >> > >> >> For this req you can rank or dense rank. >> >> >> >> On Tue, 8 Feb 2022 at 1:12 pm, <capitnfrak...@free.fr> wrote: >> >> >> >>> Hello, >> >>> >> >>> For this query: >> >>> >> >>>>>> df.select("*").orderBy("amount",ascending=False).show() >> >>> +------+------+ >> >>> | fruit|amount| >> >>> +------+------+ >> >>> |tomato| 9| >> >>> | apple| 6| >> >>> |cherry| 5| >> >>> |orange| 3| >> >>> +------+------+ >> >>> >> >>> I want to add a column "top", in which the value is: 1,2,3... >> >>> meaning >> >>> top1, top2, top3... >> >>> >> >>> How can I do it? >> >>> >> >>> Thanks. >> >>> >> >>> On 07/02/2022 21:18, Gourav Sengupta wrote: >> >>>> Hi, >> >>>> >> >>>> can we understand the requirement first? >> >>>> >> >>>> What is that you are trying to achieve by auto increment id? Do >> >>> you >> >>>> just want different ID's for rows, or you may want to keep track >> >>> of >> >>>> the record count of a table as well, or do you want to do use >> >>> them for >> >>>> surrogate keys? >> >>>> >> >>>> If you are going to insert records multiple times in a table, >> >>> and >> >>>> still have different values? >> >>>> >> >>>> I think without knowing the requirements all the above >> >>> responses, like >> >>>> everything else where solutions are reached before understanding >> >>> the >> >>>> problem, has high chances of being wrong. >> >>>> >> >>>> Regards, >> >>>> Gourav Sengupta >> >>>> >> >>>> On Mon, Feb 7, 2022 at 2:21 AM Siva Samraj >> >>> <samraj.mi...@gmail.com> >> >>>> wrote: >> >>>> >> >>>>> Monotonically_increasing_id() will give the same functionality >> >>>>> >> >>>>> On Mon, 7 Feb, 2022, 6:57 am , <capitnfrak...@free.fr> wrote: >> >>>>> >> >>>>>> For a dataframe object, how to add a column who is >> >>> auto_increment >> >>>>>> like >> >>>>>> mysql's behavior? >> >>>>>> >> >>>>>> Thank you. >> >>>>>> >> >>>>>> >> >>>>> >> >>>> >> >>> >> >> >> > --------------------------------------------------------------------- >> >>>>>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >> >>> >> >>> >> >> >> > --------------------------------------------------------------------- >> >>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >> >> -- >> >> Best Regards, >> >> Ayan Guha >> >> --------------------------------------------------------------------- >> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >> >>