Hi Adam, Thanks for putting so much effort for making ptests better. Really appreciate this.
Regards, Deepak On 5/14/18, 11:47 AM, "Adam Szita" <sz...@cloudera.com> wrote: This is now committed and has been deployed some hours ago - don't worry if you see your job failed I resubmitted everything from the queue. I will be keeping an eye on how ptest works after this change. Thanks, Adam On 2 May 2018 at 17:34, Adam Szita <sz...@cloudera.com> wrote: > I have a patch available for the voted version at > https://issues.apache.org/jira/browse/HIVE-19077. Let me know what you > think. > > On 27 April 2018 at 15:55, Adam Szita <sz...@cloudera.com> wrote: > >> Thanks to all for the responses. >> As I see it, option 3 is the winning one. Next week I'm going start >> working on this one then (unless any objections of course). >> >> Adam >> >> On 26 April 2018 at 05:48, Deepak Jaiswal <djais...@hortonworks.com> >> wrote: >> >>> +1 for option 3. Thanks Adam for taking this up again. >>> >>> Regards, >>> Deepak >>> >>> On 4/25/18, 4:54 PM, "Thejas Nair" <thejas.n...@gmail.com> wrote: >>> >>> Option 3 seems reasonable. I believe that used to be the state a >>> while >>> back (maybe 12 months back or so). >>> When 2nd ptest for same jira runs, it checks if the latest patch has >>> already been run. >>> >>> >>> On Wed, Apr 25, 2018 at 7:37 AM, Peter Vary <pv...@cloudera.com> >>> wrote: >>> > I would vote for version 3. It would solve the big patch problem, >>> and removes the unnecessary test runs too. >>> > >>> > Thanks, >>> > Peter >>> > >>> >> On Apr 25, 2018, at 11:01 AM, Adam Szita <sz...@cloudera.com> >>> wrote: >>> >> >>> >> Hi all, >>> >> >>> >> I had a patch (HIVE-19077) committed with the original aim being >>> the >>> >> prevention of wasting resources when running ptest on the same >>> patch >>> >> multiple times: >>> >> It is supposed to manage scenarios where a developer uploads >>> >> HIVE-XYZ.1.patch, that gets queued in jenkins, then before >>> execution >>> >> HIVE-XYZ.2.patch (for the same jira) is uploaded and that gets >>> queued also. >>> >> When the first patch starts to execute ptest will see that patch2 >>> is the >>> >> latest patch and will use that. After some time the second queued >>> job will >>> >> also run on this very same patch. >>> >> This is just pointless and causes long queues to progress slowly. >>> >> >>> >> My idea was to remove these duplicates from the queue where I'd >>> only keep >>> >> the latest queued element if I see more queued entries for the >>> same jira >>> >> number. It's like when you go grocery shopping and you're already >>> in line >>> >> at cashier but you realise you also need e.g. milk. You go grab >>> it and join >>> >> the END of the queue. So I believe it's a fair punishment for >>> losing one's >>> >> spot in the queue for making amends on their patch. >>> >> >>> >> That said Deepak made me realise that for big patches this will >>> be very >>> >> cumbersome due to the need of constant rebasing to avoid >>> conflicts on patch >>> >> application. >>> >> I have three proposals now: >>> >> >>> >> 1: Leave this as it currently is (with HIVE-19077 committed) - >>> *only the >>> >> latest queued job will run of the same jira* >>> >> pros: no wasting resources to run the same patches more times, >>> 'scheduling' >>> >> is fair: if you amend you're patch you may loose your original >>> spot in the >>> >> queue >>> >> cons: big patches that are prone to conflicts will be hard to get >>> executed >>> >> in ptest, devs will have to wait more time for their ptest >>> results if they >>> >> amend their patches >>> >> >>> >> 2: *Add a safety switch* to this queue checking feature >>> (currently proposed >>> >> in HIVE-19077), deduplication can be switch off on request >>> >> pros: same as 1st, + ability to have more control on this >>> mechanism i.e. >>> >> turn it off for big/urgent patches >>> >> cons: big patches that use the swich might still waste resources, >>> also devs >>> >> might use safety switch inappropriately for their own evil >>> benefit :) >>> >> >>> >> 3: Deduplication the other way around - *only the first queued >>> job will run >>> >> of the same jira*, ptest server will keep record of patch names >>> and won't >>> >> execute a patch with a seen name and jira number again >>> >> pros: same patches will not be executed more times accidentally, >>> big >>> >> patches won't be a problem either, devs will get their ptest >>> result back >>> >> earlier even if more jobs are triggered for same jira/patch name >>> >> cons: scheduling is less fair: devs can reserve their spots in >>> the queue >>> >> >>> >> >>> >> (0: restore original: I'm strongly against this, ptest queue is >>> already too >>> >> big as it is, we have to at least try and decrease its size by >>> >> deduplicating jiras in it) >>> >> >>> >> I'm personally fine with any of the 1,2,3 methods listed above, >>> with my >>> >> favourites being 2 and 3. >>> >> Let me know which one you think is the right path to go down on. >>> >> >>> >> Thanks, >>> >> Adam >>> >> >>> >> On 20 April 2018 at 20:14, Eugene Koifman < >>> ekoif...@hortonworks.com> wrote: >>> >> >>> >>> Would it be possible to add patch name validation when it gets >>> added to >>> >>> the queue? >>> >>> Currently I think it fails when the bot gets to the patch if >>> it’s not >>> >>> named correctly. >>> >>> More common for branch patches >>> >>> >>> >>> On 4/20/18, 8:20 AM, "Zoltan Haindrich" <k...@rxd.hu> wrote: >>> >>> >>> >>> Hello, >>> >>> >>> >>> Some time ago the ptest queue worked the following way: >>> >>> >>> >>> * for some reason ATTACHMENT_ID was not set by the upstream >>> jira >>> >>> scanner >>> >>> tool; this triggered a feature in Jenkins: if for the same >>> ticket >>> >>> mutliple patches were uploaded; they didn't triggered new runs >>> >>> (because >>> >>> the parameters were the same) >>> >>> * this have become fixed at some point...around that time I >>> started >>> >>> getting multiple ptest executions for the same ticket - >>> because I've >>> >>> fixed a minor typo after submitting the first version of my >>> patch... >>> >>> * currently we also have a jenkins queue reader inside the >>> ptest >>> >>> job...which checks if the ticket is in the queue right now; >>> and if is >>> >>> it, it just exits...this logic kinda restores the earlier >>> behaviour; >>> >>> with the exception that if I upload a patch every day and the >>> queue is >>> >>> longer that 1day (like now); I will never get a ptest run :D >>> >>> * ...now here I come! I've just removed my patch from >>> yesterday; >>> >>> because >>> >>> I want a ptest run with my newest patch; and the only way to >>> force the >>> >>> above logic to do that....is by removing that attachment.. >>> >>> >>> >>> >>> >>> So...could we go back to the state when the attachment_id was >>> ignored? >>> >>> I would recommend to remove the ATTACHMENT_ID from the jenkins >>> >>> parameters... >>> >>> >>> >>> cheers, >>> >>> Zoltan >>> >>> >>> >>> JenkinsQueueUtil.java: >>> >>> https://github.com/apache/hive/blob/f8a671d8cfe8a26d1d12c51f >>> 93207e >>> >>> c92577c796/testutils/ptest2/src/main/java/org/apache/hive/ >>> >>> ptest/api/client/JenkinsQueueUtil.java#L82 >>> >>> >>> >>> >>> >>> >>> >>> >>> > >>> >>> >>> >>> >> >