+1 for option 3. Thanks Adam for taking this up again. Regards, Deepak
On 4/25/18, 4:54 PM, "Thejas Nair" <thejas.n...@gmail.com> wrote: Option 3 seems reasonable. I believe that used to be the state a while back (maybe 12 months back or so). When 2nd ptest for same jira runs, it checks if the latest patch has already been run. On Wed, Apr 25, 2018 at 7:37 AM, Peter Vary <pv...@cloudera.com> wrote: > I would vote for version 3. It would solve the big patch problem, and removes the unnecessary test runs too. > > Thanks, > Peter > >> On Apr 25, 2018, at 11:01 AM, Adam Szita <sz...@cloudera.com> wrote: >> >> Hi all, >> >> I had a patch (HIVE-19077) committed with the original aim being the >> prevention of wasting resources when running ptest on the same patch >> multiple times: >> It is supposed to manage scenarios where a developer uploads >> HIVE-XYZ.1.patch, that gets queued in jenkins, then before execution >> HIVE-XYZ.2.patch (for the same jira) is uploaded and that gets queued also. >> When the first patch starts to execute ptest will see that patch2 is the >> latest patch and will use that. After some time the second queued job will >> also run on this very same patch. >> This is just pointless and causes long queues to progress slowly. >> >> My idea was to remove these duplicates from the queue where I'd only keep >> the latest queued element if I see more queued entries for the same jira >> number. It's like when you go grocery shopping and you're already in line >> at cashier but you realise you also need e.g. milk. You go grab it and join >> the END of the queue. So I believe it's a fair punishment for losing one's >> spot in the queue for making amends on their patch. >> >> That said Deepak made me realise that for big patches this will be very >> cumbersome due to the need of constant rebasing to avoid conflicts on patch >> application. >> I have three proposals now: >> >> 1: Leave this as it currently is (with HIVE-19077 committed) - *only the >> latest queued job will run of the same jira* >> pros: no wasting resources to run the same patches more times, 'scheduling' >> is fair: if you amend you're patch you may loose your original spot in the >> queue >> cons: big patches that are prone to conflicts will be hard to get executed >> in ptest, devs will have to wait more time for their ptest results if they >> amend their patches >> >> 2: *Add a safety switch* to this queue checking feature (currently proposed >> in HIVE-19077), deduplication can be switch off on request >> pros: same as 1st, + ability to have more control on this mechanism i.e. >> turn it off for big/urgent patches >> cons: big patches that use the swich might still waste resources, also devs >> might use safety switch inappropriately for their own evil benefit :) >> >> 3: Deduplication the other way around - *only the first queued job will run >> of the same jira*, ptest server will keep record of patch names and won't >> execute a patch with a seen name and jira number again >> pros: same patches will not be executed more times accidentally, big >> patches won't be a problem either, devs will get their ptest result back >> earlier even if more jobs are triggered for same jira/patch name >> cons: scheduling is less fair: devs can reserve their spots in the queue >> >> >> (0: restore original: I'm strongly against this, ptest queue is already too >> big as it is, we have to at least try and decrease its size by >> deduplicating jiras in it) >> >> I'm personally fine with any of the 1,2,3 methods listed above, with my >> favourites being 2 and 3. >> Let me know which one you think is the right path to go down on. >> >> Thanks, >> Adam >> >> On 20 April 2018 at 20:14, Eugene Koifman <ekoif...@hortonworks.com> wrote: >> >>> Would it be possible to add patch name validation when it gets added to >>> the queue? >>> Currently I think it fails when the bot gets to the patch if it’s not >>> named correctly. >>> More common for branch patches >>> >>> On 4/20/18, 8:20 AM, "Zoltan Haindrich" <k...@rxd.hu> wrote: >>> >>> Hello, >>> >>> Some time ago the ptest queue worked the following way: >>> >>> * for some reason ATTACHMENT_ID was not set by the upstream jira >>> scanner >>> tool; this triggered a feature in Jenkins: if for the same ticket >>> mutliple patches were uploaded; they didn't triggered new runs >>> (because >>> the parameters were the same) >>> * this have become fixed at some point...around that time I started >>> getting multiple ptest executions for the same ticket - because I've >>> fixed a minor typo after submitting the first version of my patch... >>> * currently we also have a jenkins queue reader inside the ptest >>> job...which checks if the ticket is in the queue right now; and if is >>> it, it just exits...this logic kinda restores the earlier behaviour; >>> with the exception that if I upload a patch every day and the queue is >>> longer that 1day (like now); I will never get a ptest run :D >>> * ...now here I come! I've just removed my patch from yesterday; >>> because >>> I want a ptest run with my newest patch; and the only way to force the >>> above logic to do that....is by removing that attachment.. >>> >>> >>> So...could we go back to the state when the attachment_id was ignored? >>> I would recommend to remove the ATTACHMENT_ID from the jenkins >>> parameters... >>> >>> cheers, >>> Zoltan >>> >>> JenkinsQueueUtil.java: >>> https://github.com/apache/hive/blob/f8a671d8cfe8a26d1d12c51f93207e >>> c92577c796/testutils/ptest2/src/main/java/org/apache/hive/ >>> ptest/api/client/JenkinsQueueUtil.java#L82 >>> >>> >>> >>> >