This is now committed and has been deployed some hours ago - don't worry if
you see your job failed I resubmitted everything from the queue.
I will be keeping an eye on how ptest works after this change.

Thanks,
Adam

On 2 May 2018 at 17:34, Adam Szita <sz...@cloudera.com> wrote:

> I have a patch available for the voted version at
> https://issues.apache.org/jira/browse/HIVE-19077. Let me know what you
> think.
>
> On 27 April 2018 at 15:55, Adam Szita <sz...@cloudera.com> wrote:
>
>> Thanks to all for the responses.
>> As I see it, option 3 is the winning one. Next week I'm going start
>> working on this one then (unless any objections of course).
>>
>> Adam
>>
>> On 26 April 2018 at 05:48, Deepak Jaiswal <djais...@hortonworks.com>
>> wrote:
>>
>>> +1 for option 3. Thanks Adam for taking this up again.
>>>
>>> Regards,
>>> Deepak
>>>
>>> On 4/25/18, 4:54 PM, "Thejas Nair" <thejas.n...@gmail.com> wrote:
>>>
>>>     Option 3 seems reasonable. I believe that used to be the state a
>>> while
>>>     back (maybe 12 months back or so).
>>>     When 2nd ptest for same jira runs, it checks if the latest patch has
>>>     already been run.
>>>
>>>
>>>     On Wed, Apr 25, 2018 at 7:37 AM, Peter Vary <pv...@cloudera.com>
>>> wrote:
>>>     > I would vote for version 3. It would solve the big patch problem,
>>> and removes the unnecessary test runs too.
>>>     >
>>>     > Thanks,
>>>     > Peter
>>>     >
>>>     >> On Apr 25, 2018, at 11:01 AM, Adam Szita <sz...@cloudera.com>
>>> wrote:
>>>     >>
>>>     >> Hi all,
>>>     >>
>>>     >> I had a patch (HIVE-19077) committed with the original aim being
>>> the
>>>     >> prevention of wasting resources when running ptest on the same
>>> patch
>>>     >> multiple times:
>>>     >> It is supposed to manage scenarios where a developer uploads
>>>     >> HIVE-XYZ.1.patch, that gets queued in jenkins, then before
>>> execution
>>>     >> HIVE-XYZ.2.patch (for the same jira) is uploaded and that gets
>>> queued also.
>>>     >> When the first patch starts to execute ptest will see that patch2
>>> is the
>>>     >> latest patch and will use that. After some time the second queued
>>> job will
>>>     >> also run on this very same patch.
>>>     >> This is just pointless and causes long queues to progress slowly.
>>>     >>
>>>     >> My idea was to remove these duplicates from the queue where I'd
>>> only keep
>>>     >> the latest queued element if I see more queued entries for the
>>> same jira
>>>     >> number. It's like when you go grocery shopping and you're already
>>> in line
>>>     >> at cashier but you realise you also need e.g. milk. You go grab
>>> it and join
>>>     >> the END of the queue. So I believe it's a fair punishment for
>>> losing one's
>>>     >> spot in the queue for making amends on their patch.
>>>     >>
>>>     >> That said Deepak made me realise that for big patches this will
>>> be very
>>>     >> cumbersome due to the need of constant rebasing to avoid
>>> conflicts on patch
>>>     >> application.
>>>     >> I have three proposals now:
>>>     >>
>>>     >> 1: Leave this as it currently is (with HIVE-19077 committed) -
>>> *only the
>>>     >> latest queued job will run of the same jira*
>>>     >> pros: no wasting resources to run the same patches more times,
>>> 'scheduling'
>>>     >> is fair: if you amend you're patch you may loose your original
>>> spot in the
>>>     >> queue
>>>     >> cons: big patches that are prone to conflicts will be hard to get
>>> executed
>>>     >> in ptest, devs will have to wait more time for their ptest
>>> results if they
>>>     >> amend their patches
>>>     >>
>>>     >> 2: *Add a safety switch* to this queue checking feature
>>> (currently proposed
>>>     >> in HIVE-19077), deduplication can be switch off on request
>>>     >> pros: same as 1st, + ability to have more control on this
>>> mechanism i.e.
>>>     >> turn it off for big/urgent patches
>>>     >> cons: big patches that use the swich might still waste resources,
>>> also devs
>>>     >> might use safety switch inappropriately for their own evil
>>> benefit :)
>>>     >>
>>>     >> 3: Deduplication the other way around - *only the first queued
>>> job will run
>>>     >> of the same jira*, ptest server will keep record of patch names
>>> and won't
>>>     >> execute a patch with a seen name and jira number again
>>>     >> pros: same patches will not be executed more times accidentally,
>>> big
>>>     >> patches won't be a problem either, devs will get their ptest
>>> result back
>>>     >> earlier even if more jobs are triggered for same jira/patch name
>>>     >> cons: scheduling is less fair: devs can reserve their spots in
>>> the queue
>>>     >>
>>>     >>
>>>     >> (0: restore original: I'm strongly against this, ptest queue is
>>> already too
>>>     >> big as it is, we have to at least try and decrease its size by
>>>     >> deduplicating jiras in it)
>>>     >>
>>>     >> I'm personally fine with any of the 1,2,3 methods listed above,
>>> with my
>>>     >> favourites being 2 and 3.
>>>     >> Let me know which one you think is the right path to go down on.
>>>     >>
>>>     >> Thanks,
>>>     >> Adam
>>>     >>
>>>     >> On 20 April 2018 at 20:14, Eugene Koifman <
>>> ekoif...@hortonworks.com> wrote:
>>>     >>
>>>     >>> Would it be possible to add patch name validation when it gets
>>> added to
>>>     >>> the queue?
>>>     >>> Currently I think it fails when the bot gets to the patch if
>>> it’s not
>>>     >>> named correctly.
>>>     >>> More  common for branch patches
>>>     >>>
>>>     >>> On 4/20/18, 8:20 AM, "Zoltan Haindrich" <k...@rxd.hu> wrote:
>>>     >>>
>>>     >>>    Hello,
>>>     >>>
>>>     >>>    Some time ago the ptest queue worked the following way:
>>>     >>>
>>>     >>>    * for some reason ATTACHMENT_ID was not set by the upstream
>>> jira
>>>     >>> scanner
>>>     >>>    tool; this  triggered a feature in Jenkins: if for the same
>>> ticket
>>>     >>>    mutliple patches were uploaded; they didn't triggered new runs
>>>     >>> (because
>>>     >>>    the parameters were the same)
>>>     >>>    * this have become fixed at some point...around that time I
>>> started
>>>     >>>    getting multiple ptest executions for the same ticket -
>>> because I've
>>>     >>>    fixed a minor typo after submitting the first version of my
>>> patch...
>>>     >>>    * currently we also have a jenkins queue reader inside the
>>> ptest
>>>     >>>    job...which checks if the ticket is in the queue right now;
>>> and if is
>>>     >>>    it, it just exits...this logic kinda restores the earlier
>>> behaviour;
>>>     >>>    with the exception that if I upload a patch every day and the
>>> queue is
>>>     >>>    longer that 1day (like now); I will never get a ptest run :D
>>>     >>>    * ...now here I come! I've just removed my patch from
>>> yesterday;
>>>     >>> because
>>>     >>>    I want a ptest run with my newest patch; and the only way to
>>> force the
>>>     >>>    above logic to do that....is by removing that attachment..
>>>     >>>
>>>     >>>
>>>     >>>    So...could we go back to the state when the attachment_id was
>>> ignored?
>>>     >>>    I would recommend to remove the ATTACHMENT_ID from the jenkins
>>>     >>> parameters...
>>>     >>>
>>>     >>>    cheers,
>>>     >>>    Zoltan
>>>     >>>
>>>     >>>    JenkinsQueueUtil.java:
>>>     >>>    https://github.com/apache/hive/blob/f8a671d8cfe8a26d1d12c51f
>>> 93207e
>>>     >>> c92577c796/testutils/ptest2/src/main/java/org/apache/hive/
>>>     >>> ptest/api/client/JenkinsQueueUtil.java#L82
>>>     >>>
>>>     >>>
>>>     >>>
>>>     >>>
>>>     >
>>>
>>>
>>>
>>>
>>
>

Reply via email to