Patrick Robb <pr...@iol.unh.edu> writes: > There was some discussion at last week's CI meeting about usage of the > Patchwork > /events/ endpoint for polling for patches, and issues with that process. Here > is a relevant > blurb, explaining some issues Aaron has run into using the dpdk-ci repo > "poll-pw.sh" shell > script: > > ---------------- > > * Discussion pertaining to looking at polling for series using the events > API. This events > endpoint (with series created event) returns info that a series has been > created, but returns > a limited set of data in the payload, and this necessitates a followup > request to patchwork. > So, this seems like it would actually increase the amount of requests made to > the patchwork > server. Some related issues discussed are: > * You cannot query the events endpoint for only events from a particular > project (this > matters for patchwork instances with many projects under them). For DPDK > there are only 4 > projects under DPDK patchwork, so it’s not a huge deal, but still a small > issue. > * The datetime that the series-created event returns is the datetimes of > one of the > commits in the series, not the datetime of when the series was submitted. So, > this means > that if you amend a commit (this does not update commit datetime) and > resubmit a > patchseries, the datetime on the series-created record will not be “updated”. > This can cause > us to miss series when polling via the events endpoint.
Sorry - I think there is still a misunderstanding here. The datetime for the /series/ endpoint is what is provided in the patch (so could be not updated) The datetime for the /events/ endpoint is when the event fires (that is when the series is received). I can reply to the meeting minutes document with this as well. > ------------------ > > And for context, poll-pw.sh will check the /events/ endpoint for new series > created events > like so: > > -------------------- > > URL="${URL}/events/?category=${resource_type}-completed" > > callcmd () # <patchwork id> > { > eval $cmd > } > > while true ; do > date_now=$(date --utc '+%FT%T') > since=$(date --utc '+%FT%T' -d $(cat $since_file | tr '\n' ' ')) > page=1 > while true ; do > ids=$(curl -s "${URL}&page=${page}&since=${since}" | > jq "try ( .[] | select( .project.name == \"$project\" ) > )" | > jq "try ( .payload.${resource_type}.id )") > [ -z "$(echo $ids | tr -d '\n')" ] && break > for id in $ids ; do > if grep -q "^${id}$" $poll_pw_ids_file ; then > continue > fi > callcmd $id > echo $id >>$poll_pw_ids_file > > ------------------- > > But, as was discussed at the meeting, once you have the series ids, then you > need to make a > followup request to /series/{id}. > > UNH has a download_patchset.py polling script very much like poll-pw.sh > except that, > because we store extra info about our processed patchseries in a database (to > facilitate > lab.dpdk.org filtering functions), we use our database to get the most > recently processed > patchseries, instead of the "since_file." Our process (running every 10 > minutes from Jenkins) > is like this: > > 1. get the "since_id" from our database > 2. get the "newest_id" from > https://patchwork.dpdk.org/api/events/?category=series-completed. Get the [0] > index of > the json response (the most recent patchseries) and save that series id. > 3. for seriesID in range(since_id, newest_id): get patch from > https://patchwork.dpdk.org/api/series/{id}. > > So, both poll-pw.sh and our UNH script follow the process of making a request > to /events/, > and then followup requests for /series/. Thus the total number of requests > being made on > patchwork is (number of new patchseries + 1). > > -The most consequential difference in the two implementations is that > poll-pw.sh makes a > request to /events/ with the &since=${since} parameter, passing in a since > datetime, and > UNH does not. As Aaron explained at the CI meeting, because the datetime > provided in the > /events/ payload is not what one would expect (it gives the datetime of the > commit, not > when the series was submitted) this means that poll-pw-sh can miss series. > With the UNH > lab polling script we don't have this issue because we don't make use of the > since > parameter in our /events/ request. I think the options for poll-pw.sh going > forward would > be: > 1. Update patchwork so that the datetime provided in the /events/ payload is > what is > "expected" i.e. the datetime that the series was submitted at. That already is done. > 2. Adopt the UNH process of discarding the &since=${since} parameter, and > rely solely on > tracking the most recently processed patchseries id, get the newest > patchseries id from > /events/, and traverse the range of (since_id, newest_id). > > -I agree it makes sense for /events/ to support a "project" param. > > Thanks Aaron for raising this conversation. We can continue the conversation > over email, or > also in person at DPDK Prague! Let's keep discussing.