Dear Arjun, On Mon, 1 Apr 2019 at 18:38, Arjun Salyan <arjun.salyan.ch...@itbhu.ac.in> wrote: > > Hi, > I was working on keeping the PortIndex updated, and was able to achieve this: > > Sync Portindex from > 'rsync://rsync.macports.org/macports//trunk/dports/PortIndex_darwin_16_i386/PortIndex' > Update or Add ports that were recently built on 10.14_x86_64 (using time > frame 'last 24 hours' for now). > New ports, (SoapyAirspy, SoapyAirspyHF etc) were successfully added, and can > now be seen on the demo app. > > This is exactly the approach I wrote in the proposal and I wanted to show a > working demo, so that I can get feedback about how efficient this method is. > The script I used: update_portindex.py . ( note: the code might not be very > well written, I was just looking to get things working. Also, I am only > updating ports built on '10.14_x86_64')
(It might have been easier to comment on pull request, but I noticed that those commits did not make it to the pull request.) This is an interesting way which should mostly work, just not always and not super reliably. The drawbacks may include: - some ports will be skipped on the builder, for various reasons (port is known not to build on a particular builder, it may not be distributable, ...) - the buildbot master may be down or experience problems, so data might go missing A strange observation from your source code: you synced portindex and ran the conversion, but then loaded the data from another json file? Am I missing something? There are various ways to achieve the goal. Note that if you run portindex yourself, it will detect which files have been updated and only ever touch data of those ports. The portindex command could be modified to only output the file with changes (when you pass some options to it). This will still miss deletes, but it would be an efficient way with almost no dependencies. One way would be to generate portindex yourself and always remember what git shasum has been used, and store that shasum to the database. Next time when you update, check and store the latest shasum, then ask git which paths have changed between the two commits, and only update ports whose paths match the paths reported by git as changed. It could also help if you stored a "complete" git history to the database (shasum, which ports changed at that point, timestamp, parents). Not sure if that's really so helpful, just as an option. What might be an interesting approach would be to try to squeeze the git shasum to the PortIndex. This could also help when submitting statistics as it would be easier to determine how old the database is / when the user last synced. (It would not work for people with their own modifications of the tree.) If you had the shasum in portindex, you could still run git independently to check for the difference. You could keep full portindex in git after you sync it and check the diffs. (Not sure if it would be super trivial to figure out which ports changed, probably not.) Just some random ideas. Regarding updates of builds: just ask the database about which build you synced last, and then sync any builds newer than that, up to the last one. You may need to check whether a build was complete when you last enquired. Mojca