2016-11-07 15:59 GMT+01:00 Thierry Goubier <thierry.goub...@gmail.com>:
> 2016-11-07 15:42 GMT+01:00 Nicolas Passerini <npasser...@gmail.com>: > >> >> 2016-11-07 15:30 GMT+01:00 Thierry Goubier <thierry.goub...@gmail.com>: >> >>> Thierry, If I'm not mistaken, Esteban is referring to the fact that in >>>> FileTree we are still using Monticello to do the load of the packages and >>>> even when we are running metadataless, we end creating fake meta data to >>>> simulate an mcz ... you and I have had conversations about ways to >>>> eliminate this "requirement" because it is meaningless in a git context ... >>>> >>> >>> Yes, this I understood. I do believe that what I suggested at one point >>> (have the ability to compare versions with an 'isAncestorOf') would be very >>> nice for that transition (work in mcz as well as on git with/without >>> metadata). >>> >> >> I would like to listen to your ideas about this topic, but I am not sure >> it is possible to achieve that compatibility. In fact we tried to do it for >> Iceberg and at some point we decided to abort it. >> > > I have an idea where such comparisons are done. I'd simply start by > changing version numbers to the short commit ID, and see where it breaks ... > I started creating version names using the unix date (the number of seconds since 1970), which allows me to provide version numbers without complex calculations and without breaking Monticello. Numbers are not nice but we do not use them any way, it is just to comply with Monticello requirements. > >> On one side, trying to re-create monticello sequential file numbers in >> git is simply not possible, at least in a reliable way. On the other side, >> loading the graph of package versions and dependencies is really slow for >> big repositories (such as pharo-core), once we removed that requirement >> Iceberg got like 100x faster. >> > > Yes, this is performance sensitive code. I spent a significant amount of > time trying to optimize the smalltalk part of that in gitfiletree a few > years ago... before I was shown that it could only scale to ~ 1000 commits > (with a repository that had more than 8000 different versions for a > package). > > Delegating everything to `git log` solved the issue. No code duplication! > > Overall, this is something that has worried me since the beginning of > libgit. Libgit is low level and pushes some of the processing into Pharo, > which is not the best tool to do high-speed processing of tree structures. > And then, instead of designing the best solution to our problem, you end up > trying to get the best design that doesn't hit a performance issue... > I am using git log extensively, turns out to be a very powerful tool. Libgit provides a very similar tool called revwalk ( https://libgit2.github.com/libgit2/#HEAD/group/revwalk). So in each case Pharo does not have to do those performance sensitive computing. The problem I found is that in order to recreate sequential numbers, you have to load all commits into the image.