On 19/11/2019 19:47, Richard Earnshaw (lists) wrote: > On 19/11/2019 19:32, Eric S. Raymond wrote: >> Richard Earnshaw (lists) <richard.earns...@arm.com>: >>> I was looking at the reposurgeon code last night, and I think I can see what >>> the problem *might* be, but I haven't had time to produce a testcase. >>> >>> Some of our commits have mergeinfo that looks a bit like this: >>> >>> 202022-202023,202026,202028-202029,202036,202039-202041,202043-202044,202048-202049,202051-202056,202058-202061,202064-202065,202068-202071,202077,202079-202082,202084,202086-202088,202092-202104,202106-202113,202115-202119,202121,202124-202134,202139,202142-202146,202148-202150,202153-202154,202158-202159,202163-202165,202168,202172,202174,202179-202180,202184-202192,202195,202197,202202-202208,202225-202230,202232-202233,202237-202239,202242,202244-202245,202247,202250-202251,202258-202264,202266,202269,202271-202275,202279,202281-202282,202284,202286,202289-202292,202296-202299,202301-202302,202305,202309,202311-202323,202327-202335,202337,202339,202343-202346,202350,202352,202356-202357,202359-202360,202363-202371,202373-202374,202377,202379-202382,202384,202389,202391-202395,202398-202407,202409,202411,202416-202418,202421 >>> >>> which is a massive long list with a number of holes in it. >>> >>> But I suspect the holes are really commits to other branches and that in the >>> above describes a linear chain along one branch. If so, rather than >>> producing links to each subgroup (and perhaps dropping single non-list >>> elements, the description can be mapped back to a contiguous sequence of >>> commits down a branch and thus should really resolve to a single child being >>> used for the merge source. At present, I think for the above we're seeing a >>> child reference created for each subrange in that list. >> >> I have no doubt you are correct. Detecting such interrupted ranges ia >> foing to be... interesting. >> >>> Incidentally, the mergeinfo pass on the gcc repo is currently taking about 8 >>> hours on my machine, that's 80-90% of the entire conversion time. But it >>> might be related to the above. >> >> You must be running the old Python code, there was on O(n**2) in that >> phase that has since been fixed. Try the Go code from >> https://gitlab.com/esr/reposurgeon; it is *much* faster. >> > > Nope, that was from running the go version from yesterday. This one, to > be precise: 1ab3c514c6cd5e1a5d6b68a8224df299751ca637 > > This pass used to be very fast a couple of weeks back, but something > went in recently that's caused a major slowdown. > > Oh, and I've been having problems with the ChangeLogs command as well. > It used to run fine on my machine (128G), but now it's started blowing > memory and taking my X server down. > > R. > > R. >
Here's the stats output: # Statistics on read and processing times timing commits: 276738 (from 278380) parsing: 2.85% 14m22.861991058s cleaning: 0.32% 1m37.653100823s filemaps: 0.37% 1m52.851558995s commits: 4.40% 22m15.380157228s rootcommit: 0.00% 8.779µs branches: 0.04% 12.710113776s parents: 0.00% 121.73484ms root: 0.00% 267.997µs branchlinks: 0.00% 10.58361ms mergeinfo: 91.67% 7h43m15.416510183s branches: 0.00% 11.616µs dejunk: 0.04% 10.672889443s polishing: 0.04% 11.249533399s tagifying: 0.03% 10.528735532s tagcleaning: 0.03% 9.880052536s debubbling: 0.00% 1.384357053s renumbering: 0.20% 59.718288526s total: 9/sec 8h25m20.439895394s