On Tue, Nov 14, 2023 at 9:38 AM Lehua Ding <lehua.d...@rivai.ai> wrote: > > > > On 2023/11/14 16:14, Richard Biener wrote: > > On Mon, Nov 13, 2023 at 11:39 PM Vladimir Makarov <vmaka...@redhat.com> > > wrote: > >> > >> > >> On 11/12/23 07:08, Lehua Ding wrote: > >>> This patch adds a live_subreg problem to extend the original live_reg to > >>> track the liveness of subreg. We will only try to trace speudo registers > >>> who's mode size is a multiple of nature size and eventually a small > >>> portion > >>> of the inside will appear to use subreg. With live_reg problem, > >>> live_subreg > >>> prbolem will have the following output. full_in/out mean the entire pesudo > >>> live in/out, partial_in/out mean the subregs of the pesudo are live > >>> in/out, > >>> and range_in/out indicates which part of the pesudo is live. all_in/out is > >>> the union of full_in/out and partial_in/out: > >>> > >> I am not a maintainer or reviewer of data-flow analysis framework and > >> can not approve this patch except changes in regs.h. Richard Sandiford > >> or Jeff Law as global reviewers probably can do this. > >> > >> As for regs.h changes, they are ok for me after fixing general issues I > >> mentioned in my previous email (two spaces after sentence ends in the > >> comments). > >> > >> I think all this code is a major compiler time and memory consumer in > >> all set of the patches. DF analysis is slow by itself even when only > >> effective data structures as bitmaps are used but you are introducing > >> even slower data structure as maps (I believe better performance data > >> structure can be used instead). In the very first version of LRA I used > >> DFA but it made LRA so slow that I had to introduce own data structures > >> which are faster in case of massive RTL changes in LRA. The same > >> problem exists for using generic C++ standard library data as vectors > >> and maps for critical code. It is hard to get a needed performance when > >> the exact implementation can vary or be not what you need, e.g. vector > >> initial capacity, growth etc. But again the performance issues can be > >> addressed later. > > > > I think the important bit should be the subreg live analysis should be > > opt-in and when not enabled shouldn't have a bad effect on memory > > usage and compile-time. At -O0 and -O1 RA consumes a major > > amount of compile-time. > > This is perfectly fine, the code inside the live_subreg problem has a > branch that goes through similar logic to live_reg if it finds no subreg > inside the program. Then when the optimization level is less than 2, it > doesn't track the subreg. By the way, I'd like to ask you if you have > certain programs where RA has a big impact on compilation time to offer? > Or any suggestions about it?
I suggest you farm bugzilla for the compile-time-hog / memory-hog testcases. I do have a set of "large" testcases. Scanning results points at PRs 36262, 37448, 39326, 69609 all having RA in the 20% area at -O0 -g. It's also a good idea to take say cc1files (set of preprocessed sources that produce GCCs cc1) and look at the overall impact of compile-time and memory-usage of a change on those which are representative for "normal" TUs as opposed to the PRs above which often are large machine-generated TUs (an important area where GCC usually shines, at least at -O1). Richard. > -- > Best, > Lehua (RiVAI) > lehua.d...@rivai.ai