On Mon, Jun 7, 2021 at 7:26 PM Andrew MacLeod <amacl...@redhat.com> wrote: > > On 6/7/21 4:40 AM, Richard Biener wrote: > > On Wed, Jun 2, 2021 at 11:15 PM Andrew MacLeod <amacl...@redhat.com> wrote: > >> > >> My thoughts are I would put this into trunk, and assuming nothing comes > >> up over the next couple of days, port it back to GCC11 to resolve > >> 100299 and other excessive memory consumption PRs there as well. given > >> that its reusing bitmap code for the sparse representation, it seems > >> like it would be low risk. > >> > >> Are we OK with the addition of the bitmap_get_quad and bitmap_set_quad > >> routines in bitmap.c? It seems like they might be useful to others. > >> They are simple tweaks of bitmap_set_bit and bitmap_bit_p.. just dealing > >> with 4 bits at a time. I could make them local if this is a problem, > >> but i don't have access to the bitmap internals there. > > I think _quad is a bit too specific - it's aligned chunks so maybe > > > > void bitmap_set_aligned_chunk (bitmap, unsigned int chunk, unsigned > > int chunk_size, BITMAP_WORD chunk_value); > > > > and > > > > BITMAP_WORD bitmap_get_aligned_chunk (bitmap, unsigned int chunk, > > unsigned chunk_size); > > > > and assert that chunk_size is power-of-two and fits in BITMAP_WORD? > > > > (also note using unsigned ints and BITMAP_WORD for the data type) > > > > I've been using two-bit representations in a few places (but mostly > > setting/testing the > > respective bits independently), I suppose for example > > That's exactly how this started.. I was using a pair of bits for > pointers. UNDEFINED, zero, non-zero and varying... and checking the bits > independently. when I decided I needed 3 bits, the whole quad thing > evolved since picking up 3 or 4 consecutive bits one at a time seemed > too inefficient. > > > > > static dep_state > > query_loop_dependence (class loop *loop, im_mem_ref *ref, dep_kind kind) > > { > > unsigned first_bit = 6 * loop->num + kind * 2; > > if (bitmap_bit_p (&ref->dep_loop, first_bit)) > > return dep_independent; > > else if (bitmap_bit_p (&ref->dep_loop, first_bit + 1)) > > return dep_dependent; > > > > could use a chunk size of 2 and a single bitmap query. Incidentially this > > specific code uses 6 bits, so it's not fully aligned ... > > > > /* We use six bits per loop in the ref->dep_loop bitmap to record > > the dep_kind x dep_state combinations. */ > > > > enum dep_kind { lim_raw, sm_war, sm_waw }; > > enum dep_state { dep_unknown, dep_independent, dep_dependent }; > > > > ... but there's also at most a single bit set. > > > > Anyway, I'm OK with adding API to access aligned power-of-two sized chunks. > > Even not power-of-two sized unaligned chunks should be quite straight > > forward to implement if we limit the chunk size to BITMAP_WORD by > > simply advancing to the next bitmap word / element when necessary. > > > > An alternative low-level API would provide accesses to whole BITMAP_WORD > > entries and the quads could be implemented on top of that > > (bitmap_set_word/_get_word) > > > > Richard. > > > I think I'll stick to the power of 2 limitation for now. If someone > finds a pressing need or desire, they can enhance it :-) > > Wanna eyeball this an make sure I'm not doing something unportable.. I > just used your original 2 function names, and swapped out the 4 and a > couple of constants for computed values. Works fine for me. > > I also made the self test process 2, 4 and 8 bit quantities. > > Its going thru a test cycle now.
+ gcc_checking_assert (__builtin_popcount (chunk_size) == 1); please use pow2p_hwi (chunk_size) instead, __builtin_popcount might not be available with non-GCC host compilers. Otherwise looks good to me. Thanks, Richard. > Andrew > >