On 13 Aug 2025, at 17:12, Wei Yang wrote:

> On Tue, Aug 12, 2025 at 11:55:10AM -0400, Zi Yan wrote:
> [...]
>> +/*
>> + * gather_folio_orders - scan through [vaddr_start, len) and record folio 
>> orders
>> + * @vaddr_start: start vaddr
>> + * @len: range length
>> + * @pagemap_fd: file descriptor to /proc/<pid>/pagemap
>> + * @kpageflags_fd: file descriptor to /proc/kpageflags
>> + * @orders: output folio order array
>> + * @nr_orders: folio order array size
>> + *
>> + * gather_folio_orders() scan through [vaddr_start, len) and check all 
>> folios
>> + * within the range and record their orders. All order-0 pages will be 
>> recorded.
>
> I feel a little confused about the description here. Especially on the
> behavior when the range is not aligned on folio boundary.

I was too ambitious on this function. It is intended to just check after
split folio orders. I will move the function to split_huge_page_test.c
and rename it to gather_after_split_folio_orders() and
check_after_split_folio_orders().

>
> See following code at 1) and 2).
>
>> + * Non-present vaddr is skipped.
>> + *
>> + *
>> + * Return: 0 - no error, -1 - unhandled cases
>> + */
>> +static int gather_folio_orders(char *vaddr_start, size_t len,
>> +                           int pagemap_fd, int kpageflags_fd,
>> +                           int orders[], int nr_orders)
>> +{
>> +    uint64_t page_flags = 0;
>> +    int cur_order = -1;
>> +    char *vaddr;
>> +
>> +    if (!pagemap_fd || !kpageflags_fd)
>> +            return -1;
>
> If my understanding is correct, we use open() to get a file descriptor.
>
> On error it returns -1. And 0 is a possible valid value, but usually used by
> stdin. The code may work in most cases, but seems not right.

Will fix it to

if (pagemap_fd == -1 || kpageflags_fd == -1)

>
>> +    if (nr_orders <= 0)
>> +            return -1;
>> +
>
> Maybe we want to check orders[] here too?
>
>> +    for (vaddr = vaddr_start; vaddr < vaddr_start + len;) {
>> +            char *next_folio_vaddr;
>> +            int status;
>> +
>> +            status = get_page_flags(vaddr, pagemap_fd, kpageflags_fd,
>> +                                    &page_flags);
>> +            if (status < 0)
>> +                    return -1;
>> +
>> +            /* skip non present vaddr */
>> +            if (status == 1) {
>> +                    vaddr += psize();
>> +                    continue;
>> +            }
>> +
>> +            /* all order-0 pages with possible false postive (non folio) */
>
> Do we still false positive case? Non-present page returns 1, which is handled
> above.

Any order-0 non folio will be counted, like GFP_KERNEL pages.

>
>> +            if (!(page_flags & (KPF_COMPOUND_HEAD | KPF_COMPOUND_TAIL))) {
>> +                    orders[0]++;
>> +                    vaddr += psize();
>> +                    continue;
>> +            }
>> +
>> +            /* skip non thp compound pages */
>> +            if (!(page_flags & KPF_THP)) {
>> +                    vaddr += psize();
>> +                    continue;
>> +            }
>> +
>> +            /* vpn points to part of a THP at this point */
>> +            if (page_flags & KPF_COMPOUND_HEAD)
>> +                    cur_order = 1;
>> +            else {
>> +                    /* not a head nor a tail in a THP? */
>> +                    if (!(page_flags & KPF_COMPOUND_TAIL))
>> +                            return -1;
>
> When reaches here, we know (page_flags & (KPF_COMPOUND_HEAD | 
> KPF_COMPOUND_TAIL)).
> So we have at least one of it set.
>
> Looks not possible to hit it?

Will remove it.
>
>> +
>> +                    vaddr += psize();
>> +                    continue;
>
> 1)
>
> In case vaddr points to the middle of a large folio, this will skip this folio
> and count from next one.
>
>> +            }
>> +
>> +            next_folio_vaddr = vaddr + (1UL << (cur_order + pshift()));
>> +
>> +            if (next_folio_vaddr >= vaddr_start + len)
>> +                    break;
>> +
>> +            while ((status = get_page_flags(next_folio_vaddr, pagemap_fd,
>> +                                             kpageflags_fd,
>> +                                             &page_flags)) >= 0) {
>> +                    /*
>> +                     * non present vaddr, next compound head page, or
>> +                     * order-0 page
>> +                     */
>> +                    if (status == 1 ||
>> +                        (page_flags & KPF_COMPOUND_HEAD) ||
>> +                        !(page_flags & (KPF_COMPOUND_HEAD | 
>> KPF_COMPOUND_TAIL))) {
>> +                            if (cur_order < nr_orders) {
>> +                                    orders[cur_order]++;
>> +                                    cur_order = -1;
>> +                                    vaddr = next_folio_vaddr;
>> +                            }
>> +                            break;
>> +                    }
>> +
>> +                    /* not a head nor a tail in a THP? */
>> +                    if (!(page_flags & KPF_COMPOUND_TAIL))
>> +                            return -1;
>> +
>> +                    cur_order++;
>> +                    next_folio_vaddr = vaddr + (1UL << (cur_order + 
>> pshift()));
>
> 2)
>
> If (vaddr_start + len) points to the middle of a large folio and folio is more
> than order 1 size, we may continue the loop and still count this last folio.
> Because we don't check next_folio_vaddr and (vaddr_start + len).
>
> A simple chart of these case.
>
>           vaddr_start                   +     len
>                |                               |
>                v                               v
>      +---------------------+              +-----------------+
>      |folio 1              |              |folio 2          |
>      +---------------------+              +-----------------+
>
> folio 1 is not counted, but folio 2 is counted.
>
> So at 1) and 2) handles the boundary differently. Not sure this is designed
> behavior. If so I think it would be better to record in document, otherwise
> the behavior is not obvious to user.

Will document it.

>
>> +            }
>> +
>> +            if (status < 0)
>> +                    return status;
>> +    }
>> +    if (cur_order > 0 && cur_order < nr_orders)
>> +            orders[cur_order]++;
>
> Another boundary case here.
>
> If we come here because (next_folio_vaddr >= vaddr_start + len) in the for
> loop instead of the while loop. This means we found the folio head at vaddr,
> but the left range (vaddr_start + len - vaddr) is less than or equal to order
> 1 page size.
>
> But we haven't detected the real end of this folio. If this folio is more than
> order 1 size, we still count it an order 1 folio.

Yes. Will document it.

Thanks for the review.

>
>> +    return 0;
>> +}
>> +
>> +int check_folio_orders(char *vaddr_start, size_t len, int pagemap_fd,
>> +                    int kpageflags_fd, int orders[], int nr_orders)
>> +{
>> +    int *vaddr_orders;
>> +    int status;
>> +    int i;
>> +
>> +    vaddr_orders = (int *)malloc(sizeof(int) * nr_orders);
>> +
>> +    if (!vaddr_orders)
>> +            ksft_exit_fail_msg("Cannot allocate memory for vaddr_orders");
>> +
>> +    memset(vaddr_orders, 0, sizeof(int) * nr_orders);
>> +    status = gather_folio_orders(vaddr_start, len, pagemap_fd,
>> +                                 kpageflags_fd, vaddr_orders, nr_orders);
>> +    if (status)
>> +            goto out;
>> +
>> +    status = 0;
>> +    for (i = 0; i < nr_orders; i++)
>> +            if (vaddr_orders[i] != orders[i]) {
>> +                    ksft_print_msg("order %d: expected: %d got %d\n", i,
>> +                                   orders[i], vaddr_orders[i]);
>> +                    status = -1;
>> +            }
>> +
>> +out:
>> +    free(vaddr_orders);
>> +    return status;
>> +}
>
> -- 
> Wei Yang
> Help you, Help me


Best Regards,
Yan, Zi

Reply via email to