hello Dawei On 7/1/25 16:16, Dawei Li wrote: > Hi Arnaud, > > Thanks for the reply. > > On Mon, Jun 30, 2025 at 09:54:40AM +0200, Arnaud POULIQUEN wrote: >> Hello Dawei, >> >> Sorry for the late answer. >> >> On 6/22/25 06:12, Dawei Li wrote: >>> Hi Arnaud, >>> >>> Thanks for the reply. >>> >>> On Fri, Jun 20, 2025 at 09:52:03AM +0200, Arnaud POULIQUEN wrote: >>>> >>>> >>>> On 6/19/25 16:43, Dawei Li wrote: >>>>> Hi Arnaud, >>>>> Thanks for review. >>>>> >>>>> On Wed, Jun 18, 2025 at 03:07:36PM +0200, Arnaud POULIQUEN wrote: >>>>>> Hello Dawei, >>>>>> >>>>>> >>>>>> Please find a few comments below. It is not clear to me which parts of >>>>>> your >>>>>> implementation are mandatory and which are optional "nice-to-have" >>>>>> optimizations. >>>>> >>>>> It's more like an improvement. >>>>> >>>>>> >>>>>> Based on (potentially erroneous) hypothesis, you will find a suggestion >>>>>> for an >>>>>> alternative to the anonymous inode approach, which does not seem to be a >>>>>> common >>>>>> interface. >>>>> >>>>> AFAIC, annoymous inode is a common interface and used extensivly in >>>>> kernel development. >>>>> Some examples below. >>>>> >>>>>> >>>>>> >>>>>> On 6/9/25 17:15, Dawei Li wrote: >>>>>>> Hi, >>>>>>> >>>>>>> This is V4 of series which introduce new uAPI(RPMSG_CREATE_EPT_FD_IOCTL) >>>>>>> for rpmsg subsystem. >>>>>>> >>>>>>> Current uAPI implementation for rpmsg ctrl & char device manipulation is >>>>>>> abstracted in procedures below: >>>>>>> - fd = open("/dev/rpmsg_ctrlX") >>>>>>> - ioctl(fd, RPMSG_CREATE_EPT_IOCTL, &info); /dev/rpmsgY devnode is >>>>>>> generated. >>>>>>> - fd_ep = open("/dev/rpmsgY", O_RDWR) >>>>>>> - operations on fd_ep(write, read, poll ioctl) >>>>>>> - ioctl(fd_ep, RPMSG_DESTROY_EPT_IOCTL) >>>>>>> - close(fd_ep) >>>>>>> - close(fd) >>>>>>> >>>>>>> This /dev/rpmsgY abstraction is less favorable for: >>>>>>> - Performance issue: It's time consuming for some operations are >>>>>>> invovled: >>>>>>> - Device node creation. >>>>>>> Depends on specific config, especially CONFIG_DEVTMPFS, the overall >>>>>>> overhead is based on coordination between DEVTMPFS and userspace >>>>>>> tools such as udev and mdev. >>>>>>> >>>>>>> - Extra kernel-space switch cost. >>>>>>> >>>>>>> - Other major costs brought by heavy-weight logic like device_add(). >>>>>> >>>>>> Is this a blocker of just optimization? >>>>> >>>>> Yep, performance is one of motivations of this change. >>>>> >>>>>> >>>>>>> >>>>>>> - /dev/rpmsgY node can be opened only once. It doesn't make much sense >>>>>>> that a dynamically created device node can be opened only once. >>>>>> >>>>>> >>>>>> I assume this is blocker with the fact that you need to open the >>>>>> /dev/rpmsg<x> >>>>>> to create the endpoint. >>>>> >>>>> Yes. You have to open /dev/rpmsgX which is generated by legacy ioctl to >>>>> instantiate a new endpoint. >>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> - For some container application such as docker, a client can't access >>>>>>> host's dev unless specified explicitly. But in case of /dev/rpmsgY, >>>>>>> which >>>>>>> is generated dynamically and whose existence is unknown for clients in >>>>>>> advance, this uAPI based on device node doesn't fit well. >>>>>> >>>>>> does this could be solve in userspace parsing /sys/class/rpmsg/ >>>>>> directory to >>>>>> retreive the device? >>>>> >>>>> Hardly, because client still can't access /dev/rpmsgX which is generated >>>>> by host _after_ client is launched. >>>> >>>> >>>> This part is not clear to me; could you provide more details? >>>> I cannot figure out why a client can access /dev/rpmsg_ctrlX but not >>>> /dev/rpmsgX. >>> >>> Well, let's take docker as example: >> >>> >>> For docker, when a client is launched and it wants to access host's >>> device, it must make explicit request when it's launched: >>> >>> docker run --device=/dev/xxx >>> >>> Let's presume that xxx is /dev/rpmsgX generated dynamically by _host_. >>> Docker command above knows nothing about these rpmsg nodes which are >>> generated by host _after_ client is launched. And yes, parsing> >>> /sys/class/rpmsg may acquire info about rpmsg devices, but client still >>> can't access /dev/rpmsgX. >> >> One extra question:Are you using RPMsg over virtio? >> >> If yes, do you test the RPMsg name service (NS) announcement, that might also >> address your needs. >> >> The principle is that the remote processor sends a name service announcement >> to >> Linux, which probes the rpmsg character device and creates the /dev/rpmsgX >> device in a predefined order known by the remote processor. >> In such a case, the /dev/rpmsgX usage would be determined by the remote >> processor itself. >> >> Another advantage is that the RPMsg channel creation is not driven by either >> the >> host or the client. In such case host does no need to define/kwnow RPMSg >> endpoint addresses. >> >> You still need to call the open() file system operation, but this should be >> done >> one time during Docker client initialization. > > NS is nice, but perhaps it's not the approach for some cases. > > For offloading/accelerator scenarios, ACPU is responsible for making all > the important decisions, including creations of endpoints. Because all > the user-awared software stack is running on ACPU, and if you want to > create a endpoint _dynamically_, it must be from user's command which is > from ACPU. > > And this series is more about how rpmsg_char and rpmsg_ctrl coordinate > themselves about creating dynamic rpmsg endpoints in a more simple > and efficient way. > > And the whole point of series is "When you want to return a fd to > userspace which represents an instance of data structure in kernel, you > don't implement it as character device". Maybe some quotes from Christian[1] > can describe it better[1]: > > "I'm not sure why people are so in love with character device based apis. > It's terrible. It glues everything to devtmpfs which isn't namespacable > in any way. It's terrible to delegate and extremely restrictive in terms > of extensiblity if you need additional device entries (aka the loop > driver folly)." > > [1] https://lkml.org/lkml/2025/6/24/639
Thank you for all your explanations! It is always very interesting to understand the different ways to use RPMsg. In the end, I don't see any problem with your series. The possibility of using an anonymous inode seems valid to me. I am just wondering whether this should be implemented in the rpmsg_char driver or in a new driver, addressing this question to Mathieu and Bjorn. Regards, Arnaud > > Thanks, > > Dawei > >> >> Regards >> Arnaud >> >> >>> >>>> >>>> >>>>> >>>>>> >>>>>> You could face same kind of random instantiation for serial peripherals >>>>>> ( UART; >>>>>> USb, I2C,...) based on a device tree enumeration. I suppose that user >>>>>> space >>>>>> use to solve this. >>>>>> >>>>>>> >>>>>>> An anonymous inode based approach is introduced to address the issues >>>>>>> above. >>>>>>> Rather than generating device node and opening it, rpmsg code just >>>>>>> creates >>>>>>> an anonymous inode representing eptdev and return the fd to userspace. >>>>>> >>>>>> A drawback is that you need to share fb passed between processes. >>>>> >>>>> Fd is the abstraction of an unique endpoint device, it holds true for >>>>> both legacy and new approach. >>>>> >>>>> So I guess what you mean is that /dev/rpmsgX is global to all so other >>>>> process >>>>> can access it? >>>>> >>>>> But /dev/rpmsgX is designed to be opened only once, it's implemented as >>>>> singleton pattern. >>>>> >>>>> static int rpmsg_eptdev_open(struct inode *inode, struct file *filp) >>>>> { >>>>> ... >>>>> if (eptdev->ept) { >>>>> mutex_unlock(&eptdev->ept_lock); >>>>> return -EBUSY; >>>>> } >>>>> ... >>>>> eptdev->ept = ept; >>>>> ... >>>>> } >>>>> >>>>> [...] >>>>> >>>>>>> printf("loop[%d]\n", loop); >>>>>>> >>>>>>> gettimeofday(&start, NULL); >>>>>>> >>>>>>> while (loop--) { >>>>>> >>>>>> Do you need to create /close Endpoint sevral times in your real use case >>>>>> with >>>>>> high timing >>>>>> constraint? >>>>> >>>>> No, it's just a silly benchmark demo, large sample reduces noise >>>>> statistically. >>>>> >>>>>> >>>>>>> fd_info.fd = -1; >>>>>>> fd_info.flags = O_RDWR | O_CLOEXEC | O_NONBLOCK; >>>>>>> ret = ioctl(fd, RPMSG_CREATE_EPT_FD_IOCTL, &fd_info); >>>>>>> if (ret < 0 || fd_info.fd < 0) { >>>>>>> printf("ioctl[RPMSG_CREATE_EPT_FD_IOCTL] >>>>>>> failed, ret[%d]\n", ret); >>>>>>> } >>>>>>> >>>>>> >>>>>> >>>>>>> ret = ioctl(fd_info.fd, RPMSG_DESTROY_EPT_IOCTL, &info); >>>>>>> if (ret < 0) { >>>>>>> printf("new ioctl[RPMSG_DESTROY_EPT_IOCTL] >>>>>>> failed, ret[%d]\n", ret); >>>>>>> } >>>>>>> >>>>>>> close(fd_info.fd); >>>>>> >>>>>> It seems strange to me to use ioctl() for opening and close() for >>>>>> closing, from >>>>>> a symmetry point of view. >>>>> >>>>> Sorry to hear that. But no, it's a pretty normal skill in kernel codebase >>>>> , I had to copy some examples from reply to other reviewer[1]. >>>> >>>> I missed this one, apologize for the duplication. >>>> >>>>> >>>>> anon_inode_get_{fd,file} are used extensively in kernel for returning a >>>>> new >>>>> fd to userspace which is associated with an unique data structure in >>>>> kernel >>>>> space, in different ways: >>>>> >>>>> - via ioctl(), some examples are: >>>>> >>>>> - KVM ioctl(s) >>>>> - KVM_CREATE_VCPU -> kvm_vm_ioctl_create_vcpu >>>>> - KVM_GET_STATS_FD -> kvm_vcpu_ioctl_get_stats_fd >>>>> - KVM_CREATE_DEVICE -> kvm_ioctl_create_device >>>>> - KVM_CREATE_VM -> kvm_dev_ioctl_create_vm >>>>> >>>>> - DMA buf/fence/sync ioctls >>>>> - DMA_BUF_IOCTL_EXPORT_SYNC_FILE -> dma_buf_export_sync_file >>>>> - SW_SYNC_IOC_CREATE_FENCE -> sw_sync_ioctl_create_fence >>>>> - Couples of driver implement DMA buf by using anon file _implicitly_: >>>>> - UDMABUF_CREATE -> udmabuf_ioctl_create >>>>> - DMA_HEAP_IOCTL_ALLOC -> dma_heap_ioctl_allocate >>>>> >>>>> - gpiolib ioctls: >>>>> - GPIO_GET_LINEHANDLE_IOCTL -> linehandle_create >>>>> - GPIO_V2_GET_LINE_IOCTL >>>>> >>>>> - IOMMUFD ioctls: >>>>> >>>>> - VFIO Ioctls: >>>>> >>>>> - .... >>>>> >>>>> >>>>> - via other specific syscalls: >>>>> - epoll_create1 >>>>> - bpf >>>>> - perf_event_open >>>>> - inotify_init >>>>> - ... >>>> >>>> If we put the optimization aspect aside, what seems strange to me is that >>>> the >>>> purpose of rpmsg_char was to expose a FS character device to user space. >>>> If we >>>> need tobypass the use of /dev/rpmsgX, does it make sense to support an >>>> anonymous >>>> inode in this driver? I am clearly not legitimate to answer this >>>> question... >>> >>> You have every right to do so, after all, it's purely a technical >>> discussion :). >>> >>> I admit it's bit confusing to add annoymous inode logic to a file named >>> rpmsg_char.c which implies 'character' device. That's why I rename API >>> following Mathieu's comment: >>> - __rpmsg_chrdev_eptdev_alloc -> rpmsg_eptdev_alloc >>> - __rpmsg_chrdev_eptdev_add -> rpmsg_eptdev_add >>> >>> As to topic how these two uAPI(s) co-exist and affect each other. This >>> change is based on rules: >>> >>> 1. Never break existing uAPI. >>> 2. Try best to reuse existing codebase. >>> 3. Userspace can choose whatever approach they want to. >>> >>> Thanks, >>> >>> Dawei >>>> >>>> >>>> Thanks, >>>> Arnaud >>>> >>>>> >>>>> [1] https://lore.kernel.org/all/20250530125008.GA5355@wendao-VirtualBox/ >>>>> >>>>>> >>>>>> Regarding your implementation, I wonder if we could keep the >>>>>> /dev/rpmsg<x> >>>>>> device with specific open() and close() file operations associated with >>>>>> your new >>>>>> ioctl. >>>>>> >>>>>> - The ioctl would create the endpoint. >>>>>> - The open() and close() operations would simply manage the file >>>>>> descriptor and >>>>>> increment/decrement a counter to prevent premature endpoint destruction. >>>>>> >>>>>> >>>>>> Regards, >>>>>> Arnaud >>>>>> >>>>> >>>>> [...] >>>>> >>>>> Thanks, >>>>> >>>>> Dawei