New working version of fw_cfg sysfs module enclosed at the end of this mail, featuring:
- probing for the appropriate fw_cfg port/address for the architecture we're on. It's either that, or preprocessor #ifdef voodoo to try only the right access method matching the architecture we're compiling for. - the module compiles and runs fine now on both x86_64 and armv7hl+lpae, not tested on ppc or sun. - Reworked initialization to prevent leaks when bailing out due to errors (thanks Matt for pointing out one of those !) On Wed, Jul 15, 2015 at 01:00:45PM +0100, Matt Fleming wrote: > On Mon, 2015-07-13 at 16:09 -0400, Gabriel L. Somlo wrote: > > Speaking of the kernel: My default plan is to subscribe to > > kernelnewb...@kernelnewbies.org and submit it there (this is after > > all baby's first kernel module :) but if there's a better route for > > pushing it upstream, please feel free to suggest it to me. > > Congrats on your first kernel patch! Cc'ing kernelnewbies is OK but this > patches warrants a wider audience. Cc Greg Kroah-Hartman > (gre...@linuxfoundation.org) since he polices things related to sysfs, > the x86 maintainers (x...@kernel.org), and linux-ker...@vger.kernel.org. > The closest thing we have to a Linux system firmware list is probably > linux-...@vger.kernel.org. > > By all means, mention that it's your first kernel patch when submitting > this. > > Oh, you're also going to want to add documentation under > Documentation/ABI/testing/sysfs-firmware-fw_cfg as part of your patch > series. Also thanks for the good advice. I'm out on PTO next week, so planning to submit this to the kernel after I get back -- wouldn't want to not be around when I start getting feedback from the kernel list... :) On Wed, Jul 15, 2015 at 01:15:08PM +0200, Laszlo Ersek wrote: > On 07/15/15 13:06, Matt Fleming wrote: > > On Tue, 2015-07-14 at 13:00 -0400, Gabriel L. Somlo wrote: > >> > >> That being said, I did reimplement systemd's escape method in cca. 30 > >> lines of C, so that shouldn't be too onerous. > > > > I really don't see a reason to use systemd's naming scheme instead of > > the one already provided by the kernel. > > > >> Besides, Laszlo said he liked a real folder hierarchy, and I do too, > >> so I'm still pondering how much doing that would complicate the module > >> init function. I'm somewhat worried about what the added complexity > >> will mean in terms of upstream acceptance in the linux kernel :) > > > > Yeah, going that route will complicate the patch and you're going to get > > asked "Umm.. why?" by Greg Kroah-Hartman (definitely Cc Greg when > > sending this to the kernel mailing lists btw). > > > > But if you can provide sound technical arguments for the added > > complexity I'm sure the kernel folks will be happy. Laszlo's argument > > makes sense to me, i.e. wanting to use standard utilities to know > > whether a blob is available. > > > > Instead of rolling all this kobject-creation logic into your driver I'd > > suggest writing a patch to lib/kobject.c., since the problem sounds like > > something that should be solved with the kobject API. > > Haha, this feature just got extended by six months. :) > > > The question is, how simple can you make the code ;-) > > The first question is how strong Gabriel's nerves are, but (from past > experience) they *are* strong. :) Sir, you are too kind... :) Speaking of which, I think I hit the first snag, and it's of the design/bikeshedding kind: The code to build nested ksets (represending sub-sub-directories of /sys/firmware/fw_cfg/...) and cleaning them up on exit doesn't promise to be *too* horrible or bulky, but as I was getting ready to start writing it, I realized that, in theory, nothing is to stop the fw_cfg device from having files named e.g. "etc/foo" and "etc/foo/bar" That doesn't happen right now on the qemu side, but it could in theory, and I have no idea how I'd deal with the file/directory duality of "foo" in this situation, short of refusing to load the module, or leaving out one fw_cfg file or the other. Unless there's a way around this, I think it's safer to either stick with the default 's/\//!/g' scheme of the kobject library, or implement something like systemd's string escaping scheme that's been suggested earlier in this thread... Or maybe leaving out the occasional poorly-named file is still better than giving up the ability to run find on the fw_cfg sysfs folder ? Anyway, here goes version 0.2... Thanks much, --Gabriel /* * drivers/firmware/fw_cfg.c * * Expose entries from QEMU's firmware configuration (fw_cfg) device in * sysfs (read-only, under "/sys/firmware/fw_cfg/..."). * * Copyright 2015 Gabriel L. Somlo <so...@cmu.edu> * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License v2 as published * by the Free Software Foundation. * * This program is distributed in the hope that it will be useful, but * WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY * or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License * for more details. * * You should have received a copy of the GNU General Public License along * with this program; if not, write to the Free Software Foundation, Inc., * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA */ #include <linux/module.h> #include <linux/capability.h> #include <linux/slab.h> #include <linux/io.h> #include <linux/ioport.h> #include <linux/ctype.h> /* selector values for "well-known" fw_cfg entries */ #define FW_CFG_SIGNATURE 0x00 #define FW_CFG_FILE_DIR 0x19 /* size in bytes of fw_cfg signature */ #define FW_CFG_SIG_SIZE 4 /* fw_cfg "file name" is up to 56 characters (including terminating nul) */ #define FW_CFG_MAX_FILE_PATH 56 /* fw_cfg file directory entry type */ struct fw_cfg_file { uint32_t size; uint16_t select; uint16_t reserved; char name[FW_CFG_MAX_FILE_PATH]; }; /* fw_cfg device i/o access options type */ struct fw_cfg_access { phys_addr_t start; uint8_t size; uint8_t ctrl_offset; uint8_t data_offset; bool is_mmio; const char *name; }; /* fw_cfg device i/o access available options for known architectures */ static struct fw_cfg_access fw_cfg_modes[] = { { 0x510, 2, 0, 1, false, "fw_cfg on i386, sun4u" }, { 0x9020000, 10, 8, 0, true, "fw_cfg on arm" }, { 0xd00000510, 3, 0, 2, true, "fw_cfg on sun4m" }, { 0xf0000510, 3, 0, 2, true, "fw_cfg on ppc/mac" }, { } }; /* fw_cfg device i/o currently selected option set */ static struct fw_cfg_access *fw_cfg_mode; /* fw_cfg device i/o register addresses */ static void __iomem *fw_cfg_dev_base; static void __iomem *fw_cfg_dev_ctrl; static void __iomem *fw_cfg_dev_data; /* atomic access to fw_cfg device (potentially slow i/o, so using mutex) */ static DEFINE_MUTEX(fw_cfg_dev_lock); /* type for fw_cfg "directory scan" visitor/callback function */ typedef int (*fw_cfg_file_callback)(const struct fw_cfg_file *f); /* run a given callback on each fw_cfg directory entry */ static int fw_cfg_scan_dir(fw_cfg_file_callback callback) { int ret = 0; uint32_t count, i; struct fw_cfg_file f; mutex_lock(&fw_cfg_dev_lock); iowrite16(FW_CFG_FILE_DIR, fw_cfg_dev_ctrl); ioread8_rep(fw_cfg_dev_data, &count, sizeof(count)); for (i = 0; i < be32_to_cpu(count); i++) { ioread8_rep(fw_cfg_dev_data, &f, sizeof(f)); ret = callback(&f); if (ret) break; } mutex_unlock(&fw_cfg_dev_lock); return ret; } /* read chunk of given fw_cfg blob (caller responsible for sanity-check) */ static inline void fw_cfg_read_blob(uint16_t select, void *buf, loff_t pos, size_t count) { mutex_lock(&fw_cfg_dev_lock); iowrite16(select, fw_cfg_dev_ctrl); while (pos-- > 0) ioread8(fw_cfg_dev_data); ioread8_rep(fw_cfg_dev_data, buf, count); mutex_unlock(&fw_cfg_dev_lock); } /* clean up fw_cfg device i/o setup */ static void fw_cfg_io_cleanup(void) { if (fw_cfg_mode->is_mmio) { iounmap(fw_cfg_dev_base); release_mem_region(fw_cfg_mode->start, fw_cfg_mode->size); } else { ioport_unmap(fw_cfg_dev_base); release_region(fw_cfg_mode->start, fw_cfg_mode->size); } } /* probe and map fw_cfg device */ static int __init fw_cfg_io_probe(void) { char sig[FW_CFG_SIG_SIZE]; for (fw_cfg_mode = &fw_cfg_modes[0]; fw_cfg_mode->start; fw_cfg_mode++) { phys_addr_t start = fw_cfg_mode->start; uint8_t size = fw_cfg_mode->size; /* reserve and map mmio or ioport region */ if (fw_cfg_mode->is_mmio) { if (!request_mem_region(start, size, fw_cfg_mode->name)) continue; fw_cfg_dev_base = ioremap(start, size); if (!fw_cfg_dev_base) { release_mem_region(start, size); continue; } } else { if (!request_region(start, size, fw_cfg_mode->name)) continue; fw_cfg_dev_base = ioport_map(start, size); if (!fw_cfg_dev_base) { release_region(start, size); continue; } } /* set control and data register addresses */ fw_cfg_dev_ctrl = fw_cfg_dev_base + fw_cfg_mode->ctrl_offset; fw_cfg_dev_data = fw_cfg_dev_base + fw_cfg_mode->data_offset; /* verify fw_cfg device signature */ fw_cfg_read_blob(FW_CFG_SIGNATURE, sig, 0, FW_CFG_SIG_SIZE); if (memcmp(sig, "QEMU", FW_CFG_SIG_SIZE) == 0) /* success, we're done */ return 0; /* clean up before probing next access mode */ fw_cfg_io_cleanup(); } return -ENODEV; } /* fw_cfg_sysfs_entry type */ struct fw_cfg_sysfs_entry { struct kobject kobj; struct fw_cfg_file f; struct list_head list; }; /* get fw_cfg_sysfs_entry from kobject member */ static inline struct fw_cfg_sysfs_entry *to_entry(struct kobject *kobj) { return container_of(kobj, struct fw_cfg_sysfs_entry, kobj); } /* fw_cfg_sysfs_attribute type */ struct fw_cfg_sysfs_attribute { struct attribute attr; ssize_t (*show)(struct fw_cfg_sysfs_entry *entry, char *buf); }; /* get fw_cfg_sysfs_attribute from attribute member */ static inline struct fw_cfg_sysfs_attribute *to_attr(struct attribute *attr) { return container_of(attr, struct fw_cfg_sysfs_attribute, attr); } /* global cache of fw_cfg_sysfs_entry objects */ static LIST_HEAD(fw_cfg_entry_cache); /* kobjects removed lazily by kernel, mutual exclusion needed */ static DEFINE_SPINLOCK(fw_cfg_cache_lock); static inline void fw_cfg_sysfs_cache_enlist(struct fw_cfg_sysfs_entry *entry) { spin_lock(&fw_cfg_cache_lock); list_add_tail(&entry->list, &fw_cfg_entry_cache); spin_unlock(&fw_cfg_cache_lock); } static inline void fw_cfg_sysfs_cache_delist(struct fw_cfg_sysfs_entry *entry) { spin_lock(&fw_cfg_cache_lock); list_del(&entry->list); spin_unlock(&fw_cfg_cache_lock); } static void fw_cfg_sysfs_cache_cleanup(void) { struct fw_cfg_sysfs_entry *entry, *next; list_for_each_entry_safe(entry, next, &fw_cfg_entry_cache, list) { /* will end up invoking fw_cfg_sysfs_cache_delist() * via each object's release() method (i.e. destructor) */ kobject_put(&entry->kobj); } } /* default_attrs: per-entry attributes and show methods */ #define FW_CFG_SYSFS_ATTR(_attr) \ struct fw_cfg_sysfs_attribute fw_cfg_sysfs_attr_##_attr = { \ .attr = { .name = __stringify(_attr), .mode = 0400 }, \ .show = fw_cfg_sysfs_show_##_attr, \ } static ssize_t fw_cfg_sysfs_show_size(struct fw_cfg_sysfs_entry *e, char *buf) { return sprintf(buf, "%d\n", e->f.size); } static ssize_t fw_cfg_sysfs_show_select(struct fw_cfg_sysfs_entry *e, char *buf) { return sprintf(buf, "%d\n", e->f.select); } static ssize_t fw_cfg_sysfs_show_name(struct fw_cfg_sysfs_entry *e, char *buf) { return sprintf(buf, "%s\n", e->f.name); } static FW_CFG_SYSFS_ATTR(size); static FW_CFG_SYSFS_ATTR(select); static FW_CFG_SYSFS_ATTR(name); static struct attribute *fw_cfg_sysfs_entry_attrs[] = { &fw_cfg_sysfs_attr_size.attr, &fw_cfg_sysfs_attr_select.attr, &fw_cfg_sysfs_attr_name.attr, NULL, }; /* sysfs_ops: find fw_cfg_[entry, attribute] and call appropriate show method */ static ssize_t fw_cfg_sysfs_attr_show(struct kobject *kobj, struct attribute *a, char *buf) { struct fw_cfg_sysfs_entry *entry = to_entry(kobj); struct fw_cfg_sysfs_attribute *attr = to_attr(a); if (!capable(CAP_SYS_ADMIN)) return -EACCES; return attr->show(entry, buf); } static const struct sysfs_ops fw_cfg_sysfs_attr_ops = { .show = fw_cfg_sysfs_attr_show, }; /* release: destructor, to be called via kobject_put() */ static void fw_cfg_sysfs_release_entry(struct kobject *kobj) { struct fw_cfg_sysfs_entry *entry = to_entry(kobj); fw_cfg_sysfs_cache_delist(entry); kfree(entry); } /* kobj_type: ties together all properties required to register an entry */ static struct kobj_type fw_cfg_sysfs_entry_ktype = { .default_attrs = fw_cfg_sysfs_entry_attrs, .sysfs_ops = &fw_cfg_sysfs_attr_ops, .release = fw_cfg_sysfs_release_entry, }; /* raw-read method and attribute */ static ssize_t fw_cfg_sysfs_read_raw(struct file *filp, struct kobject *kobj, struct bin_attribute *bin_attr, char *buf, loff_t pos, size_t count) { struct fw_cfg_sysfs_entry *entry = to_entry(kobj); if (!capable(CAP_SYS_ADMIN)) return -EACCES; if (pos > entry->f.size) return -EINVAL; if (count > entry->f.size - pos) count = entry->f.size - pos; fw_cfg_read_blob(entry->f.select, buf, pos, count); return count; } static struct bin_attribute fw_cfg_sysfs_attr_raw = { .attr = { .name = "raw", .mode = 0400 }, .read = fw_cfg_sysfs_read_raw, }; /* object set to represent fw_cfg sysfs subfolder */ static struct kset *fw_cfg_kset; static int __init fw_cfg_register_file(const struct fw_cfg_file *f) { int err; struct fw_cfg_sysfs_entry *entry; /* allocate new entry */ entry = kzalloc(sizeof(*entry), GFP_KERNEL); if (!entry) return -ENOMEM; /* set file entry information */ entry->f.size = be32_to_cpu(f->size); entry->f.select = be16_to_cpu(f->select); strcpy(entry->f.name, f->name); //FIXME: we could walk the '/' separated tokens of the file // name, creating kset subdirectories of fw_cfg_kset, // then use the last one as the parent for the kobj // being created below: /* register sysfs entry for this file */ entry->kobj.kset = fw_cfg_kset; /* NOTE: any '/' char in filename automatically turned into '!' */ err = kobject_init_and_add(&entry->kobj, &fw_cfg_sysfs_entry_ktype, NULL, "%s", entry->f.name); if (err) goto err_register; /* add raw binary content access */ err = sysfs_create_bin_file(&entry->kobj, &fw_cfg_sysfs_attr_raw); if (err) goto err_add_raw; /* success, add entry to global cache */ fw_cfg_sysfs_cache_enlist(entry); return 0; err_add_raw: kobject_del(&entry->kobj); err_register: kfree(entry); return err; } static int __init fw_cfg_sysfs_init(void) { int err; err = fw_cfg_io_probe(); if (err) return err; /* create fw_cfg folder in sysfs */ fw_cfg_kset = kset_create_and_add("fw_cfg", NULL, firmware_kobj); if (!fw_cfg_kset) { err = -ENOMEM; goto err_sysfs; } /* process fw_cfg file directory entry, registering each file */ err = fw_cfg_scan_dir(fw_cfg_register_file); if (err) goto err_scan; /* success */ pr_debug("fw_cfg: loaded.\n"); return 0; err_scan: fw_cfg_sysfs_cache_cleanup(); //FIXME: we could recursively (depth-first) walk the kset // hierarchy, unregistering from the leaf end toward // the root, once the actual leaves (cache entries) // are gone via fw_cfg_sysfs_cache_cleanup() above. kset_unregister(fw_cfg_kset); err_sysfs: fw_cfg_io_cleanup(); return err; } static void __exit fw_cfg_sysfs_exit(void) { pr_debug("fw_cfg: unloading.\n"); fw_cfg_sysfs_cache_cleanup(); kset_unregister(fw_cfg_kset); //FIXME: depth-first recursion again fw_cfg_io_cleanup(); } module_init(fw_cfg_sysfs_init); module_exit(fw_cfg_sysfs_exit); MODULE_AUTHOR("Gabriel L. Somlo <so...@cmu.edu>"); MODULE_DESCRIPTION("QEMU fw_cfg sysfs support"); MODULE_LICENSE("GPL");