On Tue, Oct 25, 2022 at 11:56:17AM +0100, Jonathan Cameron wrote: > On Mon, 24 Oct 2022 13:42:54 -0400 > Gregory Price <gregory.pr...@memverge.com> wrote: > > > On Fri, Oct 14, 2022 at 04:10:44PM +0100, Jonathan Cameron wrote: > > > From: Huai-Cheng Kuo <hch...@avery-design.com.tw> > > > > > > The CDAT can be specified in two ways. One is to add ",cdat=<filename>" > > > in "-device cxl-type3"'s command option. The file is required to provide > > > the whole CDAT table in binary mode. The other is to use the default > > > that provides some 'reasonable' numbers based on type of memory and > > > size. > > > > > > The DOE capability supporting CDAT is added to hw/mem/cxl_type3.c with > > > capability offset 0x190. The config read/write to this capability range > > > can be generated in the OS to request the CDAT data. > > > > > > Signed-off-by: Huai-Cheng Kuo <hch...@avery-design.com.tw> > > > Signed-off-by: Chris Browy <cbr...@avery-design.com> > > > Signed-off-by: Jonathan Cameron <jonathan.came...@huawei.com> > > > > > > > In reviewing this patch under a debug kernel, I'm discovering some > > warnings / tracebacks that I want to get a sanity check on. They appear > > to be warnings, but i'm not 100% sure what to make of them. > > > > I get the following stack traces during boot (on the x86 machine). > > > > Removing the type-3 device from the configuration prevents the traceback > > from occurring, suggesting it's something to do with that code in > > particular (which tracks with the patch here) > > Thanks Gregory, > > We have an INIT_WORK() in pci_doe_submit_task() > that in the pci_doe_discovery() call is based a work structure that is > on the stack. As such should be INIT_WORK_ONSTACK() > > This is a little tricky becaues there is no particular reason to assume > all struct pci_doe_task instances will be on the stack though they are > today. We don't really want to break the layering as would be necessary > to have this init at the locations where we otherwise initialize the > containing structure. > > My first thought is add an 'onstack' bool to either the pci_doe_submit_task() > or perhaps better would be to add it to the pci_doe_task() and have > macros like DECLARE_CDAT_DOE_TASK_ONSTACK() set it appropriately so we > can call the right INIT_WORK_*() variant later. > > Ira / others, which way to go to fix this?
Yes Jonathan is on the right track here. Though I was confused why no one had ever seen this till now. I see it was because I never ran with the CONFIG_DEBUG_OBJECTS_WORK option. While we could have a declaration macro. I think the best solution is to separate the task from the internal implementation; see patch below. I was never fully happy with the idea of having the work struct in the user visible task object anyway. > > I'll also reply to the last version of that series to make sure people see > this discussion. Thanks I've been looking for time to look at this series and have missed it. So the ping over there helped! :-D Ira >From 9e16d5e399723412acbaec1bb9be807d5e5bf7fc Mon Sep 17 00:00:00 2001 From: Ira Weiny <ira.we...@intel.com> Date: Mon, 31 Oct 2022 11:31:30 -0700 Subject: [RFC PATCH] PCI/doe: Fix work struct declaration The callers of pci_doe_submit_task() allocate the pci_doe_task on the stack. This causes the work structure to be allocated on the stack without pci_doe_submit_task() knowing. Work item initialization needs to be done with either INIT_WORK_ONSTACK() or INIT_WORK() depending on how the work item is allocated. Jonathan suggested creating doe task allocation macros such as DECLARE_CDAT_DOE_TASK_ONSTACK(). This would work, however, having the work structure be part of pci_doe_task seems like a layering violation. Create an internal struct pci_doe_work which represents the work and use this for the work queue. RFC because I'm wondering if a gfp_t flags parameter should be added to pci_doe_submit_task() or not. [1] https://lore.kernel.org/linux-cxl/20221014151045.24781-1-jonathan.came...@huawei.com/T/#m88a7f50dcce52f30c8bf5c3dcc06fa9843b54a2d Reported-by: Gregory Price <gregory.pr...@memverge.com> Reported-by: Jonathan Cameron <jonathan.came...@huawei.com> Signed-off-by: Ira Weiny <ira.we...@intel.com> --- drivers/pci/doe.c | 56 +++++++++++++++++++++++++++-------------- include/linux/pci-doe.h | 4 --- 2 files changed, 37 insertions(+), 23 deletions(-) diff --git a/drivers/pci/doe.c b/drivers/pci/doe.c index e402f05068a5..6a9fa57e2cac 100644 --- a/drivers/pci/doe.c +++ b/drivers/pci/doe.c @@ -29,6 +29,12 @@ #define PCI_DOE_FLAG_CANCEL 0 #define PCI_DOE_FLAG_DEAD 1 +struct pci_doe_work { + struct pci_doe_task *task; + struct work_struct work; + struct pci_doe_mb *doe_mb; +}; + /** * struct pci_doe_mb - State for a single DOE mailbox * @@ -214,9 +220,9 @@ static void signal_task_complete(struct pci_doe_task *task, int rv) task->complete(task); } -static void signal_task_abort(struct pci_doe_task *task, int rv) +static void signal_task_abort(struct pci_doe_mb *doe_mb, struct pci_doe_task *task, + int rv) { - struct pci_doe_mb *doe_mb = task->doe_mb; struct pci_dev *pdev = doe_mb->pdev; if (pci_doe_abort(doe_mb)) { @@ -233,9 +239,10 @@ static void signal_task_abort(struct pci_doe_task *task, int rv) static void doe_statemachine_work(struct work_struct *work) { - struct pci_doe_task *task = container_of(work, struct pci_doe_task, - work); - struct pci_doe_mb *doe_mb = task->doe_mb; + struct pci_doe_work *doe_work = container_of(work, struct pci_doe_work, + work); + struct pci_doe_task *task = doe_work->task; + struct pci_doe_mb *doe_mb = doe_work->doe_mb; struct pci_dev *pdev = doe_mb->pdev; int offset = doe_mb->cap_offset; unsigned long timeout_jiffies; @@ -244,7 +251,7 @@ static void doe_statemachine_work(struct work_struct *work) if (test_bit(PCI_DOE_FLAG_DEAD, &doe_mb->flags)) { signal_task_complete(task, -EIO); - return; + goto free_work; } /* Send request */ @@ -260,8 +267,8 @@ static void doe_statemachine_work(struct work_struct *work) if (rc == -EBUSY) dev_err_ratelimited(&pdev->dev, "[%x] busy detected; another entity is sending conflicting requests\n", offset); - signal_task_abort(task, rc); - return; + signal_task_abort(doe_mb, task, rc); + goto free_work; } timeout_jiffies = jiffies + PCI_DOE_TIMEOUT; @@ -269,30 +276,32 @@ static void doe_statemachine_work(struct work_struct *work) retry_resp: pci_read_config_dword(pdev, offset + PCI_DOE_STATUS, &val); if (FIELD_GET(PCI_DOE_STATUS_ERROR, val)) { - signal_task_abort(task, -EIO); - return; + signal_task_abort(doe_mb, task, -EIO); + goto free_work; } if (!FIELD_GET(PCI_DOE_STATUS_DATA_OBJECT_READY, val)) { if (time_after(jiffies, timeout_jiffies)) { - signal_task_abort(task, -EIO); - return; + signal_task_abort(doe_mb, task, -EIO); + goto free_work; } rc = pci_doe_wait(doe_mb, PCI_DOE_POLL_INTERVAL); if (rc) { - signal_task_abort(task, rc); - return; + signal_task_abort(doe_mb, task, rc); + goto free_work; } goto retry_resp; } rc = pci_doe_recv_resp(doe_mb, task); if (rc < 0) { - signal_task_abort(task, rc); - return; + signal_task_abort(doe_mb, task, rc); + goto free_work; } signal_task_complete(task, rc); +free_work: + kfree(doe_work); } static void pci_doe_task_complete(struct pci_doe_task *task) @@ -510,10 +519,14 @@ EXPORT_SYMBOL_GPL(pci_doe_supports_prot); * * Excess data will be discarded. * + * Context: non-atomic; allocates with GFP_KERNEL + * * RETURNS: 0 when task has been successfully queued, -ERRNO on error */ int pci_doe_submit_task(struct pci_doe_mb *doe_mb, struct pci_doe_task *task) { + struct pci_doe_work *work; + if (!pci_doe_supports_prot(doe_mb, task->prot.vid, task->prot.type)) return -EINVAL; @@ -528,9 +541,14 @@ int pci_doe_submit_task(struct pci_doe_mb *doe_mb, struct pci_doe_task *task) if (test_bit(PCI_DOE_FLAG_DEAD, &doe_mb->flags)) return -EIO; - task->doe_mb = doe_mb; - INIT_WORK(&task->work, doe_statemachine_work); - queue_work(doe_mb->work_queue, &task->work); + work = kzalloc(sizeof(*work), GFP_KERNEL); + if (!work) + return -ENOMEM; + + work->task = task; + work->doe_mb = doe_mb; + INIT_WORK(&work->work, doe_statemachine_work); + queue_work(doe_mb->work_queue, &work->work); return 0; } EXPORT_SYMBOL_GPL(pci_doe_submit_task); diff --git a/include/linux/pci-doe.h b/include/linux/pci-doe.h index ed9b4df792b8..7c573cdbf17b 100644 --- a/include/linux/pci-doe.h +++ b/include/linux/pci-doe.h @@ -52,10 +52,6 @@ struct pci_doe_task { int rv; void (*complete)(struct pci_doe_task *task); void *private; - - /* No need for the user to initialize these fields */ - struct work_struct work; - struct pci_doe_mb *doe_mb; }; /** -- 2.37.2