Hi Brian, Thanks for reviewing. Comments inline below.
-matt On Jul 10, 2015, at 2:49 PM, Brian King wrote: > On 06/19/2015 05:37 PM, Matthew R. Ochs wrote: >> >> cfg->init_state = INIT_STATE_NONE; >> cfg->dev = pdev; >> + cfg->last_lun_index[0] = 0; >> + cfg->last_lun_index[1] = 0; >> cfg->dev_id = (struct pci_device_id *)dev_id; >> cfg->mcctx = NULL; >> cfg->err_recovery_active = 0; >> + cfg->num_user_contexts = 0; > > No need to set these to zero, since they get allocated via scsi_host_alloc, > which zeroes the memory for you. Agreed. We've actually already fixed this, you'll see it in v2. >> +/** >> + * marshall_det_to_rele() - translate detach to release structure >> + * @detach: Destination structure for the translate/copy. >> + * @rele: Source structure from which to translate/copy. >> + */ >> +static void marshall_det_to_rele(struct dk_cxlflash_detach *detach, >> + struct dk_cxlflash_release *release) >> +{ >> + release->hdr = detach->hdr; >> + release->context_id = detach->context_id; > > Function name could be better. Marshal should have only one 'l' in this > usage... Sure, will fix. >> + if (wwid) >> + list_for_each_entry_safe(lun_info, temp, &global.luns, list) { >> + if (!memcmp(lun_info->wwid, wwid, >> + DK_CXLFLASH_MANAGE_LUN_WWID_LEN)) >> + return lun_info; >> + } > > You probably want to be grabbing global.slock hre, right? > list_for_each_entry_safe doesn't protect > you from other threads deleting things off the list, it only allows you to > delete > entries off the list in your loop and continue to iterate. Yes, will correct. >> + >> + /* Store off lun in unpacked, AFU-friendly format */ >> + lun_info->lun_id = lun_to_lunid(sdev->lun); >> + lun_info->lun_index = cfg->last_lun_index[sdev->channel]; >> + >> + writeq_be(lun_info->lun_id, >> + &afu->afu_map->global.fc_port[sdev->channel] >> + [cfg->last_lun_index[sdev->channel]++]); > > Do you need a slave_destroy that undoes this and makes the AFU forget > about this LUN? I'm thinking you probably want to also destroy any > user contexts to this device in slave_destroy as well. Otherwise > if you have a superpipe open to a scsi_device and someone deletes that > scsi_device through sysfs, you will have an open superpipe to that > device still and no way to tear it down since you don't have a > way to issue the detach ioctl any longer. In v2, you'll see that we have remove any need for the slave_* routines. >> + pid_t pid = current->tgid, ctxpid = 0; >> + ulong flags = 0; >> + >> + if (unlikely(ctx_ctrl & CTX_CTRL_CLONE)) > > I'm seeing what is a bit of an overuse of likely / unlikely. Basically, IMHO, > the only place where you should be using these compiler hints is in the > performance > path and even then, only when the odds are pretty high that you will go down > the likely path. We've actually revised the likely/unlikely since this patch. You'll see it toned down in v2. >> >> + if (likely(lun_info)) { >> + list_for_each_entry(lun_access, &ctx_info->luns, list) >> + if (lun_access->lun_info == lun_info) { >> + found = true; >> + break; >> + } > > Do we need some locking here? Yes, will investigate adding a per-context lock. >> +static int read_cap16(struct afu *afu, struct lun_info *lun_info, u32 >> port_sel) >> +{ >> + struct afu_cmd *cmd = NULL; >> + int rc = 0; >> + > > Suggest using scsi_execute instead. That way it goes up through the block > layer > via the normal execution path, should result in less code, and then you also > get > error recovery for free. You can then get rid of the nasty looping on > cxlflash_check_status. Done. >> + cmd = cxlflash_cmd_checkout(afu); >> + if (unlikely(!cmd)) { >> + pr_err("%s: could not get a free command\n", __func__); >> + return -1; >> + } >> + >> + cmd->rcb.req_flags = (SISL_REQ_FLAGS_PORT_LUN_ID | >> + SISL_REQ_FLAGS_SUP_UNDERRUN | >> + SISL_REQ_FLAGS_HOST_READ); >> + >> + cmd->rcb.port_sel = port_sel; >> + cmd->rcb.lun_id = lun_info->lun_id; >> + cmd->rcb.data_len = CMD_BUFSIZE; >> + cmd->rcb.data_ea = (u64) cmd->buf; >> + cmd->rcb.timeout = MC_DISCOVERY_TIMEOUT; >> + >> + cmd->rcb.cdb[0] = 0x9E; /* read cap(16) */ > > Use SERVICE_ACTION_IN_16 here Okay. >> + cmd->rcb.cdb[1] = 0x10; /* service action */ > > Use SAI_READ_CAPACITY_16 here You got it. >> + * rhte_checkin() - releases a resource handle table entry >> + * @ctx_info: Context owning the resource handle. >> + * @rht_entry: RHTE to release. >> + */ >> +void rhte_checkin(struct ctx_info *ctx_info, >> + struct sisl_rht_entry *rht_entry) >> +{ >> + rht_entry->nmask = 0; >> + rht_entry->fp = 0; >> + ctx_info->rht_out--; >> + ctx_info->rht_lun[rht_entry - ctx_info->rht_start] = NULL; > > Do you need some locking in both the checkout and checkin path? Likely, will investigate this. >> >> + /* >> + * Clear the Format 1 RHT entry for direct access >> + * (physical LUN) using the synchronization sequence >> + * defined in the SISLite specification. >> + */ >> + rht_entry_f1 = (struct sisl_rht_entry_f1 *)rht_entry; >> + >> + rht_entry_f1->valid = 0; >> + smp_wmb(); /* Make revocation of RHT entry visible */ > > I think what you actually want to use here is dma_wmb() rather than smp_wmb(), > assuming you are trying to ensure these writes occur in order with respect > to the AFU. If you were to build the kernel as a UP kernel, rather than an SMP > kernel, all these smp_wmb calls would turn into mere compiler barriers, which > is > not what you want, I think... Suggest auditing the entire patch as there are > likely other barriers you may want to change as well. What we're looking for here is a lightweight sync. Looking at the PPC barrier.h, I see that you are correct, we need to specify dma_wmb() instead to ensure we use the LWSYNC instead of a compiler barrier() when smp_wmb is not defined.. Agree with the audit idea. >> + /* Free the context; note that rht_lun was allocated at same time */ >> + kfree(ctx_info); >> + cfg->num_user_contexts--; > > What guarantees atomicity of this increment? I don't see any locking around > the callers... Per Mikey Neuling's comments, we have already made this atomic. >> + /* Take our LUN out of context, free the node */ >> + list_for_each_entry_safe(lun_access, t, &ctx_info->luns, list) >> + if (lun_access->lun_info == lun_info) { >> + list_del(&lun_access->list); >> + kfree(lun_access); >> + lun_access = NULL; >> + break; >> + } > > Do you need to do some locking around this? Can others be reading / writing > to / from this list concurrently? Reading/writing would be via any of the ioctl threads when determining access to a context. We'll likely cover this via a per-context lock. >> + >> + /* Tear down context following last LUN cleanup */ >> + if (list_empty(&ctx_info->luns)) { >> + spin_lock_irqsave(&cfg->ctx_tbl_slock, flags); >> + cfg->ctx_tbl[ctxid] = NULL; >> + spin_unlock_irqrestore(&cfg->ctx_tbl_slock, flags); >> + >> + while (atomic_read(&ctx_info->nrefs) > 1) { >> + pr_debug("%s: waiting on threads... (%d)\n", >> + __func__, atomic_read(&ctx_info->nrefs)); >> + cpu_relax(); > > Hmm. Would an msleep with an overall timeout be more appropriate here? It > might > actually be cleaner just to use a kref in the ctx_info struct to manage your > reference count and then use the release function to do that cleanup, > then you could simplify all this and eliminate the wait. In v2, you'll see that we've already move to a sleep instead of a busy-wait here. Will need to investigate what moving to a kref would look like. >> +int cxlflash_mark_contexts_error(struct cxlflash_cfg *cfg) >> +{ >> + int i, rc = 0; >> + ulong lock_flags; >> + struct ctx_info *ctx_info = NULL; >> + >> + spin_lock_irqsave(&cfg->ctx_tbl_slock, lock_flags); >> + >> + for (i = 0; i < MAX_CONTEXT; i++) { >> + ctx_info = cfg->ctx_tbl[i]; >> + >> + if (ctx_info) { >> + cfg->ctx_tbl[i] = NULL; >> + list_add(&ctx_info->list, &cfg->ctx_err_recovery); >> + ctx_info->err_recovery_active = true; >> + unmap_context(ctx_info); >> + } >> + } >> + >> + spin_unlock_irqrestore(&cfg->ctx_tbl_slock, lock_flags); >> + >> + return rc; >> +} > > This function doesn't appear to be called anywhere. Not even in the follow on > patch... You're correct, this might not have been called in the first patch, but is used in v2. >> + >> + /* On first attach set fileops */ >> + if (cfg->num_user_contexts == 0) >> + cfg->cxl_fops = cxlflash_cxl_fops; > > I'm not seeing any synchronization or locking here. How are you managing a > hundred different users > hitting your ioctl path at the same time? You are iterating through lists > below with no locks > held, but it seems to me like that list could be changing on you. Am I > missing something here? This line in particular is now an atomic. We've also added a state check to block ioctls when needed. >> + ctxid = cxl_process_element(ctx); >> + if ((ctxid > MAX_CONTEXT) || (ctxid < 0)) { > > Too many parentheses Fixed. >> + reg = readq_be(&afu->ctrl_map->mbox_r); /* Try MMIO */ >> + /* MMIO returning 0xff, need to reset */ >> + if (reg == -1) { >> + pr_info("%s: afu=%p reason 0x%llx\n", >> + __func__, afu, recover->reason); >> + cxlflash_afu_reset(cfg); >> + } else { >> + pr_debug("%s: reason 0x%llx MMIO working, no reset performed\n", >> + __func__, recover->reason); >> + rc = -EINVAL; > > This seems a little strange to me. The user asked you to recover the AFU, the > AFU looks > fine to you, so you do nothing, yet you fail the request to recover the AFU. > What about > multiple users issuing this IOCTL at the same time? Seems like we might want > to just > return success rather than make the user second guess what they did wrong in > building > the ioctl. Correct, we've reworked this routine for v2. You'll see behavior similar to as you've described. >> + case MODE_VIRTUAL: >> + last_lba = (((rht_entry->lxt_cnt * MC_CHUNK_SIZE * >> + lun_info->blk_len) / CXLFLASH_BLOCK_SIZE) - 1); >> + break; > > Should this go in the second patch? Yes, this should have been in the second patch. >> +struct dk_cxlflash_uvirtual { >> + struct dk_cxlflash_hdr hdr; /* Common fields */ >> + __u64 context_id; /* Context to own virtual resources */ >> + __u64 lun_size; /* Requested size, in 4K blocks */ >> + __u64 rsrc_handle; /* Returned resource handle */ >> + __u64 last_lba; /* Returned last LBA of LUN */ >> +}; > > This isn't used in this patch, so should really be in the second patch. Yes, this technically should be in the second patch. >> + >> +struct dk_cxlflash_release { >> + struct dk_cxlflash_hdr hdr; /* Common fields */ >> + __u64 context_id; /* Context owning resources */ >> + __u64 rsrc_handle; /* Resource handle to release */ >> +}; >> + >> +struct dk_cxlflash_resize { >> + struct dk_cxlflash_hdr hdr; /* Common fields */ >> + __u64 context_id; /* Context owning resources */ >> + __u64 rsrc_handle; /* Resource handle of LUN to resize */ >> + __u64 req_size; /* New requested size, in 4K blocks */ >> + __u64 last_lba; /* Returned last LBA of LUN */ >> +}; > > Looks like the same is true here. Ideally all the virtual LUN definitions, > function > prototypes, etc, would be in the second patch. Agreed. >> +#define DK_CXLFLASH_ATTACH CXL_IOW(0x80, dk_cxlflash_attach) > > It would probably be good to add a comment to include/uapi/misc/cxl.h to > indicate > that cxlflash is reserving this ioctl range so you don't conflict with other > cxl users. Sounds reasonable. Will work with the maintainers of that file to do this. >> +#define DK_CXLFLASH_USER_DIRECT CXL_IOW(0x81, >> dk_cxlflash_udirect) >> +#define DK_CXLFLASH_USER_VIRTUAL CXL_IOW(0x82, dk_cxlflash_uvirtual) >> +#define DK_CXLFLASH_VLUN_RESIZE CXL_IOW(0x83, >> dk_cxlflash_resize) >> +#define DK_CXLFLASH_RELEASE CXL_IOW(0x84, dk_cxlflash_release) >> +#define DK_CXLFLASH_DETACH CXL_IOW(0x85, dk_cxlflash_detach) >> +#define DK_CXLFLASH_VERIFY CXL_IOW(0x86, dk_cxlflash_verify) >> +#define DK_CXLFLASH_CLONE CXL_IOW(0x87, dk_cxlflash_clone) >> +#define DK_CXLFLASH_RECOVER_AFU CXL_IOW(0x88, >> dk_cxlflash_recover_afu) >> +#define DK_CXLFLASH_MANAGE_LUN CXL_IOW(0x89, >> dk_cxlflash_manage_lun) >> + >> +#endif /* ifndef _CXLFLASH_IOCTL_H */ -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html