Author: smh Date: Wed Feb 27 02:21:10 2013 New Revision: 247369 URL: http://svnweb.freebsd.org/changeset/base/247369
Log: Fixes queuing issues where mfi_release_command blindly sets the cm_flags = 0 without first removing the command from the relavent queue. This was causing panics in the queue functions which check to ensure a command is not on another queue. Fixed some cases where the error from mfi_mapcmd was lost and where the command was never released / dequeued in error cases. Ensure that all failures to mfi_mapcmd are logged. Fixed possible null pointer exception in mfi_aen_setup if mfi_get_log_state failed. Fixed mfi_parse_entries & mfi_aen_setup not returning possible errors. Corrected MFI_DUMP_CMDS calls with invalid vars SC vs sc. Commands which have timed out now set cm_error to ETIMEDOUT and call mfi_complete which prevents them getting stuck in the busy queue forever. Fixed possible use of NULL pointer in mfi_tbolt_get_cmd. Changed output formats to be more easily recognisable when debugging. Optimised mfi_cmd_pool_tbolt cleanup. Made information about driver limiting commands always display as for modern cards this can be severe. Fixed mfi_tbolt_alloc_cmd out of memory case which previously didnt return an error. Added malloc checks for request_desc_pool including free when subsiquent errors are detected. Fixed overflow error in SIMD reply descriptor check. Fixed tbolt_cmd leak in mfi_build_and_issue_cmd if there's an error during IO build. Elimintated double checks on sc->mfi_aen_cm & sc->mfi_map_sync_cm in mfi_shutdown. Move local hdr calculation after error check in mfi_aen_complete. Fixed wakeup on NULL in mfi_aen_complete. Fixed mfi_aen_cm cleanup in mfi_process_fw_state_chg_isr not checking if it was NULL. Changed mfi_alloc_commands to error if bus_dmamap_create fails. Previously we would try to continue with the number of allocated commands but lots of places in the driver assume sc->mfi_max_fw_cmds is whats available so its unsafe to do this without lots of changes. Removed mfi_total_cmds as its no longer used due the above change. Corrected mfi_tbolt_alloc_cmd to return ENOMEM where appropriate. Fixed timeouts actually firing at double what they should. Setting hw.mfi.max_cmds=-1 now configures to use the controller max. A few style (9) fixes e.g. braced single line conditions and double blank lines Cleaned up queuing macros Removed invalid queuing tests for multiple queues Trap and deal with errors when doing sends in mfi_data_cb Refactored frame sending into one method with error checking of the return code so we can ensure commands aren't left on the queue after error. This ensures that mfi_mapcmd & mfi_data_cb leave the queue in a valid state. Refactored how commands are cleaned up, mfi_release_command now ensures that all queues and command state is maintained in a consistent state. Prevent NULL pointer use in mfi_tbolt_complete_cmd Fixed use of NULL sc->mfi_map_sync_cm in wakeup Added defines to help with output of mfi_cmd and header flags. Fixed mfi_tbolt_init_MFI_queue invalidating cm_index of the acquired mfi_cmd. Reset now reinitialises sync map as well as AEN. Fixed possible use of NULL pointer in mfi_build_and_issue_cmd Fixed mfi_tbolt_init_MFI_queue call to mfi_process_fw_state_chg_isr causing panic on failure. Ensure that tbolt cards always initialise next_host_reply_index and free_host_reply_index (based off mfi_max_fw_cmds) on both startup and reset as per the linux driver. Fixed mfi_tbolt_complete_cmd not acknowledging unknown commands so it didn't clear the controller. Prevent locks from being dropped and re-acquired in the following functions which was allowing multiple threads to enter critical methods such as mfi_tbolt_complete_cmd & mfi_process_fw_state_chg_isr:- * mfi_tbolt_init_MFI_queue * mfi_aen_complete / mfi_aen_register * mfi_tbolt_sync_map_info * mfi_get_log_state * mfi_parse_entries The locking for these functions was promoting to higher level methods. This also fixed MFI_LINUX_SET_AEN_2 which was already acquiring the lock, so would have paniced for recursive lock. This also required changing malloc of ld_sync in mfi_tbolt_sync_map_info to M_NOWAIT which can hence now fail but this was already expected as its return was being tested. Removed the assignment of cm_index in mfi_tbolt_init_MFI_queue which breaks the world if the cmd returned by mfi_dequeue_free isn't the first cmd. Fixed locking in mfi_data_cb, this is an async callback from bus_dmamap_load which could hence be called after the caller has dropped the lock. If we don't have the lock we aquire it and ensure we unlock before returning. Fixed locking mfi_comms_init when mfi_dequeue_free fails. Fixed mfi_build_and_issue_cmd not returning tbolt cmds aquired to the pool on error. Fixed mfi_abort not dropping the io lock when mfi_dequeue_free fails. Added hw.mfi.polled_cmd_timeout sysctl that enables tuning of polled timeouts. This shouldn't be reduced below 50 seconds as its used for firmware patching which can take quite some time. Added hw.mfi.fw_reset_test sysctl which is avaliable when compiled with MFI_DEBUG and allows the testing of controller reset that was provoking a large number of the issues encountered here. Reviewed by: Doug Ambrisko Approved by: pjd (mentor) MFC after: 1 month Modified: head/sys/dev/mfi/mfi.c head/sys/dev/mfi/mfi_cam.c head/sys/dev/mfi/mfi_debug.c head/sys/dev/mfi/mfi_tbolt.c head/sys/dev/mfi/mfireg.h head/sys/dev/mfi/mfivar.h Modified: head/sys/dev/mfi/mfi.c ============================================================================== --- head/sys/dev/mfi/mfi.c Wed Feb 27 00:49:32 2013 (r247368) +++ head/sys/dev/mfi/mfi.c Wed Feb 27 02:21:10 2013 (r247369) @@ -108,6 +108,7 @@ static void mfi_bio_complete(struct mfi_ static struct mfi_command *mfi_build_ldio(struct mfi_softc *,struct bio*); static struct mfi_command *mfi_build_syspdio(struct mfi_softc *,struct bio*); static int mfi_send_frame(struct mfi_softc *, struct mfi_command *); +static int mfi_std_send_frame(struct mfi_softc *, struct mfi_command *); static int mfi_abort(struct mfi_softc *, struct mfi_command **); static int mfi_linux_ioctl_int(struct cdev *, u_long, caddr_t, int, struct thread *); static void mfi_timeout(void *); @@ -132,24 +133,30 @@ static int mfi_check_for_sscd(struct mfi SYSCTL_NODE(_hw, OID_AUTO, mfi, CTLFLAG_RD, 0, "MFI driver parameters"); static int mfi_event_locale = MFI_EVT_LOCALE_ALL; TUNABLE_INT("hw.mfi.event_locale", &mfi_event_locale); -SYSCTL_INT(_hw_mfi, OID_AUTO, event_locale, CTLFLAG_RW, &mfi_event_locale, - 0, "event message locale"); +SYSCTL_INT(_hw_mfi, OID_AUTO, event_locale, CTLFLAG_RWTUN, &mfi_event_locale, + 0, "event message locale"); static int mfi_event_class = MFI_EVT_CLASS_INFO; TUNABLE_INT("hw.mfi.event_class", &mfi_event_class); -SYSCTL_INT(_hw_mfi, OID_AUTO, event_class, CTLFLAG_RW, &mfi_event_class, - 0, "event message class"); +SYSCTL_INT(_hw_mfi, OID_AUTO, event_class, CTLFLAG_RWTUN, &mfi_event_class, + 0, "event message class"); static int mfi_max_cmds = 128; TUNABLE_INT("hw.mfi.max_cmds", &mfi_max_cmds); -SYSCTL_INT(_hw_mfi, OID_AUTO, max_cmds, CTLFLAG_RD, &mfi_max_cmds, - 0, "Max commands"); +SYSCTL_INT(_hw_mfi, OID_AUTO, max_cmds, CTLFLAG_RDTUN, &mfi_max_cmds, + 0, "Max commands limit (-1 = controller limit)"); static int mfi_detect_jbod_change = 1; TUNABLE_INT("hw.mfi.detect_jbod_change", &mfi_detect_jbod_change); -SYSCTL_INT(_hw_mfi, OID_AUTO, detect_jbod_change, CTLFLAG_RW, +SYSCTL_INT(_hw_mfi, OID_AUTO, detect_jbod_change, CTLFLAG_RWTUN, &mfi_detect_jbod_change, 0, "Detect a change to a JBOD"); +int mfi_polled_cmd_timeout = MFI_POLL_TIMEOUT_SECS; +TUNABLE_INT("hw.mfi.polled_cmd_timeout", &mfi_polled_cmd_timeout); +SYSCTL_INT(_hw_mfi, OID_AUTO, polled_cmd_timeout, CTLFLAG_RWTUN, + &mfi_polled_cmd_timeout, 0, + "Polled command timeout - used for firmware flash etc (in seconds)"); + /* Management interface */ static d_open_t mfi_open; static d_close_t mfi_close; @@ -361,7 +368,7 @@ mfi_attach(struct mfi_softc *sc) { uint32_t status; int error, commsz, framessz, sensesz; - int frames, unit, max_fw_sge; + int frames, unit, max_fw_sge, max_fw_cmds; uint32_t tb_mem_size = 0; if (sc == NULL) @@ -456,7 +463,14 @@ mfi_attach(struct mfi_softc *sc) * instead of compile time. */ status = sc->mfi_read_fw_status(sc); - sc->mfi_max_fw_cmds = status & MFI_FWSTATE_MAXCMD_MASK; + max_fw_cmds = status & MFI_FWSTATE_MAXCMD_MASK; + if (mfi_max_cmds > 0 && mfi_max_cmds < max_fw_cmds) { + device_printf(sc->mfi_dev, "FW MaxCmds = %d, limiting to %d\n", + max_fw_cmds, mfi_max_cmds); + sc->mfi_max_fw_cmds = mfi_max_cmds; + } else { + sc->mfi_max_fw_cmds = max_fw_cmds; + } max_fw_sge = (status & MFI_FWSTATE_MAXSGL_MASK) >> 16; sc->mfi_max_sge = min(max_fw_sge, ((MFI_MAXPHYS / PAGE_SIZE) + 1)); @@ -464,7 +478,8 @@ mfi_attach(struct mfi_softc *sc) if (sc->mfi_flags & MFI_FLAGS_TBOLT) { mfi_tbolt_init_globals(sc); - device_printf(sc->mfi_dev, "MaxCmd = %x MaxSgl = %x state = %x \n", + device_printf(sc->mfi_dev, "MaxCmd = %d, Drv MaxCmd = %d, " + "MaxSgl = %d, state = %#x\n", max_fw_cmds, sc->mfi_max_fw_cmds, sc->mfi_max_sge, status); tb_mem_size = mfi_tbolt_get_memory_requirement(sc); @@ -503,8 +518,8 @@ mfi_attach(struct mfi_softc *sc) 0, /* flags */ NULL, NULL, /* lockfunc, lockarg */ &sc->mfi_tb_init_dmat)) { - device_printf(sc->mfi_dev, "Cannot allocate init DMA tag\n"); - return (ENOMEM); + device_printf(sc->mfi_dev, "Cannot allocate init DMA tag\n"); + return (ENOMEM); } if (bus_dmamem_alloc(sc->mfi_tb_init_dmat, (void **)&sc->mfi_tb_init, BUS_DMA_NOWAIT, &sc->mfi_tb_init_dmamap)) { @@ -683,11 +698,14 @@ mfi_attach(struct mfi_softc *sc) /* ThunderBolt MFI_IOC2 INIT */ if (sc->mfi_flags & MFI_FLAGS_TBOLT) { sc->mfi_disable_intr(sc); + mtx_lock(&sc->mfi_io_lock); if ((error = mfi_tbolt_init_MFI_queue(sc)) != 0) { device_printf(sc->mfi_dev, "TB Init has failed with error %d\n",error); + mtx_unlock(&sc->mfi_io_lock); return error; } + mtx_unlock(&sc->mfi_io_lock); if ((error = mfi_tbolt_alloc_cmd(sc)) != 0) return error; @@ -723,8 +741,12 @@ mfi_attach(struct mfi_softc *sc) "hook\n"); return (EINVAL); } - if ((error = mfi_aen_setup(sc, 0), 0) != 0) + mtx_lock(&sc->mfi_io_lock); + if ((error = mfi_aen_setup(sc, 0), 0) != 0) { + mtx_unlock(&sc->mfi_io_lock); return (error); + } + mtx_unlock(&sc->mfi_io_lock); /* * Register a shutdown handler. @@ -764,7 +786,9 @@ mfi_attach(struct mfi_softc *sc) mfi_timeout, sc); if (sc->mfi_flags & MFI_FLAGS_TBOLT) { + mtx_lock(&sc->mfi_io_lock); mfi_tbolt_sync_map_info(sc); + mtx_unlock(&sc->mfi_io_lock); } return (0); @@ -774,21 +798,16 @@ static int mfi_alloc_commands(struct mfi_softc *sc) { struct mfi_command *cm; - int i, ncmds; + int i, j; /* * XXX Should we allocate all the commands up front, or allocate on * demand later like 'aac' does? */ - ncmds = MIN(mfi_max_cmds, sc->mfi_max_fw_cmds); - if (bootverbose) - device_printf(sc->mfi_dev, "Max fw cmds= %d, sizing driver " - "pool to %d\n", sc->mfi_max_fw_cmds, ncmds); - - sc->mfi_commands = malloc(sizeof(struct mfi_command) * ncmds, M_MFIBUF, - M_WAITOK | M_ZERO); + sc->mfi_commands = malloc(sizeof(sc->mfi_commands[0]) * + sc->mfi_max_fw_cmds, M_MFIBUF, M_WAITOK | M_ZERO); - for (i = 0; i < ncmds; i++) { + for (i = 0; i < sc->mfi_max_fw_cmds; i++) { cm = &sc->mfi_commands[i]; cm->cm_frame = (union mfi_frame *)((uintptr_t)sc->mfi_frames + sc->mfi_cmd_size * i); @@ -804,10 +823,20 @@ mfi_alloc_commands(struct mfi_softc *sc) mtx_lock(&sc->mfi_io_lock); mfi_release_command(cm); mtx_unlock(&sc->mfi_io_lock); + } else { + device_printf(sc->mfi_dev, "Failed to allocate %d " + "command blocks, only allocated %d\n", + sc->mfi_max_fw_cmds, i - 1); + for (j = 0; j < i; j++) { + cm = &sc->mfi_commands[i]; + bus_dmamap_destroy(sc->mfi_buffer_dmat, + cm->cm_dmamap); + } + free(sc->mfi_commands, M_MFIBUF); + sc->mfi_commands = NULL; + + return (ENOMEM); } - else - break; - sc->mfi_total_cmds++; } return (0); @@ -832,6 +861,29 @@ mfi_release_command(struct mfi_command * cm->cm_sg->sg32[0].addr = 0; } + /* + * Command may be on other queues e.g. busy queue depending on the + * flow of a previous call to mfi_mapcmd, so ensure its dequeued + * properly + */ + if ((cm->cm_flags & MFI_ON_MFIQ_BUSY) != 0) + mfi_remove_busy(cm); + if ((cm->cm_flags & MFI_ON_MFIQ_READY) != 0) + mfi_remove_ready(cm); + + /* We're not expecting it to be on any other queue but check */ + if ((cm->cm_flags & MFI_ON_MFIQ_MASK) != 0) { + panic("Command %p is still on another queue, flags = %#x", + cm, cm->cm_flags); + } + + /* tbolt cleanup */ + if ((cm->cm_flags & MFI_CMD_TBOLT) != 0) { + mfi_tbolt_return_cmd(cm->cm_sc, + cm->cm_sc->mfi_cmd_pool_tbolt[cm->cm_extra_frames - 1], + cm); + } + hdr_data = (uint32_t *)cm->cm_frame; hdr_data[0] = 0; /* cmd, sense_len, cmd_status, scsi_status */ hdr_data[1] = 0; /* target_id, lun_id, cdb_len, sg_count */ @@ -914,8 +966,10 @@ mfi_comms_init(struct mfi_softc *sc) uint32_t context = 0; mtx_lock(&sc->mfi_io_lock); - if ((cm = mfi_dequeue_free(sc)) == NULL) + if ((cm = mfi_dequeue_free(sc)) == NULL) { + mtx_unlock(&sc->mfi_io_lock); return (EBUSY); + } /* Zero out the MFI frame */ context = cm->cm_frame->header.context; @@ -944,15 +998,12 @@ mfi_comms_init(struct mfi_softc *sc) cm->cm_data = NULL; cm->cm_flags = MFI_CMD_POLLED; - if ((error = mfi_mapcmd(sc, cm)) != 0) { + if ((error = mfi_mapcmd(sc, cm)) != 0) device_printf(sc->mfi_dev, "failed to send init command\n"); - mtx_unlock(&sc->mfi_io_lock); - return (error); - } mfi_release_command(cm); mtx_unlock(&sc->mfi_io_lock); - return (0); + return (error); } static int @@ -1003,7 +1054,7 @@ mfi_get_log_state(struct mfi_softc *sc, struct mfi_command *cm = NULL; int error; - mtx_lock(&sc->mfi_io_lock); + mtx_assert(&sc->mfi_io_lock, MA_OWNED); error = mfi_dcmd_command(sc, &cm, MFI_DCMD_CTRL_EVENT_GETINFO, (void **)log_state, sizeof(**log_state)); if (error) @@ -1022,7 +1073,6 @@ mfi_get_log_state(struct mfi_softc *sc, out: if (cm) mfi_release_command(cm); - mtx_unlock(&sc->mfi_io_lock); return (error); } @@ -1035,32 +1085,32 @@ mfi_aen_setup(struct mfi_softc *sc, uint int error = 0; uint32_t seq; + mtx_assert(&sc->mfi_io_lock, MA_OWNED); + class_locale.members.reserved = 0; class_locale.members.locale = mfi_event_locale; class_locale.members.evt_class = mfi_event_class; if (seq_start == 0) { - error = mfi_get_log_state(sc, &log_state); + if ((error = mfi_get_log_state(sc, &log_state)) != 0) + goto out; sc->mfi_boot_seq_num = log_state->boot_seq_num; - if (error) { - if (log_state) - free(log_state, M_MFIBUF); - return (error); - } /* * Walk through any events that fired since the last * shutdown. */ - mfi_parse_entries(sc, log_state->shutdown_seq_num, - log_state->newest_seq_num); + if ((error = mfi_parse_entries(sc, log_state->shutdown_seq_num, + log_state->newest_seq_num)) != 0) + goto out; seq = log_state->newest_seq_num; } else seq = seq_start; - mfi_aen_register(sc, seq, class_locale.word); + error = mfi_aen_register(sc, seq, class_locale.word); +out: free(log_state, M_MFIBUF); - return 0; + return (error); } int @@ -1070,7 +1120,6 @@ mfi_wait_command(struct mfi_softc *sc, s mtx_assert(&sc->mfi_io_lock, MA_OWNED); cm->cm_complete = NULL; - /* * MegaCli can issue a DCMD of 0. In this case do nothing * and return 0 to it as status @@ -1098,12 +1147,13 @@ mfi_free(struct mfi_softc *sc) if (sc->mfi_cdev != NULL) destroy_dev(sc->mfi_cdev); - if (sc->mfi_total_cmds != 0) { - for (i = 0; i < sc->mfi_total_cmds; i++) { + if (sc->mfi_commands != NULL) { + for (i = 0; i < sc->mfi_max_fw_cmds; i++) { cm = &sc->mfi_commands[i]; bus_dmamap_destroy(sc->mfi_buffer_dmat, cm->cm_dmamap); } free(sc->mfi_commands, M_MFIBUF); + sc->mfi_commands = NULL; } if (sc->mfi_intr) @@ -1159,7 +1209,8 @@ mfi_free(struct mfi_softc *sc) /* End LSIP200113393 */ /* ThunderBolt INIT packet memory Free */ if (sc->mfi_tb_init_busaddr != 0) - bus_dmamap_unload(sc->mfi_tb_init_dmat, sc->mfi_tb_init_dmamap); + bus_dmamap_unload(sc->mfi_tb_init_dmat, + sc->mfi_tb_init_dmamap); if (sc->mfi_tb_init != NULL) bus_dmamem_free(sc->mfi_tb_init_dmat, sc->mfi_tb_init, sc->mfi_tb_init_dmamap); @@ -1176,16 +1227,14 @@ mfi_free(struct mfi_softc *sc) sc->mfi_tb_ioc_init_dmamap); if (sc->mfi_tb_ioc_init_dmat != NULL) bus_dma_tag_destroy(sc->mfi_tb_ioc_init_dmat); - for (int i = 0; i < sc->mfi_max_fw_cmds; i++) { - if (sc->mfi_cmd_pool_tbolt != NULL) { + if (sc->mfi_cmd_pool_tbolt != NULL) { + for (int i = 0; i < sc->mfi_max_fw_cmds; i++) { if (sc->mfi_cmd_pool_tbolt[i] != NULL) { free(sc->mfi_cmd_pool_tbolt[i], M_MFIBUF); sc->mfi_cmd_pool_tbolt[i] = NULL; } } - } - if (sc->mfi_cmd_pool_tbolt != NULL) { free(sc->mfi_cmd_pool_tbolt, M_MFIBUF); sc->mfi_cmd_pool_tbolt = NULL; } @@ -1250,16 +1299,14 @@ restart: cm->cm_error = 0; mfi_complete(sc, cm); } - if (++ci == (sc->mfi_max_fw_cmds + 1)) { + if (++ci == (sc->mfi_max_fw_cmds + 1)) ci = 0; - } } sc->mfi_comms->hw_ci = ci; /* Give defered I/O a chance to run */ - if (sc->mfi_flags & MFI_FLAGS_QFRZN) - sc->mfi_flags &= ~MFI_FLAGS_QFRZN; + sc->mfi_flags &= ~MFI_FLAGS_QFRZN; mfi_startio(sc); mtx_unlock(&sc->mfi_io_lock); @@ -1282,15 +1329,15 @@ mfi_shutdown(struct mfi_softc *sc) int error; - if (sc->mfi_aen_cm) + if (sc->mfi_aen_cm != NULL) { sc->cm_aen_abort = 1; - if (sc->mfi_aen_cm != NULL) mfi_abort(sc, &sc->mfi_aen_cm); + } - if (sc->mfi_map_sync_cm) + if (sc->mfi_map_sync_cm != NULL) { sc->cm_map_abort = 1; - if (sc->mfi_map_sync_cm != NULL) mfi_abort(sc, &sc->mfi_map_sync_cm); + } mtx_lock(&sc->mfi_io_lock); error = mfi_dcmd_command(sc, &cm, MFI_DCMD_CTRL_SHUTDOWN, NULL, 0); @@ -1304,9 +1351,8 @@ mfi_shutdown(struct mfi_softc *sc) cm->cm_flags = MFI_CMD_POLLED; cm->cm_data = NULL; - if ((error = mfi_mapcmd(sc, cm)) != 0) { + if ((error = mfi_mapcmd(sc, cm)) != 0) device_printf(sc->mfi_dev, "Failed to shutdown controller\n"); - } mfi_release_command(cm); mtx_unlock(&sc->mfi_io_lock); @@ -1372,8 +1418,10 @@ mfi_syspdprobe(struct mfi_softc *sc) TAILQ_FOREACH_SAFE(syspd, &sc->mfi_syspd_tqh, pd_link, tmp) { found = 0; for (i = 0; i < pdlist->count; i++) { - if (syspd->pd_id == pdlist->addr[i].device_id) + if (syspd->pd_id == pdlist->addr[i].device_id) { found = 1; + break; + } } if (found == 0) { printf("DELETE\n"); @@ -1626,6 +1674,8 @@ mfi_aen_register(struct mfi_softc *sc, i struct mfi_evt_detail *ed = NULL; int error = 0; + mtx_assert(&sc->mfi_io_lock, MA_OWNED); + current_aen.word = locale; if (sc->mfi_aen_cm != NULL) { prior_aen.word = @@ -1644,13 +1694,10 @@ mfi_aen_register(struct mfi_softc *sc, i } } - mtx_lock(&sc->mfi_io_lock); error = mfi_dcmd_command(sc, &cm, MFI_DCMD_CTRL_EVENT_WAIT, (void **)&ed, sizeof(*ed)); - mtx_unlock(&sc->mfi_io_lock); - if (error) { + if (error) goto out; - } dcmd = &cm->cm_frame->dcmd; ((uint32_t *)&dcmd->mbox)[0] = seq; @@ -1661,10 +1708,8 @@ mfi_aen_register(struct mfi_softc *sc, i sc->last_seq_num = seq; sc->mfi_aen_cm = cm; - mtx_lock(&sc->mfi_io_lock); mfi_enqueue_ready(cm); mfi_startio(sc); - mtx_unlock(&sc->mfi_io_lock); out: return (error); @@ -1682,11 +1727,11 @@ mfi_aen_complete(struct mfi_command *cm) sc = cm->cm_sc; mtx_assert(&sc->mfi_io_lock, MA_OWNED); - hdr = &cm->cm_frame->header; - if (sc->mfi_aen_cm == NULL) return; + hdr = &cm->cm_frame->header; + if (sc->cm_aen_abort || hdr->cmd_status == MFI_STAT_INVALID_STATUS) { sc->cm_aen_abort = 0; @@ -1712,16 +1757,13 @@ mfi_aen_complete(struct mfi_command *cm) } free(cm->cm_data, M_MFIBUF); - sc->mfi_aen_cm = NULL; wakeup(&sc->mfi_aen_cm); + sc->mfi_aen_cm = NULL; mfi_release_command(cm); /* set it up again so the driver can catch more events */ - if (!aborted) { - mtx_unlock(&sc->mfi_io_lock); + if (!aborted) mfi_aen_setup(sc, seq); - mtx_lock(&sc->mfi_io_lock); - } } #define MAX_EVENTS 15 @@ -1735,6 +1777,8 @@ mfi_parse_entries(struct mfi_softc *sc, union mfi_evt class_locale; int error, i, seq, size; + mtx_assert(&sc->mfi_io_lock, MA_OWNED); + class_locale.members.reserved = 0; class_locale.members.locale = mfi_event_locale; class_locale.members.evt_class = mfi_event_class; @@ -1746,13 +1790,10 @@ mfi_parse_entries(struct mfi_softc *sc, return (ENOMEM); for (seq = start_seq;;) { - mtx_lock(&sc->mfi_io_lock); if ((cm = mfi_dequeue_free(sc)) == NULL) { free(el, M_MFIBUF); - mtx_unlock(&sc->mfi_io_lock); return (EBUSY); } - mtx_unlock(&sc->mfi_io_lock); dcmd = &cm->cm_frame->dcmd; bzero(dcmd->mbox, MFI_MBOX_SIZE); @@ -1768,38 +1809,30 @@ mfi_parse_entries(struct mfi_softc *sc, cm->cm_data = el; cm->cm_len = size; - mtx_lock(&sc->mfi_io_lock); if ((error = mfi_mapcmd(sc, cm)) != 0) { device_printf(sc->mfi_dev, "Failed to get controller entries\n"); mfi_release_command(cm); - mtx_unlock(&sc->mfi_io_lock); break; } - mtx_unlock(&sc->mfi_io_lock); bus_dmamap_sync(sc->mfi_buffer_dmat, cm->cm_dmamap, BUS_DMASYNC_POSTREAD); bus_dmamap_unload(sc->mfi_buffer_dmat, cm->cm_dmamap); if (dcmd->header.cmd_status == MFI_STAT_NOT_FOUND) { - mtx_lock(&sc->mfi_io_lock); mfi_release_command(cm); - mtx_unlock(&sc->mfi_io_lock); break; } if (dcmd->header.cmd_status != MFI_STAT_OK) { device_printf(sc->mfi_dev, "Error %d fetching controller entries\n", dcmd->header.cmd_status); - mtx_lock(&sc->mfi_io_lock); mfi_release_command(cm); - mtx_unlock(&sc->mfi_io_lock); + error = EIO; break; } - mtx_lock(&sc->mfi_io_lock); mfi_release_command(cm); - mtx_unlock(&sc->mfi_io_lock); for (i = 0; i < el->count; i++) { /* @@ -1815,15 +1848,13 @@ mfi_parse_entries(struct mfi_softc *sc, else if (el->event[i].seq < start_seq) break; } - mtx_lock(&sc->mfi_io_lock); mfi_queue_evt(sc, &el->event[i]); - mtx_unlock(&sc->mfi_io_lock); } seq = el->event[el->count - 1].seq + 1; } free(el, M_MFIBUF); - return (0); + return (error); } static int @@ -1940,11 +1971,12 @@ static int mfi_add_sys_pd(struct mfi_sof dcmd->mbox[0]=id; dcmd->header.scsi_status = 0; dcmd->header.pad0 = 0; - if (mfi_mapcmd(sc, cm) != 0) { + if ((error = mfi_mapcmd(sc, cm)) != 0) { device_printf(sc->mfi_dev, "Failed to get physical drive info %d\n", id); free(pd_info, M_MFIBUF); - return (0); + mfi_release_command(cm); + return (error); } bus_dmamap_sync(sc->mfi_buffer_dmat, cm->cm_dmamap, BUS_DMASYNC_POSTREAD); @@ -2094,6 +2126,8 @@ mfi_build_syspdio(struct mfi_softc *sc, int flags = 0, blkcount = 0, readop; uint8_t cdb_len; + mtx_assert(&sc->mfi_io_lock, MA_OWNED); + if ((cm = mfi_dequeue_free(sc)) == NULL) return (NULL); @@ -2140,6 +2174,7 @@ mfi_build_syspdio(struct mfi_softc *sc, cm->cm_sg = &pass->sgl; cm->cm_total_frame_size = MFI_PASS_FRAME_SIZE; cm->cm_flags = flags; + return (cm); } @@ -2152,6 +2187,8 @@ mfi_build_ldio(struct mfi_softc *sc, str uint32_t blkcount; uint32_t context = 0; + mtx_assert(&sc->mfi_io_lock, MA_OWNED); + if ((cm = mfi_dequeue_free(sc)) == NULL) return (NULL); @@ -2193,6 +2230,7 @@ mfi_build_ldio(struct mfi_softc *sc, str cm->cm_sg = &io->sgl; cm->cm_total_frame_size = MFI_IO_FRAME_SIZE; cm->cm_flags = flags; + return (cm); } @@ -2210,11 +2248,14 @@ mfi_bio_complete(struct mfi_command *cm) if ((hdr->cmd_status != MFI_STAT_OK) || (hdr->scsi_status != 0)) { bio->bio_flags |= BIO_ERROR; bio->bio_error = EIO; - device_printf(sc->mfi_dev, "I/O error, status= %d " - "scsi_status= %d\n", hdr->cmd_status, hdr->scsi_status); + device_printf(sc->mfi_dev, "I/O error, cmd=%p, status=%#x, " + "scsi_status=%#x\n", cm, hdr->cmd_status, hdr->scsi_status); mfi_print_sense(cm->cm_sc, cm->cm_sense); } else if (cm->cm_error != 0) { bio->bio_flags |= BIO_ERROR; + bio->bio_error = cm->cm_error; + device_printf(sc->mfi_dev, "I/O error, cmd=%p, error=%#x\n", + cm, cm->cm_error); } mfi_release_command(cm); @@ -2250,6 +2291,7 @@ mfi_startio(struct mfi_softc *sc) /* Send the command to the controller */ if (mfi_mapcmd(sc, cm) != 0) { + device_printf(sc->mfi_dev, "Failed to startio\n"); mfi_requeue_ready(cm); break; } @@ -2278,10 +2320,7 @@ mfi_mapcmd(struct mfi_softc *sc, struct return (0); } } else { - if (sc->MFA_enabled) - error = mfi_tbolt_send_frame(sc, cm); - else - error = mfi_send_frame(sc, cm); + error = mfi_send_frame(sc, cm); } return (error); @@ -2295,18 +2334,28 @@ mfi_data_cb(void *arg, bus_dma_segment_t union mfi_sgl *sgl; struct mfi_softc *sc; int i, j, first, dir; - int sge_size; + int sge_size, locked; cm = (struct mfi_command *)arg; sc = cm->cm_sc; hdr = &cm->cm_frame->header; sgl = cm->cm_sg; + /* + * We need to check if we have the lock as this is async + * callback so even though our caller mfi_mapcmd asserts + * it has the lock, there is no garantee that hasn't been + * dropped if bus_dmamap_load returned prior to our + * completion. + */ + if ((locked = mtx_owned(&sc->mfi_io_lock)) == 0) + mtx_lock(&sc->mfi_io_lock); + if (error) { printf("error %d in callback\n", error); cm->cm_error = error; mfi_complete(sc, cm); - return; + goto out; } /* Use IEEE sgl only for IO's on a SKINNY controller * For other commands on a SKINNY controller use either @@ -2378,10 +2427,17 @@ mfi_data_cb(void *arg, bus_dma_segment_t cm->cm_total_frame_size += (sc->mfi_sge_size * nsegs); cm->cm_extra_frames = (cm->cm_total_frame_size - 1) / MFI_FRAME_SIZE; - if (sc->MFA_enabled) - mfi_tbolt_send_frame(sc, cm); - else - mfi_send_frame(sc, cm); + if ((error = mfi_send_frame(sc, cm)) != 0) { + printf("error %d in callback from mfi_send_frame\n", error); + cm->cm_error = error; + mfi_complete(sc, cm); + goto out; + } + +out: + /* leave the lock in the state we found it */ + if (locked == 0) + mtx_unlock(&sc->mfi_io_lock); return; } @@ -2389,8 +2445,26 @@ mfi_data_cb(void *arg, bus_dma_segment_t static int mfi_send_frame(struct mfi_softc *sc, struct mfi_command *cm) { + int error; + + mtx_assert(&sc->mfi_io_lock, MA_OWNED); + + if (sc->MFA_enabled) + error = mfi_tbolt_send_frame(sc, cm); + else + error = mfi_std_send_frame(sc, cm); + + if (error != 0 && (cm->cm_flags & MFI_ON_MFIQ_BUSY) != 0) + mfi_remove_busy(cm); + + return (error); +} + +static int +mfi_std_send_frame(struct mfi_softc *sc, struct mfi_command *cm) +{ struct mfi_frame_header *hdr; - int tm = MFI_POLL_TIMEOUT_SECS * 1000; + int tm = mfi_polled_cmd_timeout * 1000; hdr = &cm->cm_frame->header; @@ -2444,6 +2518,7 @@ void mfi_complete(struct mfi_softc *sc, struct mfi_command *cm) { int dir; + mtx_assert(&sc->mfi_io_lock, MA_OWNED); if ((cm->cm_flags & MFI_CMD_MAPPED) != 0) { dir = 0; @@ -2471,11 +2546,12 @@ mfi_abort(struct mfi_softc *sc, struct m { struct mfi_command *cm; struct mfi_abort_frame *abort; - int i = 0; + int i = 0, error; uint32_t context = 0; mtx_lock(&sc->mfi_io_lock); if ((cm = mfi_dequeue_free(sc)) == NULL) { + mtx_unlock(&sc->mfi_io_lock); return (EBUSY); } @@ -2495,7 +2571,8 @@ mfi_abort(struct mfi_softc *sc, struct m cm->cm_data = NULL; cm->cm_flags = MFI_CMD_POLLED; - mfi_mapcmd(sc, cm); + if ((error = mfi_mapcmd(sc, cm)) != 0) + device_printf(sc->mfi_dev, "failed to abort command\n"); mfi_release_command(cm); mtx_unlock(&sc->mfi_io_lock); @@ -2511,7 +2588,7 @@ mfi_abort(struct mfi_softc *sc, struct m mtx_unlock(&sc->mfi_io_lock); } - return (0); + return (error); } int @@ -2549,7 +2626,8 @@ mfi_dump_blocks(struct mfi_softc *sc, in cm->cm_total_frame_size = MFI_IO_FRAME_SIZE; cm->cm_flags = MFI_CMD_POLLED | MFI_CMD_DATAOUT; - error = mfi_mapcmd(sc, cm); + if ((error = mfi_mapcmd(sc, cm)) != 0) + device_printf(sc->mfi_dev, "failed dump blocks\n"); bus_dmamap_sync(sc->mfi_buffer_dmat, cm->cm_dmamap, BUS_DMASYNC_POSTWRITE); bus_dmamap_unload(sc->mfi_buffer_dmat, cm->cm_dmamap); @@ -2592,7 +2670,8 @@ mfi_dump_syspd_blocks(struct mfi_softc * cm->cm_total_frame_size = MFI_PASS_FRAME_SIZE; cm->cm_flags = MFI_CMD_POLLED | MFI_CMD_DATAOUT | MFI_CMD_SCSI; - error = mfi_mapcmd(sc, cm); + if ((error = mfi_mapcmd(sc, cm)) != 0) + device_printf(sc->mfi_dev, "failed dump blocks\n"); bus_dmamap_sync(sc->mfi_buffer_dmat, cm->cm_dmamap, BUS_DMASYNC_POSTWRITE); bus_dmamap_unload(sc->mfi_buffer_dmat, cm->cm_dmamap); @@ -3306,8 +3385,10 @@ out: } case MFI_SET_AEN: aen = (struct mfi_ioc_aen *)arg; + mtx_lock(&sc->mfi_io_lock); error = mfi_aen_register(sc, aen->aen_seq_num, aen->aen_class_locale); + mtx_unlock(&sc->mfi_io_lock); break; case MFI_LINUX_CMD_2: /* Firmware Linux ioctl shim */ @@ -3636,7 +3717,7 @@ mfi_dump_all(void) deadline = time_uptime - MFI_CMD_TIMEOUT; mtx_lock(&sc->mfi_io_lock); TAILQ_FOREACH(cm, &sc->mfi_busy, cm_link) { - if (cm->cm_timestamp < deadline) { + if (cm->cm_timestamp <= deadline) { device_printf(sc->mfi_dev, "COMMAND %p TIMEOUT AFTER %d SECONDS\n", cm, (int)(time_uptime - cm->cm_timestamp)); @@ -3647,7 +3728,7 @@ mfi_dump_all(void) #if 0 if (timedout) - MFI_DUMP_CMDS(SC); + MFI_DUMP_CMDS(sc); #endif mtx_unlock(&sc->mfi_io_lock); @@ -3660,7 +3741,7 @@ static void mfi_timeout(void *data) { struct mfi_softc *sc = (struct mfi_softc *)data; - struct mfi_command *cm; + struct mfi_command *cm, *tmp; time_t deadline; int timedout = 0; @@ -3672,10 +3753,10 @@ mfi_timeout(void *data) } } mtx_lock(&sc->mfi_io_lock); - TAILQ_FOREACH(cm, &sc->mfi_busy, cm_link) { + TAILQ_FOREACH_SAFE(cm, &sc->mfi_busy, cm_link, tmp) { if (sc->mfi_aen_cm == cm || sc->mfi_map_sync_cm == cm) continue; - if (cm->cm_timestamp < deadline) { + if (cm->cm_timestamp <= deadline) { if (sc->adpreset != 0 && sc->issuepend_done == 0) { cm->cm_timestamp = time_uptime; } else { @@ -3685,6 +3766,13 @@ mfi_timeout(void *data) ); MFI_PRINT_CMD(cm); MFI_VALIDATE_CMD(sc, cm); + /* + * Fail the command instead of leaving it on + * the queue where it could remain stuck forever + */ + mfi_remove_busy(cm); + cm->cm_error = ETIMEDOUT; + mfi_complete(sc, cm); timedout++; } } @@ -3692,7 +3780,7 @@ mfi_timeout(void *data) #if 0 if (timedout) - MFI_DUMP_CMDS(SC); + MFI_DUMP_CMDS(sc); #endif mtx_unlock(&sc->mfi_io_lock); Modified: head/sys/dev/mfi/mfi_cam.c ============================================================================== --- head/sys/dev/mfi/mfi_cam.c Wed Feb 27 00:49:32 2013 (r247368) +++ head/sys/dev/mfi/mfi_cam.c Wed Feb 27 02:21:10 2013 (r247369) @@ -145,6 +145,7 @@ mfip_attach(device_t dev) MFI_SCSI_MAX_CMDS, sc->devq); if (sc->sim == NULL) { cam_simq_free(sc->devq); + sc->devq = NULL; device_printf(dev, "CAM SIM attach failed\n"); return (EINVAL); } @@ -155,7 +156,9 @@ mfip_attach(device_t dev) if (xpt_bus_register(sc->sim, dev, 0) != 0) { device_printf(dev, "XPT bus registration failed\n"); cam_sim_free(sc->sim, FALSE); + sc->sim = NULL; cam_simq_free(sc->devq); + sc->devq = NULL; mtx_unlock(&mfisc->mfi_io_lock); return (EINVAL); } @@ -187,11 +190,14 @@ mfip_detach(device_t dev) mtx_lock(&sc->mfi_sc->mfi_io_lock); xpt_bus_deregister(cam_sim_path(sc->sim)); cam_sim_free(sc->sim, FALSE); + sc->sim = NULL; mtx_unlock(&sc->mfi_sc->mfi_io_lock); } - if (sc->devq != NULL) + if (sc->devq != NULL) { cam_simq_free(sc->devq); + sc->devq = NULL; + } return (0); } Modified: head/sys/dev/mfi/mfi_debug.c ============================================================================== --- head/sys/dev/mfi/mfi_debug.c Wed Feb 27 00:49:32 2013 (r247368) +++ head/sys/dev/mfi/mfi_debug.c Wed Feb 27 02:21:10 2013 (r247369) @@ -57,14 +57,7 @@ __FBSDID("$FreeBSD$"); static void mfi_print_frame_flags(device_t dev, uint32_t flags) { - device_printf(dev, "flags=%b\n", flags, - "\20" - "\1NOPOST" - "\2SGL64" - "\3SENSE64" - "\4WRITE" - "\5READ" - "\6IEEESGL"); + device_printf(dev, "flags=%b\n", flags, MFI_FRAME_FMT); } static void @@ -205,16 +198,7 @@ mfi_print_cmd(struct mfi_command *cm) device_printf(dev, "cm=%p index=%d total_frame_size=%d " "extra_frames=%d\n", cm, cm->cm_index, cm->cm_total_frame_size, cm->cm_extra_frames); - device_printf(dev, "flags=%b\n", cm->cm_flags, - "\20" - "\1MAPPED" - "\2DATAIN" - "\3DATAOUT" - "\4COMPLETED" - "\5POLLED" - "\6Q_FREE" - "\7Q_READY" - "\10Q_BUSY"); + device_printf(dev, "flags=%b\n", cm->cm_flags, MFI_CMD_FLAGS_FMT); switch (cm->cm_frame->header.cmd) { case MFI_CMD_DCMD: @@ -237,7 +221,7 @@ mfi_dump_cmds(struct mfi_softc *sc) { int i; - for (i = 0; i < sc->mfi_total_cmds; i++) + for (i = 0; i < sc->mfi_max_fw_cmds; i++) mfi_print_generic_frame(sc, &sc->mfi_commands[i]); } Modified: head/sys/dev/mfi/mfi_tbolt.c ============================================================================== --- head/sys/dev/mfi/mfi_tbolt.c Wed Feb 27 00:49:32 2013 (r247368) +++ head/sys/dev/mfi/mfi_tbolt.c Wed Feb 27 02:21:10 2013 (r247369) @@ -55,14 +55,12 @@ __FBSDID("$FreeBSD$"); #include <dev/mfi/mfi_ioctl.h> #include <dev/mfi/mfivar.h> -struct mfi_cmd_tbolt *mfi_tbolt_get_cmd(struct mfi_softc *sc); +struct mfi_cmd_tbolt *mfi_tbolt_get_cmd(struct mfi_softc *sc, struct mfi_command *); union mfi_mpi2_request_descriptor * mfi_tbolt_get_request_descriptor(struct mfi_softc *sc, uint16_t index); void mfi_tbolt_complete_cmd(struct mfi_softc *sc); int mfi_tbolt_build_io(struct mfi_softc *sc, struct mfi_command *mfi_cmd, struct mfi_cmd_tbolt *cmd); -static inline void mfi_tbolt_return_cmd(struct mfi_softc *sc, - struct mfi_cmd_tbolt *cmd); union mfi_mpi2_request_descriptor *mfi_tbolt_build_mpt_cmd(struct mfi_softc *sc, struct mfi_command *cmd); uint8_t @@ -84,6 +82,15 @@ static void mfi_queue_map_sync(struct mf #define MFI_FUSION_ENABLE_INTERRUPT_MASK (0x00000008) + +extern int mfi_polled_cmd_timeout; +static int mfi_fw_reset_test = 0; +#ifdef MFI_DEBUG +TUNABLE_INT("hw.mfi.fw_reset_test", &mfi_fw_reset_test); +SYSCTL_INT(_hw_mfi, OID_AUTO, fw_reset_test, CTLFLAG_RWTUN, &mfi_fw_reset_test, + 0, "Force a firmware reset condition"); +#endif + void mfi_tbolt_enable_intr_ppc(struct mfi_softc *sc) { @@ -162,14 +169,14 @@ mfi_tbolt_adp_reset(struct mfi_softc *sc while (!( HostDiag & DIAG_WRITE_ENABLE)) { for (i = 0; i < 1000; i++); HostDiag = (uint32_t)MFI_READ4(sc, MFI_HDR); - device_printf(sc->mfi_dev, "ADP_RESET_TBOLT: retry time=%x, " - "hostdiag=%x\n", retry, HostDiag); + device_printf(sc->mfi_dev, "ADP_RESET_TBOLT: retry time=%d, " + "hostdiag=%#x\n", retry, HostDiag); if (retry++ >= 100) return 1; } - device_printf(sc->mfi_dev, "ADP_RESET_TBOLT: HostDiag=%x\n", HostDiag); + device_printf(sc->mfi_dev, "ADP_RESET_TBOLT: HostDiag=%#x\n", HostDiag); MFI_WRITE4(sc, MFI_HDR, (HostDiag | DIAG_RESET_ADAPTER)); *** DIFF OUTPUT TRUNCATED AT 1000 LINES *** _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"