Hi, On 18/10/19 2:34 PM, Simon Goldschmidt wrote: > On Thu, Oct 17, 2019 at 2:55 PM Simon Goldschmidt > <simon.k.r.goldschm...@gmail.com> wrote: >> >> On Thu, Oct 17, 2019 at 2:44 PM Vignesh Raghavendra <vigne...@ti.com> wrote: >>> >>> Hi, >>> >>> On 17/10/19 5:09 PM, Simon Goldschmidt wrote: >>>> On Mon, Oct 14, 2019 at 3:27 PM Vignesh Raghavendra <vigne...@ti.com> >>>> wrote: >>>>> >>>>> Add support for Direct Access Controller mode of Cadence QSPI. This >>>>> allows MMIO access to SPI NOR flash providing better read performance. >>>>> >>>>> Signed-off-by: Vignesh R <vigne...@ti.com> >>>>> Signed-off-by: Vignesh Raghavendra <vigne...@ti.com> >>>> >>>> I've tested this on my socfpga cyclone5 board with an mt25ql256a (running >>>> at >>>> 50MHz) and it seems to work fine. >>>> >>>> However, I had the impression it was a bit slower, not faster, although I >>>> haven't measured, and running at 50MHz with 4 data lines, reading the whole >>>> flash takes about 1.5 seconds only, so without actually measureing it, it's >>>> hard to tell if this performance is really worth the 440 bytes of extra >>>> code >>>> it adds? >>>> >>> >>> I should have noted in the commit msg that direct mode is used only when >>> AHB window is at least 8MB. Working with smaller window sizes would mean >>> code would fall back to indirect mode most of the time. >>> I see that socfgpa don't seem to have large AHB windows and therefore >>> would not use direct access mode. >> >> Aha. Yes, socfpga gen5 has a 1MB window only, I think. So the speed hasn't >> changed. >> >>> >>> On TI platforms its possible to use DMA to read data from memory mapped >>> region using mem to mem DMA channels which improves throughput by 5 times. >> >> Hmm, I'm using a QSPI at 50 MHz and transferring 31 MByte takes about 1.5 >> seconds. The maximum speed of that setup would be 25 MByte/s without any >> overhead (1.24 seconds for 31 MB). There's no way I could achieve an >> improvement >> by 5 on my platform! >> >>> >>>> Note that I'm already short on RAM in SPL (socfpga gen5), so these 440 >>>> bytes >>>> do hurt. Note that patch 1/2 of this series reduced SPL code size by 320 >>>> bytes >>>> for me :-) >>>> >>> >>> Hmm, I read this as patch 1 brings down memory consumption by 320 bytes >>> and patch 2 adds 440 bytes (delta being +120 bytes for entire series). >>> I can see if size can be optimized, but would like to avoid #ifdef'ery >>> within the driver if possible. >> >> Well, I can only say I'm currently struggling to keep SPI-NOR and MMC enabled >> at the same time in socfpga_socrates (moving more code to DM). And even if it >> sounds like not much, 440 bytes *are* much for me at this point. >> >> I still would prefer having a Kconfig option for this that can be disabled >> for socfpga. > > Oh well, maybe just go ahead adding this and I'll try how it fits. We can > alway > add reduced functionality later by checking for CONFIG_SPL_SPI_FLASH_TINY, > whicht might be a better config option to guard this than something > cadence-qspi > specific. >
Okay, thanks! Guarding with CONFIG_SPL_SPI_FLASH_TINY as and where required seems fair idea. BTW, it would be great if you could confirm Quad read works fine when spi-tx-bus-width = 1 and spi-rx-bus-width = 4 as I suggested in other thread? Regards Vignesh > Regards, > Simon > >> >> Regards, >> Simon >> >>> >>> Regards >>> Vignesh >>> >>>> Regards, >>>> Simon >>>> >>>> >>>>> --- >>>>> drivers/spi/cadence_qspi.c | 40 ++++++++++++---------- >>>>> drivers/spi/cadence_qspi.h | 19 ++++++----- >>>>> drivers/spi/cadence_qspi_apb.c | 61 +++++++++++++++++++++++++++++----- >>>>> 3 files changed, 87 insertions(+), 33 deletions(-) >>>>> >>>>> diff --git a/drivers/spi/cadence_qspi.c b/drivers/spi/cadence_qspi.c >>>>> index 673a2e9a6c4c..6c98cbf39ae4 100644 >>>>> --- a/drivers/spi/cadence_qspi.c >>>>> +++ b/drivers/spi/cadence_qspi.c >>>>> @@ -12,12 +12,13 @@ >>>>> #include <spi.h> >>>>> #include <spi-mem.h> >>>>> #include <linux/errno.h> >>>>> +#include <linux/sizes.h> >>>>> #include "cadence_qspi.h" >>>>> >>>>> #define CQSPI_STIG_READ 0 >>>>> #define CQSPI_STIG_WRITE 1 >>>>> -#define CQSPI_INDIRECT_READ 2 >>>>> -#define CQSPI_INDIRECT_WRITE 3 >>>>> +#define CQSPI_READ 2 >>>>> +#define CQSPI_WRITE 3 >>>>> >>>>> static int cadence_spi_write_speed(struct udevice *bus, uint hz) >>>>> { >>>>> @@ -189,6 +190,7 @@ static int cadence_spi_remove(struct udevice *dev) >>>>> >>>>> static int cadence_spi_set_mode(struct udevice *bus, uint mode) >>>>> { >>>>> + struct cadence_spi_platdata *plat = bus->platdata; >>>>> struct cadence_spi_priv *priv = dev_get_priv(bus); >>>>> >>>>> /* Disable QSPI */ >>>>> @@ -197,6 +199,10 @@ static int cadence_spi_set_mode(struct udevice *bus, >>>>> uint mode) >>>>> /* Set SPI mode */ >>>>> cadence_qspi_apb_set_clk_mode(priv->regbase, mode); >>>>> >>>>> + /* Enable Direct Access Controller */ >>>>> + if (plat->use_dac_mode) >>>>> + cadence_qspi_apb_dac_mode_enable(priv->regbase); >>>>> + >>>>> /* Enable QSPI */ >>>>> cadence_qspi_apb_controller_enable(priv->regbase); >>>>> >>>>> @@ -221,12 +227,12 @@ static int cadence_spi_mem_exec_op(struct spi_slave >>>>> *spi, >>>>> if (!op->addr.nbytes) >>>>> mode = CQSPI_STIG_READ; >>>>> else >>>>> - mode = CQSPI_INDIRECT_READ; >>>>> + mode = CQSPI_READ; >>>>> } else { >>>>> if (!op->addr.nbytes || !op->data.buf.out) >>>>> mode = CQSPI_STIG_WRITE; >>>>> else >>>>> - mode = CQSPI_INDIRECT_WRITE; >>>>> + mode = CQSPI_WRITE; >>>>> } >>>>> >>>>> switch (mode) { >>>>> @@ -236,19 +242,15 @@ static int cadence_spi_mem_exec_op(struct spi_slave >>>>> *spi, >>>>> case CQSPI_STIG_WRITE: >>>>> err = cadence_qspi_apb_command_write(base, op); >>>>> break; >>>>> - case CQSPI_INDIRECT_READ: >>>>> - err = cadence_qspi_apb_indirect_read_setup(plat, op); >>>>> - if (!err) { >>>>> - err = cadence_qspi_apb_indirect_read_execute >>>>> - (plat, op->data.nbytes, op->data.buf.in); >>>>> - } >>>>> + case CQSPI_READ: >>>>> + err = cadence_qspi_apb_read_setup(plat, op); >>>>> + if (!err) >>>>> + err = cadence_qspi_apb_read_execute(plat, op); >>>>> break; >>>>> - case CQSPI_INDIRECT_WRITE: >>>>> - err = cadence_qspi_apb_indirect_write_setup(plat, op); >>>>> - if (!err) { >>>>> - err = cadence_qspi_apb_indirect_write_execute >>>>> - (plat, op->data.nbytes, op->data.buf.out); >>>>> - } >>>>> + case CQSPI_WRITE: >>>>> + err = cadence_qspi_apb_write_setup(plat, op); >>>>> + if (!err) >>>>> + err = cadence_qspi_apb_write_execute(plat, op); >>>>> break; >>>>> default: >>>>> err = -1; >>>>> @@ -264,13 +266,17 @@ static int cadence_spi_ofdata_to_platdata(struct >>>>> udevice *bus) >>>>> ofnode subnode; >>>>> >>>>> plat->regbase = (void *)devfdt_get_addr_index(bus, 0); >>>>> - plat->ahbbase = (void *)devfdt_get_addr_index(bus, 1); >>>>> + plat->ahbbase = (void *)devfdt_get_addr_size_index(bus, 1, >>>>> + &plat->ahbsize); >>>>> plat->is_decoded_cs = dev_read_bool(bus, "cdns,is-decoded-cs"); >>>>> plat->fifo_depth = dev_read_u32_default(bus, "cdns,fifo-depth", >>>>> 128); >>>>> plat->fifo_width = dev_read_u32_default(bus, "cdns,fifo-width", >>>>> 4); >>>>> plat->trigger_address = dev_read_u32_default(bus, >>>>> >>>>> "cdns,trigger-address", >>>>> 0); >>>>> + /* Use DAC mode only when MMIO window is at least 8M wide */ >>>>> + if (plat->ahbsize >= SZ_8M) >>>>> + plat->use_dac_mode = true; >>>>> >>>>> /* All other paramters are embedded in the child node */ >>>>> subnode = dev_read_first_subnode(bus); >>>>> diff --git a/drivers/spi/cadence_qspi.h b/drivers/spi/cadence_qspi.h >>>>> index e655b027d788..619f0bed8efd 100644 >>>>> --- a/drivers/spi/cadence_qspi.h >>>>> +++ b/drivers/spi/cadence_qspi.h >>>>> @@ -23,6 +23,8 @@ struct cadence_spi_platdata { >>>>> u32 fifo_depth; >>>>> u32 fifo_width; >>>>> u32 trigger_address; >>>>> + fdt_addr_t ahbsize; >>>>> + bool use_dac_mode; >>>>> >>>>> /* Flash parameters */ >>>>> u32 page_size; >>>>> @@ -52,20 +54,21 @@ struct cadence_spi_priv { >>>>> void cadence_qspi_apb_controller_init(struct cadence_spi_platdata *plat); >>>>> void cadence_qspi_apb_controller_enable(void *reg_base_addr); >>>>> void cadence_qspi_apb_controller_disable(void *reg_base_addr); >>>>> +void cadence_qspi_apb_dac_mode_enable(void *reg_base); >>>>> >>>>> int cadence_qspi_apb_command_read(void *reg_base_addr, >>>>> const struct spi_mem_op *op); >>>>> int cadence_qspi_apb_command_write(void *reg_base_addr, >>>>> const struct spi_mem_op *op); >>>>> >>>>> -int cadence_qspi_apb_indirect_read_setup(struct cadence_spi_platdata >>>>> *plat, >>>>> - const struct spi_mem_op *op); >>>>> -int cadence_qspi_apb_indirect_read_execute(struct cadence_spi_platdata >>>>> *plat, >>>>> - unsigned int rxlen, u8 *rxbuf); >>>>> -int cadence_qspi_apb_indirect_write_setup(struct cadence_spi_platdata >>>>> *plat, >>>>> - const struct spi_mem_op *op); >>>>> -int cadence_qspi_apb_indirect_write_execute(struct cadence_spi_platdata >>>>> *plat, >>>>> - unsigned int txlen, const u8 *txbuf); >>>>> +int cadence_qspi_apb_read_setup(struct cadence_spi_platdata *plat, >>>>> + const struct spi_mem_op *op); >>>>> +int cadence_qspi_apb_read_execute(struct cadence_spi_platdata *plat, >>>>> + const struct spi_mem_op *op); >>>>> +int cadence_qspi_apb_write_setup(struct cadence_spi_platdata *plat, >>>>> + const struct spi_mem_op *op); >>>>> +int cadence_qspi_apb_write_execute(struct cadence_spi_platdata *plat, >>>>> + const struct spi_mem_op *op); >>>>> >>>>> void cadence_qspi_apb_chipselect(void *reg_base, >>>>> unsigned int chip_select, unsigned int decoder_enable); >>>>> diff --git a/drivers/spi/cadence_qspi_apb.c >>>>> b/drivers/spi/cadence_qspi_apb.c >>>>> index 8dd0495dfcf4..fd491f2c8104 100644 >>>>> --- a/drivers/spi/cadence_qspi_apb.c >>>>> +++ b/drivers/spi/cadence_qspi_apb.c >>>>> @@ -189,6 +189,15 @@ void cadence_qspi_apb_controller_disable(void >>>>> *reg_base) >>>>> writel(reg, reg_base + CQSPI_REG_CONFIG); >>>>> } >>>>> >>>>> +void cadence_qspi_apb_dac_mode_enable(void *reg_base) >>>>> +{ >>>>> + unsigned int reg; >>>>> + >>>>> + reg = readl(reg_base + CQSPI_REG_CONFIG); >>>>> + reg |= CQSPI_REG_CONFIG_DIRECT; >>>>> + writel(reg, reg_base + CQSPI_REG_CONFIG); >>>>> +} >>>>> + >>>>> /* Return 1 if idle, otherwise return 0 (busy). */ >>>>> static unsigned int cadence_qspi_wait_idle(void *reg_base) >>>>> { >>>>> @@ -512,8 +521,8 @@ int cadence_qspi_apb_command_write(void *reg_base, >>>>> const struct spi_mem_op *op) >>>>> } >>>>> >>>>> /* Opcode + Address (3/4 bytes) + dummy bytes (0-4 bytes) */ >>>>> -int cadence_qspi_apb_indirect_read_setup(struct cadence_spi_platdata >>>>> *plat, >>>>> - const struct spi_mem_op *op) >>>>> +int cadence_qspi_apb_read_setup(struct cadence_spi_platdata *plat, >>>>> + const struct spi_mem_op *op) >>>>> { >>>>> unsigned int reg; >>>>> unsigned int rd_reg; >>>>> @@ -577,8 +586,9 @@ static int cadence_qspi_wait_for_data(struct >>>>> cadence_spi_platdata *plat) >>>>> return -ETIMEDOUT; >>>>> } >>>>> >>>>> -int cadence_qspi_apb_indirect_read_execute(struct cadence_spi_platdata >>>>> *plat, >>>>> - unsigned int n_rx, u8 *rxbuf) >>>>> +static int >>>>> +cadence_qspi_apb_indirect_read_execute(struct cadence_spi_platdata *plat, >>>>> + unsigned int n_rx, u8 *rxbuf) >>>>> { >>>>> unsigned int remaining = n_rx; >>>>> unsigned int bytes_to_read = 0; >>>>> @@ -639,9 +649,26 @@ failrd: >>>>> return ret; >>>>> } >>>>> >>>>> +int cadence_qspi_apb_read_execute(struct cadence_spi_platdata *plat, >>>>> + const struct spi_mem_op *op) >>>>> +{ >>>>> + u32 from = op->addr.val; >>>>> + void *buf = op->data.buf.in; >>>>> + size_t len = op->data.nbytes; >>>>> + >>>>> + if (plat->use_dac_mode && (from + len < plat->ahbsize)) { >>>>> + memcpy_fromio(buf, plat->ahbbase + from, len); >>>>> + if (!cadence_qspi_wait_idle(plat->regbase)) >>>>> + return -EIO; >>>>> + return 0; >>>>> + } >>>>> + >>>>> + return cadence_qspi_apb_indirect_read_execute(plat, len, buf); >>>>> +} >>>>> + >>>>> /* Opcode + Address (3/4 bytes) */ >>>>> -int cadence_qspi_apb_indirect_write_setup(struct cadence_spi_platdata >>>>> *plat, >>>>> - const struct spi_mem_op *op) >>>>> +int cadence_qspi_apb_write_setup(struct cadence_spi_platdata *plat, >>>>> + const struct spi_mem_op *op) >>>>> { >>>>> unsigned int reg; >>>>> >>>>> @@ -662,8 +689,9 @@ int cadence_qspi_apb_indirect_write_setup(struct >>>>> cadence_spi_platdata *plat, >>>>> return 0; >>>>> } >>>>> >>>>> -int cadence_qspi_apb_indirect_write_execute(struct cadence_spi_platdata >>>>> *plat, >>>>> - unsigned int n_tx, const u8 *txbuf) >>>>> +static int >>>>> +cadence_qspi_apb_indirect_write_execute(struct cadence_spi_platdata >>>>> *plat, >>>>> + unsigned int n_tx, const u8 >>>>> *txbuf) >>>>> { >>>>> unsigned int page_size = plat->page_size; >>>>> unsigned int remaining = n_tx; >>>>> @@ -735,6 +763,23 @@ failwr: >>>>> return ret; >>>>> } >>>>> >>>>> +int cadence_qspi_apb_write_execute(struct cadence_spi_platdata *plat, >>>>> + const struct spi_mem_op *op) >>>>> +{ >>>>> + u32 to = op->addr.val; >>>>> + const void *buf = op->data.buf.out; >>>>> + size_t len = op->data.nbytes; >>>>> + >>>>> + if (plat->use_dac_mode && (to + len < plat->ahbsize)) { >>>>> + memcpy_toio(plat->ahbbase + to, buf, len); >>>>> + if (!cadence_qspi_wait_idle(plat->regbase)) >>>>> + return -EIO; >>>>> + return 0; >>>>> + } >>>>> + >>>>> + return cadence_qspi_apb_indirect_write_execute(plat, len, buf); >>>>> +} >>>>> + >>>>> void cadence_qspi_apb_enter_xip(void *reg_base, char xip_dummy) >>>>> { >>>>> unsigned int reg; >>>>> -- >>>>> 2.23.0 >>>>> >>> >>> -- >>> Regards >>> Vignesh -- Regards Vignesh _______________________________________________ U-Boot mailing list U-Boot@lists.denx.de https://lists.denx.de/listinfo/u-boot