On 18/10/19 6:12 PM, Simon Goldschmidt wrote: > On Fri, Oct 18, 2019 at 2:40 PM Vignesh Raghavendra <vigne...@ti.com> wrote: >> >> Hi, >> >> On 18/10/19 2:34 PM, Simon Goldschmidt wrote: >>> On Thu, Oct 17, 2019 at 2:55 PM Simon Goldschmidt >>> <simon.k.r.goldschm...@gmail.com> wrote: >>>> >>>> On Thu, Oct 17, 2019 at 2:44 PM Vignesh Raghavendra <vigne...@ti.com> >>>> wrote: >>>>> >>>>> Hi, >>>>> >>>>> On 17/10/19 5:09 PM, Simon Goldschmidt wrote: >>>>>> On Mon, Oct 14, 2019 at 3:27 PM Vignesh Raghavendra <vigne...@ti.com> >>>>>> wrote: >>>>>>> >>>>>>> Add support for Direct Access Controller mode of Cadence QSPI. This >>>>>>> allows MMIO access to SPI NOR flash providing better read performance. >>>>>>> >>>>>>> Signed-off-by: Vignesh R <vigne...@ti.com> >>>>>>> Signed-off-by: Vignesh Raghavendra <vigne...@ti.com> >>>>>> >>>>>> I've tested this on my socfpga cyclone5 board with an mt25ql256a >>>>>> (running at >>>>>> 50MHz) and it seems to work fine. >>>>>> >>>>>> However, I had the impression it was a bit slower, not faster, although I >>>>>> haven't measured, and running at 50MHz with 4 data lines, reading the >>>>>> whole >>>>>> flash takes about 1.5 seconds only, so without actually measureing it, >>>>>> it's >>>>>> hard to tell if this performance is really worth the 440 bytes of extra >>>>>> code >>>>>> it adds? >>>>>> >>>>> >>>>> I should have noted in the commit msg that direct mode is used only when >>>>> AHB window is at least 8MB. Working with smaller window sizes would mean >>>>> code would fall back to indirect mode most of the time. >>>>> I see that socfgpa don't seem to have large AHB windows and therefore >>>>> would not use direct access mode. >>>> >>>> Aha. Yes, socfpga gen5 has a 1MB window only, I think. So the speed hasn't >>>> changed. >>>> >>>>> >>>>> On TI platforms its possible to use DMA to read data from memory mapped >>>>> region using mem to mem DMA channels which improves throughput by 5 times. >>>> >>>> Hmm, I'm using a QSPI at 50 MHz and transferring 31 MByte takes about 1.5 >>>> seconds. The maximum speed of that setup would be 25 MByte/s without any >>>> overhead (1.24 seconds for 31 MB). There's no way I could achieve an >>>> improvement >>>> by 5 on my platform! >>>> >>>>> >>>>>> Note that I'm already short on RAM in SPL (socfpga gen5), so these 440 >>>>>> bytes >>>>>> do hurt. Note that patch 1/2 of this series reduced SPL code size by 320 >>>>>> bytes >>>>>> for me :-) >>>>>> >>>>> >>>>> Hmm, I read this as patch 1 brings down memory consumption by 320 bytes >>>>> and patch 2 adds 440 bytes (delta being +120 bytes for entire series). >>>>> I can see if size can be optimized, but would like to avoid #ifdef'ery >>>>> within the driver if possible. >>>> >>>> Well, I can only say I'm currently struggling to keep SPI-NOR and MMC >>>> enabled >>>> at the same time in socfpga_socrates (moving more code to DM). And even if >>>> it >>>> sounds like not much, 440 bytes *are* much for me at this point. >>>> >>>> I still would prefer having a Kconfig option for this that can be disabled >>>> for socfpga. >>> >>> Oh well, maybe just go ahead adding this and I'll try how it fits. We can >>> alway >>> add reduced functionality later by checking for CONFIG_SPL_SPI_FLASH_TINY, >>> whicht might be a better config option to guard this than something >>> cadence-qspi >>> specific. >>> >> >> Okay, thanks! Guarding with CONFIG_SPL_SPI_FLASH_TINY as and where >> required seems fair idea. BTW, it would be great if you could confirm >> Quad read works fine when spi-tx-bus-width = 1 and spi-rx-bus-width = 4 >> as I suggested in other thread? > > Yes, I'll do that when I find the time. Unfortunately, I can only do > that on the devices > at work, not at home.
Alright, thanks! > I also would like to check the memory-mapped code works on > socfpga by removing the 8MB size check, does that make sense? > Well, in that case memory-mapped mode will be used for accessing data within first 1MB of flash. Rest of the data would be accessed via indirect mode. I think if AHB window is small, then probably SoC designers did not intend, controller to be used in DAC or memory-mapped mode. Regards Vignesh > Regards, > Simon > >> >> Regards >> Vignesh >> >>> Regards, >>> Simon >>> >>>> >>>> Regards, >>>> Simon >>>> >>>>> >>>>> Regards >>>>> Vignesh >>>>> >>>>>> Regards, >>>>>> Simon >>>>>> >>>>>> >>>>>>> --- >>>>>>> drivers/spi/cadence_qspi.c | 40 ++++++++++++---------- >>>>>>> drivers/spi/cadence_qspi.h | 19 ++++++----- >>>>>>> drivers/spi/cadence_qspi_apb.c | 61 +++++++++++++++++++++++++++++----- >>>>>>> 3 files changed, 87 insertions(+), 33 deletions(-) >>>>>>> >>>>>>> diff --git a/drivers/spi/cadence_qspi.c b/drivers/spi/cadence_qspi.c >>>>>>> index 673a2e9a6c4c..6c98cbf39ae4 100644 >>>>>>> --- a/drivers/spi/cadence_qspi.c >>>>>>> +++ b/drivers/spi/cadence_qspi.c >>>>>>> @@ -12,12 +12,13 @@ >>>>>>> #include <spi.h> >>>>>>> #include <spi-mem.h> >>>>>>> #include <linux/errno.h> >>>>>>> +#include <linux/sizes.h> >>>>>>> #include "cadence_qspi.h" >>>>>>> >>>>>>> #define CQSPI_STIG_READ 0 >>>>>>> #define CQSPI_STIG_WRITE 1 >>>>>>> -#define CQSPI_INDIRECT_READ 2 >>>>>>> -#define CQSPI_INDIRECT_WRITE 3 >>>>>>> +#define CQSPI_READ 2 >>>>>>> +#define CQSPI_WRITE 3 >>>>>>> >>>>>>> static int cadence_spi_write_speed(struct udevice *bus, uint hz) >>>>>>> { >>>>>>> @@ -189,6 +190,7 @@ static int cadence_spi_remove(struct udevice *dev) >>>>>>> >>>>>>> static int cadence_spi_set_mode(struct udevice *bus, uint mode) >>>>>>> { >>>>>>> + struct cadence_spi_platdata *plat = bus->platdata; >>>>>>> struct cadence_spi_priv *priv = dev_get_priv(bus); >>>>>>> >>>>>>> /* Disable QSPI */ >>>>>>> @@ -197,6 +199,10 @@ static int cadence_spi_set_mode(struct udevice >>>>>>> *bus, uint mode) >>>>>>> /* Set SPI mode */ >>>>>>> cadence_qspi_apb_set_clk_mode(priv->regbase, mode); >>>>>>> >>>>>>> + /* Enable Direct Access Controller */ >>>>>>> + if (plat->use_dac_mode) >>>>>>> + cadence_qspi_apb_dac_mode_enable(priv->regbase); >>>>>>> + >>>>>>> /* Enable QSPI */ >>>>>>> cadence_qspi_apb_controller_enable(priv->regbase); >>>>>>> >>>>>>> @@ -221,12 +227,12 @@ static int cadence_spi_mem_exec_op(struct >>>>>>> spi_slave *spi, >>>>>>> if (!op->addr.nbytes) >>>>>>> mode = CQSPI_STIG_READ; >>>>>>> else >>>>>>> - mode = CQSPI_INDIRECT_READ; >>>>>>> + mode = CQSPI_READ; >>>>>>> } else { >>>>>>> if (!op->addr.nbytes || !op->data.buf.out) >>>>>>> mode = CQSPI_STIG_WRITE; >>>>>>> else >>>>>>> - mode = CQSPI_INDIRECT_WRITE; >>>>>>> + mode = CQSPI_WRITE; >>>>>>> } >>>>>>> >>>>>>> switch (mode) { >>>>>>> @@ -236,19 +242,15 @@ static int cadence_spi_mem_exec_op(struct >>>>>>> spi_slave *spi, >>>>>>> case CQSPI_STIG_WRITE: >>>>>>> err = cadence_qspi_apb_command_write(base, op); >>>>>>> break; >>>>>>> - case CQSPI_INDIRECT_READ: >>>>>>> - err = cadence_qspi_apb_indirect_read_setup(plat, op); >>>>>>> - if (!err) { >>>>>>> - err = cadence_qspi_apb_indirect_read_execute >>>>>>> - (plat, op->data.nbytes, >>>>>>> op->data.buf.in); >>>>>>> - } >>>>>>> + case CQSPI_READ: >>>>>>> + err = cadence_qspi_apb_read_setup(plat, op); >>>>>>> + if (!err) >>>>>>> + err = cadence_qspi_apb_read_execute(plat, op); >>>>>>> break; >>>>>>> - case CQSPI_INDIRECT_WRITE: >>>>>>> - err = cadence_qspi_apb_indirect_write_setup(plat, op); >>>>>>> - if (!err) { >>>>>>> - err = cadence_qspi_apb_indirect_write_execute >>>>>>> - (plat, op->data.nbytes, op->data.buf.out); >>>>>>> - } >>>>>>> + case CQSPI_WRITE: >>>>>>> + err = cadence_qspi_apb_write_setup(plat, op); >>>>>>> + if (!err) >>>>>>> + err = cadence_qspi_apb_write_execute(plat, op); >>>>>>> break; >>>>>>> default: >>>>>>> err = -1; >>>>>>> @@ -264,13 +266,17 @@ static int cadence_spi_ofdata_to_platdata(struct >>>>>>> udevice *bus) >>>>>>> ofnode subnode; >>>>>>> >>>>>>> plat->regbase = (void *)devfdt_get_addr_index(bus, 0); >>>>>>> - plat->ahbbase = (void *)devfdt_get_addr_index(bus, 1); >>>>>>> + plat->ahbbase = (void *)devfdt_get_addr_size_index(bus, 1, >>>>>>> + &plat->ahbsize); >>>>>>> plat->is_decoded_cs = dev_read_bool(bus, "cdns,is-decoded-cs"); >>>>>>> plat->fifo_depth = dev_read_u32_default(bus, "cdns,fifo-depth", >>>>>>> 128); >>>>>>> plat->fifo_width = dev_read_u32_default(bus, "cdns,fifo-width", >>>>>>> 4); >>>>>>> plat->trigger_address = dev_read_u32_default(bus, >>>>>>> >>>>>>> "cdns,trigger-address", >>>>>>> 0); >>>>>>> + /* Use DAC mode only when MMIO window is at least 8M wide */ >>>>>>> + if (plat->ahbsize >= SZ_8M) >>>>>>> + plat->use_dac_mode = true; >>>>>>> >>>>>>> /* All other paramters are embedded in the child node */ >>>>>>> subnode = dev_read_first_subnode(bus); >>>>>>> diff --git a/drivers/spi/cadence_qspi.h b/drivers/spi/cadence_qspi.h >>>>>>> index e655b027d788..619f0bed8efd 100644 >>>>>>> --- a/drivers/spi/cadence_qspi.h >>>>>>> +++ b/drivers/spi/cadence_qspi.h >>>>>>> @@ -23,6 +23,8 @@ struct cadence_spi_platdata { >>>>>>> u32 fifo_depth; >>>>>>> u32 fifo_width; >>>>>>> u32 trigger_address; >>>>>>> + fdt_addr_t ahbsize; >>>>>>> + bool use_dac_mode; >>>>>>> >>>>>>> /* Flash parameters */ >>>>>>> u32 page_size; >>>>>>> @@ -52,20 +54,21 @@ struct cadence_spi_priv { >>>>>>> void cadence_qspi_apb_controller_init(struct cadence_spi_platdata >>>>>>> *plat); >>>>>>> void cadence_qspi_apb_controller_enable(void *reg_base_addr); >>>>>>> void cadence_qspi_apb_controller_disable(void *reg_base_addr); >>>>>>> +void cadence_qspi_apb_dac_mode_enable(void *reg_base); >>>>>>> >>>>>>> int cadence_qspi_apb_command_read(void *reg_base_addr, >>>>>>> const struct spi_mem_op *op); >>>>>>> int cadence_qspi_apb_command_write(void *reg_base_addr, >>>>>>> const struct spi_mem_op *op); >>>>>>> >>>>>>> -int cadence_qspi_apb_indirect_read_setup(struct cadence_spi_platdata >>>>>>> *plat, >>>>>>> - const struct spi_mem_op *op); >>>>>>> -int cadence_qspi_apb_indirect_read_execute(struct cadence_spi_platdata >>>>>>> *plat, >>>>>>> - unsigned int rxlen, u8 *rxbuf); >>>>>>> -int cadence_qspi_apb_indirect_write_setup(struct cadence_spi_platdata >>>>>>> *plat, >>>>>>> - const struct spi_mem_op *op); >>>>>>> -int cadence_qspi_apb_indirect_write_execute(struct >>>>>>> cadence_spi_platdata *plat, >>>>>>> - unsigned int txlen, const u8 *txbuf); >>>>>>> +int cadence_qspi_apb_read_setup(struct cadence_spi_platdata *plat, >>>>>>> + const struct spi_mem_op *op); >>>>>>> +int cadence_qspi_apb_read_execute(struct cadence_spi_platdata *plat, >>>>>>> + const struct spi_mem_op *op); >>>>>>> +int cadence_qspi_apb_write_setup(struct cadence_spi_platdata *plat, >>>>>>> + const struct spi_mem_op *op); >>>>>>> +int cadence_qspi_apb_write_execute(struct cadence_spi_platdata *plat, >>>>>>> + const struct spi_mem_op *op); >>>>>>> >>>>>>> void cadence_qspi_apb_chipselect(void *reg_base, >>>>>>> unsigned int chip_select, unsigned int decoder_enable); >>>>>>> diff --git a/drivers/spi/cadence_qspi_apb.c >>>>>>> b/drivers/spi/cadence_qspi_apb.c >>>>>>> index 8dd0495dfcf4..fd491f2c8104 100644 >>>>>>> --- a/drivers/spi/cadence_qspi_apb.c >>>>>>> +++ b/drivers/spi/cadence_qspi_apb.c >>>>>>> @@ -189,6 +189,15 @@ void cadence_qspi_apb_controller_disable(void >>>>>>> *reg_base) >>>>>>> writel(reg, reg_base + CQSPI_REG_CONFIG); >>>>>>> } >>>>>>> >>>>>>> +void cadence_qspi_apb_dac_mode_enable(void *reg_base) >>>>>>> +{ >>>>>>> + unsigned int reg; >>>>>>> + >>>>>>> + reg = readl(reg_base + CQSPI_REG_CONFIG); >>>>>>> + reg |= CQSPI_REG_CONFIG_DIRECT; >>>>>>> + writel(reg, reg_base + CQSPI_REG_CONFIG); >>>>>>> +} >>>>>>> + >>>>>>> /* Return 1 if idle, otherwise return 0 (busy). */ >>>>>>> static unsigned int cadence_qspi_wait_idle(void *reg_base) >>>>>>> { >>>>>>> @@ -512,8 +521,8 @@ int cadence_qspi_apb_command_write(void *reg_base, >>>>>>> const struct spi_mem_op *op) >>>>>>> } >>>>>>> >>>>>>> /* Opcode + Address (3/4 bytes) + dummy bytes (0-4 bytes) */ >>>>>>> -int cadence_qspi_apb_indirect_read_setup(struct cadence_spi_platdata >>>>>>> *plat, >>>>>>> - const struct spi_mem_op *op) >>>>>>> +int cadence_qspi_apb_read_setup(struct cadence_spi_platdata *plat, >>>>>>> + const struct spi_mem_op *op) >>>>>>> { >>>>>>> unsigned int reg; >>>>>>> unsigned int rd_reg; >>>>>>> @@ -577,8 +586,9 @@ static int cadence_qspi_wait_for_data(struct >>>>>>> cadence_spi_platdata *plat) >>>>>>> return -ETIMEDOUT; >>>>>>> } >>>>>>> >>>>>>> -int cadence_qspi_apb_indirect_read_execute(struct cadence_spi_platdata >>>>>>> *plat, >>>>>>> - unsigned int n_rx, u8 *rxbuf) >>>>>>> +static int >>>>>>> +cadence_qspi_apb_indirect_read_execute(struct cadence_spi_platdata >>>>>>> *plat, >>>>>>> + unsigned int n_rx, u8 *rxbuf) >>>>>>> { >>>>>>> unsigned int remaining = n_rx; >>>>>>> unsigned int bytes_to_read = 0; >>>>>>> @@ -639,9 +649,26 @@ failrd: >>>>>>> return ret; >>>>>>> } >>>>>>> >>>>>>> +int cadence_qspi_apb_read_execute(struct cadence_spi_platdata *plat, >>>>>>> + const struct spi_mem_op *op) >>>>>>> +{ >>>>>>> + u32 from = op->addr.val; >>>>>>> + void *buf = op->data.buf.in; >>>>>>> + size_t len = op->data.nbytes; >>>>>>> + >>>>>>> + if (plat->use_dac_mode && (from + len < plat->ahbsize)) { >>>>>>> + memcpy_fromio(buf, plat->ahbbase + from, len); >>>>>>> + if (!cadence_qspi_wait_idle(plat->regbase)) >>>>>>> + return -EIO; >>>>>>> + return 0; >>>>>>> + } >>>>>>> + >>>>>>> + return cadence_qspi_apb_indirect_read_execute(plat, len, buf); >>>>>>> +} >>>>>>> + >>>>>>> /* Opcode + Address (3/4 bytes) */ >>>>>>> -int cadence_qspi_apb_indirect_write_setup(struct cadence_spi_platdata >>>>>>> *plat, >>>>>>> - const struct spi_mem_op *op) >>>>>>> +int cadence_qspi_apb_write_setup(struct cadence_spi_platdata *plat, >>>>>>> + const struct spi_mem_op *op) >>>>>>> { >>>>>>> unsigned int reg; >>>>>>> >>>>>>> @@ -662,8 +689,9 @@ int cadence_qspi_apb_indirect_write_setup(struct >>>>>>> cadence_spi_platdata *plat, >>>>>>> return 0; >>>>>>> } >>>>>>> >>>>>>> -int cadence_qspi_apb_indirect_write_execute(struct >>>>>>> cadence_spi_platdata *plat, >>>>>>> - unsigned int n_tx, const u8 *txbuf) >>>>>>> +static int >>>>>>> +cadence_qspi_apb_indirect_write_execute(struct cadence_spi_platdata >>>>>>> *plat, >>>>>>> + unsigned int n_tx, const u8 >>>>>>> *txbuf) >>>>>>> { >>>>>>> unsigned int page_size = plat->page_size; >>>>>>> unsigned int remaining = n_tx; >>>>>>> @@ -735,6 +763,23 @@ failwr: >>>>>>> return ret; >>>>>>> } >>>>>>> >>>>>>> +int cadence_qspi_apb_write_execute(struct cadence_spi_platdata *plat, >>>>>>> + const struct spi_mem_op *op) >>>>>>> +{ >>>>>>> + u32 to = op->addr.val; >>>>>>> + const void *buf = op->data.buf.out; >>>>>>> + size_t len = op->data.nbytes; >>>>>>> + >>>>>>> + if (plat->use_dac_mode && (to + len < plat->ahbsize)) { >>>>>>> + memcpy_toio(plat->ahbbase + to, buf, len); >>>>>>> + if (!cadence_qspi_wait_idle(plat->regbase)) >>>>>>> + return -EIO; >>>>>>> + return 0; >>>>>>> + } >>>>>>> + >>>>>>> + return cadence_qspi_apb_indirect_write_execute(plat, len, buf); >>>>>>> +} >>>>>>> + >>>>>>> void cadence_qspi_apb_enter_xip(void *reg_base, char xip_dummy) >>>>>>> { >>>>>>> unsigned int reg; >>>>>>> -- >>>>>>> 2.23.0 >>>>>>> >>>>> >>>>> -- >>>>> Regards >>>>> Vignesh >> >> -- >> Regards >> Vignesh -- Regards Vignesh _______________________________________________ U-Boot mailing list U-Boot@lists.denx.de https://lists.denx.de/listinfo/u-boot