On 11/09/2023 17:39, Mukesh Ojha wrote:
> 
> 
> On 9/11/2023 2:22 PM, Bagas Sanjaya wrote:
>> On Sun, Sep 10, 2023 at 01:46:01AM +0530, Mukesh Ojha wrote:
>>> Hi All,
>>>
>>> This is to continuation from the conversation happened at v4
>>>
>>> https://lore.kernel.org/lkml/632c5b97-4a91-c3e8-1e6c-33d6c4f64...@quicinc.com/
>>>
>>> https://lore.kernel.org/lkml/695133e6-105f-de2a-5559-555cea0a0...@quicinc.com/
>>>
>>> We have put abstract on LPC on this topic as well as initiated a mail thread
>>> with other SoC vendors but did not get much traction on it.
>>>
>>> https://lore.kernel.org/lkml/0199db00-1b1d-0c63-58ff-03efae02c...@quicinc.com/
>>>
>>> We explored most of possiblity present in kernel to address this issue[1] 
>>> but
>>> solution like kdump/fadump does not seems safe/secure/performant from our
>>> perspective.
>>>
>>> Hence, with this series we tried to make the minidump kernel driver, simple
>>> and tied with pstore frontends, so that it collects the present available
>>> frontends data like dmesg, ftrace, pmsg, ftrace., Also, we will be working
>>> towards enhancing generic pstore to capture more debug data which will be
>>> helpful for first hand of debugging that can benefit both other pstore users
>>> as well as us as minidump users.
>>>
>>> One of the proposal made here,
>>> https://lore.kernel.org/lkml/1683561060-2197-1-git-send-email-quic_mo...@quicinc.com/
>>>
>>> Looking forward for your comments.
>>>
>>> Thanks,
>>> Mukesh
>>>
>>> [1]
>>> Minidump is a best effort mechanism to collect useful and predefined data
>>> for first level of debugging on end user devices running on Qualcomm SoCs.
>>> It is built on the premise that System on Chip (SoC) or subsystem part of
>>> SoC crashes, due to a range of hardware and software bugs. Hence, the
>>> ability to collect accurate data is only a best-effort. The data collected
>>> could be invalid or corrupted, data collection itself could fail, and so on.
>>>
>>> Qualcomm devices in engineering mode provides a mechanism for generating
>>> full system ramdumps for post mortem debugging. But in some cases it's
>>> however not feasible to capture the entire content of RAM. The minidump
>>> mechanism provides the means for selecting which snippets should be
>>> included in the ramdump.
>>>
>>> The core of SMEM based minidump feature is part of Qualcomm's boot
>>> firmware code. It initializes shared memory (SMEM), which is a part of
>>> DDR and allocates a small section of SMEM to minidump table i.e also
>>> called global table of content (G-ToC). Each subsystem (APSS, ADSP, ...)
>>> has their own table of segments to be included in the minidump and all
>>> get their reference from G-ToC. Each segment/region has some details
>>> like name, physical address and it's size etc. and it could be anywhere
>>> scattered in the DDR.
>>>
>>> Existing upstream Qualcomm remoteproc driver[1] already supports SMEM
>>> based minidump feature for remoteproc instances like ADSP, MODEM, ...
>>> where predefined selective segments of subsystem region can be dumped
>>> as part of coredump collection which generates smaller size artifacts
>>> compared to complete coredump of subsystem on crash.
>>>
>>> [1]
>>> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/remoteproc/qcom_common.c#n142
>>>
>>> In addition to managing and querying the APSS minidump description,
>>> the Linux driver maintains a ELF header in a segment. This segment
>>> gets updated with section/program header whenever a new entry gets
>>> registered.
>>>
>>> Changes in v5:
>>>   - On suggestion from Pavan.k, to have single function call for minidump 
>>> collection
>>>     from remoteproc driver, separated the logic to have separate minidump 
>>> file called
>>>     qcom_rproc_minidump.c and also renamed the function from 
>>> qcom_minidump() to
>>>     qcom_rproc_minidump(); however, dropped his suggestion about rework on 
>>> lazy deletion
>>>     during region unregister in this series, will pursue it in next series.
>>>
>>>   - To simplify the minidump driver, removed the complication for frontend 
>>> and different
>>>     backend from Greg suggestion, will pursue this once main driver gets 
>>> mainlined.
>>>
>>>   - Move the dynamic ramoops region allocation from Device tree approach to 
>>> command line
>>>     approch with the introduction command line parsing and memblock 
>>> reservation during
>>>     early boot up; Not added documentation about it yet, will add if it 
>>> gets positive
>>>     response.
>>>
>>>   - Exporting linux banner from kernel to make minidump build also as 
>>> module, however,
>>>     minidump is a debug module and should be kernel built to get most debug 
>>> information
>>>     from kernel.
>>>
>>>   - Tried to address comments given on dload patch series.
>>>
>>> Changes in v4: 
>>> https://lore.kernel.org/lkml/1687955688-20809-1-git-send-email-quic_mo...@quicinc.com/
>>>   - Redesigned the driver and divided the driver into front end and backend 
>>> (smem) so
>>>     that any new backend can be attached easily to avoid code duplication.
>>>   - Patch reordering as per the driver and subsystem to easier review of 
>>> the code.
>>>   - Removed minidump specific code from remoteproc to minidump smem based 
>>> driver.
>>>   - Enabled the all the driver as modules.
>>>   - Address comments made on documentation and yaml and Device tree file 
>>> [Krzysztof/Konrad]
>>>   - Address comments made qcom_pstore_minidump driver and given its Device 
>>> tree
>>>     same set of properties as ramoops. [Luca/Kees]
>>>   - Added patch for MAINTAINER file.
>>>   - Include defconfig change as one patch as per [Krzysztof] suggestion.
>>>   - Tried to remove the redundant file scope variables from the module as 
>>> per [Krzysztof] suggestion.
>>>   - Addressed comments made on dload mode patch v6 version
>>>     
>>> https://lore.kernel.org/lkml/1680076012-10785-1-git-send-email-quic_mo...@quicinc.com/
>>>
>>> Changes in v3: 
>>> https://lore.kernel.org/lkml/1683133352-10046-1-git-send-email-quic_mo...@quicinc.com/
>>>   - Addressed most of the comments by Srini on v2 and refactored the 
>>> minidump driver.
>>>      - Added platform device support
>>>      - Unregister region support.
>>>   - Added update region for clients.
>>>   - Added pending region support.
>>>   - Modified the documentation guide accordingly.
>>>   - Added qcom_pstore_ramdump client driver which happen to add ramoops 
>>> platform
>>>     device and also registers ramoops region with minidump.
>>>   - Added download mode patch series with this minidump series.
>>>      
>>> https://lore.kernel.org/lkml/1680076012-10785-1-git-send-email-quic_mo...@quicinc.com/
>>>
>>> Changes in v2: 
>>> https://lore.kernel.org/lkml/1679491817-2498-1-git-send-email-quic_mo...@quicinc.com/
>>>   - Addressed review comment made by [quic_tsoni/bmasney] to add 
>>> documentation.
>>>   - Addressed comments made by [srinivas.kandagatla]
>>>   - Dropped pstore 6/6 from the last series, till i get conclusion to get 
>>> pstore
>>>     region in minidump.
>>>   - Fixed issue reported by kernel test robot.
>>>
>>> Changes in v1: 
>>> https://lore.kernel.org/lkml/1676978713-7394-1-git-send-email-quic_mo...@quicinc.com/
>>>
>>> Testing of the patches has been done on sm8450 target after enabling config 
>>> like
>>> CONFIG_PSTORE_RAM and CONFIG_PSTORE_CONSOLE and once the device boots up.
>>>
>>>   echo mini > /sys/module/qcom_scm/parameters/download_mode
>>>
>>> Try crashing it via devmem2 0xf11c000(this is known to create xpu violation 
>>> and
>>> and put the device in download mode) on command prompt.
>>>
>>> Default storage type is set to via USB, so minidump would be downloaded 
>>> with the
>>> help of x86_64 machine (running PCAT tool) attached to Qualcomm device 
>>> which has
>>> backed minidump boot firmware support.
>>>
>>> This will make the device go to download mode and collect the minidump on 
>>> to the
>>> attached x86 machine running the Qualcomm PCAT tool(This comes as part 
>>> Qualcomm
>>> package manager kit).
>>>
>>> After that we will see a bunch of predefined registered region as binary 
>>> blobs files
>>> starts with md_* downloaded on the x86 machine on given location in PCAT 
>>> tool from
>>> the target device, more about this can be found in qualcomm minidump guide 
>>> patch.
>>>
>>
>> I tried to apply this series on top of 535a265d7f0dd50 (as suggested by
>> `b4 am -l -g`), but it conflicts on patch [04/17]. Please specify the
>> exact base commit or another series for which this series is based on.
> 
> Apologies !
> I just realized, it was 6.5-rc7, but let me rebase version of the series;
> 
> Sorry, for all the reviewed done so far, i will definitely take care of them 
> or reply.
> 

OK, see you in v6!

-- 
An old man doll... just what I always wanted! - Clara

Reply via email to