Hi community,

To follow up on the previous mail, I'd like to give you a more detailed
update on how we're addressing the histogram data handling issue in the
*chart2* module, which Tomaž and Regina pointed out.

   - The core problem lies in how *histograms* handle *X-axis* values. The
   existing *chart2* architecture *assumes a direct mapping between source
   data and chart X-axis values*.
   - However, histograms calculate X-axis values (bin-width) based on the
   input data, leading to a situation where these calculated values overwrite
   the original X-values and create issues in many places in the code.
   - We can get the render of the chart while creating it but when we tried
   to save(for example) the original values were gone by this time hence the
   problem.


Tomaž also said that he would like to introduce a variable
*XValuesCalculated* which will resolve most of the problem.

Further Analysis -

   -     *Original X-Values:* The original X-values are correctly stored in
   XLabeledDataSequence.
   -     *Histogram Calculations:* The histogram calculations occur within
   the *HistogramDataInterpreter* and then the calculated values overwrite
   the XLabeledDataSequence x values.
   -     *Need for Separation:* We need to store and display the
   histogram-calculated X-values while preserving the original data as many
   parts of the code depend on original values.




Analysis of *HistogramChartType.cxx*

*Purpose:* This file defines the HistogramChartType class, that is
responsible for setting up a histogram chart type. It manages properties
specific to histograms,
like bin width and range. It also implements the logic to create the
data-series and do the calculations.

*createCalculatedDataSeries()* Method:

   - This method retrieves the raw data through the *XLabeledDataSequence*
   from *m_aDataSeries*. This is where the logic modifies the source data
   (the sequence returned from getValues()).
   - It then uses *HistogramCalculator* to perform calculations and
   generate the data.
   - Finally, it creates a new *HistogramDataSequence* to store calculated
   values and then adds that to m_aDataSeries.


*Interaction with **XLabeledDataSequence*
*:*

   -         It gets the original data using the *getDataSequences2()*
   method of *DataSeries*.
   -         It gets the original x values by using *getValues*() method of
   the *XLabeledDataSequence* which returns an XDataSequence and then uses
   the *getData() *method of the XDataSequence to retrieve the x values in
   a sequence of uno::Any.



Analysis of *HistogramCalculator.cxx*

*Purpose*: This class performs the core histogram calculations using the
raw data that is provided to *computeBinFrequencyHistogram()* method.

*Bin Calculation Logic:*

   -         The class calculates the number of bins, bin width, and bin
   ranges based on the input data.
   -         It computes bin ranges and their frequencies.
   -         The calculated values are stored internally in *maBinRanges*
   and *maBinFrequencies*.




Understanding* HistogramDataSequence.cxx*

*Data Storage:*

   -         The primary data storage is the *mxValues* member, which is a
   *uno::Sequence<double>*. This sequence stores the calculated bin values
   for the histogram.
   -         There is also *mxLabels* which stores labels but we are not
   concerned with it.


*Interface Implementations:*

   -         *HistogramDataSequence* implements several UNO interfaces
   including *XNumericalDataSequence, XTextualDataSequence, XDataSequence*.
   -         *getNumericalData()*: Returns the *mxValues* as a
   *uno::Sequence<double>*.
   -         *getTextualData()*: Returns an empty *uno::Sequence<OUString>*.
   -        * getData()*: Returns the *mxValues* as a
   *uno::Sequence<uno::Any>*.




Proposed Solution:
To address these issues, this is how I am thinking of making the changes:

   * *Add **XValuesCalculated Property(As said by Tomaž)* :

   -         Add an *[optional, attribute] sequence<double> *
   *XValuesCalculated**;* to the *DataSeries* service definition (
   *offapi/com/sun/star/chart2/DataSeries.idl*).
   -         Implement the corresponding storage, getter, and setter
   methods in *DataSeries.hxx* and *DataSeries.cxx*. (Add storage for
   *m_aXValuesCalculated*)
   -         Add the corresponding property support in *OPropertySet.hxx*
   and *OPropertySet.cxx*?

       *Why add it to DataSeries? *-> If my understanding is correct,
*XValuesCalculated* is a property associated with a DataSeries and is not a
new data structure by itself, so we do not need to create a new IDL file
for it.
*      No Boolean Flag Needed:* -> we do not need an explicit boolean flag.

       I am using *.hasElements() *and *w**hy Length Check is Sufficient*:

   -     *Presence = Calculated Values*: The presence of elements within
   the *XValuesCalculated* sequence is sufficient to determine if
   calculated X-values exist.
   -     *Empty Sequence = No Calculated Values*: An empty sequence, which
   tells us that no values have been calculated. This is very convenient
   because UNO Sequences can be empty.
   -     *No Redundancy:* Using a separate boolean flag would add
   redundancy.



   *
* Modify Histogram Logic:*

   -         Update HistogramDataInterpreter.cxx to perform calculations
   and store calculated bin values in the *XValuesCalculated* property of
   the corresponding DataSeries object. This will not touch the original
   values in the *XLabeledDataSequence*.
   -         And the Histogram View utilises the BarChart view for the
   final rendering.


   *
* Modify Rendering Logic:*

   -         Modify chart renderers (such as VSeriesPlotter.cxx ?) to
   prioritize the *XValuesCalculated* property when available for plotting.
   When the *XValuesCalculated* is not available then it will use the
   original values of *XLabeledDataSequence*.


   *
*Adjust Axis Scaling:*

   -         Modify the code to use the *XValuesCalculated* property, when
   available, for axis scaling range calculation.


   * *Preserve Original Data in UI:*

   -         Ensure that UI components always display original data fetched
   from *XLabeledDataSequence* with the role of *"values-x".*
   -         Since users expect the UI tables to reflect the underlying
   data they have provided. If we show calculated bin values in tables/cells,
   it will be confusing and wrong.


   *
*Comprehensive Testing:*

   -         Create tests to ensure that the changes work and do not cause
   a regression to ensure proper functionality and maintain backward
   compatibility.
   -         We will also make sure the undo and redo are working properly.


   *
* Documentation:*

   -         Document all the changes to reflect the new property and its
   usage.



Key Benefits:

   1.    * Correct Histogram Display:* Histograms will correctly use
   calculated bin values for the X-axis.
   2.     *Preserved Original Data:* Original source data will be
   maintained for other charts and user interactions.
   3.     *Minimal Impact:* This change does not affect other chart types,
   it ensures backward compatibility, and it only impacts histogram and
   renderer-specific logic.
   4.     *Clear Separation:* Separates the concern of raw data and
   calculated data.
   5.    * Improved Architecture: *Addresses the underlying issue of
   assumptions embedded in the chart(*chart2*) module.





*Is the below file related?*
chart2/source/controller/dialogs/DataBrowser.cxx
<https://github.com/LibreOffice/core/blob/b0a4afc58e1c9434e56ddb96c41f4ebe5985ed0a/chart2/source/controller/dialogs/DataBrowser.cxx#L4>



Do we also have to modify *getValues()* in *XLabeledDataSequence* to return
calculated values or original values based on the presence of
*XValuesCalculated* property and chart type?

If other places need changes or I if have still missed files (please let me
know)
and the changes in the *offapi/com/sun/star/chart2/DataSeries.idl* are
correct?

On Thu, 23 Jan 2025 at 18:59, Regina Henschel (via Code Review) <
ger...@gerrit.libreoffice.org> wrote:

> Attention is currently required from: Devansh Varshney, Tomaž Vajngerl.
>
> Regina Henschel has posted comments on this change by Devansh Varshney. (
> https://gerrit.libreoffice.org/c/core/+/177364?usp=email )
>
> Change subject: xmloff: Histogram Chart ODF import and export support
> ......................................................................
>
>
> Patch Set 26:
>
> (2 comments)
>
> Patchset:
>
> PS26:
> There is a deeper problem. You have set the calculated binFrequencies and
> binRanges in place of the original data. That has give you a convenient way
> for getting a rendering as bar chart. Not only does this cause bugs
> tdf#164109 and tdf#162240, but now you can't export the <chart:series>
> element the same way as it is done for other chart types. I'm referring to
> the way that can be seen after comment '// export dataseries for current
> chart-type' in SchXMLExport.cxx.
>
> Currently in your patch, the <chart:series> element is not exported at
> all. When it will be exported, the new <loext:histogram-configuration>
> element has to become a child element after all the existing child
> elements. If the export looks similar to the existing chart types, it would
> be just before the comment '// close series'.
>
> I suggest to discuss the underlying problem with Tomaž Vajngerl. I don't
> see an easy solution and would have to try a lot. Until this is resolved,
> it doesn't make sense to make progress on the export.
>
>
> File schema/libreoffice/OpenDocument-v1.4+libreoffice-schema.rng:
>
>
> https://gerrit.libreoffice.org/c/core/+/177364/comment/2e8b89d8_b939b3f4?usp=email
> :
> PS26, Line 2149:         <rng:optional>
>                :           <rng:ref name="loext-histogram-configuration"/>
>                :         </rng:optional>
> > I mean the `loext:histogram-configuration` is placed under
> `chart:series` is this the correct way? […]
> The position is correct.
>
>
>
> --
> To view, visit https://gerrit.libreoffice.org/c/core/+/177364?usp=email
> To unsubscribe, or for help writing mail filters, visit
> https://gerrit.libreoffice.org/settings?usp=email
>
> Gerrit-MessageType: comment
> Gerrit-Project: core
> Gerrit-Branch: master
> Gerrit-Change-Id: Ibda198c1a3e4a7b53e3cf590c33cc9adb558be74
> Gerrit-Change-Number: 177364
> Gerrit-PatchSet: 26
> Gerrit-Owner: Devansh Varshney <varshney.devansh...@gmail.com>
> Gerrit-Reviewer: Jenkins
> Gerrit-Reviewer: Tomaž Vajngerl <qui...@gmail.com>
> Gerrit-CC: Regina Henschel <rb.hensc...@t-online.de>
> Gerrit-Attention: Devansh Varshney <varshney.devansh...@gmail.com>
> Gerrit-Attention: Tomaž Vajngerl <qui...@gmail.com>
> Gerrit-Comment-Date: Thu, 23 Jan 2025 13:29:36 +0000
> Gerrit-HasComments: Yes
> Gerrit-Has-Labels: No
> Comment-In-Reply-To: Devansh Varshney <varshney.devansh...@gmail.com>
>


-- 
*Regards,*
*Devansh*

Reply via email to