https://bz.apache.org/bugzilla/show_bug.cgi?id=60848

Jakub <jakub.vojti...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub.vojti...@gmail.com

--- Comment #14 from Jakub <jakub.vojti...@gmail.com> ---
Created attachment 39124
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=39124&action=edit
Test cases for intersection, sumproduct and index.

I've been able to reproduce the bug, looked at the code and played a bit with
similar related cases. The problem is not in the SUMPRODUCT function nor in the
unary minus. It is about implementation of implicit intersections in POI. Here
is some introduction to the intersections:
https://www.ablebits.com/office-addins-blog/excel-implicit-intersection/

The logic when implicit intersection happens and when not is quite complex.
Here's what I've found out using Excel 2010:
- In a "normal mode" common arithmetic operations causes implicit
intersections. For example assume there are values 1 to 5 in cells A1 to A5.
Then the formula "=2*A1:A5" evaluated in the B2 cell returns the value of A2
multiplied by 2, i.e. 4.
- If a formula is saved using Ctrl+Shift+Enter it is stored as array formula
(aka CSE formula, in excel it is wrapped in curly brackets). Implicit
intersection is suppressed in CSE formulas. Let's call it an "array mode". The
whole CSE formula is evaluated in array mode.
- Some functions changes the mode in which their arguments are evaluated, in
particular:
-- SUMPRODUCT will force the array mode on all its arguments. In above example
the formula "=SUM(2*A1:A5)" in the B2 cell returns 4 (implicit intersection
happens) but "=SUMPRODUCT(2*A1:A5)" returns 30 as it doesn't intersects and
sums the range multiplied by 2.
-- INDEX function will force the array mode on it's first argument.
Interestingly it also switches the evaluation of its second and third argument
to the normal mode. That is true if INDEX is used inside SUMPRODUCT. If INDEX
is used in a CSE formula, no switching happens. Implicit intersection is
suppressed on all evaluations in CSE formula.
- Some functions which also accept array arguments just keep the mode as it
was, for example functions MIN or MAX functions. 

Now, POI seems to work well in the array mode (in CSE formulas) as well as in
the normal mode. What is not implemented is switching of the mode by certain
functions (SUMPRODUCT in this case). Note there is a piece of code attempting
to do this. The org.apache.poi.ss.formula.functions.ArrayMode interface is
clearly intended for that. See where it is used in the
org.apache.poi.ss.formula.WorkbookEvaluator. However doesn't implement the
logic above correctly. Also the INDEX is the only function implementing the
ArrayMode interface.

This is what I've found out so far. It is demonstrated in the attached
workbook. There is a column in the workbook showing where POI would give wrong
results. It is not just the SUMPRODUCT, INDEX function alone also can give
errors.

If anyone thinks it works differently or knows about some formal specification
or documentation please comment.

I have some ideas of how to fix it. Seems to resolve the bugs in the attached
workbook however there is lot more to investigate and test...

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org

Reply via email to