Apache OpenOffice (AOO) Bugzilla – Issue 57401
COUNTIF in Array Formula counts 0's (zeros) when using non-numeric characters
Last modified: 2013-08-07 15:15:35 UTC
In Excel, entering the array formula {=COUNTIF(<range>,<range>)} in a cell produces a 1 if there are no duplicate entries in <range>, and some value greater than 1 if there are duplicates. In Calc 2.0, the array formula {=COUNTIF(<range>,<range>)} cannot be entered into a single cell within Calc; it is intead spread over the number of cells in <range>. To reproduce, for example, putting [Q,R,Q,T,U] in cells A1 through A5, and the array formula {=countif(A1:A5;A1:A5)} in B1, produces: A B ------------ 1 | Q 0 2 | R 0 3 | Q 0 4 | T 0 5 | U 0 Whereas in Excel, it produces: A B ------------ 1 | Q 2 2 | R 3 | Q 4 | T 5 | U Note that if the Excel spreadsheet is created first and saved in Excel, OO 2.0 will open the spreadsheet with the array formula properly contained only in cell B1, like this, even though its computation is incorrect: A B ------------ 1 | Q 0 2 | R 3 | Q 4 | T 5 | U In my example above, the data in <range> is non-numeric. Interestingly, if the data in <range> is numeric, then a spreadsheet created in Excel with this COUNTIF formula will open and compute properly in oo 2.0.
This bug also appears to be a regression from Staroffice 7. Created an attachment to document the problem: * Opening file 57401.xls in Excel or SO 7 gives an answer of 2 in cell B1 * Opening the same file in oo2.0 gives an answer of 0 in cell B1 * Also in oo2.0, there's this possibly related problem: - click on cell B1 - click at the end of the formula in the formula bar - make a change to the formula such as deleting the right parenthesis, and reinserting the right parenthesis - hit control-shift enter to complete the array formula - Result incorrectly spans 5 cells instead of the single cell
Created attachment 31419 [details] File demonstrating array formula incompatibility/bug
Correction 1: In Excel 2003, entering {=COUNTIF(<range>,<range>)} into a cell is treated as text. You must enter =COUNTIF(<range>,<range>), then press CTRL+SHIFT+ENTER to have the formula entered as an array formula. Also in Excel, in order to have the array formula spread over 5 cells (in other words, to actually have it be an array) you must select all five cells first, then enter the formula, then hit CTRL+SHIFT+ENTER Correction 2: As far as I can tell, the array formula is supposed to span five cells when you enter it. This is normal and proper behavior, for two reasons. First, it is counterintuitive to have to select the cells THEN enter the array formula, as Excel does it. Instead, OOo does the right thing: autodetects when a formula is set up to be an array formula and fills it down accordingly. Which brings me to reason number 2: OOo autodetects that you are using COUNTIF as an array formula because you entered a range in the second argument to COUNTIF. COUNTIF normally takes =COUNTIF(<range>,<criteria>), but you fed it {=COUNTIF(<range>,<range>)}. Thus, it sees the array formula brackets and sees a range instead of a criteria, and correctly autofills the entire 5 cell range without you needing to do it manually. Thus, what Excel does is actually improper IMO; it makes no sense to have an "array" formula in one cell only. Nevertheless, I agree with you that there is a bug in the calculation of the array formula when using non-numeric cell contents. Here's what I get when using the formula {=COUNTIF(A1:A5;A1:A5)} for three different data sets: A B ---------------- 1 | Q 0 2 | R 0 3 | Q 0 4 | T 0 5 | U 0 A B ---------------- 1 | 1 2 2 | 2 1 3 | 1 2 4 | 3 1 5 | 4 1 A B ---------------- 1 | @ 0 2 | # 0 3 | $ 0 4 | % 0 5 | ^ 0 The formula only works properly with the numeric data set. Confirming (Windows XP SP2, OOo 2.0) Changing summary to reflect actually bug.
Hi Eike, please have a look at this one. Frank
Accepted for the non-numeric part. Correction 3: it's not really a regression. It is true that when opening the attached file in OOo1.x/SO7 the result is 2 instead of 0, but this is only the result of the topleft corner of the array formula. If entered in a range, _all_ cells displayed the result 2, because the array context wasn't supported at all. The topleft case here was just a special form, and is equivalent to =COUNTIF(A1:A5;A1), which may be repeated in subsequent rows as in =COUNTIF(A1:A5;A2), =COUNTIF(A1:A5;A3), .... to obtain all values.
This seems easy to fix (Countif must handle svMatrix the same way Sumif already does). We should change it for 2.x.
In CWS dr51 sc/source/core/tool/interpr1.cxx 1.41.6.3
Reassigning to QA.
found fixed on cws dr51 using Linux, Solaris and Windows build
found fixed on master OOFm6 using Linux, Solaris and Windows build