On 17/09/2020 16:26, Stuart Owen wrote:
Hi,

comments below ...

On 17/09/2020 11:48, Andreas Reichel wrote:
Stuart,

maybe you could provide the simpliest possible test case, e. g.
creating your form with only 2 columns and 1 row, add the validation
and then:
I found when creating a sheet as simple as you suggest, POI wasn't able find the validations in the original either. Looking at the unpacked xml for the sheet, the validation is defined in a weird looking block:
<ext uri="{CCE6A557-97BC-4b89-ADB6-D9C93CAAB3DF}"
xmlns:x14="http://schemas.microsoft.com/office/spreadsheetml/2009/9/main";><x14:dataValidations count="1" xmlns:xm="http://schemas.microsoft.com/office/excel/2006/main";><x14:dataValidation type="list" allowBlank="1" showInputMessage="1" showErrorMessage="1"><x14:formula1><xm:f>Sheet2!$A$1</xm:f></x14:formula1><xm:sqref>A1</xm:sqref></x14:dataValidation></x14:dataValidations></ext>
</extLst>
(I added the file to the github test case here: https://github.com/stuzart/poi-google-export-testcase/blob/master/src/test/resources/original-very-simple.xlsx)

1) extract the XML after your created the template
2) extract the XML after you have exporting it from Google spreadsheets
3) extract the XML after importing and re-exporting it from Libreoffice

Then compare the XML files and see what is actually different.

So, sticking with the original files I have unpacked the xlsx and taken a look at the xml differences. There are actually a lot of differences, too much to do a simple diff. However manually inspecting, I've noticed a difference in the dataValidation validation block for the sheet1.xml that looks suspicous. In the problematic exported case, it seems to be missing theĀ  count="1" attribute, whereas the other cases that work do have this.
i.e
original - https://github.com/stuzart/poi-google-export-testcase/blob/master/src/test/resources/original-unpacked/xl/worksheets/sheet1.xml#L17 exported - https://github.com/stuzart/poi-google-export-testcase/blob/master/src/test/resources/exported-unpacked/xl/worksheets/sheet1.xml#L1003 resaved - https://github.com/stuzart/poi-google-export-testcase/blob/master/src/test/resources/resaved-unpacked/xl/worksheets/sheet1.xml#L1007

I'm not familiar with the format enough to know if this is legal or not.

to add, I've tried validating the xml against its schema, and it appears to be legal.


Also I do not understand yet, why do you want to read the validation in
POI again? In my limited understanding, the Google Spreadsheet will
validate the user inputs already and assure, you will read a valid
file?
I work on a tool that is based on embedding OWL ontologies (semantic web stuff) into spreadsheets, allowing scientists (mostly biologists) to annotate data in a way that is user friendly and familiar to them, and hides the nasty stuff. We use a trick with data validation to map human readable labels, with their sementic URI identifier, and then be able to export this again. Using data validations allows us to do this without any additional macros or vb scripting. The exported annotations and data can then be fed into a database as a graph, and be queried and reasoned across. You can find out more at https://rightfield.org.uk/

Possible work-arounds in the meantime:

1) when opening the Google Spreadsheet file, read the Validations from
the original XLSX file in parallel and merge that information
(e.g. take the data from the Google Spreadsheet, but the structure and
definitions from the original XLSX).

2) alternatively, run it automated through Libreoffice first (it has a
command-line/shell only option)
Yes, option 2 is something I am thinking of as a fallback option.

many thanks,
Stuart

Best regards
Andreas

On Thu, 2020-09-17 at 11:28 +0100, Stuart Owen wrote:
Hi,
I've encountered an odd problem reading validations from an xlsx
spreadsheet uploaded to Google Drive, and then exported.To cut a long
story short, my particular use-case involves a pipeline of creating
an Excel spreadsheet, which is then shared and used via Google Sheets
to allow scientists to collaborate and annotate biological data
(using data validations), and is then exported and the annotations
read via Apache POI.
I have found that Apache POI can detect the validations in the
original spreadsheet, but not in the exported version. The
validations are read via XSSFSheet.getDataValidations(). The
validations however, can be read fine when opening in either Excel or
LibreOffice, and if resaved can then again be read by Apache POI.
To clarify things, I've put together a test case to demonstrate the
problem at https://github.com/stuzart/poi-google-export-testcase

I'm not sure if this is a problem with Google export, or Apache POI.
thanks,
Stuart
-------------------------------------------------------------------
--To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]





---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to