2010/10/25 Bob Jolliffe <bobjolli...@gmail.com> > 2010/10/25 Lars Helge Øverland <larshe...@gmail.com>: > > > > Hi thanks for the nice update, interesting. The thinking sounds very > > reasonable to me. > > Ola and I will work a bit on the optimization of SQL view. One issue here > is > > that when writing "generic" sql queries for tables like > > orgunitgroupsetstructure, one needs to select * (all) columns in order to > be > > portable because the table has each column per groupset, ie. its dynamic. > > But I guess we can fix these things when generating the SQL from Java / > > DHIS. > > Yes. The more we optimize these (in terms of number of fields) the > less enormous the excel file will be. And given that excel holds all > in memory we must do the best we can. > > > Re size, a good thing is that a typical pivot table won't contain the > whole > > database, rather the data for a district and its facilities. We want to > be > > able to compare with other districts (maybe inside the same province > only) > > but for those we only need the aggregated data at the district level (not > > facility data) - which means a lot less data. > > I'm trying to focus on the particular problems related to enormous > spreadsheets. If they are small we don't really have much of a > problem either way. But if we can deal with the really large ones > then we can maybe deal with medium large ones quite snappily. > > Yes definitely.
> > Its also interesting to see how excel uses a "minimal" xml (<r><x > > v="0"/></r> etc). We could consider that for DXF 2 datavalues. > > Size/performance vs readability. > > This is an OOXML "feature" which has been heavily criticized by almost > all in the xml world. Not being religious, I am in two minds about > it. I think it might be reasonable for the datavalues in dxf, but I'd > stick my heels in if we started doing the same thing for metadata. In > fact I think there is just such a minimal syntax which I have > suggested to Jo some time back as being maybe useful for a lightweight > mobile version of sdmx. I wouldn't have a problem with: > > <dv de='32' o='3' p='4' d=''2' /> <!-- where 'd' = disaggregation --> > > (I'd still like to see the other attributes, comment, createdby etc > being aggregated up to a 'datavalueset' but that's another story) > > Regarding ooxml I think there is a better justification doing it for a > data oriented document format like a spreadsheet, than there is for > word docs and presentations. Certainly M$ has achieved significant > performance benefits to Excel in terms of load times. File size is > not that significant a benefit because the compressed files shouldn't > be that different. What also really helps is the use of > sharedstrings. You can see that they do a similar shared strings type > optimization in the pivotcache. All of which allows us to stretch the > use of spreadsheets into domains where they shouldn't really go. > > Yes, was only thinking about data values. We can leave this for later.
_______________________________________________ Mailing list: https://launchpad.net/~dhis2-devs Post to : dhis2-devs@lists.launchpad.net Unsubscribe : https://launchpad.net/~dhis2-devs More help : https://help.launchpad.net/ListHelp