On 03.02.2014 18:54, Mayur wrote:
Found something interesting. Writer seems to reject any graphics in OOXML
documents - even VML ones. But there does seem to be code to support it.
Only, if a couple of tiny glitches were fixed, possibly Writer will start
showing VML shapes (at least). That'd work for all the 2007 MS word
documents, as well as for some 2010 documents which would have the vml data
in their mc:Alternativecontent tags. Here're the problems:
   i.  A function getNamespace( )  in
oox/source/shape/ShapeContextHandler.cxx always returned 0. The problem
seems to be a
       rather strange looking definition of the NMSP_MASK constant in
oox/source/token/namespaces.hxx.tail. It says there:
            *const sal_int32 TOKEN_MASK* = static_cast<sal_int32>* ( (1 <<
16) - 1 );  *
*           const sal_Int32 NMSP_MASK       = static_cast< sal_Int32 >(
SAL_MAX_INT16 & ~TOKEN_MASK );*

      Why SAL_MAX_INT16? That would translate into (for windows)
*              TOKEN_MASK = static_cast<long>(0xFFFF);  // 65535*
                *and NMSP_MASK = static_cast<long>(0x7FFF & ~TOKEN_MASK). //
which is 0x00007FFF & 0xFFFF0000 = 0.*
       And
       Where as really, we should be looking for is the namespace value
which is in the higher two bytes. i.e. the following change fixes it.
        *        const sal_Int32 NMSP_MASK       = static_cast< sal_Int32 >(
SAL_MAX_INT32 & ~TOKEN_MASK );*
        That should set NMSP_MASK to the required 0xFFFF0000 to obtain the
higher two bytes.

Good analysis.


        To my mind, this sort of compactness isn't called for. Maybe, we
could have simply used a compact struct to store namespace and tag.

I always wondered why we keep namespace and token separable once they have been read into memory. While the XML file is scanned it does makes sense to use the same token for names in different namespaces. This keeps the number of tokens and thus the complexity of the scanner small (well, as small as possible). But once we have the tag (namespace and name) in memory we should not have to extract namespace or name from a tag. After all, if I have a name n in two namespaces a and b then a:n and b:n are two different things and we can use enum values a_n and b_n with arbitrary values to represent them. But maybe I am missing something?

-Andre


  ii. In oox/source/vml/vmldrawingfragment.cxx, there's a switch in the
function onCreateContext that says:
         case VMLDRAWING_WORD:
                   if ( isRootElement() )  {... }

       Is this so that whenever a vml file is received as a separate
document fragment, only then we create a shape context? Why not for the
inline (v:rect or other) objects? I tried removing the check, and instead
checking simply if nElement is a VML, then vml drawings were suddenly
visible in writer.

Is this a valid fix?



On Wed, Jan 29, 2014 at 1:24 PM, Andre Fischer <awf....@gmail.com> wrote:

On 28.01.2014 13:47, Mayur wrote:

Hi,

I am very new to the OpenOffice code, and need some help understanding the
open-xml handling code. Could someone please answer the following
questions?

   i. There seem to be two distinct pieces of code that do open-xml parsing
in different ways. There's one part in "writerfilter" that has some
generated code (xslt based) that provides factories and classes for
creating different object types. And then, for sc and sd, all of the
parsing code is in the "oox" module and seems to be hand-written. Why is
that? Are there plans to move the parsing code to a common module?
(perhaps
oox ...)

Re why: OOXML import has been developed while OpenOffice was maintained by
Sun, later Oracle.  There where at least three development teams involved
(for Writer, Calc, Draw/Impress). Sometimes they did not communicate with
each other as well as they should have.  Having different modules is one of
the results.  But, as far as I know, writerfilter has some calls into oox
for shared functionality.

Re future plans: Some of us are thinking about improving the OOXML
support.  Consolidation of the code base into one module is a long term
goal.



ii. Probably a related question - why are drawing-ml shapes and pictures
not supported in sw, while they are supported in sc and sd? The parsing
code seems to be there. The tag wps:wsp has very little delta with the
p:sp
tag. Is this in works?

Well, see my above comments.
And, parsing OOXML is the easy part, importing the data into the
application model is the hard part.  Calc and Impress use the same model
for representing graphical objects, Writer has its own.

If you are interested in OOXML import/export then maybe we can work
together on improving it?

Regards,
Andre


thanks,
mayur


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org
For additional commands, e-mail: dev-h...@openoffice.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org
For additional commands, e-mail: dev-h...@openoffice.apache.org

Reply via email to