Hi again Nick,
This problem appears to be Mac-specific, I have had more luck with a .doc file
created natively in Windows :-)
Now POIFSLister shows the ObjectPool and the item in it:
Root Entry -
SummaryInformation <(0x05)SummaryInformation> [412 / 0x19c]
DocumentSummaryInformation <(0x05)DocumentSummaryInformation> [280 / 0x118]
WordDocument [4142 / 0x102e]
1Table [2087 / 0x827]
ObjectPool -
_1432368106 -
CompObj <(0x01)CompObj> [76 / 0x4c]
ObjInfo <(0x03)ObjInfo> [6 / 0x6]
Ole10Native <(0x01)Ole10Native> [568849 / 0x8ae11]
EPRINT <(0x03)EPRINT> [5000 / 0x1388]
CompObj <(0x01)CompObj> [113 / 0x71]
Data [4096 / 0x1000]
Please can you point me to any resources which could help me to save the
embedded file to another file (i.e. read all the bytes and save them somewhere)?
Thanks,
- Chris
On 10 Jun 2013, at 09:33, Chris Bamford wrote:
> Hi Nick,
>
> I created a .doc file with an embedded MP3 (that is, I dragged an MP3 file
> from Finder and dropped it into the document whereupon Word displayed a small
> image of a loudspeaker - I took this as a positive sign!).
> I then added some text for good measure and saved it, taking care to save it
> as "Word 97 - 2004".
> Then I ran POIFSLister -sizes on it and got:
>
> Root Entry -
> SummaryInformation <(0x05)SummaryInformation> [4096 / 0x1000]
> DocumentSummaryInformation <(0x05)DocumentSummaryInformation> [4096 / 0x1000]
> WordDocument [9152 / 0x23c0]
> 1Table [7280 / 0x1c70]
> CompObj <(0x01)CompObj> [96 / 0x60]
>
> Looking closer in the debugger, I discovered that none of the entries shown
> are of type DirectoryNode, so I cannot even start the process of finding /
> extracting the MP3.
> Any ideas what I might be doing wrong?
> Thanks,
>
> - Chris
>
>
> Thanks Nick, must have missed that. Will check it out.
> Chris
> On 7 Jun 2013, at 14:12, Nick Burch wrote:
>> On Fri, 7 Jun 2013, Chris Bamford wrote:
>>> Is there a way to extract files embedded into Word docs (.doc, not .docx),
>>> using the HWPF package?
>>
>> Does the information on http://poi.apache.org/poifs/embeded.html not cover
>> what you need?
>>
>> Nick
>
>
>
>
> On 7 Jun 2013, at 14:26, Chris Bamford wrote:
>
> Thanks Nick, must have missed that. Will check it out.
>
> Chris
>
> On 7 Jun 2013, at 14:12, Nick Burch wrote:
>
>> On Fri, 7 Jun 2013, Chris Bamford wrote:
>>> Is there a way to extract files embedded into Word docs (.doc, not .docx),
>>> using the HWPF package?
>>
>> Does the information on http://poi.apache.org/poifs/embeded.html not cover
>> what you need?
>>
>> Nick
>
>
> Chris Bamford
> Senior Developer
>
> CityPoint,
> One Ropemaker Street,
> London,
> EC2Y 9AW.
>
> mobile +44 7860 405292
> tel: +44 (0) 207 847 8700
> web www.mimecast.com
>
>
> The information contained in this communication from [email protected] is
> confidential and may be legally privileged. It is intended solely for use by
> [email protected] and others authorized to receive it. If you are not
> [email protected] you are hereby notified that any disclosure, copying,
> distribution or taking action in reliance of the contents of this information
> is strictly prohibited and may be unlawful.
>
>
> Mimecast Ltd. is a company registered in England and Wales with the company
> number 4698693 VAT No. GB 123 4197 34
> Registered Office: CityPoint, One Ropemaker Street, Moorgate, London, EC2Y
> 9AW Email Address: [email protected]
>
> This email message has been scanned for viruses by Mimecast.
> Mimecast delivers a complete managed email solution from a single web based
> platform.
> For more information please visit http://www.mimecast.com