Both the page and character index are clamped to the number of pages and 
characters on a page so you could set both to very high numbers. Adding 
character counts to the documentPages property might be useful here too.

Cheers

Monte

> On 13 Dec 2021, at 11:17 am, Paul Dupuis via use-livecode 
> <use-livecode@lists.runrev.com> wrote:
> 
> Thank you Monte,
> 
> We've just started to make a map from XPDF APIs to the PDF Widget APIs, so 
> I'll make sure that gets done soon and add any missing capabilities as 
> requests to the LC Quality Center.
> 
> With regard to the hilitedRange and hilitedRangeText properties, can you just 
> advise on the correct use to get a PDF's text? i.e can you use a range of 1 
> to -1 to get the whole document text or would that just be the current page 
> text?
> 
> Thanks in advance,
> 
> 
> On 12/12/2021 6:49 PM, Monte Goulding via use-livecode wrote:
>> Hi Folks
>> 
>> Currently you can extract text in the widget by setting the hilitedRange and 
>> getting the hilitedRangeText. It wouldn’t be that hard to add extracted text 
>> to the documentPages property. The PDF widget was built to meet the 
>> requirements for a client rather than to match the features of XPDF so it’s 
>> worthwhile anyone still using XPDF to take the time to audit their use and 
>> see if there’s any extra features required. If so please create feature 
>> requests for them. While XPDF will continue to function we intend to stop 
>> including it in LiveCode.
>> 
>> Cheers
>> 
>> Monte
>> 
>>> On 12 Dec 2021, at 12:27 am, Paul Dupuis via use-livecode 
>>> <use-livecode@lists.runrev.com> wrote:
>>> 
>>> I suspect it is for backward compatibility.
>>> 
>>> When I turned over the XPDF external to Livecode, I asked that they 
>>> maintain it for a couple years. I had expected we'd migrate out apps to the 
>>> PDF widget by then, but business factors mean we're only now just starting 
>>> a migration.
>>> 
>>> That's why I jumped in on this thread - we HAVE to have the ability to 
>>> extract text and images from the PDF widget (as you can with the External) 
>>> - to migrate to the Widget.
>>> 
>>> I suspect many other commercial developers who used the External still have 
>>> active code using it that they have not migrated yet OR the issue of the 
>>> undocumented (or, even worse, missing) properties of the widget most likely 
>>> would have been raised before now.
>>> 
>>> To migrate, all the command and functions of the External need to be mapped 
>>> to the properties of the Widget. We have probably a couple hundred calls to 
>>> the External in our code all of which need to be mapped, updated, and 
>>> tested - so no trivial task.
>>> 
>>> 
>>> On 12/11/2021 6:50 AM, matthias rebbe via use-livecode wrote:
>>>> Ah, i thought you were referring only to XPDF.
>>>> Btw. do you have an idea why both, XPDF external and PDF widget, are 
>>>> maintained? Wouldn't it make sense to have only one pdf solution included?
>>>> Or am i missing something?
>>>> 
>>>> Regards,
>>>> Matthias
>>>> 
>>>> 
>>>>> Am 11.12.2021 um 02:01 schrieb Paul Dupuis via use-livecode 
>>>>> <use-livecode@lists.runrev.com>:
>>>>> 
>>>>> Yes, I am familiar with the XPDF external (based on Google's PDFium 
>>>>> library), having designed it and paid Monte to code it and then turned it 
>>>>> over to LiveCode.
>>>>> 
>>>>> I was referring to the PDF Widget (also based on Google's PDFium), which 
>>>>> should have a comparable property for fetching the text of a page. The LC 
>>>>> dictionary does not list any property for returning the page text, so I 
>>>>> assume that is a Dictionary/Documentation error and that Monte can tell 
>>>>> us the correct property of the PDF widget that will return the text of a 
>>>>> page.
>>>>> 
>>>>> 
>>>>> On 12/10/2021 7:05 PM, matthias rebbe via use-livecode wrote:
>>>>>> Paul,
>>>>>> 
>>>>>> here on mac OS the dictionary of LC 10 DP1 definitely lists the function 
>>>>>> XPDFViewer_Text(viewerName, pageNumber).
>>>>>> Btw. checking this showed me that this function seems to be deprecated 
>>>>>> and instead the command
>>>>>>      XPDFViewer_Unicode viewerName, pageNumber, variableName
>>>>>> should be used.
>>>>>> 
>>>>>> 
>>>>>>> Am 10.12.2021 um 23:22 schrieb Paul Dupuis via use-livecode 
>>>>>>> <use-livecode@lists.runrev.com>:
>>>>>>> 
>>>>>>> There must be an undocumented property for the text of a page - there 
>>>>>>> was a function to return the full text of a page in the External (XPDF) 
>>>>>>> and to get the full text of the PDF file, you just stepped through the 
>>>>>>> pages (1..N) getting and concatenating the page text.
>>>>>>> 
>>>>>>> Monte? LC 10.0.0 Dictionary does not list a property for the page text.
>>>>>>> 
>>>>>>> 
>>>>>>> On 12/10/2021 4:46 PM, Torsten Holmer via use-livecode wrote:
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>> I have a PDF file with text and pictures, but I just want the text.
>>>>>>>> 
>>>>>>>> I can do it manually with Ctrl-A and Ctrl-Copy by viewing the file 
>>>>>>>> with Preview on MacOS.
>>>>>>>> 
>>>>>>>> I have a business licence and want to use the PDF widget but I cannot 
>>>>>>>> find a way to do it.
>>>>>>>> 
>>>>>>>> Can someone help me out?
>>>>>>>> 
>>>>>>>> Cheers,
>>>>>>>> Torsten
>>>>>>>> _______________________________________________
>>>>>>>> use-livecode mailing list
>>>>>>>> use-livecode@lists.runrev.com
>>>>>>>> Please visit this url to subscribe, unsubscribe and manage your 
>>>>>>>> subscription preferences:
>>>>>>>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>>>>>> _______________________________________________
>>>>>>> use-livecode mailing list
>>>>>>> use-livecode@lists.runrev.com
>>>>>>> Please visit this url to subscribe, unsubscribe and manage your 
>>>>>>> subscription preferences:
>>>>>>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>>>>> _______________________________________________
>>>>>> use-livecode mailing list
>>>>>> use-livecode@lists.runrev.com
>>>>>> Please visit this url to subscribe, unsubscribe and manage your 
>>>>>> subscription preferences:
>>>>>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>>>> _______________________________________________
>>>>> use-livecode mailing list
>>>>> use-livecode@lists.runrev.com
>>>>> Please visit this url to subscribe, unsubscribe and manage your 
>>>>> subscription preferences:
>>>>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>>> _______________________________________________
>>>> use-livecode mailing list
>>>> use-livecode@lists.runrev.com
>>>> Please visit this url to subscribe, unsubscribe and manage your 
>>>> subscription preferences:
>>>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>> 
>>> _______________________________________________
>>> use-livecode mailing list
>>> use-livecode@lists.runrev.com
>>> Please visit this url to subscribe, unsubscribe and manage your 
>>> subscription preferences:
>>> http://lists.runrev.com/mailman/listinfo/use-livecode
>> 
>> _______________________________________________
>> use-livecode mailing list
>> use-livecode@lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your subscription 
>> preferences:
>> http://lists.runrev.com/mailman/listinfo/use-livecode
> 
> 
> _______________________________________________
> use-livecode mailing list
> use-livecode@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription 
> preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode


_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Reply via email to