Re: Would Scandoc be somthing for Extragear?

2022-11-10 Thread Klaas Freitag

Am 09.11.22 um 21:18 schrieb Tobias Leupold:

Am Mittwoch, 9. November 2022, 20:59:15 CET schrieb Klaas Freitag:

Am 09.11.22 um 20:22 schrieb Nate Graham:




That is essentially what my old document scanning script did -- using the same
tools. But without a GUI of course. Nice idea! Seems like I'm not the only one
who worked on getting this very task done as easy as possible ;-)


Yes, well, that is what people in the office need. "Classical" office 
suites are just the beginning. Things that we Linux guys considered 
black magic for long time (such as printing, scanning, ocr, doc analysis 
etc) are now super easy and available on mobile platforms. And that is 
what we compete with on the office desktop.


I know people who run Kraft on Chromebooks to enable the users to use 
the "nice" productivity apps available with android. Tough times ;-)


regards,
Klaas







Have you checked out Skanpage? It does PDF scanning, including creating
multi-page PDF documents out of the scanned files. It also integrates
with the Purpose framework to offer a simple "Share" menu that lets you
email scanned documents very quickly.

Nate

On 11/9/22 06:32, Tobias Leupold wrote:

Hi all!

Nowadays, sending PDFs of scanned documents via email or uploading them
somewhere has become a recurring task. For years, I was using shell
scripts to
kind-of automate scanning, doing some post-processing and conversion
-- after
a fashion. But I thought that there should be some more
straightforward tool
for this.

The known general-purpose scanning applications we have didn't do what I
wanted to. So, at the beginning of the year, I started to write a quite
specialized scanning program whose only purpose is to make scanning
documents
and turning them into a PDF file as easy as possible.

The result is Scandoc. It currently lives at
https://invent.kde.org/tleupold/scandoc

The Readme contains a description of what it is. It uses KSaneCore to
access a
scanner and runs (by default well-known) helper programs to
post-process the
scanned pages and save them as a PDF file. By default, ImageMagick's
convert
tool is invoked for the colour/sharpness/gamma post-processing and TeX
Live's
pdfjam is used for the PDF conversion. However one can use any CLI helper
program or script for those tasks. E.g. the repository contains an
example
script to output searchable PDFs by using the Tesseract OCR engine.

Scandoc has been used for half a year in production now in my (dentist's)
office, and -- from what I heard from the (of course by now only few)
users --
it makes this very task of creating PDF files from documents a lot
easier and
can be used quite conveniently.

I thus wondered if this would be something we could need in Extragear.
At least, I wanted to share this with you, maybe, someone may find
this useful

:-)

Cheers, Tobias









Re: Would Scandoc be somthing for Extragear?

2022-11-10 Thread Thomas Baumgart
On Donnerstag, 10. November 2022 09:28:17 CET Klaas Freitag wrote:

> Am 09.11.22 um 21:18 schrieb Tobias Leupold:
> > Am Mittwoch, 9. November 2022, 20:59:15 CET schrieb Klaas Freitag:
> >> Am 09.11.22 um 20:22 schrieb Nate Graham:
> 
> > 
> > That is essentially what my old document scanning script did -- using the 
> > same
> > tools. But without a GUI of course. Nice idea! Seems like I'm not the only 
> > one
> > who worked on getting this very task done as easy as possible ;-)
> 
> Yes, well, that is what people in the office need. "Classical" office 
> suites are just the beginning. Things that we Linux guys considered 
> black magic for long time (such as printing, scanning, ocr, doc analysis 
> etc) are now super easy and available on mobile platforms. And that is 
> what we compete with on the office desktop.

I don't know, if you have thought of this when writing invoices. Include
an EPC QR (https://de.wikipedia.org/wiki/EPC-QR-Code) on the invoice. I am
thinking about adding a webcam interface so that KMyMoney can use it to
read the QR code and use the data to fill out the form for online payments
that KMyMoney already has. Or even use the PDF file as input an pull out
the QR code. That would improve productivity I think. I know, a bit off
topic here.

> I know people who run Kraft on Chromebooks to enable the users to use 
> the "nice" productivity apps available with android. Tough times ;-)



-- 

Regards

Thomas Baumgart

-
'Knowing a computer language is neither a necessary nor a sufficient
condition to know how to construct a computer program' -- J.R. Tyrer
-


signature.asc
Description: This is a digitally signed message part.


Re: Would Scandoc be somthing for Extragear?

2022-11-10 Thread Sune Vuorela
On 2022-11-10, Thomas Baumgart  wrote:
> I don't know, if you have thought of this when writing invoices. Include
> an EPC QR (https://de.wikipedia.org/wiki/EPC-QR-Code) on the invoice. I am
> thinking about adding a webcam interface so that KMyMoney can use it to
> read the QR code and use the data to fill out the form for online payments
> that KMyMoney already has. Or even use the PDF file as input an pull out
> the QR code. That would improve productivity I think. I know, a bit off
> topic here.

libprison and itinerary application might help you in this; they should
together have the features to both read barcodes from pdf's, from webcam
and generate barcodes.

/Sune



Re: Would Scandoc be somthing for Extragear?

2022-11-10 Thread Tobias Leupold

Am 10.11.22 um 10:09 schrieb Thomas Baumgart:

On Donnerstag, 10. November 2022 09:28:17 CET Klaas Freitag wrote:


Am 09.11.22 um 21:18 schrieb Tobias Leupold:

Am Mittwoch, 9. November 2022, 20:59:15 CET schrieb Klaas Freitag:

Am 09.11.22 um 20:22 schrieb Nate Graham:




That is essentially what my old document scanning script did -- using the same
tools. But without a GUI of course. Nice idea! Seems like I'm not the only one
who worked on getting this very task done as easy as possible ;-)


Yes, well, that is what people in the office need. "Classical" office
suites are just the beginning. Things that we Linux guys considered
black magic for long time (such as printing, scanning, ocr, doc analysis
etc) are now super easy and available on mobile platforms. And that is
what we compete with on the office desktop.


I don't know, if you have thought of this when writing invoices. Include
an EPC QR (https://de.wikipedia.org/wiki/EPC-QR-Code) on the invoice. I am
thinking about adding a webcam interface so that KMyMoney can use it to
read the QR code and use the data to fill out the form for online payments
that KMyMoney already has. Or even use the PDF file as input an pull out
the QR code. That would improve productivity I think. I know, a bit off
topic here.


Would be a cool feature for sure -- but I don't think I could add this 
to Scandoc in a meaningful way, can I?



I know people who run Kraft on Chromebooks to enable the users to use
the "nice" productivity apps available with android. Tough times ;-)


Re: Would Scandoc be somthing for Extragear?

2022-11-10 Thread Thomas Baumgart
On Donnerstag, 10. November 2022 10:51:32 CET Tobias Leupold wrote:

> Am 10.11.22 um 10:09 schrieb Thomas Baumgart:
> > On Donnerstag, 10. November 2022 09:28:17 CET Klaas Freitag wrote:
> > 
> >> Am 09.11.22 um 21:18 schrieb Tobias Leupold:
> >>> Am Mittwoch, 9. November 2022, 20:59:15 CET schrieb Klaas Freitag:
>  Am 09.11.22 um 20:22 schrieb Nate Graham:
> >>
> >>>
> >>> That is essentially what my old document scanning script did -- using the 
> >>> same
> >>> tools. But without a GUI of course. Nice idea! Seems like I'm not the 
> >>> only one
> >>> who worked on getting this very task done as easy as possible ;-)
> >>
> >> Yes, well, that is what people in the office need. "Classical" office
> >> suites are just the beginning. Things that we Linux guys considered
> >> black magic for long time (such as printing, scanning, ocr, doc analysis
> >> etc) are now super easy and available on mobile platforms. And that is
> >> what we compete with on the office desktop.
> > 
> > I don't know, if you have thought of this when writing invoices. Include
> > an EPC QR (https://de.wikipedia.org/wiki/EPC-QR-Code) on the invoice. I am
> > thinking about adding a webcam interface so that KMyMoney can use it to
> > read the QR code and use the data to fill out the form for online payments
> > that KMyMoney already has. Or even use the PDF file as input an pull out
> > the QR code. That would improve productivity I think. I know, a bit off
> > topic here.
> 
> Would be a cool feature for sure -- but I don't think I could add this 
> to Scandoc in a meaningful way, can I?

Oh, sorry for the confusion: my statement was not meant in this direction.
Scandoc would probably be in a position to provide the PDF where this 
information
can then be extracted from.

-- 

Regards

Thomas Baumgart

-
The good thing about FOSS is that people can see your code and comment it.
The bad thing is that people can see your code and comment it. (asoliverez)
-


signature.asc
Description: This is a digitally signed message part.


Re: Would Scandoc be somthing for Extragear?

2022-11-10 Thread Thomas Baumgart
Sune,

On Donnerstag, 10. November 2022 10:31:57 CET Sune Vuorela wrote:

> On 2022-11-10, Thomas Baumgart  wrote:
> > I don't know, if you have thought of this when writing invoices. Include
> > an EPC QR (https://de.wikipedia.org/wiki/EPC-QR-Code) on the invoice. I am
> > thinking about adding a webcam interface so that KMyMoney can use it to
> > read the QR code and use the data to fill out the form for online payments
> > that KMyMoney already has. Or even use the PDF file as input an pull out
> > the QR code. That would improve productivity I think. I know, a bit off
> > topic here.
> 
> libprison and itinerary application might help you in this; they should
> together have the features to both read barcodes from pdf's, from webcam
> and generate barcodes.

Thanks for mentioning, I am aware of that. Generating would be something
more in the area of Kraft and printing invoices. I know that itinerary
uses the libs I looked into for KMyMoney and will contact Volker if I am
ready to play with it. Time is the limiting factor (as usual).

-- 

Regards

Thomas Baumgart

-
Q: How do you make a water bed more bouncy? A: Fill it with spring water!
-


signature.asc
Description: This is a digitally signed message part.


Re: Would Scandoc be somthing for Extragear?

2022-11-10 Thread Klaas Freitag

Am 10.11.22 um 10:09 schrieb Thomas Baumgart:

Hi Thomas,



I don't know, if you have thought of this when writing invoices. Include
an EPC QR (https://de.wikipedia.org/wiki/EPC-QR-Code) on the invoice.


The next version of Kraft that I am about to release has EPC QR Code 
capabilities already [1]. That was a feature wish from users. I should 
have docu and a blog post about that...



I am
thinking about adding a webcam interface so that KMyMoney can use it to
read the QR code and use the data to fill out the form for online payments
that KMyMoney already has. Or even use the PDF file as input an pull out
the QR code. That would improve productivity I think. 


Well, if it was only Kraft, we would probably be better off using in the 
PDF embedded Metadata to detect that. Having that as an OCR based 
feature would add that more generic of course.



I know, a bit off topic here.

Is it? Where else would we discuss that?

regards,
Klaas

[1] 
https://github.com/dragotin/kraft/commit/4adecbf263844dd27141b6b1cf5c15a89af15102







Re: Would Scandoc be somthing for Extragear?

2022-11-10 Thread Tobias Leupold
Am Donnerstag, 10. November 2022, 12:25:26 CET schrieb Klaas Freitag:
> Am 10.11.22 um 10:09 schrieb Thomas Baumgart:
> 
> Hi Thomas,
> 
> > I don't know, if you have thought of this when writing invoices. Include
> > an EPC QR (https://de.wikipedia.org/wiki/EPC-QR-Code) on the invoice.
> 
> The next version of Kraft that I am about to release has EPC QR Code
> capabilities already [1]. That was a feature wish from users. I should
> have docu and a blog post about that...
> 
> > I am
> > thinking about adding a webcam interface so that KMyMoney can use it to
> > read the QR code and use the data to fill out the form for online payments
> > that KMyMoney already has. Or even use the PDF file as input an pull out
> > the QR code. That would improve productivity I think.
> 
> Well, if it was only Kraft, we would probably be better off using in the
> PDF embedded Metadata to detect that. Having that as an OCR based
> feature would add that more generic of course.
> 
> > I know, a bit off topic here.
> 
> Is it? Where else would we discuss that?

This mailing list is the right place for sure, but I think Thomas meant it's a 
bit off-topic for this thread (about whether or not Scandoc would be something 
enriching and/or suitable for Extragear).

> regards,
> Klaas
> 
> [1]
> https://github.com/dragotin/kraft/commit/4adecbf263844dd27141b6b1cf5c15a89af
> 15102






Re: Would Scandoc be somthing for Extragear?

2022-11-10 Thread Volker Krause
On Donnerstag, 10. November 2022 12:25:26 CET Klaas Freitag wrote:
> Am 10.11.22 um 10:09 schrieb Thomas Baumgart:
> > I am
> > thinking about adding a webcam interface so that KMyMoney can use it to
> > read the QR code and use the data to fill out the form for online payments
> > that KMyMoney already has. Or even use the PDF file as input an pull out
> > the QR code. That would improve productivity I think.
> 
> Well, if it was only Kraft, we would probably be better off using in the
> PDF embedded Metadata to detect that. Having that as an OCR based
> feature would add that more generic of course.

Right, we likely need to support all of those methods for handling arbitrary 
input documents anyway. 

I'd be very interested in EPC QR sample documents to check whether ZXing and 
our PDF extractor can handle those correctly :) (EPC QR is slightly special 
due to it being text-based while defining the used text codec inside the text 
itself, that could potentially confuse content/codec auto-detection).

Regards,
Volker

signature.asc
Description: This is a digitally signed message part.


Re: Would Scandoc be somthing for Extragear?

2022-11-10 Thread Thomas Baumgart
On Donnerstag, 10. November 2022 17:38:21 CET Volker Krause wrote:

> On Donnerstag, 10. November 2022 12:25:26 CET Klaas Freitag wrote:
> > Am 10.11.22 um 10:09 schrieb Thomas Baumgart:
> > > I am
> > > thinking about adding a webcam interface so that KMyMoney can use it to
> > > read the QR code and use the data to fill out the form for online payments
> > > that KMyMoney already has. Or even use the PDF file as input an pull out
> > > the QR code. That would improve productivity I think.
> > 
> > Well, if it was only Kraft, we would probably be better off using in the
> > PDF embedded Metadata to detect that. Having that as an OCR based
> > feature would add that more generic of course.
> 
> Right, we likely need to support all of those methods for handling arbitrary 
> input documents anyway. 
> 
> I'd be very interested in EPC QR sample documents to check whether ZXing and 
> our PDF extractor can handle those correctly :) (EPC QR is slightly special 
> due to it being text-based while defining the used text codec inside the text 
> itself, that could potentially confuse content/codec auto-detection).

I tried ZXing a while back to read in EPC-QRs I had at the time. Results looked
very promising (of course no special chars included, so codec did not really
matter). I see if I can get a hold on some and send them of to you via PM.

-- 

Regards

Thomas Baumgart

-
If a cluttered desk is characteristic of a cluttered mind,
what does an empty desk mean 1 2 a b k X
-


signature.asc
Description: This is a digitally signed message part.


Re: Would Scandoc be somthing for Extragear?

2022-11-10 Thread Klaas Freitag

Hallo Volker,

Am 10.11.22 um 17:38 schrieb Volker Krause:

On Donnerstag, 10. November 2022 12:25:26 CET Klaas Freitag wrote:

Am 10.11.22 um 10:09 schrieb Thomas Baumgart:

I am
thinking about adding a webcam interface so that KMyMoney can use it to
read the QR code and use the data to fill out the form for online payments
that KMyMoney already has. Or even use the PDF file as input an pull out
the QR code. That would improve productivity I think.


Well, if it was only Kraft, we would probably be better off using in the
PDF embedded Metadata to detect that. Having that as an OCR based
feature would add that more generic of course.


Right, we likely need to support all of those methods for handling arbitrary
input documents anyway.

I'd be very interested in EPC QR sample documents to check whether ZXing and
our PDF extractor can handle those correctly :) (EPC QR is slightly special
due to it being text-based while defining the used text codec inside the text
itself, that could potentially confuse content/codec auto-detection).


I created a test doc here (my hosting):
https://pjatniza.net/owncloud/index.php/s/Km8iIjLBYJHe5nY

If you need more let me know, but the QR code only encodes some quite 
"interesting" text, so it should not really change something structural 
if I'd send you another invoice or two built with Kraft. Yeah, text 
encoding maybe... ;-)


regards,
Klaas






Re: Would Scandoc be somthing for Extragear?

2022-11-10 Thread Volker Krause
On Donnerstag, 10. November 2022 18:00:27 CET Klaas Freitag wrote:
> Hallo Volker,
> 
> Am 10.11.22 um 17:38 schrieb Volker Krause:
> > On Donnerstag, 10. November 2022 12:25:26 CET Klaas Freitag wrote:
> >> Am 10.11.22 um 10:09 schrieb Thomas Baumgart:
> >>> I am
> >>> thinking about adding a webcam interface so that KMyMoney can use it to
> >>> read the QR code and use the data to fill out the form for online
> >>> payments
> >>> that KMyMoney already has. Or even use the PDF file as input an pull out
> >>> the QR code. That would improve productivity I think.
> >> 
> >> Well, if it was only Kraft, we would probably be better off using in the
> >> PDF embedded Metadata to detect that. Having that as an OCR based
> >> feature would add that more generic of course.
> > 
> > Right, we likely need to support all of those methods for handling
> > arbitrary input documents anyway.
> > 
> > I'd be very interested in EPC QR sample documents to check whether ZXing
> > and our PDF extractor can handle those correctly :) (EPC QR is slightly
> > special due to it being text-based while defining the used text codec
> > inside the text itself, that could potentially confuse content/codec
> > auto-detection).
> I created a test doc here (my hosting):
> https://pjatniza.net/owncloud/index.php/s/Km8iIjLBYJHe5nY
> 
> If you need more let me know, but the QR code only encodes some quite
> "interesting" text, so it should not really change something structural
> if I'd send you another invoice or two built with Kraft. Yeah, text
> encoding maybe... ;-)

Thanks! 

No problem with ZXing, but as this is ASCII-only that is expected.

The QR code is a vector graphic in the PDF (good, but rare), that required a 
minor tweak in the PDF extractor used by Itinerary to be properly recognized, 
but that works as well now.

Besides being potentially useful for PDF import in KMyMoney, this could also 
enable things like a KMail plugin that automatically detects invoices and 
offers triggering payment for example, similar to what we do with adding travel 
tickets to the calendar :)

Regards,
Volker

signature.asc
Description: This is a digitally signed message part.