Re: Would Scandoc be somthing for Extragear?
Am 09.11.22 um 21:18 schrieb Tobias Leupold: Am Mittwoch, 9. November 2022, 20:59:15 CET schrieb Klaas Freitag: Am 09.11.22 um 20:22 schrieb Nate Graham: That is essentially what my old document scanning script did -- using the same tools. But without a GUI of course. Nice idea! Seems like I'm not the only one who worked on getting this very task done as easy as possible ;-) Yes, well, that is what people in the office need. "Classical" office suites are just the beginning. Things that we Linux guys considered black magic for long time (such as printing, scanning, ocr, doc analysis etc) are now super easy and available on mobile platforms. And that is what we compete with on the office desktop. I know people who run Kraft on Chromebooks to enable the users to use the "nice" productivity apps available with android. Tough times ;-) regards, Klaas Have you checked out Skanpage? It does PDF scanning, including creating multi-page PDF documents out of the scanned files. It also integrates with the Purpose framework to offer a simple "Share" menu that lets you email scanned documents very quickly. Nate On 11/9/22 06:32, Tobias Leupold wrote: Hi all! Nowadays, sending PDFs of scanned documents via email or uploading them somewhere has become a recurring task. For years, I was using shell scripts to kind-of automate scanning, doing some post-processing and conversion -- after a fashion. But I thought that there should be some more straightforward tool for this. The known general-purpose scanning applications we have didn't do what I wanted to. So, at the beginning of the year, I started to write a quite specialized scanning program whose only purpose is to make scanning documents and turning them into a PDF file as easy as possible. The result is Scandoc. It currently lives at https://invent.kde.org/tleupold/scandoc The Readme contains a description of what it is. It uses KSaneCore to access a scanner and runs (by default well-known) helper programs to post-process the scanned pages and save them as a PDF file. By default, ImageMagick's convert tool is invoked for the colour/sharpness/gamma post-processing and TeX Live's pdfjam is used for the PDF conversion. However one can use any CLI helper program or script for those tasks. E.g. the repository contains an example script to output searchable PDFs by using the Tesseract OCR engine. Scandoc has been used for half a year in production now in my (dentist's) office, and -- from what I heard from the (of course by now only few) users -- it makes this very task of creating PDF files from documents a lot easier and can be used quite conveniently. I thus wondered if this would be something we could need in Extragear. At least, I wanted to share this with you, maybe, someone may find this useful :-) Cheers, Tobias
Re: Would Scandoc be somthing for Extragear?
On Donnerstag, 10. November 2022 09:28:17 CET Klaas Freitag wrote: > Am 09.11.22 um 21:18 schrieb Tobias Leupold: > > Am Mittwoch, 9. November 2022, 20:59:15 CET schrieb Klaas Freitag: > >> Am 09.11.22 um 20:22 schrieb Nate Graham: > > > > > That is essentially what my old document scanning script did -- using the > > same > > tools. But without a GUI of course. Nice idea! Seems like I'm not the only > > one > > who worked on getting this very task done as easy as possible ;-) > > Yes, well, that is what people in the office need. "Classical" office > suites are just the beginning. Things that we Linux guys considered > black magic for long time (such as printing, scanning, ocr, doc analysis > etc) are now super easy and available on mobile platforms. And that is > what we compete with on the office desktop. I don't know, if you have thought of this when writing invoices. Include an EPC QR (https://de.wikipedia.org/wiki/EPC-QR-Code) on the invoice. I am thinking about adding a webcam interface so that KMyMoney can use it to read the QR code and use the data to fill out the form for online payments that KMyMoney already has. Or even use the PDF file as input an pull out the QR code. That would improve productivity I think. I know, a bit off topic here. > I know people who run Kraft on Chromebooks to enable the users to use > the "nice" productivity apps available with android. Tough times ;-) -- Regards Thomas Baumgart - 'Knowing a computer language is neither a necessary nor a sufficient condition to know how to construct a computer program' -- J.R. Tyrer - signature.asc Description: This is a digitally signed message part.
Re: Would Scandoc be somthing for Extragear?
On 2022-11-10, Thomas Baumgart wrote: > I don't know, if you have thought of this when writing invoices. Include > an EPC QR (https://de.wikipedia.org/wiki/EPC-QR-Code) on the invoice. I am > thinking about adding a webcam interface so that KMyMoney can use it to > read the QR code and use the data to fill out the form for online payments > that KMyMoney already has. Or even use the PDF file as input an pull out > the QR code. That would improve productivity I think. I know, a bit off > topic here. libprison and itinerary application might help you in this; they should together have the features to both read barcodes from pdf's, from webcam and generate barcodes. /Sune
Re: Would Scandoc be somthing for Extragear?
Am 10.11.22 um 10:09 schrieb Thomas Baumgart: On Donnerstag, 10. November 2022 09:28:17 CET Klaas Freitag wrote: Am 09.11.22 um 21:18 schrieb Tobias Leupold: Am Mittwoch, 9. November 2022, 20:59:15 CET schrieb Klaas Freitag: Am 09.11.22 um 20:22 schrieb Nate Graham: That is essentially what my old document scanning script did -- using the same tools. But without a GUI of course. Nice idea! Seems like I'm not the only one who worked on getting this very task done as easy as possible ;-) Yes, well, that is what people in the office need. "Classical" office suites are just the beginning. Things that we Linux guys considered black magic for long time (such as printing, scanning, ocr, doc analysis etc) are now super easy and available on mobile platforms. And that is what we compete with on the office desktop. I don't know, if you have thought of this when writing invoices. Include an EPC QR (https://de.wikipedia.org/wiki/EPC-QR-Code) on the invoice. I am thinking about adding a webcam interface so that KMyMoney can use it to read the QR code and use the data to fill out the form for online payments that KMyMoney already has. Or even use the PDF file as input an pull out the QR code. That would improve productivity I think. I know, a bit off topic here. Would be a cool feature for sure -- but I don't think I could add this to Scandoc in a meaningful way, can I? I know people who run Kraft on Chromebooks to enable the users to use the "nice" productivity apps available with android. Tough times ;-)
Re: Would Scandoc be somthing for Extragear?
On Donnerstag, 10. November 2022 10:51:32 CET Tobias Leupold wrote: > Am 10.11.22 um 10:09 schrieb Thomas Baumgart: > > On Donnerstag, 10. November 2022 09:28:17 CET Klaas Freitag wrote: > > > >> Am 09.11.22 um 21:18 schrieb Tobias Leupold: > >>> Am Mittwoch, 9. November 2022, 20:59:15 CET schrieb Klaas Freitag: > Am 09.11.22 um 20:22 schrieb Nate Graham: > >> > >>> > >>> That is essentially what my old document scanning script did -- using the > >>> same > >>> tools. But without a GUI of course. Nice idea! Seems like I'm not the > >>> only one > >>> who worked on getting this very task done as easy as possible ;-) > >> > >> Yes, well, that is what people in the office need. "Classical" office > >> suites are just the beginning. Things that we Linux guys considered > >> black magic for long time (such as printing, scanning, ocr, doc analysis > >> etc) are now super easy and available on mobile platforms. And that is > >> what we compete with on the office desktop. > > > > I don't know, if you have thought of this when writing invoices. Include > > an EPC QR (https://de.wikipedia.org/wiki/EPC-QR-Code) on the invoice. I am > > thinking about adding a webcam interface so that KMyMoney can use it to > > read the QR code and use the data to fill out the form for online payments > > that KMyMoney already has. Or even use the PDF file as input an pull out > > the QR code. That would improve productivity I think. I know, a bit off > > topic here. > > Would be a cool feature for sure -- but I don't think I could add this > to Scandoc in a meaningful way, can I? Oh, sorry for the confusion: my statement was not meant in this direction. Scandoc would probably be in a position to provide the PDF where this information can then be extracted from. -- Regards Thomas Baumgart - The good thing about FOSS is that people can see your code and comment it. The bad thing is that people can see your code and comment it. (asoliverez) - signature.asc Description: This is a digitally signed message part.
Re: Would Scandoc be somthing for Extragear?
Sune, On Donnerstag, 10. November 2022 10:31:57 CET Sune Vuorela wrote: > On 2022-11-10, Thomas Baumgart wrote: > > I don't know, if you have thought of this when writing invoices. Include > > an EPC QR (https://de.wikipedia.org/wiki/EPC-QR-Code) on the invoice. I am > > thinking about adding a webcam interface so that KMyMoney can use it to > > read the QR code and use the data to fill out the form for online payments > > that KMyMoney already has. Or even use the PDF file as input an pull out > > the QR code. That would improve productivity I think. I know, a bit off > > topic here. > > libprison and itinerary application might help you in this; they should > together have the features to both read barcodes from pdf's, from webcam > and generate barcodes. Thanks for mentioning, I am aware of that. Generating would be something more in the area of Kraft and printing invoices. I know that itinerary uses the libs I looked into for KMyMoney and will contact Volker if I am ready to play with it. Time is the limiting factor (as usual). -- Regards Thomas Baumgart - Q: How do you make a water bed more bouncy? A: Fill it with spring water! - signature.asc Description: This is a digitally signed message part.
Re: Would Scandoc be somthing for Extragear?
Am 10.11.22 um 10:09 schrieb Thomas Baumgart: Hi Thomas, I don't know, if you have thought of this when writing invoices. Include an EPC QR (https://de.wikipedia.org/wiki/EPC-QR-Code) on the invoice. The next version of Kraft that I am about to release has EPC QR Code capabilities already [1]. That was a feature wish from users. I should have docu and a blog post about that... I am thinking about adding a webcam interface so that KMyMoney can use it to read the QR code and use the data to fill out the form for online payments that KMyMoney already has. Or even use the PDF file as input an pull out the QR code. That would improve productivity I think. Well, if it was only Kraft, we would probably be better off using in the PDF embedded Metadata to detect that. Having that as an OCR based feature would add that more generic of course. I know, a bit off topic here. Is it? Where else would we discuss that? regards, Klaas [1] https://github.com/dragotin/kraft/commit/4adecbf263844dd27141b6b1cf5c15a89af15102
Re: Would Scandoc be somthing for Extragear?
Am Donnerstag, 10. November 2022, 12:25:26 CET schrieb Klaas Freitag: > Am 10.11.22 um 10:09 schrieb Thomas Baumgart: > > Hi Thomas, > > > I don't know, if you have thought of this when writing invoices. Include > > an EPC QR (https://de.wikipedia.org/wiki/EPC-QR-Code) on the invoice. > > The next version of Kraft that I am about to release has EPC QR Code > capabilities already [1]. That was a feature wish from users. I should > have docu and a blog post about that... > > > I am > > thinking about adding a webcam interface so that KMyMoney can use it to > > read the QR code and use the data to fill out the form for online payments > > that KMyMoney already has. Or even use the PDF file as input an pull out > > the QR code. That would improve productivity I think. > > Well, if it was only Kraft, we would probably be better off using in the > PDF embedded Metadata to detect that. Having that as an OCR based > feature would add that more generic of course. > > > I know, a bit off topic here. > > Is it? Where else would we discuss that? This mailing list is the right place for sure, but I think Thomas meant it's a bit off-topic for this thread (about whether or not Scandoc would be something enriching and/or suitable for Extragear). > regards, > Klaas > > [1] > https://github.com/dragotin/kraft/commit/4adecbf263844dd27141b6b1cf5c15a89af > 15102
Re: Would Scandoc be somthing for Extragear?
On Donnerstag, 10. November 2022 12:25:26 CET Klaas Freitag wrote: > Am 10.11.22 um 10:09 schrieb Thomas Baumgart: > > I am > > thinking about adding a webcam interface so that KMyMoney can use it to > > read the QR code and use the data to fill out the form for online payments > > that KMyMoney already has. Or even use the PDF file as input an pull out > > the QR code. That would improve productivity I think. > > Well, if it was only Kraft, we would probably be better off using in the > PDF embedded Metadata to detect that. Having that as an OCR based > feature would add that more generic of course. Right, we likely need to support all of those methods for handling arbitrary input documents anyway. I'd be very interested in EPC QR sample documents to check whether ZXing and our PDF extractor can handle those correctly :) (EPC QR is slightly special due to it being text-based while defining the used text codec inside the text itself, that could potentially confuse content/codec auto-detection). Regards, Volker signature.asc Description: This is a digitally signed message part.
Re: Would Scandoc be somthing for Extragear?
On Donnerstag, 10. November 2022 17:38:21 CET Volker Krause wrote: > On Donnerstag, 10. November 2022 12:25:26 CET Klaas Freitag wrote: > > Am 10.11.22 um 10:09 schrieb Thomas Baumgart: > > > I am > > > thinking about adding a webcam interface so that KMyMoney can use it to > > > read the QR code and use the data to fill out the form for online payments > > > that KMyMoney already has. Or even use the PDF file as input an pull out > > > the QR code. That would improve productivity I think. > > > > Well, if it was only Kraft, we would probably be better off using in the > > PDF embedded Metadata to detect that. Having that as an OCR based > > feature would add that more generic of course. > > Right, we likely need to support all of those methods for handling arbitrary > input documents anyway. > > I'd be very interested in EPC QR sample documents to check whether ZXing and > our PDF extractor can handle those correctly :) (EPC QR is slightly special > due to it being text-based while defining the used text codec inside the text > itself, that could potentially confuse content/codec auto-detection). I tried ZXing a while back to read in EPC-QRs I had at the time. Results looked very promising (of course no special chars included, so codec did not really matter). I see if I can get a hold on some and send them of to you via PM. -- Regards Thomas Baumgart - If a cluttered desk is characteristic of a cluttered mind, what does an empty desk mean 1 2 a b k X - signature.asc Description: This is a digitally signed message part.
Re: Would Scandoc be somthing for Extragear?
Hallo Volker, Am 10.11.22 um 17:38 schrieb Volker Krause: On Donnerstag, 10. November 2022 12:25:26 CET Klaas Freitag wrote: Am 10.11.22 um 10:09 schrieb Thomas Baumgart: I am thinking about adding a webcam interface so that KMyMoney can use it to read the QR code and use the data to fill out the form for online payments that KMyMoney already has. Or even use the PDF file as input an pull out the QR code. That would improve productivity I think. Well, if it was only Kraft, we would probably be better off using in the PDF embedded Metadata to detect that. Having that as an OCR based feature would add that more generic of course. Right, we likely need to support all of those methods for handling arbitrary input documents anyway. I'd be very interested in EPC QR sample documents to check whether ZXing and our PDF extractor can handle those correctly :) (EPC QR is slightly special due to it being text-based while defining the used text codec inside the text itself, that could potentially confuse content/codec auto-detection). I created a test doc here (my hosting): https://pjatniza.net/owncloud/index.php/s/Km8iIjLBYJHe5nY If you need more let me know, but the QR code only encodes some quite "interesting" text, so it should not really change something structural if I'd send you another invoice or two built with Kraft. Yeah, text encoding maybe... ;-) regards, Klaas
Re: Would Scandoc be somthing for Extragear?
On Donnerstag, 10. November 2022 18:00:27 CET Klaas Freitag wrote: > Hallo Volker, > > Am 10.11.22 um 17:38 schrieb Volker Krause: > > On Donnerstag, 10. November 2022 12:25:26 CET Klaas Freitag wrote: > >> Am 10.11.22 um 10:09 schrieb Thomas Baumgart: > >>> I am > >>> thinking about adding a webcam interface so that KMyMoney can use it to > >>> read the QR code and use the data to fill out the form for online > >>> payments > >>> that KMyMoney already has. Or even use the PDF file as input an pull out > >>> the QR code. That would improve productivity I think. > >> > >> Well, if it was only Kraft, we would probably be better off using in the > >> PDF embedded Metadata to detect that. Having that as an OCR based > >> feature would add that more generic of course. > > > > Right, we likely need to support all of those methods for handling > > arbitrary input documents anyway. > > > > I'd be very interested in EPC QR sample documents to check whether ZXing > > and our PDF extractor can handle those correctly :) (EPC QR is slightly > > special due to it being text-based while defining the used text codec > > inside the text itself, that could potentially confuse content/codec > > auto-detection). > I created a test doc here (my hosting): > https://pjatniza.net/owncloud/index.php/s/Km8iIjLBYJHe5nY > > If you need more let me know, but the QR code only encodes some quite > "interesting" text, so it should not really change something structural > if I'd send you another invoice or two built with Kraft. Yeah, text > encoding maybe... ;-) Thanks! No problem with ZXing, but as this is ASCII-only that is expected. The QR code is a vector graphic in the PDF (good, but rare), that required a minor tweak in the PDF extractor used by Itinerary to be properly recognized, but that works as well now. Besides being potentially useful for PDF import in KMyMoney, this could also enable things like a KMail plugin that automatically detects invoices and offers triggering payment for example, similar to what we do with adding travel tickets to the calendar :) Regards, Volker signature.asc Description: This is a digitally signed message part.