FWIW, OpenAI, the maker of ChatGPT, has an open-source (MIT license) transcriber and translator called Whisper. I tried it, but it was incredibly slow. It took several minutes on a 2019 MacBook Pro to transcribe the first few lines of dialogue.
https://github.com/openai/whisper From: Code for Libraries <CODE4LIB@LISTS.CLIR.ORG> on behalf of Coates, Sarah N <sarah.coa...@ufl.edu> Date: Friday, April 7, 2023 at 7:16 AM To: CODE4LIB@LISTS.CLIR.ORG <CODE4LIB@LISTS.CLIR.ORG> Subject: [External] Re: [CODE4LIB] Using ChatGPT to transcribe manuscripts? Thank you so much, all, for the excellent suggestions! I've passed them on to my colleague, who sends her thanks as well. Sarah ---------------------- Sarah Coates, CA University Archivist University Archives PO Box 117005 George A. Smathers Libraries University of Florida Gainesville, FL 32611-7005 sarah.coa...@ufl.edu 352-273-2817 ________________________________ From: Code for Libraries <CODE4LIB@LISTS.CLIR.ORG> on behalf of Lena G. Bohman <lena.g.boh...@hofstra.edu> Sent: Friday, April 7, 2023 9:50 AM To: CODE4LIB@LISTS.CLIR.ORG <CODE4LIB@LISTS.CLIR.ORG> Subject: Re: [CODE4LIB] Using ChatGPT to transcribe manuscripts? [External Email] Quartex has a product that does this as well: https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.amdigital.co.uk%2Fcreate%2Fam-quartex%2Fhtr-and-ocr&data=05%7C01%7Csteven.ng%40TEMPLE.EDU%7Cdda53ad29bec4f9775a008db3772aaef%7C716e81efb52244738e3110bd02ccf6e5%7C0%7C0%7C638164737864205719%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=YCRz7ILXYLbWIE1J%2BLVCoFCK8kB%2BQaFHdXV01okVmAE%3D&reserved=0<https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.amdigital.co.uk%2Fcreate%2Fam-quartex%2Fhtr-and-ocr&data=05%7C01%7Csteven.ng%40TEMPLE.EDU%7Cdda53ad29bec4f9775a008db3772aaef%7C716e81efb52244738e3110bd02ccf6e5%7C0%7C0%7C638164737864205719%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=YCRz7ILXYLbWIE1J%2BLVCoFCK8kB%2BQaFHdXV01okVmAE%3D&reserved=0><https://www.amdigital.co.uk/create/am-quartex/htr-and-ocr> I looked seriously at them when I was working on a repository upgrade a year ago or so. They're really aiming at archives, so it didn't work out, but I was impressed with the product. Lena Lena Bohman Data and Research Impact Librarian Long Island Jewish - Forest Hills Liaison Donald and Barbara Zucker School of Medicine at Hofstra/Northwell [cid:382462b1-5aec-49b2-adad-38ba0ea99449] ________________________________ From: Code for Libraries <CODE4LIB@LISTS.CLIR.ORG> on behalf of Mia <mia.ri...@gmail.com> Sent: Friday, April 7, 2023 6:34 AM To: CODE4LIB@LISTS.CLIR.ORG <CODE4LIB@LISTS.CLIR.ORG> Subject: Re: [CODE4LIB] Using ChatGPT to transcribe manuscripts? EXTERNAL MESSAGE Hi all, A second vote for READ's Transkribus where the use case is transcribing handwritten or typed text from an image. They also have a text correction option. I know people have experimented with ChatGPT for OCR correction, but I suspect it'd do best where the text only contains common words. It might well make up words where it doesn't know the correct word such as an archaic term, personal name or jargon. Cheers, Mia -------------------------------------------- https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fopenobjects.org.uk%2F&data=05%7C01%7Csteven.ng%40TEMPLE.EDU%7Cdda53ad29bec4f9775a008db3772aaef%7C716e81efb52244738e3110bd02ccf6e5%7C0%7C0%7C638164737864205719%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Vn2Hmf5bAqFpbMi%2Fkj7EGnN15Y7pU5pU0J6BacMps6o%3D&reserved=0<https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fopenobjects.org.uk%2F&data=05%7C01%7Csteven.ng%40TEMPLE.EDU%7Cdda53ad29bec4f9775a008db3772aaef%7C716e81efb52244738e3110bd02ccf6e5%7C0%7C0%7C638164737864205719%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Vn2Hmf5bAqFpbMi%2Fkj7EGnN15Y7pU5pU0J6BacMps6o%3D&reserved=0><https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fopenobjects.org.uk%2F&data=05%7C01%7Csteven.ng%40TEMPLE.EDU%7Cdda53ad29bec4f9775a008db3772aaef%7C716e81efb52244738e3110bd02ccf6e5%7C0%7C0%7C638164737864361958%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=YNVeOPQ2L%2Be72LFyBFD7wQseC3c97sZTe3i0n99Nybk%3D&reserved=0><http://openobjects.org.uk/> https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Ftwitter.com%2Fmia_out&data=05%7C01%7Csteven.ng%40TEMPLE.EDU%7Cdda53ad29bec4f9775a008db3772aaef%7C716e81efb52244738e3110bd02ccf6e5%7C0%7C0%7C638164737864361958%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=qs%2B33k2QWOZtQuLbFCRJ69PbyWuL03%2BXSxeHiFYGqXk%3D&reserved=0<https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Ftwitter.com%2Fmia_out&data=05%7C01%7Csteven.ng%40TEMPLE.EDU%7Cdda53ad29bec4f9775a008db3772aaef%7C716e81efb52244738e3110bd02ccf6e5%7C0%7C0%7C638164737864361958%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=qs%2B33k2QWOZtQuLbFCRJ69PbyWuL03%2BXSxeHiFYGqXk%3D&reserved=0><https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Ftwitter.com%2Fmia_out&data=05%7C01%7Csteven.ng%40TEMPLE.EDU%7Cdda53ad29bec4f9775a008db3772aaef%7C716e81efb52244738e3110bd02ccf6e5%7C0%7C0%7C638164737864361958%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=qs%2B33k2QWOZtQuLbFCRJ69PbyWuL03%2BXSxeHiFYGqXk%3D&reserved=0><http://twitter.com/mia_out> Check out my book! https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fbit.ly%2FCrowdsourcingOurCulturalHeritage&data=05%7C01%7Csteven.ng%40TEMPLE.EDU%7Cdda53ad29bec4f9775a008db3772aaef%7C716e81efb52244738e3110bd02ccf6e5%7C0%7C0%7C638164737864361958%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=zUuNaojQoiFT3x6lET0dL%2BrXdn%2FTL04xMfRrGyDEvOg%3D&reserved=0<https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fbit.ly%2FCrowdsourcingOurCulturalHeritage&data=05%7C01%7Csteven.ng%40TEMPLE.EDU%7Cdda53ad29bec4f9775a008db3772aaef%7C716e81efb52244738e3110bd02ccf6e5%7C0%7C0%7C638164737864361958%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=zUuNaojQoiFT3x6lET0dL%2BrXdn%2FTL04xMfRrGyDEvOg%3D&reserved=0><https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fbit.ly%2FCrowdsourcingOurCulturalHeritage&data=05%7C01%7Csteven.ng%40TEMPLE.EDU%7Cdda53ad29bec4f9775a008db3772aaef%7C716e81efb52244738e3110bd02ccf6e5%7C0%7C0%7C638164737864361958%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=zUuNaojQoiFT3x6lET0dL%2BrXdn%2FTL04xMfRrGyDEvOg%3D&reserved=0><http://bit.ly/CrowdsourcingOurCulturalHeritage> P.S. I mostly use this address for list mail and don't check it daily On Thu, 6 Apr 2023 at 21:06, Clapp, Sharon B. (Library) < 000000beaa7a0956-dmarc-requ...@lists.clir.org> wrote: > Hi all, > If you haven't looked at this > https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Freadcoop.eu%2Ftranskribus%2F&data=05%7C01%7Csteven.ng%40TEMPLE.EDU%7Cdda53ad29bec4f9775a008db3772aaef%7C716e81efb52244738e3110bd02ccf6e5%7C0%7C0%7C638164737864361958%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=WMPjKLVm6ks374kTDSYNX3vyh%2F%2BRGpETh0IRRyMz5CM%3D&reserved=0<https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Freadcoop.eu%2Ftranskribus%2F&data=05%7C01%7Csteven.ng%40TEMPLE.EDU%7Cdda53ad29bec4f9775a008db3772aaef%7C716e81efb52244738e3110bd02ccf6e5%7C0%7C0%7C638164737864361958%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=WMPjKLVm6ks374kTDSYNX3vyh%2F%2BRGpETh0IRRyMz5CM%3D&reserved=0><https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Freadcoop.eu%2Ftranskribus%2F&data=05%7C01%7Csteven.ng%40TEMPLE.EDU%7Cdda53ad29bec4f9775a008db3772aaef%7C716e81efb52244738e3110bd02ccf6e5%7C0%7C0%7C638164737864361958%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=WMPjKLVm6ks374kTDSYNX3vyh%2F%2BRGpETh0IRRyMz5CM%3D&reserved=0><https://readcoop.eu/transkribus/> > - it might > be of interest... > [https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Freadcoop.eu%2Fwp-content%2Fuploads%2F2021%2F04%2FSignature.jpg&data=05%7C01%7Csteven.ng%40TEMPLE.EDU%7Cdda53ad29bec4f9775a008db3772aaef%7C716e81efb52244738e3110bd02ccf6e5%7C0%7C0%7C638164737864361958%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=MZlADH2IpzI1CuXbgHgsMVhgiBBybM7VIDEZoPGpHqg%3D&reserved=0<https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Freadcoop.eu%2Fwp-content%2Fuploads%2F2021%2F04%2FSignature.jpg&data=05%7C01%7Csteven.ng%40TEMPLE.EDU%7Cdda53ad29bec4f9775a008db3772aaef%7C716e81efb52244738e3110bd02ccf6e5%7C0%7C0%7C638164737864361958%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=MZlADH2IpzI1CuXbgHgsMVhgiBBybM7VIDEZoPGpHqg%3D&reserved=0<https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Freadcoop.eu%2Fwp-content%2Fuploads%2F2021%2F04%2FSignature.jpg&data=05%7C01%7Csteven.ng%40TEMPLE.EDU%7Cdda53ad29bec4f9775a008db3772aaef%7C716e81efb52244738e3110bd02ccf6e5%7C0%7C0%7C638164737864361958%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=MZlADH2IpzI1CuXbgHgsMVhgiBBybM7VIDEZoPGpHqg%3D&reserved=0<https://readcoop.eu/wp-content/uploads/2021/04/Signature.jpg>>>]< > https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Freadcoop.eu%2Ftranskribus%2F&data=05%7C01%7Csteven.ng%40TEMPLE.EDU%7Cdda53ad29bec4f9775a008db3772aaef%7C716e81efb52244738e3110bd02ccf6e5%7C0%7C0%7C638164737864361958%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=WMPjKLVm6ks374kTDSYNX3vyh%2F%2BRGpETh0IRRyMz5CM%3D&reserved=0><https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Freadcoop.eu%2Ftranskribus%2F&data=05%7C01%7Csteven.ng%40TEMPLE.EDU%7Cdda53ad29bec4f9775a008db3772aaef%7C716e81efb52244738e3110bd02ccf6e5%7C0%7C0%7C638164737864361958%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=WMPjKLVm6ks374kTDSYNX3vyh%2F%2BRGpETh0IRRyMz5CM%3D&reserved=0><https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Freadcoop.eu%2Ftranskribus%2F&data=05%7C01%7Csteven.ng%40TEMPLE.EDU%7Cdda53ad29bec4f9775a008db3772aaef%7C716e81efb52244738e3110bd02ccf6e5%7C0%7C0%7C638164737864361958%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=WMPjKLVm6ks374kTDSYNX3vyh%2F%2BRGpETh0IRRyMz5CM%3D&reserved=0><https://readcoop.eu/transkribus/> > Transkribus<https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Freadcoop.eu%2Ftranskribus%2F&data=05%7C01%7Csteven.ng%40TEMPLE.EDU%7Cdda53ad29bec4f9775a008db3772aaef%7C716e81efb52244738e3110bd02ccf6e5%7C0%7C0%7C638164737864361958%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=WMPjKLVm6ks374kTDSYNX3vyh%2F%2BRGpETh0IRRyMz5CM%3D&reserved=0<https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Freadcoop.eu%2Ftranskribus%2F&data=05%7C01%7Csteven.ng%40TEMPLE.EDU%7Cdda53ad29bec4f9775a008db3772aaef%7C716e81efb52244738e3110bd02ccf6e5%7C0%7C0%7C638164737864361958%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=WMPjKLVm6ks374kTDSYNX3vyh%2F%2BRGpETh0IRRyMz5CM%3D&reserved=0<https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Freadcoop.eu%2Ftranskribus%2F&data=05%7C01%7Csteven.ng%40TEMPLE.EDU%7Cdda53ad29bec4f9775a008db3772aaef%7C716e81efb52244738e3110bd02ccf6e5%7C0%7C0%7C638164737864361958%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=WMPjKLVm6ks374kTDSYNX3vyh%2F%2BRGpETh0IRRyMz5CM%3D&reserved=0<https://readcoop.eu/transkribus/>>>> > Transkribus Unlock historical documents with AI Transkribus is an > AI-powered platform for text recognition, transcription and searching of > historical > readcoop.eu > > > Sharon Clapp > Digital Resources Librarian, Elihu Burritt Library > scl...@ccsu.edu > 860-832-2059 > ________________________________ > From: Code for Libraries <CODE4LIB@LISTS.CLIR.ORG> on behalf of Matt > Sherman <matt.r.sher...@gmail.com> > Sent: Thursday, April 6, 2023 3:15 PM > To: CODE4LIB@LISTS.CLIR.ORG <CODE4LIB@LISTS.CLIR.ORG> > Subject: Re: [CODE4LIB] Using ChatGPT to transcribe manuscripts? > > EXTERNAL EMAIL: This email originated from outside of the organization. Do > not click any links or open any attachments unless you trust the sender and > know the content is safe. > > You can use AI to help with this, though I don't think ChatGPT is built out > for this kind of work due to its interfacing approach. There are other AI > systems you can try. There was a Code4AI pre-conference this year and the > presenter was using TensorFlow to talk about/show off AI. So you would want > to look into the various AIs available and see which ones are being > built/trained around the kind of work you want to do. > > On Thu, Apr 6, 2023 at 3:12 PM Coates, Sarah N <sarah.coa...@ufl.edu> > wrote: > > > Hi, all! > > > > I had a colleague ask an interesting question that I didn't know the > > answer to, so I thought I'd pass it along to your collective wisdom. The > > question is: Could ChatGPT assist with transcription work for manuscript > > letters (handwritten, in either print or cursive). For example, could > > ChatGPT help figure out a word if the transcription only had 3 of the 6 > > letters-would ChatGPT be able to identify that word? Has anyone heard of > > someone using ChatGPT (or a similar program) to assist with > transcription > > of manuscript letters? > > > > Thanks! > > Sarah > > > > ---------------------- > > Sarah Coates, CA > > University Archivist > > University Archives > > PO Box 117005 > > George A. Smathers Libraries > > University of Florida > > Gainesville, FL 32611-7005 > > sarah.coa...@ufl.edu<mailto:sarah.coa...@ufl.edu> > > 352-273-2817 > > > **** CAUTION: This email originated from outside of Hofstra University. Do not click links or open attachments unless you recognize the sender and know the content is safe. ****