Re: [CODE4LIB] genrative ai; fine-tuning and rag

Eric Lease Morgan Thu, 29 Feb 2024 12:07:30 -0800

On Feb 26, 2024, at 4:05 PM, Eric Lease Morgan <emor...@nd.edu> wrote:


> Who out here in Code4Lib Land is practicing with either one or both of the 
> following things: 1) fine-tuning large-language models, or 2) 
> retrieval-augmented generation (RAG). If there is somebody out there, then 
> I'd love to chat...


Many things.

  1. First of all, I'm happy there were a number of different replies. I 
learned something.

  2. Second, I believe the phrase "artificial intelligence" (AI) is poor choice 
of words if not a misnomer. What is "intelligence" anyway, and why should I 
give any credence to fake intelligence? Is the ability to do mathematics very 
quickly intelligent? Is the ability to store and retrieve vast amounts of 
information intelligent? I say not but some people call such things "smart". AI 
has ebbed & flowed over the course of computing history. In the 1990's AI was 
implemented as "expert systems". We are experiencing an ebb.

  3. Third, computer technology evolves. Think of the all the computer 
technology evolutions libraries have experienced. Cards to MARC. MARC to OPAC. 
Print indexes to indexes on CD-ROMS. Field searching to free text searching 
with relevancy ranking. Every time these things happen, some blindly embrace 
the evolution, some are skeptical, and some believe the evolution is a fad. 
This is natural. Generative AI is just another example; the current flavor of 
AI is a mash-up of natural language processing, image processing, and data 
science all on steroids.

  4. Fourth, with the incarnation of generative AI, for the first time in my 
life, I feel threatened by a computer. A computer can do some of my job. It can 
write software. It can summarize text. It can classify text. It can create MARC 
records. Yikes!?

  5. Fifth, I found a few places to discuss AI in libraries. First of all there 
is the AI4LAM Slack channel, and there are couple of similar sub-channels in 
the Code4Lib Slack (#ai-dl-ml and #generative-ai). 

  6. Sixth, a few projects where brought to my attention, and of particular 
interest to me were WARC-GPT, Talpa, and Daybooks of Susan B. Anthony. [1, 2, 
3] In each of these cases the developers: 1) had a collection, 2) used 
large-language model technology to index/analyze the content, 3) provided a 
mechanism to query the collection/analysis, and 4) returned a useful result.

Finally, I see generative AI as a tool, and just like any other tool -- a 
hammer, for example -- one needs to practice in order to use it effectively. My 
toolbox is getting bigger.


Links

[1] WARC-GPT - https://github.com/harvard-lil/warc-gpt
[2] Talpa - https://www.talpa.ai/
[3] Daybooks of Susan B. Anthony - 
https://thisismattmiller.com/post/using-gpt-on-library-collections/

--
Eric Morgan <emor...@nd.edu>
Navari Family Center for Digital Scholarship
Hesburgh Libraries
University of Notre Dame

Re: [CODE4LIB] genrative ai; fine-tuning and rag

Reply via email to