[Corpora-List] A Robust Pre-trained Model in French for Biomedical and Clinical domains

Mickael ROUVIER via Corpora Wed, 05 Apr 2023 01:55:41 -0700

Dear all, 

We are proud to announce our first biomedical language model for French called 
DrBERT. It's now available on HuggingFace and Arxiv ( [ 
https://arxiv.org/abs/2304.00958 | https://arxiv.org/abs/2304.00958 ] ).


You can now use the model on your own documents and get state-of-the-art 
performances in only 3 lines of code. 

Check out the: 
- Project website: [ https://drbert.univ-avignon.fr/ | 
https://drbert.univ-avignon.fr/ ] 
- Hugging Face models: [ https://huggingface.co/Dr-BERT | 
https://huggingface.co/Dr-BERT ] 


Our model was trained on 128 GPU from Jean-Zay Supercomputer and assessed on 11 
distinct practical biomedical tasks for French language, which came from public 
and private data. These tasks include : Named Entity Recognition (NER), 
Part-Of-Speech tagging (POS), binary/multi-class/multi-label classification, 
and multiple-choice question answering. The outcomes revealed that DrBERT 
enhanced the performance of most tasks compared to prior techniques, indicating 
that from-scratch pre-trained strategy is still the most effective for BERT 
language models on French Biomedical. 

Tutorials about biomedical natural language processing are coming soon, stay 
tuned !! 

With Yanis Labrak (LIA / Zenidoc), Adrien Bazoge (LS2N), Richard Dufour (LS2N), 
Mickael Rouvier (LIA), Emmanuel Morin (LS2N), Béatrice Daille (LS2N) and 
Pierre-Antoine Gourraud (Nantes University / CHU Nantes). 

Best regards.

_______________________________________________
Corpora mailing list -- [email protected]
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to [email protected]

[Corpora-List] A Robust Pre-trained Model in French for Biomedical and Clinical domains

Reply via email to