** Apologies for cross-posting **

Applications are invited for a fully funded PhD candidature in at Leiden 
Institute of Advanced Computer Science (LIACS), The Netherlands, on the use of 
Formal Methods to enhance the efficiency, transparency and the understanding of 
Transformer-based language models.

While large language models (LLMs) have proven successful in many areas of 
Natural Language Processing, they suffer from high data and resource usage, and 
display limited generalization capacity in tasks that humans excel at. In this 
PhD project you will have the opportunity to investigate how formal methods can 
help in developing more efficient and more transparent models for Natural 
Language Understanding. Specifically, you will investigate the use of implicit 
or explicit structural bias in Transformer-based language models to reduce 
training data and model parameters; additionally, you will look at novel 
techniques for evaluating models for their generalization capabilities on 
Natural Language Understanding tasks such as Natural Language Inference, 
possibly in a multilingual and multimodal setting.

The specific project content is to be decided between the applicants’ interest 
and the expertise of the supervisor, dr. Gijs Wijnholds.

Topics include (but are not limited to):

- Using logical methods to define task-relevant constraints on LLM finetuning;
- Incorporating structured representations in regularized training of smaller 
language models;
- Assessing the generalization capacity of Transformer-based models in the 
context of formal language theory, model probing, Natural Language Inference;
- Evaluation of Natural Language Understanding models in the presence of 
ambiguity and/or annotator disagreement;
- Understanding multilingual Natural Language Inference in Vision-Language 
Models;

For the official vacancy text, please see 
https://www.universiteitleiden.nl/en/vacancies/2023/qw4/23-81914335phd-candidate-formal-methods-in-natural-language-processing

The application deadline is January 13, 2024. The ideal starting date is in 
March 2024, but can be negotiated depending on circumstances.

For further information feel free to reach out to dr. Gijs Wijnholds 
([email protected]<mailto:[email protected]>).

dr. Gijs Wijnholds
Assistant Professor in Natural Language Processing
Text Mining and Retrieval Group<https://tmr.liacs.nl/>
Leiden Institute of Advanced Computer Science
https://gijswijnholds.github.io
_______________________________________________
Corpora mailing list -- [email protected]
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to [email protected]

Reply via email to