Hi Kilian

Hope all has been well.

I'm surprised that people are still "wording around" nowadays. Some
suggestions:

1. Can't we rename "MWEs" to "fixed/idiomatic expressions" instead? One can
reformulate these as sequences/strings/expressions of various
lengths/vocabs in characters.
2. Also, one can interpret these without information/association with any
syntactic categories, nouns or verbs etc..
3. They do just represent lexical info (some reflecting/encoding
historico-social habits, though one also should be aware of the ethical
aspects of reinforcing some "traditional values"). Perhaps a more
sophisticated view of language could help wean practitioners from a
mindframe that relies of "linguistic structure(s)" as we've had it thus far
(i.e. based on "words" and "sentences")?
4. Re " their meaning often does not result from the direct combination of
the meanings of their parts": non-compositionality may be a better
description of a more realistic view of language, it should prob be our
default expectation (instead of the cherry-picked compositional
counterparts).

I think efforts towards mitigating a mental dependency on "words" would be
a good direction to pursue, what do you think?
Can we get SIGLEX to update in this regard?

Best
Ada


On Wed, Feb 8, 2023 at 11:12 AM Kilian Evang via Corpora <
[email protected]> wrote:

> [Apologies for cross-postings]
>
>
> ********************************************************************************
>
> Call for Papers: Deadline extended
>
> 19th Workshop on Multiword Expressions (MWE 2023)
>
> Organized and sponsored by SIGLEX, the Special Interest Group
> on the Lexicon of the ACL
>
> Full-day workshop collocated with EACL 2023, Dubrovnik, Croatia, May 5
> or 6, 2023
>
> Hybrid (on-site & on-line)
>
> NEW: Submission deadline: February 20, 2023
>
> NEW: Invited speakers announced (see below)
>
> NEW: Best paper award (see below)
>
> MWE 2023 website: https://multiword.org/mwe2023/
>
>
> ********************************************************************************
>
> Multiword expressions (MWEs) are word combinations that exhibit
> lexical, syntactic, semantic, pragmatic, and/or statistical
> idiosyncrasies (Baldwin & Kim 2010), such as by and large, hot dog,
> pay a visit and pull one's leg. The notion encompasses closely related
> phenomena: idioms, compounds, light-verb constructions, phrasal verbs,
> rhetorical figures, collocations, institutionalised phrases, etc.
> Their behaviour is often unpredictable; for example, their meaning
> often does not result from the direct combination of the meanings of
> their parts. Given their irregular nature, MWEs often pose complex
> problems in linguistic modelling (e.g. annotation), NLP tasks (e.g.
> parsing), and end-user applications (e.g. natural language
> understanding and MT), hence still representing an open issue for
> computational linguistics (Constant et al. 2017).
>
> For almost two decades, modelling and processing MWEs for NLP has been
> the topic of the MWE workshop organised by the MWE section of SIGLEX
> in conjunction with major NLP conferences since 2003. Impressive
> progress has been made in the field, but our understanding of MWEs
> still requires much research considering their need and usefulness in
> NLP applications. This is also relevant to domain-specific NLP
> pipelines that need to tackle terminologies most often realised as
> MWEs. Following previous years, for this 19th edition of the workshop,
> we identified the following topics on which contributions are
> particularly encouraged:
>
> MWE processing and identification in specialized languages and
> domains: Multiword terminology extraction from domain-specific corpora
> (Bonin et al. 2010) is of particular importance to various
> applications, such as MT (Semmar & Laib, 2017), or for the
> identification and monitoring of neologisms and technical jargon
> (Chatzitheodorou et al, 2021).  We expect approaches that deal with
> the processing of MWEs as well as the processing of terminology in
> specialised domains can benefit from each other.
>
> MWE processing to enhance end-user applications: MWEs have gained
> particular attention in end-user applications, including MT (Zaninello
> & Birch 2020; Han et al. 2021, 2022), simplification (Kochmar et al.
> 2020), language learning and assessment (Paquot et al. 2019;
> Christiansen & Arnon 2017), social media mining (Maisto et al. 2017),
> and abusive language detection (Zampieri et al. 2020; Caselli et al.
> 2020). We believe that it is crucial to extend and deepen these first
> attempts to integrate and evaluate MWE technology in these and further
> end-user applications.
>
> MWE identification and interpretation in pre-trained language models:
> Most current MWE processing is limited to their identification and
> detection using pre-trained language models, but we still lack
> understanding about how MWEs are represented and dealt with therein
> (Nedumpozhimana & Kelleher 2021; Garcia et al. 2021, Fakharian & Cook
> 2021), how to better model the compositionality of MWEs from semantics
> (Moreau et al. 2018). Now that NLP has shifted towards end-to-end
> neural models like BERT, capable of solving complex tasks with little
> or no intermediary linguistic symbols, questions arise about the
> extent to which MWEs should be implicitly or explicitly modelled
> (Shwartz & Dagan, 2019).
>
> MWE processing in low-resource languages: The PARSEME shared tasks
> (Ramisch et al. 2020; 2018; Savary et al. 2017), among others, have
> fostered significant progress in MWE identification, providing
> datasets that include low-resource languages, evaluation measures, and
> tools that now allow fully integrating MWE identification into
> end-user applications. A few efforts have recently explored methods
> for the automatic interpretation of MWEs (Bhatia, et al. 2018; 2017),
> and their processing in low-resource languages (Liu & Wang 2020; Kumar
> et al. 2017). Resource creation and sharing should be pursued in
> parallel with the development of methods able to capitalize on small
> datasets (Han et al. 2020).
>
> Through this workshop, we would like to bring together and encourage
> researchers in various NLP subfields to submit MWE-related research,
> so that approaches that deal with processing of MWEs including
> processing for low-resource languages and for various applications can
> benefit from each other. We also intend to consolidate the converging
> effects of previous joint workshops LAW-MWE-CxG 2018, MWE-WN 2019 and
> MWE-LEX 2020, the joint MWE-WOAH panel in 2021, and the MWE-SIGUL 2022
> joint session, extending our scope to MWEs in e-lexicons and WordNets,
> MWE annotation, as well as grammatical constructions. Correspondingly,
> we call for papers on research related (but not limited) to MWEs and
> constructions in:
>
> Computationally-applicable theoretical work in psycholinguistics and
> corpus linguistics;
>
> Annotation (expert, crowdsourcing, automatic) and representation in
> resources such as corpora, treebanks, e-lexicons, and WordNets (also
> for low-resource languages);
>
> Processing in syntactic and semantic frameworks (e.g. CCG, CxG, HPSG,
> LFG, TAG, UD, etc.);
>
> Discovery and identification methods, including for specialized
> languages and domains such as clinical or biomedical NLP;
>
> Interpretation of MWEs and understanding of text containing them;
>
> Language acquisition, language learning, and non-standard language
> (e.g. tweets, speech);
>
> Evaluation of annotation and processing techniques;
>
> Retrospective comparative analyses from the PARSEME shared tasks;
>
> Processing for end-user applications (e.g. MT, NLU, summarisation,
> language learning, etc.);
>
> Implicit and explicit representation in pre-trained language models
> and end-user applications;
>
> Evaluation and probing of pre-trained language models;
>
> Resources and tools (e.g. lexicons, identifiers) and their integration
> into end-user applications;
>
> Multiword terminology extraction;
>
> Adaptation and transfer of annotations and related resources to new
> languages and domains including low-resource ones.
>
>
> Shared Task
>
> We do not have a shared task this year, but a new release of the
> PARSEME corpus of verbal MWEs is currently underway. We encourage
> submission of research papers that include analyses of the new edition
> of the PARSEME data and improvements over the results for PARSEME 2020
> shared task as well as SemEval 2022 task 2 on idiomaticity prediction.
>
>
> *** Special Track on MWEs in Clinical NLP ***
>
> Pursuing the MWE Section’s tradition of synergies with other
> communities, this year, we are organizing a joint session with the
> Clinical NLP workshop for shared papers/poster presentations. Since
> clinical texts contain an important amount of multiword expressions
> (e.g. medical terms or domain-specific collocations), a joint session
> is deemed beneficial for both communities. The goal is to foster
> future synergies that could address scientific challenges in the
> creation of resources, models and applications to deal with multiword
> expressions and related phenomena in the specialised domain of
> ClinicalNLP. Submissions describing research on MWEs in the
> specialized domain of ClinicalNLP, especially introducing new datasets
> or new tools and resources, are welcome. Papers accepted in this track
> will have the option to present their work in the Clinical NLP
> workshop at ACL 2023 as well, after being presented at MWE 2023.
>
>
> Invited Speakers
>
> We are looking forward to invited talks by two amazing speakers:
>
> Leo Wanner, Universitat Pompeu Fabra
>
> TBD
>
>
> Best paper award
>
> All full papers in the workshop will be considered by the program
> committee for a best paper award. The decision will be announced in
> the closing session.
>
>
> Submission formats
>
> The workshop invites  two types of submissions:
>
> archival submissions that present substantially original research in
> both long paper format (8 pages + references) and short paper format
> (4 pages + references).
>
> non-archival submissions of abstracts describing relevant research
> presented/published elsewhere which will not be included in the MWE
> proceedings.
>
>
> Paper submission and templates
>
> Papers should be submitted via the workshop's START submission page
> (https://softconf.com/eacl2023/mwe2023/). Please choose the
> appropriate submission format (archival/non-archival). Archival papers
> with existing reviews will also be accepted through the ACL Rolling
> Review. Submissions must follow the ACL 2023 stylesheet.
>
>
> Archival papers with existing reviews from ACL Rolling Review will
> also be considered. A paper may not be simultaneously under review
> through ARR and MWE. A paper that has or will receive reviews through
> ARR may not be submitted for review to MWE.
>
>
> Important Dates
>
> Paper submission: February 20, 2023
>
> ARR paper commitment: March 6, 2023
>
> Notification of acceptance: March 13, 2023
>
> Camera-ready papers due: March 27, 2023
>
> Workshop: May 5 or 6, 2023
>
>
> All deadlines are at 23:59 UTC-12 (Anywhere on Earth).
>
>
> Organizing Committee
>
> Program chairs: Marcos Garcia, Voula Giouli, Lifeng Han, Shiva Taslimipoor
>
> Publication chair: Archna Bhatia
>
> Publicity chair: Kilian Evang
>
>
> Anti-harassment policy
>
> The workshop follows the ACL anti-harassment policy.
>
>
> Contact
>
> For any inquiries regarding the workshop, please send an email to the
> Organizing Committee at [email protected].
> _______________________________________________
> Corpora mailing list -- [email protected]
> https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
> To unsubscribe send an email to [email protected]
>
_______________________________________________
Corpora mailing list -- [email protected]
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to [email protected]

Reply via email to