Hi Ken

Thanks for the message. Unfortunately, it looks like there has been no
prior discussions on any of the topics I suggested, and the earliest post I
can access dates back only to 22Nov2020. I can surely start a discussion,
but that might look to be the first/only discussion on the list? (I went
through all the conversations accessible thus far and only saw
announcements.)

Perhaps more importantly:
as this seems to be an issue that could also affect other areas of concern
to the general audience of the Corpora-List (*not just for MWEs/SIGLEX*),
is there a way that we all can make some changes in the "language space"
across the board?

Thanks and best
Ada


On Wed, Feb 8, 2023 at 5:57 PM Ken Litkowski <[email protected]> wrote:

> Dear Ada,
>
> When I added the SIGLEX discussion code back in 2010, I did so with the
> idea that we would have discussion of just like the topic of yours. The
> morph of the discussion now is located on the Google group, via
> https://groups.google.com/g/siglex-members. There, you will find a place
> "Search conversations ..." where you can add your topic so that all will be
> sent. Rather than just the announcements that are the mainly topics.
>
>     Ken (webmaster retiree)
> On 2/8/2023 10:18 AM, Ada Wan via Corpora wrote:
>
> Hi Kilian
>
> Hope all has been well.
>
> I'm surprised that people are still "wording around" nowadays. Some
> suggestions:
>
> 1. Can't we rename "MWEs" to "fixed/idiomatic expressions" instead? One
> can reformulate these as sequences/strings/expressions of various
> lengths/vocabs in characters.
> 2. Also, one can interpret these without information/association with any
> syntactic categories, nouns or verbs etc..
> 3. They do just represent lexical info (some reflecting/encoding
> historico-social habits, though one also should be aware of the ethical
> aspects of reinforcing some "traditional values"). Perhaps a more
> sophisticated view of language could help wean practitioners from a
> mindframe that relies of "linguistic structure(s)" as we've had it thus far
> (i.e. based on "words" and "sentences")?
> 4. Re " their meaning often does not result from the direct combination of
> the meanings of their parts": non-compositionality may be a better
> description of a more realistic view of language, it should prob be our
> default expectation (instead of the cherry-picked compositional
> counterparts).
>
> I think efforts towards mitigating a mental dependency on "words" would be
> a good direction to pursue, what do you think?
> Can we get SIGLEX to update in this regard?
>
> Best
> Ada
>
>
> On Wed, Feb 8, 2023 at 11:12 AM Kilian Evang via Corpora <
> [email protected]> wrote:
>
>> [Apologies for cross-postings]
>>
>>
>> ********************************************************************************
>>
>> Call for Papers: Deadline extended
>>
>> 19th Workshop on Multiword Expressions (MWE 2023)
>>
>> Organized and sponsored by SIGLEX, the Special Interest Group
>> on the Lexicon of the ACL
>>
>> Full-day workshop collocated with EACL 2023, Dubrovnik, Croatia, May 5
>> or 6, 2023
>>
>> Hybrid (on-site & on-line)
>>
>> NEW: Submission deadline: February 20, 2023
>>
>> NEW: Invited speakers announced (see below)
>>
>> NEW: Best paper award (see below)
>>
>> MWE 2023 website: https://multiword.org/mwe2023/
>>
>>
>> ********************************************************************************
>>
>> Multiword expressions (MWEs) are word combinations that exhibit
>> lexical, syntactic, semantic, pragmatic, and/or statistical
>> idiosyncrasies (Baldwin & Kim 2010), such as by and large, hot dog,
>> pay a visit and pull one's leg. The notion encompasses closely related
>> phenomena: idioms, compounds, light-verb constructions, phrasal verbs,
>> rhetorical figures, collocations, institutionalised phrases, etc.
>> Their behaviour is often unpredictable; for example, their meaning
>> often does not result from the direct combination of the meanings of
>> their parts. Given their irregular nature, MWEs often pose complex
>> problems in linguistic modelling (e.g. annotation), NLP tasks (e.g.
>> parsing), and end-user applications (e.g. natural language
>> understanding and MT), hence still representing an open issue for
>> computational linguistics (Constant et al. 2017).
>>
>> For almost two decades, modelling and processing MWEs for NLP has been
>> the topic of the MWE workshop organised by the MWE section of SIGLEX
>> in conjunction with major NLP conferences since 2003. Impressive
>> progress has been made in the field, but our understanding of MWEs
>> still requires much research considering their need and usefulness in
>> NLP applications. This is also relevant to domain-specific NLP
>> pipelines that need to tackle terminologies most often realised as
>> MWEs. Following previous years, for this 19th edition of the workshop,
>> we identified the following topics on which contributions are
>> particularly encouraged:
>>
>> MWE processing and identification in specialized languages and
>> domains: Multiword terminology extraction from domain-specific corpora
>> (Bonin et al. 2010) is of particular importance to various
>> applications, such as MT (Semmar & Laib, 2017), or for the
>> identification and monitoring of neologisms and technical jargon
>> (Chatzitheodorou et al, 2021).  We expect approaches that deal with
>> the processing of MWEs as well as the processing of terminology in
>> specialised domains can benefit from each other.
>>
>> MWE processing to enhance end-user applications: MWEs have gained
>> particular attention in end-user applications, including MT (Zaninello
>> & Birch 2020; Han et al. 2021, 2022), simplification (Kochmar et al.
>> 2020), language learning and assessment (Paquot et al. 2019;
>> Christiansen & Arnon 2017), social media mining (Maisto et al. 2017),
>> and abusive language detection (Zampieri et al. 2020; Caselli et al.
>> 2020). We believe that it is crucial to extend and deepen these first
>> attempts to integrate and evaluate MWE technology in these and further
>> end-user applications.
>>
>> MWE identification and interpretation in pre-trained language models:
>> Most current MWE processing is limited to their identification and
>> detection using pre-trained language models, but we still lack
>> understanding about how MWEs are represented and dealt with therein
>> (Nedumpozhimana & Kelleher 2021; Garcia et al. 2021, Fakharian & Cook
>> 2021), how to better model the compositionality of MWEs from semantics
>> (Moreau et al. 2018). Now that NLP has shifted towards end-to-end
>> neural models like BERT, capable of solving complex tasks with little
>> or no intermediary linguistic symbols, questions arise about the
>> extent to which MWEs should be implicitly or explicitly modelled
>> (Shwartz & Dagan, 2019).
>>
>> MWE processing in low-resource languages: The PARSEME shared tasks
>> (Ramisch et al. 2020; 2018; Savary et al. 2017), among others, have
>> fostered significant progress in MWE identification, providing
>> datasets that include low-resource languages, evaluation measures, and
>> tools that now allow fully integrating MWE identification into
>> end-user applications. A few efforts have recently explored methods
>> for the automatic interpretation of MWEs (Bhatia, et al. 2018; 2017),
>> and their processing in low-resource languages (Liu & Wang 2020; Kumar
>> et al. 2017). Resource creation and sharing should be pursued in
>> parallel with the development of methods able to capitalize on small
>> datasets (Han et al. 2020).
>>
>> Through this workshop, we would like to bring together and encourage
>> researchers in various NLP subfields to submit MWE-related research,
>> so that approaches that deal with processing of MWEs including
>> processing for low-resource languages and for various applications can
>> benefit from each other. We also intend to consolidate the converging
>> effects of previous joint workshops LAW-MWE-CxG 2018, MWE-WN 2019 and
>> MWE-LEX 2020, the joint MWE-WOAH panel in 2021, and the MWE-SIGUL 2022
>> joint session, extending our scope to MWEs in e-lexicons and WordNets,
>> MWE annotation, as well as grammatical constructions. Correspondingly,
>> we call for papers on research related (but not limited) to MWEs and
>> constructions in:
>>
>> Computationally-applicable theoretical work in psycholinguistics and
>> corpus linguistics;
>>
>> Annotation (expert, crowdsourcing, automatic) and representation in
>> resources such as corpora, treebanks, e-lexicons, and WordNets (also
>> for low-resource languages);
>>
>> Processing in syntactic and semantic frameworks (e.g. CCG, CxG, HPSG,
>> LFG, TAG, UD, etc.);
>>
>> Discovery and identification methods, including for specialized
>> languages and domains such as clinical or biomedical NLP;
>>
>> Interpretation of MWEs and understanding of text containing them;
>>
>> Language acquisition, language learning, and non-standard language
>> (e.g. tweets, speech);
>>
>> Evaluation of annotation and processing techniques;
>>
>> Retrospective comparative analyses from the PARSEME shared tasks;
>>
>> Processing for end-user applications (e.g. MT, NLU, summarisation,
>> language learning, etc.);
>>
>> Implicit and explicit representation in pre-trained language models
>> and end-user applications;
>>
>> Evaluation and probing of pre-trained language models;
>>
>> Resources and tools (e.g. lexicons, identifiers) and their integration
>> into end-user applications;
>>
>> Multiword terminology extraction;
>>
>> Adaptation and transfer of annotations and related resources to new
>> languages and domains including low-resource ones.
>>
>>
>> Shared Task
>>
>> We do not have a shared task this year, but a new release of the
>> PARSEME corpus of verbal MWEs is currently underway. We encourage
>> submission of research papers that include analyses of the new edition
>> of the PARSEME data and improvements over the results for PARSEME 2020
>> shared task as well as SemEval 2022 task 2 on idiomaticity prediction.
>>
>>
>> *** Special Track on MWEs in Clinical NLP ***
>>
>> Pursuing the MWE Section’s tradition of synergies with other
>> communities, this year, we are organizing a joint session with the
>> Clinical NLP workshop for shared papers/poster presentations. Since
>> clinical texts contain an important amount of multiword expressions
>> (e.g. medical terms or domain-specific collocations), a joint session
>> is deemed beneficial for both communities. The goal is to foster
>> future synergies that could address scientific challenges in the
>> creation of resources, models and applications to deal with multiword
>> expressions and related phenomena in the specialised domain of
>> ClinicalNLP. Submissions describing research on MWEs in the
>> specialized domain of ClinicalNLP, especially introducing new datasets
>> or new tools and resources, are welcome. Papers accepted in this track
>> will have the option to present their work in the Clinical NLP
>> workshop at ACL 2023 as well, after being presented at MWE 2023.
>>
>>
>> Invited Speakers
>>
>> We are looking forward to invited talks by two amazing speakers:
>>
>> Leo Wanner, Universitat Pompeu Fabra
>>
>> TBD
>>
>>
>> Best paper award
>>
>> All full papers in the workshop will be considered by the program
>> committee for a best paper award. The decision will be announced in
>> the closing session.
>>
>>
>> Submission formats
>>
>> The workshop invites  two types of submissions:
>>
>> archival submissions that present substantially original research in
>> both long paper format (8 pages + references) and short paper format
>> (4 pages + references).
>>
>> non-archival submissions of abstracts describing relevant research
>> presented/published elsewhere which will not be included in the MWE
>> proceedings.
>>
>>
>> Paper submission and templates
>>
>> Papers should be submitted via the workshop's START submission page
>> (https://softconf.com/eacl2023/mwe2023/). Please choose the
>> appropriate submission format (archival/non-archival). Archival papers
>> with existing reviews will also be accepted through the ACL Rolling
>> Review. Submissions must follow the ACL 2023 stylesheet.
>>
>>
>> Archival papers with existing reviews from ACL Rolling Review will
>> also be considered. A paper may not be simultaneously under review
>> through ARR and MWE. A paper that has or will receive reviews through
>> ARR may not be submitted for review to MWE.
>>
>>
>> Important Dates
>>
>> Paper submission: February 20, 2023
>>
>> ARR paper commitment: March 6, 2023
>>
>> Notification of acceptance: March 13, 2023
>>
>> Camera-ready papers due: March 27, 2023
>>
>> Workshop: May 5 or 6, 2023
>>
>>
>> All deadlines are at 23:59 UTC-12 (Anywhere on Earth).
>>
>>
>> Organizing Committee
>>
>> Program chairs: Marcos Garcia, Voula Giouli, Lifeng Han, Shiva Taslimipoor
>>
>> Publication chair: Archna Bhatia
>>
>> Publicity chair: Kilian Evang
>>
>>
>> Anti-harassment policy
>>
>> The workshop follows the ACL anti-harassment policy.
>>
>>
>> Contact
>>
>> For any inquiries regarding the workshop, please send an email to the
>> Organizing Committee at [email protected].
>> _______________________________________________
>> Corpora mailing list -- [email protected]
>> https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
>> To unsubscribe send an email to [email protected]
>>
>
> _______________________________________________
> Corpora mailing list -- 
> [email protected]https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
> To unsubscribe send an email to [email protected]
>
> --
> Ken Litkowski                     TEL.: 301-482-0237
> CL Research                       EMAIL: [email protected]
> 9208 Gue Road                     Home Page: http://www.clres.com
> Damascus, MD 20872-1025 USA       Blog: http://www.clres.com/blog
>
>
_______________________________________________
Corpora mailing list -- [email protected]
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to [email protected]

Reply via email to