Hi Ken Thanks for the message. Unfortunately, it looks like there has been no prior discussions on any of the topics I suggested, and the earliest post I can access dates back only to 22Nov2020. I can surely start a discussion, but that might look to be the first/only discussion on the list? (I went through all the conversations accessible thus far and only saw announcements.)
Perhaps more importantly: as this seems to be an issue that could also affect other areas of concern to the general audience of the Corpora-List (*not just for MWEs/SIGLEX*), is there a way that we all can make some changes in the "language space" across the board? Thanks and best Ada On Wed, Feb 8, 2023 at 5:57 PM Ken Litkowski <[email protected]> wrote: > Dear Ada, > > When I added the SIGLEX discussion code back in 2010, I did so with the > idea that we would have discussion of just like the topic of yours. The > morph of the discussion now is located on the Google group, via > https://groups.google.com/g/siglex-members. There, you will find a place > "Search conversations ..." where you can add your topic so that all will be > sent. Rather than just the announcements that are the mainly topics. > > Ken (webmaster retiree) > On 2/8/2023 10:18 AM, Ada Wan via Corpora wrote: > > Hi Kilian > > Hope all has been well. > > I'm surprised that people are still "wording around" nowadays. Some > suggestions: > > 1. Can't we rename "MWEs" to "fixed/idiomatic expressions" instead? One > can reformulate these as sequences/strings/expressions of various > lengths/vocabs in characters. > 2. Also, one can interpret these without information/association with any > syntactic categories, nouns or verbs etc.. > 3. They do just represent lexical info (some reflecting/encoding > historico-social habits, though one also should be aware of the ethical > aspects of reinforcing some "traditional values"). Perhaps a more > sophisticated view of language could help wean practitioners from a > mindframe that relies of "linguistic structure(s)" as we've had it thus far > (i.e. based on "words" and "sentences")? > 4. Re " their meaning often does not result from the direct combination of > the meanings of their parts": non-compositionality may be a better > description of a more realistic view of language, it should prob be our > default expectation (instead of the cherry-picked compositional > counterparts). > > I think efforts towards mitigating a mental dependency on "words" would be > a good direction to pursue, what do you think? > Can we get SIGLEX to update in this regard? > > Best > Ada > > > On Wed, Feb 8, 2023 at 11:12 AM Kilian Evang via Corpora < > [email protected]> wrote: > >> [Apologies for cross-postings] >> >> >> ******************************************************************************** >> >> Call for Papers: Deadline extended >> >> 19th Workshop on Multiword Expressions (MWE 2023) >> >> Organized and sponsored by SIGLEX, the Special Interest Group >> on the Lexicon of the ACL >> >> Full-day workshop collocated with EACL 2023, Dubrovnik, Croatia, May 5 >> or 6, 2023 >> >> Hybrid (on-site & on-line) >> >> NEW: Submission deadline: February 20, 2023 >> >> NEW: Invited speakers announced (see below) >> >> NEW: Best paper award (see below) >> >> MWE 2023 website: https://multiword.org/mwe2023/ >> >> >> ******************************************************************************** >> >> Multiword expressions (MWEs) are word combinations that exhibit >> lexical, syntactic, semantic, pragmatic, and/or statistical >> idiosyncrasies (Baldwin & Kim 2010), such as by and large, hot dog, >> pay a visit and pull one's leg. The notion encompasses closely related >> phenomena: idioms, compounds, light-verb constructions, phrasal verbs, >> rhetorical figures, collocations, institutionalised phrases, etc. >> Their behaviour is often unpredictable; for example, their meaning >> often does not result from the direct combination of the meanings of >> their parts. Given their irregular nature, MWEs often pose complex >> problems in linguistic modelling (e.g. annotation), NLP tasks (e.g. >> parsing), and end-user applications (e.g. natural language >> understanding and MT), hence still representing an open issue for >> computational linguistics (Constant et al. 2017). >> >> For almost two decades, modelling and processing MWEs for NLP has been >> the topic of the MWE workshop organised by the MWE section of SIGLEX >> in conjunction with major NLP conferences since 2003. Impressive >> progress has been made in the field, but our understanding of MWEs >> still requires much research considering their need and usefulness in >> NLP applications. This is also relevant to domain-specific NLP >> pipelines that need to tackle terminologies most often realised as >> MWEs. Following previous years, for this 19th edition of the workshop, >> we identified the following topics on which contributions are >> particularly encouraged: >> >> MWE processing and identification in specialized languages and >> domains: Multiword terminology extraction from domain-specific corpora >> (Bonin et al. 2010) is of particular importance to various >> applications, such as MT (Semmar & Laib, 2017), or for the >> identification and monitoring of neologisms and technical jargon >> (Chatzitheodorou et al, 2021). We expect approaches that deal with >> the processing of MWEs as well as the processing of terminology in >> specialised domains can benefit from each other. >> >> MWE processing to enhance end-user applications: MWEs have gained >> particular attention in end-user applications, including MT (Zaninello >> & Birch 2020; Han et al. 2021, 2022), simplification (Kochmar et al. >> 2020), language learning and assessment (Paquot et al. 2019; >> Christiansen & Arnon 2017), social media mining (Maisto et al. 2017), >> and abusive language detection (Zampieri et al. 2020; Caselli et al. >> 2020). We believe that it is crucial to extend and deepen these first >> attempts to integrate and evaluate MWE technology in these and further >> end-user applications. >> >> MWE identification and interpretation in pre-trained language models: >> Most current MWE processing is limited to their identification and >> detection using pre-trained language models, but we still lack >> understanding about how MWEs are represented and dealt with therein >> (Nedumpozhimana & Kelleher 2021; Garcia et al. 2021, Fakharian & Cook >> 2021), how to better model the compositionality of MWEs from semantics >> (Moreau et al. 2018). Now that NLP has shifted towards end-to-end >> neural models like BERT, capable of solving complex tasks with little >> or no intermediary linguistic symbols, questions arise about the >> extent to which MWEs should be implicitly or explicitly modelled >> (Shwartz & Dagan, 2019). >> >> MWE processing in low-resource languages: The PARSEME shared tasks >> (Ramisch et al. 2020; 2018; Savary et al. 2017), among others, have >> fostered significant progress in MWE identification, providing >> datasets that include low-resource languages, evaluation measures, and >> tools that now allow fully integrating MWE identification into >> end-user applications. A few efforts have recently explored methods >> for the automatic interpretation of MWEs (Bhatia, et al. 2018; 2017), >> and their processing in low-resource languages (Liu & Wang 2020; Kumar >> et al. 2017). Resource creation and sharing should be pursued in >> parallel with the development of methods able to capitalize on small >> datasets (Han et al. 2020). >> >> Through this workshop, we would like to bring together and encourage >> researchers in various NLP subfields to submit MWE-related research, >> so that approaches that deal with processing of MWEs including >> processing for low-resource languages and for various applications can >> benefit from each other. We also intend to consolidate the converging >> effects of previous joint workshops LAW-MWE-CxG 2018, MWE-WN 2019 and >> MWE-LEX 2020, the joint MWE-WOAH panel in 2021, and the MWE-SIGUL 2022 >> joint session, extending our scope to MWEs in e-lexicons and WordNets, >> MWE annotation, as well as grammatical constructions. Correspondingly, >> we call for papers on research related (but not limited) to MWEs and >> constructions in: >> >> Computationally-applicable theoretical work in psycholinguistics and >> corpus linguistics; >> >> Annotation (expert, crowdsourcing, automatic) and representation in >> resources such as corpora, treebanks, e-lexicons, and WordNets (also >> for low-resource languages); >> >> Processing in syntactic and semantic frameworks (e.g. CCG, CxG, HPSG, >> LFG, TAG, UD, etc.); >> >> Discovery and identification methods, including for specialized >> languages and domains such as clinical or biomedical NLP; >> >> Interpretation of MWEs and understanding of text containing them; >> >> Language acquisition, language learning, and non-standard language >> (e.g. tweets, speech); >> >> Evaluation of annotation and processing techniques; >> >> Retrospective comparative analyses from the PARSEME shared tasks; >> >> Processing for end-user applications (e.g. MT, NLU, summarisation, >> language learning, etc.); >> >> Implicit and explicit representation in pre-trained language models >> and end-user applications; >> >> Evaluation and probing of pre-trained language models; >> >> Resources and tools (e.g. lexicons, identifiers) and their integration >> into end-user applications; >> >> Multiword terminology extraction; >> >> Adaptation and transfer of annotations and related resources to new >> languages and domains including low-resource ones. >> >> >> Shared Task >> >> We do not have a shared task this year, but a new release of the >> PARSEME corpus of verbal MWEs is currently underway. We encourage >> submission of research papers that include analyses of the new edition >> of the PARSEME data and improvements over the results for PARSEME 2020 >> shared task as well as SemEval 2022 task 2 on idiomaticity prediction. >> >> >> *** Special Track on MWEs in Clinical NLP *** >> >> Pursuing the MWE Section’s tradition of synergies with other >> communities, this year, we are organizing a joint session with the >> Clinical NLP workshop for shared papers/poster presentations. Since >> clinical texts contain an important amount of multiword expressions >> (e.g. medical terms or domain-specific collocations), a joint session >> is deemed beneficial for both communities. The goal is to foster >> future synergies that could address scientific challenges in the >> creation of resources, models and applications to deal with multiword >> expressions and related phenomena in the specialised domain of >> ClinicalNLP. Submissions describing research on MWEs in the >> specialized domain of ClinicalNLP, especially introducing new datasets >> or new tools and resources, are welcome. Papers accepted in this track >> will have the option to present their work in the Clinical NLP >> workshop at ACL 2023 as well, after being presented at MWE 2023. >> >> >> Invited Speakers >> >> We are looking forward to invited talks by two amazing speakers: >> >> Leo Wanner, Universitat Pompeu Fabra >> >> TBD >> >> >> Best paper award >> >> All full papers in the workshop will be considered by the program >> committee for a best paper award. The decision will be announced in >> the closing session. >> >> >> Submission formats >> >> The workshop invites two types of submissions: >> >> archival submissions that present substantially original research in >> both long paper format (8 pages + references) and short paper format >> (4 pages + references). >> >> non-archival submissions of abstracts describing relevant research >> presented/published elsewhere which will not be included in the MWE >> proceedings. >> >> >> Paper submission and templates >> >> Papers should be submitted via the workshop's START submission page >> (https://softconf.com/eacl2023/mwe2023/). Please choose the >> appropriate submission format (archival/non-archival). Archival papers >> with existing reviews will also be accepted through the ACL Rolling >> Review. Submissions must follow the ACL 2023 stylesheet. >> >> >> Archival papers with existing reviews from ACL Rolling Review will >> also be considered. A paper may not be simultaneously under review >> through ARR and MWE. A paper that has or will receive reviews through >> ARR may not be submitted for review to MWE. >> >> >> Important Dates >> >> Paper submission: February 20, 2023 >> >> ARR paper commitment: March 6, 2023 >> >> Notification of acceptance: March 13, 2023 >> >> Camera-ready papers due: March 27, 2023 >> >> Workshop: May 5 or 6, 2023 >> >> >> All deadlines are at 23:59 UTC-12 (Anywhere on Earth). >> >> >> Organizing Committee >> >> Program chairs: Marcos Garcia, Voula Giouli, Lifeng Han, Shiva Taslimipoor >> >> Publication chair: Archna Bhatia >> >> Publicity chair: Kilian Evang >> >> >> Anti-harassment policy >> >> The workshop follows the ACL anti-harassment policy. >> >> >> Contact >> >> For any inquiries regarding the workshop, please send an email to the >> Organizing Committee at [email protected]. >> _______________________________________________ >> Corpora mailing list -- [email protected] >> https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/ >> To unsubscribe send an email to [email protected] >> > > _______________________________________________ > Corpora mailing list -- > [email protected]https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/ > To unsubscribe send an email to [email protected] > > -- > Ken Litkowski TEL.: 301-482-0237 > CL Research EMAIL: [email protected] > 9208 Gue Road Home Page: http://www.clres.com > Damascus, MD 20872-1025 USA Blog: http://www.clres.com/blog > >
_______________________________________________ Corpora mailing list -- [email protected] https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/ To unsubscribe send an email to [email protected]
