- Corpora - ELRA lists

CLASSLA-web 2.0 Web Corpora for South Slavic languages
by Taja Kuzman Pungeršek 02 Mar '26

02 Mar '26

*Dear colleagues, We are happy to announce that we have released the second version of the South Slavic CLASSLA-web corpora. **The corpus collection contains approximately 38 million texts and 17 billion words, collected from the web in 2024, and covers the full South Slavic language group: Bosnian, Bulgarian, Croatian, Macedonian, Montenegrin, Serbian, and Slovenian. Compared to CLASSLA-web 1.0, the new web corpora are significantly expanded and largely consist of new texts. The corpora are linguistically annotated, automatically classified by genre and enriched with topic labels.**The web corpus collection is intended for a wide range of uses, including corpus linguistics, lexicography, and other linguistic research, as well as for natural language processing tasks such as training and evaluating language models, and creating genre- or topic-specific datasets.*** ***A detailed description of the resource can be found in the accompanying paper (https://doi.org/10.48550/arXiv.2601.11170). **Further information on both CLASSLA-web 1.0 and 2.0 versions, including details on corpus construction, additional resources, a video describing the workflow, and citation guidelines, is available on the CLASSLA-web website: https://clarinsi.github.io/classla-web/ **If you are interested in language resources and technologies for South Slavic languages, we invite you to browse the CLASSLA-web corpora via the CLARIN.SI concordancers (**https://www.clarin.si/ske/#open**) or download them **under a CC0 license **from the CLARIN.SI repository: http://hdl.handle.net/11356/2079* *Best wishes, CLASSLA-web authors: Taja Kuzman Pungeršek, Peter Rupnik, Vít Suchomel and Nikola Ljubešić, supported by CLARIN.SI <https://www.clarin.si/info/about/> and CLASSLA <https://www.clarin.si/info/k-centre/>*

1 0

Data in Historical Linguistics Seminar Series – Seminar 4
by Andrea Farina 02 Mar '26

02 Mar '26

The fourth talk of the Data in Historical Linguistics Seminar Series will take place remotely on Monday 9 March 2026 at 5pm GMT. Ilia Afanasev (University of Vienna, Austria) will be presenting on "A study of the diatopic variation in Old East Slavic charters by means of a small unannotated corpus-based language comparison”. Registration for this talk will close at midnight on Friday 6 March and the link for this can be accessed here: https://forms.gle/WXumpbms2BFMkFtu9 Participants will receive a Microsoft Teams link via email on the morning of the talk. The abstract for this talk can be found at this page<https://datainhistoricallinguistics.wordpress.com/2025/12/19/monday-9-march…>. The programme and registration links for all talks in the series can be found on our website: https://datainhistoricallinguistics.wordpress.com/2026-programme/ This seminar series is run by Andrea Farina (King’s College London) and Dr Mathilde Bru and is aimed at PhD students and early career researchers. The purpose of this seminar series is to bring together researchers working on historical linguistics with a quantitative approach, and to discuss current avenues of research in this topic. We hope that these seminars will nurture international collaboration and establish academic ties among researchers working on similar topics in this field. Join our mailing list<https://datainhistoricallinguistics.wordpress.com/join-us/>!

1 0

Call for participation: ‘Statistics for linguistics with R’ bootcamp (06 - 10/07/2026)
by Magali Paquot 02 Mar '26

02 Mar '26

The Linguistics Research Unit of the Institute of Language and Communication (Université catholique de Louvain, Belgium) will be hosting Stefan Gries’s next bootcamp on statistics for linguistics with R from 06 to 10 July 2026. The ‘Statistics for linguistics with R’ bootcamp is a hands-on introduction to statistical methods for both graduate students and seasoned researchers and is loosely based on the third edition (2021) of Gries’s textbook Statistics for linguistics with R. The course is intended for linguists who already have a basic knowledge in statistics and some experience using R and who wish to improve their proficiency in statistical modeling of linguistic data. Using the open source software and programming language R, we will deal with: • fundamental aspects of fixed effects regression modeling for both numeric and binary response variables; these include exploration of data and their preparation for modeling, model formulation and selection; numerical and visual interpretation and evaluation of models; • more advanced aspects of fixed-effects regression modeling such as contrasts for ordinal predictors, orthogonal contrasts, curvature of numeric predictors, and maybe general linear hypothesis tests; • the theoretical foundations of mixed-effects regression modeling; • applications of mixed-effects modeling for both numeric and binary response variables; • tree-based methods and random forests: 'fitting' and interpreting them with importance scores, partial dependence scores, and detecting (not just capturing) interactions. Online registration will start on 2 March 2026, 1 pm CET (today!). The number of participants is limited. https://www.uclouvain.be/en/research-institutes/ilc/cecl/rling2026 Contact email : magali.paquot(a)uclouvain.be<mailto:magali.paquot@uclouvain.be> Magali Paquot Convenor

1 0

GRACE@IberLEF2026: Clinical Argument Mining shared task in Spanish connecting Explainable AI and Evidence-Based Medicine
by aitziber.atucha＠ehu.eus 02 Mar '26

02 Mar '26

Registration open!! ######################################################## GRACE@IberLEF2026: https://www.codabench.org/competitions/13280/ ######################################################## ****We apologize for multiple postings of this e-mail**** GRACE@IberLEF2026 announces the first edition of a novel task on Argument Mining shared task in Spanish connecting Explainable AI and Evidence-Based Medicine across clinical trials and medical licensing examinations. ⚗️ Argument Mining Argument Mining automatically extracts claims and evidence from clinical text and reveals how they support or challenge each other, enabling transparent, traceable clinical reasoning. 🌍 Spanish, First GRACE is the first Argument Mining task in Spanish for the clinical domain, filling a key gap in multilingual biomedical NLP with fine-grained, entity-level annotations. Track 01 🔬 Clinical Trial Evidence & Argumentation This track focuses on abstracts of Randomized Controlled Trials (RCTs). Their standardized design, contrasting an intervention with a control group, provides a transparent path from data to conclusions, making argumentative components more accessible to automated systems. Goal: Identify argumentative components (claims and premises) and detect support/attack relations at the sentence level. Track 02 🩺 Clinical Case Reasoning (MIR) This track uses cases from the MIR (Médico Interno Residente) exam, Spain's national medical specialization test. Each instance pairs a dense clinical narrative with five competing diagnostic or treatment options, only one of which is correct. Goal: Extract fine-grained evidence spans that justify the correct option while refuting the incorrect alternatives. 📅 Important Dates 📂 Release of Training & Dev Sets March 18 🚀 Official Test Set Release April 22 ⏰ Deadline for Result Submission May 3 📊 Publication of Results May 8 📄 System Paper Submission May 24 ✅ Notification of Acceptance June 17 🎤 IberLEF Workshop (at SEPLN) September 22

1 0

HPSG 2026 -- Second CfP
by Antonio Machicao y Priemer 28 Feb '26

28 Feb '26

SECOND CfP: The 33rd International Conference on Head-Driven Phrase Structure Grammar (Norway) Short Title: HPSG 2026 Date: 03-Aug-2026 - 04-Aug-2026 Location: Western Norway University of Applied Sciences (Bergen, Norway) Contact: Petter Haugereid, Berthold Crysmann & Antonio Machicao y Priemer Email: hpsg2026(a)easychair.org Conference Website: https://petterha.github.io/hpsg2026/ Conference fee: 68€ (faculty) / 43€ (student) Linguistic Field(s): General Linguistics; Linguistic Theories; Computational Linguistics; Syntax; Morphology; Semantics; Cognitive Science; Meeting Description: The 33rd International Conference on Head-Driven Phrase Structure Grammar will be held on August 03-04 August 2026 at the Western Norway University of Applied Sciences (Bergen, Norway). The HPSG 2026 conference will be a two-day main conference (03-04 August). It will be co-located with the DELPH-IN meeting held over the preceding week (27-31 July). Anonymous abstracts are invited that address linguistic, foundational, or computational issues relating to or in the spirit of the framework of Head-Driven Phrase Structure Grammar. Submissions should be 4 pages long, + 1 page for data, figures & references. They should be submitted in PDF format. The submissions should not include the authors’ names, and authors are asked to avoid self-references. Presentations are in-person by default, although exceptions can be negotiated. All abstracts should be submitted by 10 April 2026, via Easychair: https://easychair.org/conferences/?conf=hpsg2026 Deadline for abstracts: 10 April 2026 Reviews due: 10 May 2026 Notification of acceptance: 15 May 2026 Conference: 03-04 August 2026 Keynote speakers: * Dag Trygve Truslew Haug (Universitetet i Oslo, Norway) * Nurit Melnik (Open University, Israel) Conference proceedings submission: 15 October 2026 All abstracts will be reviewed anonymously by at least two reviewers. Each accepted abstract will be given 30 minutes for presentation. Additionally, 10 minutes will be reserved for discussion. A call for contributions to the proceedings will be issued after the conference. The proceedings will undergo a separate (final) round of reviews (accept/reject), to enable indexing of the proceedings. The proceedings of previous conferences are available at: https://proceedings.hpsg.xyz/ Programme Committee: - Anne Abeillé (LLF, Université de Paris) - Gabrielle Aguila-Multner (Universität Zürich) - Emily M. Bender (University of Washington) - Gabriela Bîlbîie (University of Bucharest) - Felix Bildhauer (Institut für Deutsche Sprache Mannheim) - Olivier Bonami (Universite Paris Diderot) - Francis Bond (Palacký University) - Rui Chaves (University at Buffalo, SUNY) - Berthold Crysmann (CNRS - LLF, Université de Paris) - Petter Haugereid (Western Norway University of Applied Sciences) - Fabiola Henri (University at Buffalo) - Anke Holler (University of Göttingen) - Jong-Bok Kim (Kyung Hee University) - Jean-Pierre Koenig (University at Buffalo, The State University of New York) - Andy Lücking (Goethe University Frankfurt) - Antonio Machicao y Priemer (Humboldt-Universität zu Berlin) - Jakob Maché (Universidade de Lisboa) - Nurit Melnik (The Open University of Israel) - Luis Morgado Da Costa (Palacký University Olomouc) - Stefan Müller (Humboldt-Universität zu Berlin) - Tsuneko Nakazawa (The University of Tokyo) - Joanna Nykiel (UC Davis) - David Oshima (Nagoya University) - Gerald Penn (University of Toronto) - Frank Richter (Goethe Universität Frankfurt) - Manfred Sailer (Goethe Universität Frankfurt) - Frank Van Eynde (Katholieke Universiteit Leuven) - Giuseppe Varaschin (Humboldt-Universität zu Berlin) - Elodie Winckel (Friedrich-Alexander Universität Erlangen-Nürnberg) - Shûichi Yatabe (The University of Tokyo) - Eun-Jung Yoo (Seoul National University) - Olga Zamaraeva (Universidade da Coruña) -- Dr. Antonio Machicao y Priemer Department of German Studies and Linguistics - Humboldt-Universität zu Berlin Homepage: https://hu.berlin/aMyP Project: Building register into the architecture of language – an HPSG account (CRC 1412, Project A04) Series: Textbooks in Language Science (https://langsci-press.org/catalog/series/tbls)

1 0

2nd Call for Abstracts -- 2026 NARNiHS Research Incubator -- North American Research Network in Historical Sociolinguistics
by Lauersdorf, Mark R. 28 Feb '26

28 Feb '26

2nd CALL FOR ABSTRACTS [only three weeks left to submit] *** 2026 NARNiHS Research Incubator *** North American Research Network in Historical Sociolinguistics *** 8th edition *** 07-09 May 2026 -- entirely online! ==> Abstract Submission Deadline ==> 23 March 2026, 11:59 PM (U.S. Eastern Time) The 2026 NARNiHS Research Incubator is an entirely online event (**with free registration**). This event offers an opportunity for scholars in historical sociolinguistics from all over the world to participate in discussions of cutting-edge research without the limitations imposed by international travel. We encourage our fellow historical sociolinguists and scholars from related fields in our global scholarly community to join us online for our Research Incubator this spring. Abstract submission deadline: 23 March 2026, 11:59 PM (U.S. Eastern Time) Abstract submission online: https://easyabs.linguistlist.org/submit/2026_Incubator/ The North American Research Network in Historical Sociolinguistics (NARNiHS) is accepting abstracts for its 2026 NARNiHS Research Incubator. The 8th edition of this inclusive NARNiHS event seeks to provide a collaborative environment where presenters bring work that is in-progress, exploratory, proof-of-concept, prototyping. The Incubator's audience actively participates in workshopping these new ideas, brainstorming along with the presenters to forge scholarly paths and develop research solutions. We see the NARNiHS Research Incubator as a place for testing and pushing boundaries; developing new theories, methods, models, and tools in historical sociolinguistics; seeking feedback from peers; and engaging in productive assessment of fledgling ideas and nascent projects. NARNiHS welcomes papers in all areas of historical sociolinguistics, which is understood as the application/development of sociolinguistic theories, methods, and models for the study of historical language variation and change over time, or more broadly, the study of the interaction of language and society in historical periods and from historical perspectives. Thus, a wide range of linguistic areas, subdisciplines, and methodologies easily find their place within the field, and we encourage submission of abstracts that reflect this broad scope. Successful abstracts for this research incubator environment will demonstrate thorough grounding in historical sociolinguistics, scientific rigor in the formulation of research questions, and promise for rich discussion of ideas. Abstracts should be explicit about which theoretical frameworks, methodological protocols, and analytical strategies are being applied or critiqued. Data sources and examples should be sufficiently (if briefly) presented, so as to allow reviewers a full understanding of the scope and claims of the research. Please note that the connection of your research to the field of historical sociolinguistics should be explicitly outlined in your abstract. Abstracts should not exceed one page (not including examples and references, see below). Failure to adhere to these criteria will likely result in rejection. We are soliciting abstracts for 25-minute presentations. Presenters will have the entire 25 minutes for their presentations, with discussion happening in the "incubation session" at the end of each panel. Presentations will be grouped into thematic panels of three presentations, each panel followed by an hour-long discussion with the audience led by specialists, creating a brainstorming/workshopping environment that encourages maximum exchange of ideas. Discussion will encompass specific feedback on the individual papers as well as consideration of overarching questions of theory, methods, and models emerging from the papers. To facilitate such discussion, authors will be required to submit a draft of their presentation materials for distribution to the panel discussants and to the other presenters a few days prior to the start of the conference. Abstracts will be accepted until Monday, 23 March 2026 -- late abstracts will not be considered. *** Abstract Content Requirements: 1) Abstracts should be explicit about which theoretical frameworks, methodological protocols, and analytical strategies are being applied or critiqued. 2) Data sources and examples should be sufficiently (if briefly) presented, so as to allow reviewers a full understanding of the scope and claims of the research. 3) The connection of your research to the field of historical sociolinguistics should be explicitly outlined. *** Abstract Format Guidelines: 1) Abstracts must be submitted in PDF format. 2) Abstracts must fit on one standard 8.5×11 inch or A4 page, with margins no smaller than 1 inch / 2.5 cm and a font style and size no smaller than Times New Roman 12-point. All additional content (visualizations, trees, tables, figures, captions, examples, and references) must fit on a single (1) additional page. No exceptions to these requirements are allowed; abstracts exceeding these limits will be rejected without review. 3) Anonymize your abstract. We realize that sometimes complete anonymity is not attainable, but there is a difference between the nature of the research creating an inability to anonymize and careless non-anonymizing (in citations, references, file names, etc.). Be sure to anonymize your PDF file (you may do so in Adobe Acrobat Reader by clicking on "File", then "Properties", removing your name if it appears in the "Author" line of the "Description" tab, and re-saving the file before submission). Do not use your name when saving your PDF (e.g. Smith_Abstract.pdf); file names will not be automatically anonymized by the EasyAbs system. Rather, use non-identifying information in your file name (e.g. HistSoc4Lyfe.pdf). Your name should only appear in the online form accompanying your abstract submission. Papers that are not sufficiently anonymized wherever possible will be rejected without review. *** General Conference Requirements: 1) Abstracts must be submitted electronically, using the following link: https://easyabs.linguistlist.org/submit/2026_Incubator/ 2) Papers must be delivered as projected in the abstract or represent bona fide developments of the same research. 3) Authors are expected to virtually attend the conference and present their own papers. 4) Presentations will be delivered via Zoom. Technical details and instructions regarding the platform will be sent to authors in due time. Please contact us at NARNiHistSoc(a)gmail.com with any questions.

1 0

CfP: RiCL Journal special issue - Learner Corpus Research meets the Common European Framework of Reference for Languages and the Companion Volume
by María Belén Díez Bedmar 28 Feb '26

28 Feb '26

*CFPs special issue of Research in Corpus Linguistics (RiCL)* *‘Learner Corpus Research meets the Common European Framework of Reference for Languages and the Companion Volume’* Due to the importance of the *Common European Framework of Reference for Languages* (CEFR) and the *Companion Volume* (CV) (Council of Europe, 2001, 2020) for the learning, teaching and assessment of languages, most learner corpora nowadays employ the learner’s CEFR level to specify the student’s communicative language competence level or proficiency level. Most learner corpora compiled in CEFR-aligned high-stakes foreign language accreditation/certification exams - for instance, the *Cambridge Learner Corpus* (O’Keeffe & Mark, 2017), the *FineDesc Learner Corpus *(Díez-Bedmar, 2025) - are composed of learners of English production in successful certification/accreditation exams at different CEFR levels. Other target languages are compiled in similar foreign language accreditation/certification exam conditions, such as the CELI corpus for Italian (Spina et al., 2023) or the Merlin corpus for Italian, German and Czech (Wisniewski, 2020). Other learner corpora composed of accreditation/certification exams that are not aligned to the CEFR or learner corpora compiled in other contexts have also been partially or fully aligned to the CEFR levels, as reported by Gablasova et al., (2019) regarding the *Trinity Lancaster* *Corpus*, Thewissen (2013) concerning *ICLE* or Tono (2018) as to the *JEFLL corpus*. The use of LCR results from CEFR-aligned learner corpora to inform or facilitate the implementation of the CEFR/CV is, however, still limited. Among the most important LCR contributions in this respect are those by the *English Profile Project* (Salamoura, 2008) and the CEFR-J Project (Tono, 2019). The former used the *Cambridge Learner Corpus* to provide linguistic information on the language produced at each CEFR level (UCLES/CUP, 2011) and freely available online resources (eg., the English Grammar Profile and the English Vocabulary Profile). The latter employed learner corpora, among other corpora types, to adapt the CEFR for the L1 Japanese EFL context (12 sublevels), create the Vocabulary Profile and the Grammar Profile. Despite these efforts, CEFR/CV end-users and stake-holders find difficulties in the implementation of the CEFR/CV in their L1-contexts, on most occasions due to the language neutral nature of the CEFR/CV descriptors (see Díez-Bedmar & Byram, 2019; Díez-Bedmar & Luque-Agulló, 2023; Luque-Agulló & Díez-Bedmar, 2025). It is for this reason that they demand fine-tuned descriptors, i.e., CEFR/CV descriptors informed by CEFR-aligned L1-specific learner corpus results (Díez-Bedmar, 2018). In the Spanish context, the *FineDesc Project* (Grant PID2020-117041GA-I00, funded by MICIU/AEI/10.13039/501100011033) has provided L1 Spanish CEFR/CV end-users and stake-holders’ with fine-tuned descriptors thanks to the analysis of the 1.3-million-word *FineDesc Learner* *Corpus* (Díez-Bedmar, 2025), a freely available L1-specific learner corpus by L1 Spanish monolingual students (or bilingual L1 Spanish/a co-official language in Spain). The learner corpus results have informed fine-tuned descriptors not only for the linguistic competence, but also for the sociolinguistic and pragmatic ones, when students engage in different communicative language activities at B1, B2 and C1 levels (see Díez-Bedmar et al., 2026). These fine-tuned descriptors aim at paving the way for the implementation of the CEFR/CV in the L1 Spanish context. These are just some examples which show how LCR may inform the CEFR/CV and facilitate its implementation in different contexts. Other efforts are being made by the LCR community by using either general CEFR-aligned learner corpora or L1-specific ones, as shown in some papers presented at the International Online Conference ‘Bringing together research on the CEFR/CV and LCR: a focus on descriptors’ which was organized by the FineDesc Project (https://web.ujaen.es/investiga/finedesc/index.php). It is the objective of this special issue to bring together research on the different ways how LCR may meet the CEFR/CV. Contributions which employ any reliably CEFR-aligned learner corpus with this objective in mind are welcome, whether they were presented at the conference or not. Researchers who do not have access to any CEFR-aligned learner corpus are encouraged to use the *FineDesc Learner Corpus*, freely available at www.finedesc.com. *Potential topics for this special issue are (but are not limited to):* -The fine-tuning of CEFR/CV descriptors for L1-specific contexts thanks to LCR. -The integration of CAF results in fine-tuned descriptors. -The (cross-sectional) analysis of CEFR-aligned learner corpora considering the linguistic, sociolinguistic or pragmatic competences to inform CEFR-aligned pedagogical resources/language assessment. -The exploitation of CEFR-aligned learner corpora to design tools/software which may help analyse learner corpora at different CEFR levels. -The overcoming of any difficulties in CEFR/CV implementation with the help of LCR. *Important dates* Deadline for proposals: April 30, 2026 Outcome of proposal review: May 21, 2026 Deadline for manuscript first drafts: December 1,2026 Notification of reviewer outcome: March 5, 2027 Deadline for manuscript final drafts: May 28th, 2027 Special issue publication: Autumn 2027 *Proposal format and submission* Potential contributors should prepare an extended 800-word abstract of the proposed paper following RiCL’s submission guidelines, which can be found at https://ricl.aelinco.es/index.php/ricl/about/submissions The abstract should include a tentative title, motivate the study, state the research questions, provide methodological information (learner corpus or learner corpora analysed and the procedure employed to align it/them to CEFR/CV levels as well as the statistical tests employed), tentative results and clear information on the way how the results inform the CEFR/CV and its implementation. Please submit your proposal to the special issue editor, María Belén Díez-Bedmar (belendb(a)ujaen.es), before the deadline (April 30, 2026) including in your proposal your name(s), email(s) and affiliation(s). *Peer review* All accepted manuscripts will undergo double-blind peer review. *References* Council of Europe (2001). Common European Framework of Reference for Languages: learning, teaching, assessment. Council of Europe Publishing. Council of Europe (2020). Common European Framework of Reference for Languages: learning, teaching, assessment – Companion Volume. Council of Europe Publishing. Díez-Bedmar, M. B. (2018). Fine-tuning descriptors for CEFR B1 level: insights from learner corpora. *ELT Journal*, 72(2), 199-209. https://doi.org/10.1093/elt/ccx052 Díez-Bedmar, M. B. (2025). FineDesc Learner Corpus 2.0 (España, 2510243469857). SafeCreative. https://www.safecreative.org/validity Díez-Bedmar, M. B., & Byram, M. (2019). The current influence of the CEFR in secondary education: teachers’ perceptions. *Language, Culture and Curriculum*,* 32*(1), 1–15. https://doi.org/10.1080/07908318.2018.1493492 Díez-Bedmar, M. B., Laso-Martín, N. J., Maíz-Arévalo, C., & Carrió-Pastor, M. L. (2026). Supplement to the Common European Framework of Reference for Languages and the Companion Volume: L1 Spanish users (Spain, 2601214327051). SafeCreative. https://www.safecreative.org/validity Díez-Bedmar, M. B. & Luque-Agulló, G. (2023). Analysing the CEFR/CV in University Language Centres in Spain: The Raters' Perspective. In M. Fernández Álvarez & A. L. Gordenstein Montes (Eds.), *Global Perspectives on Effective Assessment in English Language Teaching* (pp. 1-33). IGI Global. Gablasova, D., Brezina, V., & McEnery, T. (2019). The Trinity Lancaster Corpus. Development, description and application. *International Journal of Learner Corpus Research*, *5*(2), 126-158. https://doi.org/10.1075/ijlcr.19001.gab Luque-Agulló, G. & Díez-Bedmar, M.B. (2025). Listening to the teachers: CEFR implementation in University language Centres in Spain. *Revista de Lingüística y Lenguas Aplicadas* (RLyLA),*20*, 72-86. https://doi.org/10.4995/rlyla.2025.21285 O’Keeffe, A., and Mark, G. 2017. The English Grammar Profile of learner competence: Methodology and key findings. *International Journal of Corpus Linguistics,* *22*(4), 457-489. https://doi.org/10.1075/ijcl.14086.oke Salamoura, A. (2008). Aligning English Profile research data to the CEFR. Cambridge ESOL: *Research Notes*, *33*, 5–7. Spina, S., Fioravanti, I., Forti, L., & Zanda, F. (2023). The CELI corpus: Design and linguistic annotation of a new online learner corpus. *Second Language Research*, 40(2), 457-477. https://doi.org/10.1177/02676583231176370 Thewissen, J. (2013). Capturing L2 accuracy developmental patterns: Insights from an error-tagged EFL learner corpus. *The Modern Language Journal*, *97*(S1), 77-101. Tono, Y. (2018). Corpus approaches to L2 Learner Profiling Research. In Y. N. Leung, J. Katchen, S. Y. Hwang & Y. Chen (Eds.), Reconceptualizing English language teaching and learning in the 21st century: A special monograph in memory of Professor Kai-Chong Cheung (pp. 390-402). Taipei, Taiwan: Crane Publishing Company. Tono, Y. (2019). Coming full circle – from CEFR to CEFR-J and back. CEFR Journal. *Research and Practice*, *1*, 5-17. https://doi.org/10.37546/JALTSIG.CEFR1-1 UCLES/CUP (2011). *English Profile. Introducing the CEFR for English Version 1.1*. Cambridge University Press. Wisniewski, K. (2020). SLA developmental stages in the CEFR-related learner corpus MERLIN: Inversion and verb-end structures in German A2 and B1 learner texts. *International Journal of* *Learner Corpus Research*, *6*(1), 1–17. https://doi.org/10.1075/ijlcr.18008.wis

1 0

IberLef 2026 - Shared Task: Multilingual Easy to Read Translation - MER-TRANS-2026 - Registration Open
by Horacio Saggion 28 Feb '26

28 Feb '26

*MER-TRANS-2026: https://lastus-taln-upf.github.io/mertrans-iberlef-2026/ <https://lastus-taln-upf.github.io/mertrans-iberlef-2026/>IBERLEF 2026: https://sites.google.com/view/iberlef-2026 <https://sites.google.com/view/iberlef-2026>Apologies for cross-postingWe are pleased to announce the launching of the Shared Task: Multilingual Easy to Read Translation (MER-TRANS) in the context of the IberLef 2026 Evaluation Forum. - Context* *Linguistic access to information is increasingly recognised as a fundamental citizens’ right. For example, the United Nations Convention on the Rights of Persons with Disabilities (UNCRPD) includes accessibility as one key enabler for a more inclusive society, while the European Union adopted the European Accessibility Act (EAA) in 2019, which requires that many everyday products and services, including digital information services, comply with accessibility standards. The UNCRPD and the EAA stress that information has to be provided in accessible language, such as plain language or easy-to-read language in order to allow the different population segments with language comprehension difficulties to exercise their recognized right to participate in society and public life. Providing information in easy language in a real-world setting is challenging. High-quality, easy-to-read content must typically be written or translated by human experts following specialized guidelines, and then often validated by target readers. This manual process requires considerable effort, time, and expertise, which limits how much content can be produced or translated into easy formats. Therefore, this MER-TRANS shared task aims at advancing the state of the art in the field of automatic easy-to-read translation with specific emphasis on Romance languages, more concretely, in Catalan, Italian, and Spanish. * * - Task OverviewIn this shared task, we invite participating teams to automatically produce easy to read versions of texts and/or sentences. The texts will be sampled from the iDEM corpus, a multilingual corpus in the domain of democratic participation, which has been simplified by human experts following easy-to-read recommendations and high-quality validation procedures. There will also be a surprise language task to be revealed closer to the release of the test data. Up to three submissions per language will be allowed per participant team. - Data and EvaluationUnlike previous challenges where the data remained of restricted use, the iDEM corpus, with original and adapted versions, will be made fully available to the community, the documents for participants to produce adaptations will be provided during the test phase, and the full dataset during the IberLef workshop in September 2026. A small trial dataset will be released; it will be sampled from the iDEM corpus, considering the occurrence of different difficulties addressed by the easy-to-read adaptations. The examples will contain both the original text excerpts and easy-to-read versions. The test datawill consist of only the original complex text excerpts. Evaluation will be carried out with automatic metrics borrowed from current text simplification evaluation research, such as SARI and/or BLEUE and semantic similarity scores when appropriate. - System Description Papers and ProceedingsAll participating teams will be invited to submit a system description paper describing their methods, models, and experimental findings: further information about formatting and length will be given in due time. Submitted papers will be reviewed by at least two peer reviewers.Papers will be required to describe fully Reproducible solutions which contribute to Open Science. Accepted system description papers will be published at no cost in the Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2026), hosted by CEUR Workshop Proceedings (CEUR-WS.org).Authors of accepted papers will be invited to present their work at the IberLEF 2026 workshop. Presentation at the workshop is encouraged but not mandatory for publication in the proceedings. - Who Should Participate?Participation is open to all. - Important DatesRegistration opens: Open nowRegistration closes: 5 March 2026Trial data release: 06 Mar 2026Test data release: 06 Apr 2026System outputs submission deadline : 13 Apr 2026Evaluation results published: 27 Apr 2026System Description Papers: 01 Jun 2026Papers acceptance: 14 Jun 2026Camera-ready papers due: 21 Jun 2026IberLEF 2026 Workshop: 22 Sept 2026 - How to ParticipateIn order to participate, teams must register using the registration form available at the task website https://lastus-taln-upf.github.io/mertrans-iberlef-2026/ <https://lastus-taln-upf.github.io/mertrans-iberlef-2026/>Note that registration is mandatory in order to access the data and submit system outputs. - Task OrganizersHoracio Saggion — Universitat Pompeu Fabra, SpainNelson Pérez Rojas — Universidad de Costa Rica, Costa RicaStefan Bott — Universitat Pompeu Fabra, Spain Nouran Khallaf — University of Leeds, UKMehrzad Tareh — Universitat Pompeu Fabra, SpainDaniel Adanza — Universitat Pompeu Fabra, SpainAlmudena Rascon — Plena Inclusion Madrid, SpainSandra Szasz — Universitat Pompeu Fabra, Spain* -- Horacio Saggion Full Professor / Chair in Computer Science and Artificial Intelligence Head of the Natural Language Processing Group - TALN Project Coordinator iDEM Project (HE) Co-PI of the AI-BOOST project (HE) PI of the IDEAL project (HE) Universitat Pompeu Fabra https://twitter.com/h_saggion https://www.linkedin.com/in/horacio-saggion-1749b916 -- Horacio Saggion Full Professor / Chair in Computer Science and Artificial Intelligence Head of the Natural Language Processing Group - TALN Project Coordinator iDEM Project (HE) Co-PI of the AI-BOOST project (HE) PI of the IDEAL project (HE) Universitat Pompeu Fabra https://twitter.com/h_saggion https://www.linkedin.com/in/horacio-saggion-1749b916 -- Horacio Saggion Full Professor / Chair in Computer Science and Artificial Intelligence Head of the Natural Language Processing Group - TALN Project Coordinator iDEM Project (HE) Co-PI of the AI-BOOST project (HE) PI of the IDEAL project (HE) Universitat Pompeu Fabra https://twitter.com/h_saggion https://www.linkedin.com/in/horacio-saggion-1749b916

1 0

CfP - Journal of Procesamiento del Lenguaje Natural and SEPLN 2026
by Martínez Cámara 28 Feb '26

28 Feb '26

*Call for papers for issue 77 of the journal Procesamiento del Lenguaje Natural and 42nd SEPLN Conference* http://www.sepln.org/en/journal http://www.sepln.org/en/journal/author-guidelines SEPLN 2026 Conference website: https://sepln2026.org Introduction The journal Procesamiento del Lenguaje Natural aims to provide an international forum for the dissemination of high-quality scientific and technical contributions in the field of Natural Language Processing (NLP). It welcomes original, unpublished work that has not been submitted simultaneously to other journals or conference proceedings. The journal promotes the advancement of research in NLP, fosters the exchange of innovative ideas, supports the identification of emerging research directions, and showcases real-world applications and technological developments in this rapidly evolving discipline. Each year, the Spanish Society for Natural Language Processing (SEPLN) publishes two issues of the journal, including original research articles, project descriptions, book reviews, and summaries of doctoral theses. The scientific quality of the journal is supported by its indexing in major international databases, including the 2024 JCR index (JIF: 1.3, JCI: 0.48, Q2-Linguistics – Q4-Computer Sciences, Artificial Intelligence ESCI), the SCImago Journal Ranking (2024 SJR: 0.570, Q2-Computer Science Applications, Q1-Linguistics and Language), the Scopus Index (2024 CiteScore: 7.3), and the SNIP index with 1.61 points. More information at: http://www.sepln.org/en/journal/quality. Topics - NLP for low-resource languages - Efficient and sustainable NLP methods - Ethics, Bias and Fairness in NLP - Trustworthy and explainability in NLP - Security and privacy in NLP - Text and Multimodal Generation - Language Modeling - Multimodality and Language Grounding to Vision - Knowledge and common sense - Computational lexicography and terminology - Linguistic theories, Cognitive Modeling and Psycholinguistics - Morphological and Syntactic analysis - Corpus linguistics - Development of linguistic resources and tools - Semantics, pragmatics, and discourse - Machine translation - Speech synthesis and recognition - Audio indexing and retrieval - Dialogue systems and interactive systems/ Conversational assistants - Monolingual and multilingual information extraction and retrieval - Question answering systems - Automatic textual content analysis - Sentiment analysis, opinion mining and argument mining - Plagiarism detection - Negation and speculation processing - Text summarization - Text simplification - Image retrieval - NLP in specific domains (Medicine, Law, Education…) Submission Information The proposal must be submitted by March 23rd, 2026, and must meet certain format and style requirements. All submissions must be in PDF format and submitted electronically using the OpenReview system. Submitted papers will be subjected to a blind review by at least three members of the program committee. Categories of papers Regular papers with original contributions will be accepted for publication in the Journal of Procesamiento del Lenguaje Natural. Research projects & system demonstration papers will be published in the CEUR proceedings Information for Authors The proposals of regular papers with original contributions can be written in Spanish or English and should be at most 10 A4-size pages of content, plus unlimited pages for references. The papers must include the following sections: - The title of the communication (in English and Spanish). - An abstract in English and Spanish (maximum 150 words). - A list of keywords or related topics (in English and Spanish). - The documents must not include headers or footers. As reviewing will be doubled-blind, the paper should not include the authors’ names and affiliation. Furthermore, self-references that reveal the author’s identity should be avoided. The articles should only include the title, the abstract, the keywords and the proposal. We recommend using the LaTeX and Word templates that can be downloaded from the SEPLN web (author guidelines have been updated): http://www.sepln.org/index.php/en/journal/author-guidelines SEPLN Workshops Proceedings. The high quality papers will be published in the issue 77 of the journal Procesamiento del Lenguaje Natural. Those papers that the reviewers assess as quality papers, but without the consideration of high quality papers will be published in the CEUR proceedings. These works will be presented as posters in the 42nd SEPLN Conference. Research project & System Demonstration Session. This session will be focused on ongoing NLP research projects and system demonstration papers. The summaries of ongoing research projects must include the following information: - Project title. - Author name, affiliation and contact information. The review of this kind of paper is not blind review. - Funding institutions. - Research Groups participating in the project. - Language: English. We will not accept research project summaries in Spanish or other languages. - An abstract of a maximum of 150 words and a list of keywords. - Minimum length: 5 A4-size pages. - Maximum length: 6 A4-size pages (including references). The System demonstration papers must be related to NLP applications, and they must describe the technical details and the NLP components used or developed. The paper must be written in English, the minimum length of the paper must be 5 A4-size pages and the maximum length is 6 A4-size pages of content with the references included. The research project summaries and the system demonstration papers will be published in the SEPLN Workshop Proceedings. Accordingly, the paper format must be in accordance with the CEUR template. We have adapted the CEUR Latex Template to SEPLN 2026 and you can download it here (LaTeX, Word). Note on camera ready The final version of the paper (camera ready) should be submitted together with a cover letter explaining how the suggestions of the reviewers were implemented in the final version. This cover letter will be considered in order to accept or finally reject the selected paper. Preprint policy The Journal allows the publication of preprints (non-refereed paper posted online, such as ArXiv) anytime, but during the review period the preprint must indicate that the paper it is “under review” in the Journal Procesamiento del Lenguaje Natural. Likewise, if the paper is accepted, the preprint must be updated with the DOI, name of the Journal and the bibliographic information of the paper. Important dates - Submission deadline: March 23rd, 2026 - Notification of acceptance: May 16th, 2026 - Camera ready: May 30th, 2026 - Publication: September 2026 - Congress: September 22-25, 2026 For all general enquiries, please contact: sepln2026(a)unileon.es All information related to the congress can be found at: https://sepln2026.org/ Editorial Committee of the Procesamiento del Lenguaje Natural

1 0

Last deadline extension: Language Driven Deliberation Technology Workshop (DELITE 2026) @LREC 2026 - Deadline (archival/non archival): March 9th
by Gabriella Lapesa 28 Feb '26

28 Feb '26

---------------------------------------------------------- Third Call for Papers: DELITE 2026 The 2nd Workshop on Language-driven Deliberation Technology Co-located with LREC 2026, Palma, Mallorca (Spain) *** Last deadline extension: March 9th (archival and non-archival)** *** Important info: LREC has extended the early bird registration fee for workshop authors to March 31st (acceptance notification: March 25th) ----------------------------------------------------------- OVERVIEW -------- Deliberation is ubiquitous: from navigating divergent interests in everyday personal life to reaching consensus in the political decision making process, deliberation describes the communicative process by which a group of people exchange ideas, weigh different arguments, and ultimately reach mutual understanding. In recent years, deliberative processes have gained momentum and shown to improve everyday and political decision-making. For the first time, technological solutions are maturing to the point that they can be deployed to support deliberation. The DELITE workshop provides a forum for presenting new advances in technology around deliberation by addressing researchers in Natural Language Processing, human-computer interaction, corpus linguistics, political science and philosophy, as well as stakeholders and domain experts involved in integrating such technology into decision-making processes. The topic is particularly timely in the age of LLMs and collective intelligence, which has heightened the awareness of the public to the potentials and drawbacks of language technology. While LLMs are transforming the way that much AI research is carried out, it is becoming clear that handling natural argumentation, particularly the sort of discussion found in deliberative settings, presents deep challenges for LLMs that are not likely to be overcome soon. The complex pragmatic structure of such discussions, the subjectivity of the phenomena involved (emotions, storytelling), nuanced presentation, framing and reframing of ideas, and resolution of differences of opinion all lay many orders of magnitude beyond the current parameterization spaces of such models. We view deliberation as an exercise in Collective Intelligence—the enhanced capacity of groups to make decisions due to collaboration and structured interaction. AI systems should augment and never replace human deliberation, by supporting facilitators, providing discussion summaries, and amplify/enact diversity in group decision making processes. TOPICS OF INTEREST ------------------ We welcome submissions that address the gaps facing this nascent field, including the scarcity of data on large-scale deliberation, the need for stakeholder requirements, and the need for technology that fosters trust. Topics include, but are not limited to: * Deliberation theory in NLP models * In-domain versus across domain resources * Integrating language systems into deliberation processes and interfaces * Technological solutions for online deliberation at scale * Argument mining for deliberation scenarios * Visualizing language systems results for human sensemaking * Empirical foundations for evaluation * Integrating and reflecting on recent advances in LLMs for deliberation scenarios * Collective Intelligence frameworks for deliberation at scale * Human-AI collaboration in group decision-making * Explainability, ethical questions, and addressing bias APPLICATION AREAS ----------------- We welcome submissions from all areas of application, including public policy making, democratic innovations, deliberative democracy, political decision making, citizen engagement and co-creation, intelligence services and military, conflict resolution/mitigation, case analysis in healthcare, legal decision making, and scholarly discourse. SUBMISSION ---------- DELITE 2026 introduces new submission formats to foster diversity and inclusion, specifically opening the venue to junior researchers and fields where conference papers are not standard (e.g., Social Sciences). * Standard Papers: Oral and poster presentations of long and short papers. * Extended Abstracts (non-archival): A new format designed to be inclusive of researchers from fields where conference papers are not standard (e.g., Social Sciences). * PhD Project Proposals: A non-archival submission option allowing doctoral students to collect feedback on their research plans without the pressure of a full-fledged publication. * Non-Archival Reports: Poster presentations of non-archival reports of ongoing projects to serve community building. Standard papers must describe original (completed or in progress) and unpublished work. These papers can be long (8 pages, excluding references) or short (4 pages, excluding references) and must be anonymized to support double-blind reviewing, i.e., they must not include authors’ names and affiliations and should avoid links to non-anonymized repositories. Standard papers that do not conform to these requirements will be rejected without review. Extended abstracts and non-archival papers must be at most 2 pages, excluding references and an additional page as an appendix for tables/figures. Submission of all papers is electronic, using the Softconf START conference management system. Papers must follow the LREC 2026 two-column format, using the supplied official style files. The templates can be downloaded from the Style Files and Formatting page provided on the website. Please do not modify these style files, nor should you use templates designed for other conferences. Submissions that do not conform to the required styles, including paper size, margin width, and font size restrictions, will be rejected without review. Submission link:https://softconf.com/lrec2026/DELITE2026/ The LRE 2026 Map and the "Share your LRs!" initiative ------------------------------------------------------ When submitting a paper from the START page, authors will be asked to provide essential information about resources (in a broad sense, i.e., also technologies, standards, evaluation kits, etc.) that have been used for the work described in the paper or are a new result of your research. Moreover, ELRA encourages all LREC authors to share the described LRs (data, tools, services, etc.) to enable their reuse and replicability of experiments (including evaluation ones)". IMPORTANT DATES --------------------- * Archival paper submission: *** March 9th (last extension) *** * Non-archival paper submission: *** March 9th (last extension) *** * Notification of acceptance: 25 March 2026 * Camera-ready: 30 March 2026 * Workshop day: 16 May 2026 WORKSHOP ORGANIZERS ------------------- * Lucas Anastasiou, The Open University, UK * Katarina Boland, Heinrich Heine University Düsseldorf, Germany * Anna De Liddo, The Open University, UK * Neele Falk, University of Stuttgart, Germany * Annette Hautli-Janisz, University of Passau, Germany * Gabriella Lapesa, GESIS Leibniz Institute for the Social Sciences, Germany & Heinrich-Heine University of Düsseldorf, Germany * Julia Romberg, GESIS Leibniz Institute for the Social Sciences, Germany CONTACT --------------------- e-mail:lucas.anastasiou@open.ac.uk <mailto:lucas.anastasiou@open.ac.ukwebsite> website: https://idea.kmi.open.ac.uk/the-2nd-workshop-on-language-driven-deliberatio…

1 0

2026

2025

2024

2023

2022