3rd International Workshop on Natural Scientific Language Processing (NSLP 2026): 
Final Call for Papers

12 May 2026 – Co-located with LREC 2026
Palma, Mallorca (Spain)

NSLP 2026 features two shared tasks:

NSLP 2026 – important dates:

NSLP 2026 website (including the shared tasks):

Scientific research has witnessed a steep growth rate over the last decades. The number of scholarly publications is growing exponentially, and doubles every 15-17 years. Consequently, both general and specialised repositories, databases, knowledge graphs, and digital libraries have been developed to publish and manage scientific artifacts. Examples include the Open Research Knowledge Graph (ORKG), the Semantic Scholar Academic Graph (S2AG), PubMed Central and also the ACL Anthology. These resources enable the collection, reuse, tracking, and expansion of scientific findings, and facilitate downstream applications such as scientific search engines.

However, in order to develop robust systems that deal with scholarly text, various challenges need to be addressed. The current status quo of scientific communication mostly includes scholarly articles as unstructured PDF documents, which are not machine-readable in the sense that relevant scientific information can be extracted easily, thus making extracting and utilising this information as part of the scientific process a laborious and time-consuming task. Developing methods for converting unstructured information into structured formats is one of the major challenges in the field of Natural Scientific Language Processing (NSLP). This goal encompasses related challenges such as detecting, disambiguating, and linking mentions of scientific artifacts (e.g., software tools or specific datasets or language resources), and tracking state-of-the-art models and their evaluation scores (including new versions of existing models). Extracting and managing heterogeneous scientific knowledge effectively remains a challenging ongoing research area. Existing efforts are often fragmented, addressing separate issues with distinct datasets and conceptual approaches.

NSLP 2026 addresses current topics and issues in Natural Scientific Language Processing. It is proposed and organised with the support of NFDI for Data Science and Artificial Intelligence (NFDI4DS), a long-term project with approx. 20 partners who work towards building a German national research data infrastructure for DS and AI. The workshop aims to further bring together the international community of researchers who work on NSLP and related topics (including research knowledge graphs), to discuss current issues and possible solutions. NSLP 2026 includes two keynote speakers and presentations of accepted papers (oral and poster presentations), as well as three shared tasks.

Topics of interest include, but are not limited to

Important Dates

Submission Guidelines

The NSLP 2026 workshop invites submissions of: regular long papers; short papers; position papers. We especially encourage submissions from junior researchers and students from diverse backgrounds.
When submitting a paper through START, the authors will be asked to provide essential information about resources (in a broad sense, i.e., also technologies, standards, evaluation kits, etc.) that have been used for the work described in the paper or are a new result of your research. Moreover, ELRA encourages all LREC authors to share the described LRs (data, tools, services, etc.) to enable their reuse and replicability of experiments (including evaluation ones).

Keynote Speakers

Shared Tasks

1. ClimateCheck 2026: Scientific Fact-Checking of Social Media Claims

The rise of climate discourse on social media offers new channels for public engagement but also amplifies mis- and disinformation. As online platforms increasingly shape public understanding of science, tools that ground claims in trustworthy, peer-reviewed evidence are necessary. The new iteration of ClimateCheck builds on the results and insights from the 2025 iteration (run at SDP 2025/ACL 2025), offering the following subtasks:

Subtask 1: Abstract retrieval and claim verification: given a claim and corpus of publications, retrieve the top 10 most relevant abstracts and classify each claim-abstract pair as supports, refutes, or not enough information

Subtask 2: Disinformation narrative classification: given a claim, predict which climate disinformation narrative exists according to a predefined taxonomy. 

New training data will be released for both tasks, with task 1 having triple the amount of the last iteration. The new iteration will focus on sustainability, emphasising the need to build climate-friendly NLP systems with minimal environmental impact. 

Shared task co-organisers: Raia Abu Ahmad, Aida Usmanova, Max Upravitelev, Georg Rehm

2. SOMD 2026: Software Mention Detection & Coreference Resolution

Understanding software mentions is crucial for reproducibility and to interpret experimental results. Citations of software are often informal, lacking the use of persistent identifiers, making it hard to infer and disambiguate knowledge about software efficiently. This task will build on SOMD 2025 (run at SDP 2025, co-located with ACL 2025) and focus on entity disambiguation as an under-investigated problem in this context. More precisely, we address the task of coreference resolution of software mentions across multiple documents, i.e. given a set of software mentions extracted from multiple scientific publications, cluster these mentions so that all software mentions in a particular cluster refer to the same real world software. We define three subtasks with varying challenges: 

Subtask 1: Software coreference resolution over gold standard mentions. Addresses the task based on high-quality (gold standard) mentions of software that are expert-annotated in multiple publications.

Subtask 2: Software coreference resolution over predicted mentions. Addresses the task on software mentions that are automatically extracted using a baseline model, i.e. reflecting a typical information extraction scenario, where upstream pipelines (such as entity and metadata extraction) are imperfect. 

Subtask 3: Software coreference resolution at scale. Addresses the task using predicted mentions of software and metadata at a larger scale. This challenges models to scale effectively, maintain accuracy, and distinguish among an increasingly dense field of similar or overlapping software mentions.

Shared task co-organisers: Sharmila Upadhyaya, Stefan Dietze, Frank Krüger, Wolfgang Otto

Organisers

Programme Committee