The workshop Leveraging Derived Text Formats to Unlock Copyrighted Collections for Open Science will be held at the Language Resources and Evaluation Conference (LREC 2026).
Derived Text Formats (DTF), also known as extracted features, offer a promising solution for enabling research on textual data that cannot be shared in its original form due to copyright or privacy restrictions. This workshop brings together researchers, legal experts, and infrastructure providers to explore the creation, standardization, legal framing, and scientific use of derived data in linguistics, digital humanities, and language technology.
We invite contributions from the community that address practical experiences, challenges, and solutions related to:
The workshop will be held as a hybrid event. The exact workshop date will be communicated in due time.
Submissions should be 4 to 8 pages in length (excluding references and potential Ethics Statements). Submissions should follow the LREC stylesheet, available on the conference website on the Author’s kit page at https://lrec2026.info/authors-kit/. Submissions will be reviewed by the workshop organizers and the programme committee.
Submissions will be handled via the submission system Softconf at https://softconf.com/lrec2026/DTF.
When submitting a paper from the START page, authors will be asked to provide essential information about resources (in a broad sense, i.e. also technologies, standards, evaluation kits, etc.) that have been used for the work described in the paper or are a new result of your research. Moreover, ELRA encourages all LREC authors to share the described LRs (data, tools, services, etc.) to enable their reuse and replicability of experiments (including evaluation ones)
For questions, please contact: dtf-at-lrec2026@googlegroups.com
For updates, see https://text-plus.org/en/aktuelles/veranstaltungen/2026-05-12-lrec-dtf/