CALD-pseudo workshop at EACL 2024

a cross-disciplinary forum for advancing privacy protection of unstructured text data & data openness through pseudonymization.

Dates and venue

 
Venue: Hotel Radisson Blu, St. Julians, Malta
Date: March 21 or 22, 2024
Submission deadline: December 18
Submission website: https://softconf.com/eacl2024/CALD-pseudo-2024/
Registration: coming soon
Quick links
Call for papers
Submission information
Important dates
Invited speakers
- Anders Søgaard, Denmark
- Ildikó Pilán, Norway
Program committee
Organizers

Call for papers

 
We invite submissions to the first edition of the CALD-pseudo workshop on Computational Approaches to Language Data Pseudonymization, to be held at EACL 2024 on March 21 or 22, 2024.
Description
Accessibility of research data is critical for advances in many research fields but textual data often cannot be shared due to the personal and sensitive information which it contains, e.g names, political opinions, sensitive personal information and medical data. General Data Protection Regulation, GDPR (EU Commission, 2016), suggests pseudonymization as a solution to secure open access to research data but we need to learn more about pseudonymization as an approach before adopting it for manipulation of research data (Volodina et al., 2023). The main challenge is how to effectively pseudonymize data so that such individuals cannot be identified, while at the same time keeping the data usable for research (e.g. in computational linguistics, linguistics) and natural language processing tasks for which it was collected.
Topics of interest
This workshop invites a broad community of researchers in all concerned cross-disciplinary fields to jointly discuss challenges within pseudonymization, such as
* automatic approaches to detection and labelling of personal information in unstructured language data, including events and other context-dependent cues revealing a person;
* developing context-sensitive algorithms for replacement of personal information in unstructured data;
* studies into the effects of pseudonymization on unstructured data, e.g. applicability of pseudonymised data for the intended research questions, readability of pseudonymised data or addition of unwelcome biases through pseudonymization;
* effectiveness of pseudonymization as a way of protecting writer identity;
* reidentification studies, e.g. adversarial learning techniques that attempt to breach the privacy protections of pseudonymized data;
* constructing datasets for automatic pseudonymization, including methodological and ethical aspects of those;
* approaches to the evaluation of automatic pseudonymization both in concealing the private information and preserving the semantics of the non-personal data;
* pseudonymization tools and software: evaluating the available tools and software for pseudonymization in different languages, and their ease of use, scalability, and performance;
* and numerous other open questions.

Back to top

Submission information

 
Authors are invited to submit by December 18, 2023 original and unpublished research papers in the following categories:

Full papers (up to 8 pages) for substantial contributions
Short or demo papers (up to 4 pages) for ongoing or preliminary work

All submissions must be in PDF format must follow the EACL 2024 guidelines described in the ARR CfP https://aclrollingreview.org/cfp, be in pdf, and use the official ACL style templates available here: https://github.com/acl-org/acl-style-files

Direct submission deadline: December 18, 2023 at https://softconf.com/eacl2024/CALD-pseudo-2024/
Deadline for registration of ARR reviewed papers: January 17, 2023. (Further instructions will follow.)

We also invite authors of papers on the topics of the workshop accepted to Findings to reach out to the organizing committee of CALD-pseudo to present them at the workshop.
Every paper will be reviewed by at least 2 members of the program committee. As reviewing will be blind, please ensure that papers are anonymous. Self-references that reveal the author’s identity, e.g., “We previously showed (Smith, 1991) …”, should be avoided. Instead, use citations such as “Smith previously showed (Smith, 1991) …”. Submissions will be judged on appropriateness, clarity, originality/innovativeness, correctness/soundness, meaningful comparison, significance and impact of ideas or results.

Final camera-ready versions of accepted papers will be given an additional page to address reviewer comments.

Back to top

Important dates

 
* December 18, 2023: Workshop paper deadline
* January 17, 2024: Re-submission of pre-reviewed ARR papers
* January 20, 2024: Notification of acceptance
* January 30, 2024: Camera-ready papers due
* March 21 or 22, 2024: Workshop date

Invited speakers

Back to top

Anders Søgaard, University of Copenhagen, Denmark
Title TBA
BIO
Anders Søgaard is Full Professor in Natural Language Processing and Machine Learning, Dpt. of Computer Science, University of Copenhagen. He is also affiliated with the Pioneer Centre for Artificial Intelligence, Dpt. of Philosophy, and Center for Social Data Science. He was previously at University of Potsdam, Amazon and Google Research. He has won eight best paper awards and several prestigious gran

Back to top

Ildikó Pilán, the Norwegian Computing Center, Norway
Title (preliminary) Pseudonymisation and related techniques: a quest for determining what personal information to rewrite and how
BIO
Ildikó Pilán is a Senior Research Scientist at the Norwegian Computing Center, Norway. Her most impactful research comes from linguistic complexity studies within the domain of language learning, and recently from the area of anonymization and pseudonymization where she has been actively working on preparing datasets, benchmarks and models for automatic anonymization and pseudonymization of Norwegian and English data in the project Cleanup (e.g. Lison et al., 2021; Pilán et al., 2022). Her fields of expertise include Natural Language Processing, Machine Learning, privacy protection, data privacy, medical text processing and Intelligent Computer-Assisted Language Learning.

Back to top

Program

 
TBA

Back to top

Program committee

 
* Ahrenberg Lars, Linköping University, Sweden
* Ainiala Terhi, University of Helsinki, Finland
* Aldrin Emilia, Halmstad University, Sweden
* Arhar Holdt Špela, University of Ljubljana, Slovenia
* Caines Andres, University of Cambridge, United Kingdom * Dalianis Hercules, Stockholm University, Sweden
* Dannélls Dana, University of Gothenburg, Sweden
* Dobnik Simon, University of Gothenburg, Sweden
* Grouin Cyril, LIMSI, CNRS, Université Paris-Saclay, France
* Hämäläinen, Lasse, University of Helsinki, Finland
* Henriksson Aron, Stockholm University, Sweden
* Kokkinakis Dimitrios, University of Gothenburg, Sweden
* Lassus Jannika, University of Helsinki, Finland
* Lindström Tiedemann Therese, University of Helsinki, Finland
* Lison Pierre, Norwegian Computing Center, Norway
* Lindén Krister, University of Helsinki, Finland
* Ljunglöf Peter, Chalmers University of Technology / University of Gothenburg, Sweden
* Marko Karoline, University of Graz, Austria
* Megyesi Beáta, Stockholm University, Sweden
* Nelson Boel, Aarhus University, Denmark
* Nordman Lieselott, University of Helsinki, Finland
* Ochs Sebastian, Technical University of Darmstadt, Germany
* Pilán Ildikó, Norwegian Computing Center, Norway
* Raheja Vipul, Grammarly, USA
* Sánchez Ruenes David, University of Rovira i Virgili, Spain
* Scheffler Tatjana, Ruhr University Bochum, Germany
* Torra Vicenc, Umeå University, Sweden
* Vakili Thomas, Stockholm University, Sweden
* Vydiswaran VG Vinod, University of Michigan, USA
* Volodina Elena, University of Gothenburg, Sweden
* Vu Xuan-Son, Umeå University, Sweden

Back to top

Organizers

 
General co-chairs
* Elena Volodina, University of Gothenburg, Sweden
* Therese Lindström Tiedemann, University of Helsinki, Finland
* Simon Dobnik, University of Gothenburg, Sweden (publication chair)
* Xuan-Son Vu, Umeå university, Sweden
Organizing co-chairs
* Ricardo Muñoz Sánchez, University of Gothenburg, Sweden
* Maria Irena Szawerna, University of Gothenburg, Sweden
Contact mormor.karl@svenska.gu.se
Anti-harassment policy
CALD-pseudo workshop adheres to the ACL’s anti-harassment policy https://www.aclweb.org/adminwiki/index.php?title=Anti-Harassment_Policy.
Acknowledgments
The workshop is organized within the project Grandma Karl is 27 years old and is supported by a research grant on pseudonymization from the Swedish Research Council.