An initial meeting for the project, feauring two guest talks by invited speaker Pierre Lison. He prestented results from his project on anonymisation CLEANUP.

Internal kick-off meeting, Gothenburg

Date, time and place: 2023-03-08 – 2023-03-10, F314

Two guest talks by Pierre Lison

Speaker bio: Pierre Lison is a senior researcher at the Norwegian Computing Center, a research institute located in Oslo and conducting research in computer science, statistical modelling and machine learning. Pierre’s research interests include privacy-enhancing NLP, spoken dialogue systems, multilingual corpora and weak supervision. Pierre currently leads the CLEANUP project on data-driven models for text sanitization. He also holds a part-time position as associate professor at the University of Oslo.

Talk 1: Privacy-enhancing NLP: a primer

Date, time and place: 2023-03-09, 13.15-14.30, J335

Abstract: Text documents often contain personal data in some form – either related to the authors themselves or to some other individuals mentioned in the text. This obviously raises a number of privacy issues, especially when those documents are to be published online or be included as training data for NLP models. Fortunately, privacy-enhancing techniques can be applied to protect the privacy of the individuals referred to in a given document. I’ll review in this talk some of those techniques, such as text sanitization, text rewriting, and privacy-preserving training. I’ll then describe in more details our own work on data-driven text sanitization based on explicit measures of privacy risks, and will also present how such methods can be evaluated using our recently released Text Anonymization Benchmark (TAB).

Talk 2: skweak: Weak Supervision Made Easy for NLP

Date, time and place: 2023-03-10, 10.00-12.00, J577

Abstract: I will present our recent work on weak supervision, and the skweak toolkit that emerged from it . The talk is based on the paper https://aclanthology.org/2021.acl-demo.40/