Datasets through the Lđź‘€king-Glass

Datasets through the Lđź‘€king-Glass is a webinar series focusing on the data aspects on learning-based methods. Our aim is to build a community of scientists interested in understanding how the data we use affects the algorithms and society as a whole, instead of only optimizing for a performance metric. We draw inspiration from a variety of topics, such as data curation to build datasets, meta-data, shortcuts, fairness, ethics and philosophy in AI.

All previous talks where the authors have agreed to share the talk, can be found in our YouTube playlist.

Next webinar: TBA

Date: TBA

Where: Zoom

Previous talks:

All previous abstracts can be found here.

  • S01E01 - Dr. Roxana Daneshjou (Stanford University School of Medicine, Stanford, CA, USA). 27th Feb 2023. Challenges with equipoise and fairness in AI/ML datasets in dermatology
  • S01E02 - Dr. David Wen (Oxford University Clinical Academic Graduate School, University of Oxford, Oxford, UK). 27th Feb 2023. Characteristics of open access skin cancer image datasets: implications for equitable digital health
  • S01E03 - Prof. Colin Fleming (Ninewells Hospital, Dundee, UK). 27th Feb 2023. Characteristics of skin lesions datasets
  • S02E01 - Prof. Amber Simpson (Queen’s University, Canada). 5th June 2023. The medical segmentation decathlon
  • S02E02 - Dr. Esther E. Bron (Erasmus MC - University Medical Center Rotterdam, the Netherlands). 5th June 2023. Image analysis and machine learning competitions in dementia
  • S02E03 - Dr. Ujjwal Baid (University of Pennsylvania, USA). 5th June 2023. Brain tumor segmentation challenge 2023
  • S03E01 - Dr. Thijs Kooi (Lunit, South Korea). 18th September 2023. Optimizing annotation cost for AI based medical image analysis
  • S03E02 - Dr. Andre Pacheco (Federal University of EspĂ­rito Santo, Brazil). 18th September 2023. PAD-UFES-20: the challenges and opportunities in creating a skin lesion dataset
  • S04E01 - Dr. Jessica Schrouff (Google DeepMind, UK). 4th December 2023. Detecting shortcut learning for fair medical AI
  • S04E02 - Rhys Compton and Lily Zhang (New York University, USA). 4th December 2023. When more is less: Incorporating additional datasets can hurt performance by introducing spurious correlations
  • S04E03 - Dr. Enzo Ferrante (CONICET, Argentina). 4th December 2023. Building and auditing a large-scale x-ray segmentation dataset with automatic annotations: Navigating fairness without ground-truth
  • S05E01 - Hubert Dariusz ZajÄ…c and Natalia-Rozalia Avlona (University of Copenhagen, Denmark). 25th March 2024. Ground Truth Or Dare: Factors Affecting The Creation Of Medical Datasets For Training AI
  • S05E02 - Dr. Annika Reinke (DKFZ, Germany). 25th March 2024. Why your Dataset Matters: Choosing the Right Metrics for Biomedical Image Analysis
  • S05E03 - Alceu Bissoto and Dr. Sandra Avila (UNICAMP, Brazil). 25th March 2024. The Performance of Transferability Metrics does not Translate to Medical Tasks

All previous abstracts can be found here.


Amelia Jiménez-Sánchez & Veronika Cheplygina at IT University of Copenhagen (Denmark). This project has received funding from the Independent Research Fund Denmark - Inge Lehmann number 1134-00017B.


If you want to receive information about upcoming seminars, please sign up to our mailing list. We pick the GDPR-compliant Brevo (formerly Sendinblue) as our mail provider. If you have any concerns relating to our data handling, please read our privacy notice.

Please be aware that many mail providers are tagged as junk, and the confirmation email might end up in your spam folder. Double check if your confirmation email is there. The sender will be PURRlab @ IT University of Copenhagen (amji @ itu.dk). Please add this sender to your contacts. If you have any problems subscribing to our mailing list, please contact Amelia.