Yassine El Kheir - Researcher in Speech Processing and Deepfake Detection

Hi πŸ‘‹, I’m Yassine El Kheir

I’m a PhD student at the German Research Center for Artificial Intelligence (DFKI) in Berlin, working on robust speech representations in self-supervised learning (SSL) models for audio deepfake detection under the supervision of Prof. Sebastian MΓΆller and Dr. Tim Polzehl.

Research Interests πŸ‘€

  • πŸ” Self-Supervised Learning (SSL) for Speech
  • 🎭 Audio Deepfake Detection, Anti-Spoofing
  • 🌍 Multilingual and Non-Native Speech Processing/Recognition
  • πŸ—£ Automatic Speech Recognition (ASR) and NLP

News ✨

  • 2025-02-05: πŸŽ‰ Excited to announce MorphBPE Tokenizer used in Fanar Qatar LLM is published!
  • 2025-01-25: πŸŽ‰ Excited to announce a new paper on Layer-wise Analysis of SSL Models for Audio Deepfake Detection model interpretability accepted to Findings of NAACL 2025!
  • 2024-12-12: Invited as a researcher for a two-week project at the SDAIA Winter School, organized by SDAIA.

Education πŸŽ“

  • πŸ“ PhD in Computer Science (2024 - Exp. 2027) - DFKI, Berlin, Germany
  • πŸ“ MSc in Machine Learning (2021 - 2022) - KTH Royal Institute of Technology, Sweden
  • πŸ“ Master in Data Science (2020 - 2021) - EURECOM & TΓ©lΓ©com Paris, France
  • πŸ“ Master in Digital Engineering (2019 - 2022) - TΓ©lΓ©com Paris, France
  • πŸ“ Preparatory Classes (CPGE) (2017 - 2019) - LycΓ©e Mohammed VI, Morocco

Jobs πŸ§‘β€πŸ’»

  • 2024.07 - ongoing: PhD Student - Researcher @ DFKI, Berlin, Germany
  • 2022.07 - 2024.07: Research Associate @ Qatar Computing Research Institute (QCRI), Qatar
  • 2022.02 - 2022.07: Machine Learning Intern @ Snappet, Netherlands

Selected Publications πŸ“œ

  1. Comprehensive Layer-wise Analysis of SSL Models for Audio Deepfake Detection - NAACL Findings 2025
  2. Beyond Orthography: Automatic Recovery of Short Vowels and Dialectal Sounds in Arabic - ACL 2024
  3. Speech Representation Analysis Based on Inter- and Intra-Model Similarities - IEEE WICASSP 2024
  4. L1-aware Multilingual Mispronunciation Detection Framework - IEEE ICASSP 2024

For a full list of my publications, visit my Google Scholar.

Projects πŸš€

  • News-Polygraph: πŸ€– News-polygraph is a collaborative research project working on a comprehensive, multimodal technology platform for analyzing and detecting disinformation (speech part – deepfake detection).
  • Fanar LLM: πŸ€– An Arabic-centric large language model supporting multiple dialects.
  • QVoice: πŸ—£οΈ The first Arabic speech mispronunciation detection system.
  • AraVoiceL2 Dataset: 🎀 A dataset of non-native Arabic speech for phoneme-level mispronunciation detection.

Awards & Scholarships πŸ…

  • πŸ† Telecom Paris Scholarship (2022-2023)
  • πŸ† Excellence Scholarship FIRSI (2019-2022)
  • πŸ† Prepa FIRSI Scholarship (2018-2019)

Contact πŸ“¬