Publications
πComprehensive Layer-wise Analysis of SSL Models for Audio Deepfake Detection
Authors: Yassine El Kheir, Youness Samih, Suraj Maharjan, Tim Polzehl, Sebastian MΓΆller
Publication date: 2025/2/8
Conference: NAACL Findings 2025
Description: This paper conducts a comprehensive layer-wise analysis of self-supervised learning (SSL) models for audio deepfake detection across diverse contexts, including multilingual datasets (English, Chinese, Spanish), partial, song, and scene-based deepfake scenarios. By systematically evaluating the contributions of different transformer layers, we uncover critical insights into model behavior and performance. Our findings reveal that lower layers consistently provide the most discriminative features, while higher layers capture less relevant information. Notably, all models achieve competitive equal error rate (EER) scores even when employing a reduced number of layers. This indicates that we can reduce computational costs and increase the inference speed of detecting deepfakes by utilizing only a few lower layers. This work enhances our understanding of SSL models in deepfake detection, offering valuable insights applicable across varied linguistic and contextual settings.
Read the paperπMorphBPE: A Morpho-Aware Tokenizer Bridging Linguistic Complexity for Efficient LLM Training Across Morphologies
Authors: Ehsaneddin Asgari, Yassine El Kheir , Mohammad Ali Sadraei Javaheri
Publication Date: 2025/2/2
Conference: Submitted to ACL 2025 arXiv:2502.00894
Description: We introduce MorphBPE, a morphology-aware extension of BPE that integrates linguistic structure into subword tokenization while preserving statistical efficiency, specifically for morphologically rich languages.
πFanar: An Arabic-Centric Multimodal Generative AI Platform
Authors: Fanar Team, Ummar Abbas, Mohammad Shahmeer Ahmad, Firoj Alam, Enes Altinisik, Ehsannedin Asgari, Yazan Boshmaf, Sabri Boughorbel, Sanjay Chawla, Shammur Chowdhury, Fahim Dalvi, Kareem Darwish, Nadir Durrani, Mohamed Elfeky, Ahmed Elmagarmid, Mohamed Eltabakh, Masoomali Fatehkia, Anastasios Fragkopoulos, Maram Hasanain, Majd Hawasly, Mus' ab Husaini, Soon-Gyo Jung, Ji Kim Lucas, Walid Magdy, Safa Messaoud, Abubakr Mohamed, Tasnim Mohiuddin, Basel Mousi, Hamdy Mubarak, Ahmad Musleh, Zan Naeem, Mourad Ouzzani, Dorde Popovic, Amin Sadeghi, Husrev Taha Sencar, Mohammed Shinoy, Omar Sinan, Yifan Zhang, Ahmed Ali, Yassine El Kheir , Xiaosong Ma, Chaoyi Ruan
Publication Date: 2025/1/18
Report: arXiv preprint arXiv:2501.13944
Description: Fanar is a platform for Arabic-centric multimodal generative AI systems, supporting language, speech, and image generation tasks, with key components like Fanar Star and Fanar Prime, offering state-of-the-art Arabic language models and advanced capabilities like Islamic Retrieval Augmented Generation (RAG).
π€Beyond Orthography: Automatic Recovery of Short Vowels and Dialectal Sounds in Arabic
Authors: Yassine El Kheir , Hamdy Mubarak, Ahmed Ali, Shammur Absar Chowdhury
Publication Date: 2024/8/5
Conference: ACL 2024
Description: This paper presents a novel framework for dialectal sound and vowelization recovery in Arabic, addressing the challenge of recognizing borrowed and dialectal sounds in phonologically diverse languages, using limited data to improve performance.
π€Larabench: Benchmarking Arabic AI with Large Language Models
Authors: Ahmed Abdelali, Hamdy Mubarak, Shammur Chowdhury, Maram Hasanain, Basel Mousi, Sabri Boughorbel, Samir Abdaljalil, Yassine El Kheir, Daniel Izham, Fahim Dalvi, Majd Hawasly, Nizi Nazar, Youssef Elshahawy, Ahmed Ali, Nadir Durrani, NataΕ‘a MiliΔ-Frayling, Firoj Alam
Publication date: 2024/3
Conference: Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Pages: 487-520
Read the paperπ£οΈAutomatic Pronunciation Assessment - A Review
Authors: Yassine El Kheir , Ahmed Ali, Shammur Absar Chowdhury
Publication Date: 2023/10/21
Conference: Findings of EMNLP 23
Description: A comprehensive review of recent advancements in automatic pronunciation assessment for both phonemic and prosodic aspects, discussing methods, challenges, and resources, with directions for future research.
πL1-aware Multilingual Mispronunciation Detection Framework
Authors: Yassine El Kheir , Shammur Absar Chowdhury, Ahmed Ali
Publication Date: 2023/9/14
Conference: IEEE ICASSP 2024
Description: This paper introduces the L1-MultiMDD framework, incorporating L1-aware speech representation to detect mispronunciations across multiple languages, improving multilingual MDD by integrating an L1-L2 embedding and multi-task learning.
π€Multi-View Multi-Task Representation Learning for Mispronunciation Detection
Authors: Yassine El Kheir , Shammur Absar Chowdhury, Ahmed Ali
Publication Date: 2023/6/2
Conference: Speech and Language Technology in Education Workshop (SLaTE 2023)
Description: This paper proposes a novel architecture for mispronunciation detection that uses multiple views of the input data assisted by auxiliary tasks to learn more distinctive phonetic representations in low-resource settings, outperforming the state-of-the-art models.
π€QVoice: Arabic Speech Pronunciation Learning Application
Authors: Yassine El Kheir, Fouad Khnaisser, Shammur Absar Chowdhury, Hamdy Mubarak, Shazia Afzal, Ahmed Ali
Publication date: 2023/5/9
Conference: INTERSPEECH 2023