An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.
-
Updated
Oct 10, 2023 - Python
An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.
This code provides word level language identification tool for identifying language for individual words in Code-Mixed text. e.g. The text that includes words from two languages such as Hindi written in roman script, mixed with English.
Implementation of meta-transfer-learning for ASR and LM (ACL 2020)
Multilingual Meta-Embeddings for Named Entity Recognition (RepL4NLP & EMNLP 2019)
Pytorch implementation of CS-Tacotron, a code-switching speech synthesis end-to-end generative TTS model.
[ACL 2024] 💬 MaskLID: Code-Switching Language Identification through Iterative Masking
Hierarchical Korean-English Code-Switching Speech Recognition Benchmark (EACL Findings 2026) | 한영 혼용 음성인식 벤치마크
Local WhatsApp-to-persona workbench for parsing chats, ranking fine-tuning paths, and exporting SFT/DPO datasets, memory profiles, and character files.
Code-Switching Language Modeling using Syntax-Aware Multi-Task Learning (CALCS 2018, ACL)
Code-switching patterns can be an effective route to improve performance of downstream NLP applications: A case study of humour, sarcasm and hate speech detection
VoiceTut-TTS is an Egyptian-Arabic text-to-speech system fine-tuned from OmniVoice on ~380 hours of Egyptian podcast speech. It produces natural Egyptian speech with seamless Arabic ↔ English code-switching, ships 15 built-in studio voices, supports zero-shot voice cloning
EduMIND — Multimodal Bilingual Lecture Assistant with Anti-Forget RAG, Whisper ASR, VietMix NMT (CMI-aware), and Human-in-the-Loop Active Learning via Label Studio. Built for code-mixed Vietnamese–English academic environments.
Implementation of a deep learning model (BiLSTM) to detect code-switching
A sequence tagging model with active learning
Repository containing Abusive Tweet Detection, Location Detection and Gender Detection codes
POSIT aims to segment and tag mixed-text that contains English and C-like code, such that the user both knows what a token is, and within the language it's used in, what role, such as an AST tag or PoS tag, it serves.
A production-oriented Levantine Text-to-Speech pipeline built on a fine-tuned XTTS-v2 optimized for real-time conversational agents.
CURE-Med: Curriculum-Informed Reinforcement Learning for Multilingual Medical Reasoning
Point of Interest Error Rate (PIER) Metric for Code-Switching ASR: A specialized evaluation metric designed to focus on critical points in multilingual speech recognition, providing a more accurate analysis of code-switched utterances.
A package for determining the matrix language in bilingual sentences
Add a description, image, and links to the code-switching topic page so that developers can more easily learn about it.
To associate your repository with the code-switching topic, visit your repo's landing page and select "manage topics."