publications
-
BibleTTS - a large, high-fidelity, multilingual, and uniquely African speech corpus
-
Preparing an endangered language for the digital age - The Case of Judeo-Spanish
-
Corpora compilation for prosody-informed speech processing
-
Congolese Swahili Machine Translation for Humanitarian Response
-
TICO-19 – The Translation Initiative for COvid-19
-
Participatory Research for Low-resourced Machine Translation - A Case Study in African Languages
-
Gamayun – Language Technology for Humanitarian Response
-
CATOTRON – A Neural Text-to-Speech System in Catalan
-
Masakhane – Machine Translation For Africa
-
Tigrinya Neural Machine Translation with Transfer Learning for Humanitarian Response
-
Prosodic phrase alignment for machine dubbing
-
Building an open source automatic speech recognition system for Catalan
-
Bilingual prosodic dataset compilation for spoken language translation
-
Visualizing punctuation restoration in speech transcripts with Prosograph
-
Attentional parallel RNNs for generating punctuation in transcribed speech
-
Revising the METU-Sabancı Turkish treebank: An exercise in surface-syntactic annotation of agglutinative languages
-
Prosograph: A tool for prosody visualisation of large speech corpora
-
Automatic extraction of parallel speech corpora from dubbed movies
-
From raw data to semantically enriched hyperlinking: Recent advances in the LinkedTV analysis workflow
-
Processing the manuscripts of Atatürk