Nile Bits, LLC. · Cairo, Egypt · Posted 2026-05-18
Company DescriptionProject OverviewJoin a cutting-edge initiative focused on building advanced AI voice infrastructure for Arabic-speaking markets. The project involves developing state-of-the-art Arabic speech technologies, including:Natural Text-to-Speech (TTS)Real-Time Automatic Speech Recognition (ASR)End-to-End Speech-to-Speech Conversational SystemsThe solutions are tailored to regional Arabic dialects, including Egyptian, Gulf, Levantine, and others.Job DescriptionJob DescriptionWe are seeking a highly skilled Senior Applied Machine Learning Engineer with deep expertise in speech and audio technologies. In this role, you will design, fine-tune, and optimize advanced machine learning models for Arabic voice applications. You will work across the full development lifecycle, from data pipeline construction and model experimentation to inference optimization and production deployment.This position is ideal for engineers who are passionate about transforming cutting-edge research into scalable, low-latency systems that support natural and accurate Arabic speech interactions.Key ResponsibilitiesBenchmark and evaluate TTS and ASR models using Arabic-specific test sets, measuring metrics such as Word Error Rate (WER), naturalness, and dialect coverage. Fine-tune generative models for voice cloning, zero-shot speaker adaptation, and speech synthesis. Build and maintain Arabic-focused data pipelines, including:Audio collection and preprocessingDiacritization (Tashkil)Data cleaning and augmentationOptimize model inference for production environments using:QuantizationKV-cache tuningStreaming inference techniquesIntegrate and evaluate complete speech-to-speech conversational pipelines. Conduct experiments based on recent research papers and convert findings into production-ready solutions. Collaborate with engineering and product teams to deploy robust and scalable speech systems. QualificationsRequired Qualifications5+ years of experience in Machine Learning, Applied AI, or AI Research. Strong programming skills in Python. Extensive hands-on experience with PyTorch and the Hugging Face ecosystem. Proven experience training and fine-tuning neural models for:Text-to-Speech (TTS)Automatic Speech Recognition (ASR)Audio codecsDeep understanding of modern speech architectures such as:WhisperConformerHiFi-GANDiffusion-based modelsExperience with audio processing techniques including:Voice Activity Detection (VAD)Speaker DiarizationNeural VocodersDemonstrated ability to implement and adapt research papers into practical production experiments. Strong understanding of Arabic language challenges, including:Diacritization (Tashkil)Dialectal variationsCode-switchingExperience with inference optimization techniques such as:QuantizationStreaming inferenceNVIDIA TensorRTPreferred QualificationsExperience developing custom NVIDIA CUDA kernels for high-performance model inference. Familiarity with speculative decoding and other advanced acceleration techniques. Experience deploying models at scale in cloud or GPU-based production environments. Contributions to open-source speech or machine learning projects. Additional InformationWHY YOU’LL LOVE USAll employees benefits for free (our famous games room, daily breakfast, fruits, coffee and other hot drinks, soft drinks and juices, company days out and parties…)Social insuranceOpen-door management policyFull Medical insuranceAccommodation and Transportation AllowanceFriendly environment that values innovation and efficiencyExciting opportunities for career growth and talent developmentFeedback encouragementRecognition and reward programsCompetitive salaries and incentivesFriendly environmentFlexible and Comfortable scheduleFun committeesMonetary rewardsFun, smart and creative peopleCareer possibilities with growing teamPaid vacationsSocial benefitsFor more information about Nile Bits, please visit our website:https://www.nilebits.com