Senior Data Engineer - Hybrid - KSA - (10-12 Months) - RTG

Robusta Studio · Cairo, Egypt · Posted 2026-05-14

Robusta assists organizations in transitioning to a digital-first approach, crafting unforgettable experiences for their customers. We provide strategy, design, product, and technology services to prominent businesses and brands, utilizing our go-to-market expertise to facilitate seamless customer experiences and enhance conversion rates.About The RoleWe are seeking a highly experienced Senior Data Engineer to lead the technical design, implementation, and delivery of an enterprise-grade, AI-ready Data Lakehouse platform. This role is critical in building the foundational data layer for a large-scale digital transformation initiative that will support AI agents, digital workers, and knowledge graph (ontology) systems.The ideal candidate will have strong software engineering experience with a focus on data pipeline development, data architecture, and scalable distributed systems. You will play a key role in designing and maintaining robust data infrastructure that enables advanced analytics and AI capabilities.This position also involves technical leadership, mentoring engineering teams in a collaborative co-building model, and ensuring long-term operational ownership.Key ResponsibilitiesLakehouse Architecture & Implementation: Design and deploy a unified Data Lakehouse utilizing the Medallion architecture (Bronze, Silver, Gold) and open table formats (e.g., Delta Lake, Apache Iceberg) on cloud infrastructure hosted within Saudi ArabiaData Ingestion & Pipeline Engineering: Build reusable, automated ingestion frameworks (batch and streaming) capable of processing both structured data (RDBMS, APIs) and unstructured data (PDFs, policy documents) to feed downstream AI models and semantic reasoning engines. Data Quality & Governance: Implement automated data quality "circuit breakers" (completeness, uniqueness, referential integrity) and end-to-end data lineage tracking frameworks. Optimization: Optimize data processing workflows for performance, scalability, and cost-efficiencySystem Monitoring and Maintenance: Monitor and maintain data systems, responding to SEVs or other urgent issues to ensure continuous operationsSecurity & Compliance: Ensure the platform adheres strictly to NCA (National Cybersecurity Authority) and NDMO (National Data Management Office) standards. Implement AES-256 encryption at rest, TLS 1.2+ in transit, robust Key Management Systems (KMS), and centralized audit loggingAccess Control Integration: Design and deploy granular Role-Based Access Control (RBAC) and Attribute-Based Access Control (ABAC), integrating seamlessly with existing enterprise Identity Providers (e.g., Active Directory)Capability Building & Handover: Lead hands-on knowledge transfer sessions, pair-programming with client engineers, creating operational runbooks, and conducting "Game Day" failure simulations to ensure the client's team is fully ready to operate the platform independentlyRequirementsEducation: Bachelor's or Master's degree in Computer Science, Engineering, or a related fieldExperience: 5+ years of proven experience in Data Engineering, Distributed Systems, or Big Data Architecture, with at least 2+ years specifically leading Data Lakehouse or Cloud Data Platform implementations. Technical Skills & Core Technologies: Programming Languages: Proficiency in programming languages such as Python, Java, or ScalaData Architecture & System Design: Strong expertise in designing data-intensive applications, complex data modeling, and schema design for enterprise environments. Distributed Systems & Lakehouse Technologies: Deep, hands-on experience with distributed processing engines (e.g., Apache Spark, Kafka, Hadoop) and modern open table formats (e.g., Delta Lake, Apache Iceberg, Apache Hudi). ETL/ELT & Orchestration: Experience designing and building robust data pipelines using modern transformation and orchestration tools (e.g., Apache Airflow, Prefect, dbt). Database Ecosystems: Proven track record with relational databases (e.g., PostgreSQL, MySQL), NoSQL platforms (e.g., MongoDB, Cassandra), and distributed SQL query engines like Hive and TrinoCloud Infrastructure: Proven experience deploying enterprise data solutions on major cloud providers, specifically within localized Saudi cloud regions. Expertise in Oracle Cloud Infrastructure (OCI) or Google Cloud Platform (GCP) is highly preferred, though experience with AWS or Azure is acceptableAnalytical Skills: Strong problem-solving skills with a keen eye for detail and a passion for dataAI/Data Science Enablement: Prior experience building data pipelines optimized for Machine Learning, Natural Language Processing (NLP), vector embeddings, or Knowledge Graphs/Ontologies is highly desirableSecurity & Networking: Strong understanding of enterprise network security, Private Endpoints, Identity & Access Management (IAM), and cryptographic key management. Communication: Excellent written and verbal communication skills, with the ability to articulate complex technical concepts to non-technical stakeholdersLeadership Skills: Demonstrated ability to lead technical teams, manage stakeholder expectations, and successfully transition complex systems to internal IT/Data teamsRegulatory Knowledge: Familiarity with Saudi Arabian data compliance frameworks (NCA CCC, NDMO, SDAIA) is highly preferred

Apply for this role