Infrastructure Subject Matter Expert for BCDR & DR Automation - KSA
DeepSource Technologies · Posted 2026-06-11
Role Overview: The Infrastructure SME plays a critical role in ensuring that the underlying IT infrastructure fully supports Business Continuity and Disaster Recovery (BCDR) objectives, with a strong focus on DR Automation. This role bridges infrastructure and automation teams, ensuring resilience, scalability, and seamless failover/failback execution across all infrastructure layers. Key Responsibilities 1. Infrastructure Architecture & Readiness · Review and validate the end-to-end infrastructure architecture supporting automated DR failover and failback, including: o Network, Security, Compute, storage, virtualization, containers, and data center components · Ensure the design supports high availability, resiliency, and recoverability aligned with business requirements. 2. DR Automation Integration · Act as the primary bridge between infrastructure teams and the DR Automation team, ensuring alignment and seamless collaboration. · Review and validate automated failover/failback workflows across infrastructure components, including: o Network, Security, Servers, storage, DNS, virtualization platforms, and container environments · Collaborate on the development of pre-failover validation scripts to ensure readiness before execution. 3. Recovery Objectives & Capacity Planning · Review and validate infrastructure-level RTOs, ensuring alignment with application and business recovery requirements. · Ensure sufficient capacity and performance within DR sites and automation platforms to support: o Full failover scenarios o Partial or phased failover scenarios 4. Technical Leadership & Engagement · Lead and actively participate in technical discussions and workshops across: o Discovery o Validation o Tabletop exercises · Provide domain expertise and recommendations to ensure robust infrastructure design and DR strategy alignment. 5. Performance & Validation · Oversee and validate infrastructure performance testing during and after DR failover/failback activities. · Ensure that systems meet defined performance benchmarks and recovery objectives post-recovery. 6. Compliance & Audit Readiness · Review and ensure adherence to audit and regulatory requirements, particularly around: o Logging o Monitoring o Traceability of DR activities · Support audit readiness by ensuring proper documentation and controls are in place. 7. Cross-Functional Collaboration · Collaborate with Application, Network, Security, Database, and Business teams to ensure end-to-end alignment. · Coordinate with stakeholders to ensure dependencies are properly managed across infrastructure and application layers. 8. Continuous Improvement & Optimization · Identify opportunities to optimize infrastructure resilience, performance, and cost efficiency. · Drive continuous improvement initiatives based on test results, incidents, and evolving business needs. · Strong expertise in enterprise infrastructure design and operations (network, compute, storage, virtualization, cloud). · Hands-on experience with Disaster Recovery architectures and DR automation tools. · Deep understanding of failover/failback mechanisms and infrastructure dependencies. · Experience in capacity planning, performance testing, and high availability design. · Knowledge of regulatory and compliance requirements related to DR and infrastructure. · Strong stakeholder communication and cross-team coordination skills. Preferred Qualifications · Experience in large-scale BCDR and DR Automation programs. · Certifications in infrastructure technologies, cloud platforms, or DR/BCDR frameworks.