My client is looking for an experienced ML Infrastructure Engineer to support the deployment, optimisation and scaling of advanced machine learning models in production environments. This role sits at the intersection of research and engineering, focused on ensuring models are reliably transitioned from experimentation through to large-scale deployment.You will work closely with research and platform teams to build and maintain high-performance inference systems, improve deployment processes and help drive infrastructure improvements that enable faster model iteration and release cycles.This is a strong opportunity to work on technically complex challenges within a fast-moving and highly collaborative environment.The RoleProductionise machine learning models from research through validation, staging and live deploymentBuild, maintain and optimise scalable inference infrastructure supporting high-throughput, low-latency workloadsImprove performance and reliability across GPU-based environmentsDesign and implement model serving and deployment workflowsDevelop monitoring and observability tools to track system performance, errors and utilisationSupport data preparation and model integration as part of the wider development lifecycleCollaborate with research, engineering and infrastructure teams to improve deployment efficiency and platform scalabilityEvaluate and integrate third-party infrastructure and inference tooling where appropriateRequirementsProven experience deploying and maintaining ML inference systems in production environmentsStrong programming experience in Python and familiarity with modern machine learning frameworksExperience working with containerisation and orchestration technologies such as Kubernetes or similarExposure to distributed systems and cloud-based infrastructureExperience supporting GPU workloads and performance optimisationStrong troubleshooting skills across performance, scaling and system reliabilityComfortable working cross-functionally within research-led environmentsAbility to operate in fast-paced teams with evolving technical prioritiesNice to HaveExperience building or improving model serving infrastructureUnderstanding of distributed training or inference techniquesExperience debugging low-level performance or hardware-related issuesExposure to real-time or latency-sensitive ML applications
Responsibilities
Job Requirements
Apply now