


Data Scientist (Geospatial & Site Intelligence)Location:LondonType:Full-Time, one year contract with possible rolls & extensionsIndustry:Energy Infrastructure / AI / Regulatory TechnologyAbout UsNyxium is a London based deep tech company backed by top-tier investors. We are building an infrastructure intelligence platform for siting, permitting, and lifecycle compliance across high-impact sectors including renewable energy, nuclear/SMRs, data centers, mining, and grid infrastructure.Nyxium integrates agentic AI, geospatial analytics, machine learning, and structured decision logic to help teams evaluate feasibility, reduce permitting and interconnection risk, and monitor compliance through construction and operations.Why join us?You learn how to build a company from scratch. Fast decision making. Direct access to founders. High level of independence.Role OverviewWe are hiring a full-time Geospatial Data Scientist to lead the sourcing, ingestion, validation, and structuring of the core datasets that power Nyxium’s site intelligence workflows.This is a foundational role. The quality and reliability of the data layer directly determines the credibility of Nyxium’s outputs. You will work closely with the founding team and engineering leads to ensure Nyxium’s datasets are authoritative, scalable, and ready for production.This role is especially focused on expanding and operationalising Nyxium’s dataset coverage to global scale.What You Will OwnYou will own the lifecycle of critical datasets, including:Geospatial base layers (vector and raster)Environmental constraints and hazard datasetsWeather and climate datasets (including time-series)Community and socioeconomic indicators (for community-level risk and constraints)Metadata, provenance, and versioning practices to support auditability and trustCore Responsibilities1) Data Sourcing and AcquisitionIdentify authoritative and regularly updated data sources across the US and EuropeAcquire datasets through:government portalsopen data platformsAPIs and bulk downloadslicensed datasets (when required)Evaluate credibility, metadata integrity, update frequency, and licensing constraintsDocument sources clearly so every dataset is traceable and defensible2) Data Engineering and Storage DesignDesign efficient schemas for large geospatial datasets (vector and raster)Optimise storage for:query performancescalabilityreproducibilityRecommend when datasets should be:stored internallyaccessed via API live callscached in tiles or AOI subsetsSupport integration with geospatial databases (e.g., BigQuery GIS, PostGIS)3) Data Quality, Cleaning, and EnhancementBuild reproducible validation pipelines for geospatial correctness and consistencyClean, normalise, and standardise datasets across:coordinate systemsgeometriesspatial resolutiontime rangesHandle missing data and inconsistent attributesCreate derived layers and features where needed (aggregation, proximity features, indices)Ensure outputs are structured for downstream feasibility scoring and analysis4) Global Dataset ExpansionIdentify global and national equivalents for core datasets, including:Geospatial and Environmentalprotected areaswetlands and water resourcesflood/fire/seismic hazardsland cover, slope, elevationclimate and weather constraintsEconomicelectricity cost proxiesland cost proxieslabour cost proxiesregional development indicators relevant to infrastructure feasibilityCommunity and Sociologicalsocial vulnerability-type indicatorsdemographic structurecommunity risk proxiesother regionally available datasets relevant to development feasibilityEnsure cross-jurisdiction metadata compatibility (global vs national vs regional)Maintain a consistent schema so the platform can scale across countriesRequired QualificationsMSc or PhD in Geospatial Science, Environmental Modelling, Engineering, or a related fieldStrong Python experience with geospatial libraries such as:GeoPandasShapelyRasterioPyProjGDALDask (or similar tools for scaling)Experience working with large geospatial datasets and geospatial databases (PostGIS, BigQuery GIS, etc.)Strong data validation and data quality disciplineAbility to work independently and deliver production-grade datasetsPreferred Qualifications (Strong Plus)Experience with climate risk, hazard modelling, or environmental datasetsFamiliarity with European data portals and sourcesExperience building ETL pipelines and automated ingestion workflowsExperience with cloud-based storage and orchestration (GCP, AWS)Familiarity with infrastructure-related datasets (grid, transport, land use, permitting layers)Exposure to decision-support systems, optimisation, or scoring workflowsWhat Makes This Role UniqueYou will architect and scale the data foundation of a high-impact infrastructure intelligence platformYour work will directly shape product credibility and decision qualityYou will work closely with a technical founding team building a serious product, not a prototypeHow to ApplyPlease send:Your CVA short description (or link) of one data pipeline you built (GitHub welcome but not required)A short paragraph describing your experience with geospatial and/or environmental datasets
Lorem ipsum dolor sit amet, consectetur adipiscing elit.