AI Researcher – Autonomy & Robotics

Our mission is twofold:
On the civilian side, we build labeled 3D information layers from all types of sensors—powering smart cities, drones, autonomous vehicles, infrastructure monitoring, public safety, and far beyond.
On the defense side, we bring true real-time intelligence to the edge—anywhere, for any sensor, at any point on the map—enabling fast, informed, and decisive action in the field.

At Algolight, we combine everything that’s best in the civilian world—advanced technologies, open-minded thinking, freedom to innovate, and personal growth—with a bold vision: to become the company that enables AI to truly understand the world.
At the same time, we embrace the best of the defense world—missions with real meaning, a genuine sense of purpose, and the development of systems that save lives.

This is a rare combination of cutting-edge technology + deep impact + global vision + an innovative environment that pushes you to the highest professional levels.

We’re building autonomy & robotics capabilities that work across all sensors and all working points.
At the core of this effort is a long-term mission: to build a foundation model grounded in massive 2D and 3D photogrammetric data, capable of open-vocabulary perception across environments, platforms, and sensing regimes.

This means going far beyond closed-set detection. We are training models that understand the world through large-scale 2D imagery, multi-view geometry, 3D reconstructions, point clouds, meshes, and spatio-temporal context—and can generalize to previously unseen objects, structures, and scenes.

We’re looking for an AI Researcher who lives at the intersection of foundation models, open-vocabulary detection, multi-modal learning, and real-world deployment, and who wants to turn photogrammetric data into general, reusable world models.

🛠️ What You’ll Be Responsible For

Full ownership of training, fine-tuning, evaluation, and deployment of neural networks for autonomy & robotics tasks:
Detection, Open-Vocabulary Detection, Segmentation, Classification, Tracking / Video Understanding, VQA (Q&A), Pose Estimation, anomaly/change cues, and spatio-temporal reasoning.
Building foundation models from large-scale 2D and 3D photogrammetric data: multi-view imagery, stereo, SfM/MVS outputs, depth maps, point clouds, meshes, DSM/DTM, and time-varying reconstructions.
Developing open-vocabulary and open-set perception pipelines: text-conditioned models, vision-language supervision, weak/open supervision, and representations that generalize beyond fixed label sets.
Working with multi-modal, multi-sensor data:
VIS / IR / SWIR / Thermal / SAR / LiDAR / Radar / Audio / Seismic / Event / Multi- & Hyper-Spectral, including domain adaptation, augmentation, simulation, and synthetic data strategies when labels are sparse or incomplete.
Designing data and representation strategies for 2D↔3D learning: cross-view consistency, geometry-aware losses, temporal alignment, scene-level representations, and robustness across changing viewpoints and conditions.
Defining and enforcing data standards: labeling taxonomies (including open-vocab schemas), QA pipelines, dataset health metrics, precision/recall/IoU, calibration-aware metrics, latency budgets, benchmark suites, and systematic error analysis.
Training & inference optimization for edge and cloud: quantization / pruning / distillation, mixed precision, batch-vs-stream scheduling, conversion to ONNX / TensorRT, and serving with Triton Inference Server where relevant.
Model-level multi-sensor fusion: early/mid/late fusion, temporal models, transformers for video and variable sensor cardinality, and graceful degradation when sensors are missing or degraded.
MLOps / Research Ops: experiment tracking and reproducibility (W&B / MLflow / DVC), model and data versioning, A/B testing, performance dashboards, and regression monitoring over time.
Close collaboration with embedded/edge, autonomy, and physical sensing teams—productizing foundation models into live pipelines that operate under real constraints (latency, bandwidth, power), and iterating based on field feedback.

✅ Requirements

3+ years of hands-on experience training deep learning models for at least two of:
Detection (including open-vocabulary), Segmentation, Classification, Tracking / Video, Pose Estimation, VQA/Q&A, anomaly/change understanding.
Strong practical command of PyTorch (mandatory) and modern training workflows (distributed training is a plus).
Experience working with large-scale 2D and/or 3D vision data (multi-view imagery, depth, point clouds, meshes, or photogrammetry outputs).
Proven ability to debug models deeply across data / labels / geometry / model behavior.
Strong collaboration skills and comfort working in a multi-disciplinary, fast-moving environment.

⭐ Strong Advantages

Experience with open-vocabulary / vision-language models (e.g., CLIP-like supervision, text-conditioned detection/segmentation).
Experience with 2D-3D learning, geometry-aware training, or photogrammetry-driven pipelines.
ONNX / TensorRT / Triton and real deployment experience.
Edge inference optimization (e.g., Jetson) and real-time latency budgeting.
Experience with synthetic data, simulation, self-supervised / contrastive learning, and domain adaptation.
Familiarity with streaming environments (ROS2 / GStreamer / DeepStream) as they relate to production systems.

🚀 What Awaits You

A central role in building foundation models for real-world autonomy and robotics, grounded in massive 2D and 3D photogrammetric data.
Access to rare datasets and large-scale collection pipelines spanning aerial, ground, and multi-platform sensing.
Collaboration with elite sensing and AI teams across multispectral, hyperspectral, seismic, radar, thermal, audio, and more—with deep opportunities for cross-domain learning and tech transfer.
Access to state-level resources: field teams, test infrastructures, sensors, operational challenges, and exceptional minds—where research becomes deployed capability.
A culture that values depth, ownership, and ambition—where you can help define how AI understands the physical world.
💰 Excellent compensation for the right candidate!

🔎 Sensors We Work With

UV, VIS, NIR, SWIR, MWIR, LWIR, Bolometric, Radar, Microphone, Geophone, SAR, Polarimetric cameras, Fiber sensing, Event cameras, Multi-/Hyper-spectral, Laser imaging cameras, Vibrometers, and more.

Apply for job

Similar jobs

Algolight