Varsity portfolio draft picks

Explore premier career opportunities with our portfolio companies
If you are a Varsity portfolio companyclaim your profile.

(Freelance) Pre-Training Engineer

Kog

Kog

Paris, France
Posted on Nov 8, 2025

Location

Paris, France

Employment Type

Contract

Location Type

Remote

Department

Engineering

KOG:

Kog is a French deeptech company building an ultra-fast AI execution layer for real-time AI.

We target up to 10x gains through GPU optimization and, crucially, up to 10x gains through model and training architecture design.

We start on AMD GPUs and will expand to other accelerators.

Our aim is a modular, real-time AI platform where developers and users can generate, customize, and operate AI agents and applications collaboratively, with a strong focus on European sovereignty, efficiency, and user control.

Context

We have access to a large NVIDIA H100 training cluster with 200+ GPUs. Our immediate priority is to optimize LLM pretraining efficiency on this cluster. We are competent with our current setup, but not yet at the level we need. We want a hands-on engineer who already runs pretraining at scale elsewhere and can quickly profile, correct, and document best practices so our team can execute independently.

Missions

  • Profile current LLM pretraining runs on an NVIDIA H100 cluster using NCCL and SLURM.

  • Critique and improve launch, scheduling, data and model parallelism, checkpointing, fault tolerance, and monitoring practices.

  • Implement targeted fixes to improve stability, throughput, and cost efficiency on 200+ GPU jobs.

  • Produce clear documentation and runbooks enabling the team to sustain improvements.

  • Bonus: bring practical LLM expertise that improves pretraining efficiency or quality through better model or training architecture choices.

Profile

  • Recent hands-on experience training neural networks on NVIDIA clusters with H100 GPUs and NCCL.

  • Deep knowledge of SLURM for large multi-node jobs at scale.

  • Strong expertise in end-to-end profiling of training workloads and removing bottlenecks.

  • Proven track record with LLM pre-training on large clusters of several hundred GPUs.

  • Pragmatic engineering mindset focused on launching, optimizing, and monitoring real training.

  • Not a research role. We value practical, production-grade execution over theory.

  • You have done this elsewhere and can step in immediately.

Contract

  • Freelance mission from 1 day up to 2 weeks.

  • Remote-friendly.

  • Attractive day rate.

  • Strong performance can lead to a full-time offer.

You can apply right below if you feel that you're up to the task!