My job alerts

GPU Engineer

Gensyn

Apply

Remote

Posted on Tuesday, February 27, 2024

The world will be unrecognisable in 5 years.

Machine learning models are driving our cars, testing our eyesight, detecting our cancer, giving sight to the blind, giving speech to the mute, and dictating what we consume, enjoy, and think. These AI systems are already an integral part of our lives and will shape our future as a species.

Soon, we'll conjure unlimited content: from never-ending TV series (where we’re the main character) to personalised tutors that are infinitely patient and leave no student behind. We’ll augment our memories with foundation models—individually tailored to us through RLHF and connected directly to our thoughts via Brain-Machine Interfaces—blurring the lines between organic and machine intelligence and ushering in the next generation of human development.

This future demands immense, globally accessible, uncensorable, computational power. Gensyn is the machine learning compute protocol that translates machine learning compute into an always-on commodity resource—outside of centralised control and as ubiquitous as electricity—accelerating AI progress and ensuring that this revolutionary technology is accessible to all of humanity through a free market.

Our Principles:

AUTONOMY

Don’t ask for permission - we have a constraint culture, not a permission culture.
Claim ownership of any work stream and set its goals/deadlines, rather than waiting to be assigned work or relying on job specs.
Push & pull context on your work rather than waiting for information from others and assuming people know what you’re doing.
No middle managers - we don’t (and will likely never) have middle managers.

FOCUS

Small team - misalignment and politics scale super-linearly with team size. Small protocol teams rival much larger traditional teams.
Thin protocol - build and design thinly.
Reject waste - guard the company’s time, rather than wasting it in meetings without clear purpose/focus, or bikeshedding.

REJECT MEDIOCRITY

Give direct feedback to everyone immediately rather than avoiding unpopularity, expecting things to improve naturally, or trading short-term pain for extreme long-term pain.
Embrace an extreme learning rate rather than assuming limits to your ability/knowledge.

Responsibilities

👉 Write performant GPU kernels and GPU compute infrastructure - from integrating with/in common frameworks such as PyTorch, down to an IR representation for training - with particular focus on ensuring reproducibility

👉 Write novel algorithms with numerical properties suitable for modern cryptographic systems - designing stable compute flows for deep learning algorithms

👉 Ownership - in the following areas:

Front-end - deal with the handshaking of common Deep Learning Frameworks with Gensyn's custom ops. Hooking up GPU kernels all the way up to the users.
Number representations - deal with mixed precision, floating point datatypes and fixed point variants

Minimum Requirements

✅ Solid software engineering skills - practicing software engineer, having significantly contributed to/shipped production code

✅ Deep Understanding of parallel programming and hardware - specifically as it pertains to GPUs

✅ Ability to operate on:

GPU Kernels (CUDA, PTX, IR); and/or
Low-level GPU-specific optimisations - for performance and numerical stability

✅ Highly self-motivated with excellent verbal and written communication skills

✅ Comfortable working in an applied research environment - with extremely high autonomy

Nice to haves

🔥 Architecture understanding - full understanding of a computer architecture specialised for training NN graphs (Intel Xeon CPU, GPUs, TPUs, custom accelerators)

🔥 Open-source contributions to high-performance GPU codebases

🔥 Compilation understanding - strong understanding of compilation in regards to one or more High-Performance Computer architectures (CPU, GPU, custom accelerator, or a heterogenous system of all such components)

🔥 Compiler knowledge - base-level understanding of a traditional compiler (LLVM, GCC) and graph traversals

🔥 Proven technical foundation - in CPU and GPU architectures, numeric libraries, and modular software design

🔥 Deep Learning understanding - both in terms of recent architecture trends + fundamentals of how training works, and experience with machine learning frameworks and their internals (e.g. PyTorch, TensorFlow, scikit-learn, etc.)

🔥 Exposure to a Deep Learning Compiler frameworks - e.g. TVM, MLIR, TensorComprehensions, Triton, JAX

🔥 Kernel Experience - Experience writing from scratch and optimising highly-performant GPU kernels

*For potential candidates that are outside of these criteria, we still encourage you to apply as there may be openings with higher/lower levels than listed above.

Compensation / Benefits:

💰 Competitive salary + share of equity and token pool

🌐 Fully remote work - we hire between the West Coast (PT) and Central Europe (CET) time zones

🛫 4x all expenses paid company retreats around the world, per year

💻 Whatever equipment you need

❤️ Paid sick leave

🏥 Private health, vision, and dental insurance - including spouse/dependents [🇺🇸 only]

Note: please only submit CVs in .pdf format.

Apply now

See more open positions at Gensyn

A world of opportunity