Member of Technical Staff, Machine Learning at sync. (W24)
$140K - $215K  •  0.30% - 1.50%
AI lipsync tool for video content creators
San Francisco, CA, US
Full-time
6+ years
About sync.

at sync. we're making video as fluid and editable as a word document.

how much time would you save if you could record every video in a single take?

no more re-recording yourself because you didn't like what you said, or how you said it.

just shoot once, revise yourself to do exactly what you want, and post. that's all.

this is the future of video: AI modified >> AI generated

we're playing at the edge of science + fiction.

our team is young, hungry, uniquely experienced, and advised by some of the greatest research minds + startup operators in the world. we're driven to solve impossible problems, impossibly fast.

our founders are the original team behind the open sourced wav2lip — the most prolific lip-sync model to date w/ over 9k+ GitHub stars.

we’re at a stage today in computer vision where we were w/ NLP two years ago — have a bunch of disparate, specialized models (eg. Sentiment classification, translation, summarization, etc), but LLMs (a generalized large language model) displaced them.

we’re taking the same approach – curating high quality datasets + training a series of specialized models to accomplish specific tasks, while building up to towards a more generalized approach for one model to rule them all.

post batch our growth is e^x – we need help asap to scale up our infra, training, and product velocity.

we look for the following: [1] raw intelligence [2] boundless curiosity [3] exceptional resolve [4] high agency [5] outlier hustle

About the role
Skills: Torch/PyTorch, Python, Machine Learning, Computer Vision

About sync.

We’re a team of artists, engineers, and researchers building tools to understand and modify people in video — in the last year we graduated at the top of our YC batch (W24), raised a $5.5M seed backed by GV, won the AI grant from Nat Friedman and Daniel Gross, and scaled to millions in revenue from $0.

We’re building a zero-shot generalized video model to understand, generate, and gain fine-grained control over any human in any video. We’ve already released a state-of-the-art generalized lip-syncing model for content translation and word-level video editing — you can access our models through our developer playground and API.

We had a breakthrough in research: we learned a highly accurate generalized representation of a human face – this unlocks new editing tasks that were never possible before. 

As a we build out these model capabilities, we ship ML pipelines around generation and video understanding to bring them into the hands of users. We’re looking for a monstrous machine learning engineer who loves to tinker at the edge of science and fiction.

What are we working on?

We live in an extraordinary time. 

Video generation is world modeling. Deep learning unlocked the ability to decipher the world around us, to understand it, to compress its data and information into a 70b param network plus weights.

By simply changing these underlying numbers – these latent representations — we can reimagine and reconstruct reality in any way we see fit.

This is profound. A high schooler can craft a masterpiece with an iPhone. A studio can produce a movie at a tenth of the cost 10x faster. Every video can be distributed worldwide in any language with perfect preservation of meaning, instantly. Video becomes as malleable as text.

But we have two fundamental problems to tackle before this is a reality:
[1] Large models are great at generating entirely new scenes and worlds, but struggle with precise control and fine grained edits. The ability to make subtle, intentional adjustments – the kind that separates good content from great content – doesn’t exist.

[2] If video generation is world modeling, each human is a world unto themselves. We each have our idiosyncrasies that make us unique — creating primitives to capture, express, and modify them is the key to breaking through the uncanny valley.

While our focus in research is to push the boundary on what’s possible through new models, our focus in ML engineering is to push the boundaries of video editing by extending existing pipelines and creating entirely new workflows that make users feel like they’re playing with magic.

Key Responsibilities

  • Architect next-generation ML infrastructure for AI video generation
  • Build core abstractions that enable rapid ML feature development
  • Design high-performance inference systems that scale to millions of users
  • Lead critical ML platform decisions that unlock new product capabilities
  • Bridge research innovations with production-ready features

Required Skills and Experience

  • Exceptional track record building ML platforms that power consumer products
  • Deep expertise in video processing/generation at massive scale
  • Built and scaled ML systems handling millions of daily inferences
  • History of creating ML architectures that become team standards
  • Strong product sense and ability to shape ML feature development

Technical Requirements

  • Advanced knowledge of ML serving architectures and distributed systems
  • Deep understanding of video codecs, processing, and optimization
  • Experience building real-time ML systems with strict latency requirements
  • Expert-level Python and ML framework knowledge
  • Strong foundation in system design and performance optimization

Outcomes

  • Transform research possibilities into reliable product features
  • Build ML infrastructure that scales 100x with minimal changes
  • Create abstractions that accelerate entire team's development
  • Drive order-of-magnitude improvements in key metrics
  • Enable new classes of features previously thought impossible

Looking for someone who:

  • Has shipped multiple ML systems that scaled to millions of users
  • Makes everyone around them better through system design and mentorship
  • Consistently delivers breakthrough technical solutions
  • Thrives on hard technical challenges but focuses on user impact
  • Can see around corners to anticipate scaling challenges before they hit


Our goal is to keep the team lean, hungry, and shipping fast.

These are the qualities we embody and look for:

[1] Raw intelligence: We tackle complex problems and push the boundaries of what's possible.

[2] Boundless curiosity: We're always learning, exploring new technologies, and questioning assumptions.

[3] Exceptional resolve: We persevere through challenges and never lose sight of our goals.

[4] High agency: We take ownership of our work and drive initiatives forward autonomously.

[5] Outlier hustle: We work smart and hard, going above and beyond to achieve extraordinary results.

[6] Obsessively data-driven: We base our decisions on solid data and measurable outcomes.

[7] Radical candor: We communicate openly and honestly, providing direct feedback to help each other grow.

Technology

next.js nest.js python pytorch aws/gcp/azure kubernetes

Interview Process

We’re a small team who works hard to create outsized impact. our interview process is grounded in reality:

We expect whoever we hire into this role to have a high degree of agency and maniacal urgency.

Our interview process is grounded in reality — its hard to get a sense of how well we'd work together from a traditional interview question or take home test.

Here is our process:

[1] 30 mins, intro call to understand goals, evaluate mutual fit, and set up next steps.

[2] 3 hrs, technical assessment and interview loop. Can solve p2p live, or offline, you choose. We understand everyone is different, and solving a real world problem in a similar environment and pace as you would normally in the role will be the best way of giving you the best chance to succeed.

[3] 4hrs, irl on-site interview. we’ll fly you out to SF where we’ll work on a problem together.

From there, you’ll get an offer or decision in <24 hrs.

we want to set expectations – we work hard, work fast, and do a lot with very little. You'd be joining an outlier team at the ground floor, and a culture of obsession is what we care about most.

Other jobs at sync.

fulltimeSan Francisco, CA, USProduct design$120K - $180K0.30% - 1.00%6+ years

fulltimeSan Francisco, CA, US / RemoteFull stack$130K - $200K0.30% - 1.00%6+ years

fulltimeHyderabad, TS, IN / Bengaluru, KA, IN / Remote (Hyderabad, TS, IN; Bengaluru, KA, IN; Mumbai, MH, IN)Machine learning₹3M - ₹10M INR0.15% - 0.75%3+ years

fulltimeSan Francisco, CA, USMachine learning$140K - $215K0.30% - 1.50%6+ years

Hundreds of YC startups are hiring on Work at a Startup.

Sign up to see more ›