design engineer at sync. (W24)

$120K - $180K • 0.30% - 1.00%

AI lipsync tool for video content creators

San Francisco, CA, US

Full-time

6+ years

Apply now

About sync.

at sync. we're making video as fluid and editable as a word document.

how much time would you save if you could record every video in a single take?

no more re-recording yourself because you didn't like what you said, or how you said it.

just shoot once, revise yourself to do exactly what you want, and post. that's all.

this is the future of video: AI modified >> AI generated

we're playing at the edge of science + fiction.

our team is young, hungry, uniquely experienced, and advised by some of the greatest research minds + startup operators in the world. we're driven to solve impossible problems, impossibly fast.

our founders are the original team behind the open sourced wav2lip — the most prolific lip-sync model to date w/ over 9k+ GitHub stars.

we’re at a stage today in computer vision where we were w/ NLP two years ago — have a bunch of disparate, specialized models (eg. Sentiment classification, translation, summarization, etc), but LLMs (a generalized large language model) displaced them.

we’re taking the same approach – curating high quality datasets + training a series of specialized models to accomplish specific tasks, while building up to towards a more generalized approach for one model to rule them all.

post batch our growth is e^x – we need help asap to scale up our infra, training, and product velocity.

we look for the following: [1] raw intelligence [2] boundless curiosity [3] exceptional resolve [4] high agency [5] outlier hustle

About the role

Skills: Figma, React, TypeScript

About sync.

We’re a team of artists, engineers, and researchers building tools to understand and modify people in video — in the last year we graduated at the top of our YC batch (W24), raised a $5.5M seed backed by GV, won the AI grant from Nat Friedman and Daniel Gross, and scaled to $2M+ ARR from $0.

We’re building a zero-shot generalized video model to understand, generate, and gain fine-grained control over any human in any video. We’ve already released a state-of-the-art generalized lip-syncing model for content translation and word-level video editing — you can access our models through our developer playground and API.

We had a research breakthrough — we learned a highly accurate generalized representation of a human face. tldr; this unlocks many new editing tasks that haven’t ever been possible before.

As we build out these model capabilities, we’re expanding our team to bring on a monster product designer who understands how developers think – and who can turn their ideas into code that delights users.

What are we working on?

We live in an extraordinary time.

Video generation is world modeling. Deep learning unlocked the ability to decipher the world around us, to understand it, to compress its data and information into a 70b param network plus weights.

By simply changing these underlying numbers – these latent representations — we can reimagine and reconstruct reality in any way we see fit.

This is profound. A high schooler can craft a masterpiece with an iPhone. A studio can produce a movie at a tenth of the cost 10x faster. Every video can be distributed worldwide in any language with perfect preservation of meaning, instantly. Video becomes as malleable as text.

But we have two fundamental problems to tackle before this is a reality:
[1] Large models are great at generating entirely new scenes and worlds, but struggle with precise control and fine grained edits. The ability to make subtle, intentional adjustments – the kind that separates good content from great content – doesn’t exist.

[2] If video generation is world modeling, each human is a world unto themselves. We each have our idiosyncrasies that make us unique — creating primitives to capture, express, and modify them is the key to breaking through the uncanny valley.

While our focus in research is to push the boundary on what’s possible through new models and pipelines, our focus in product is to design and ship intuitive experiences that simply delight users and provide maximal utility.

Key Responsibilities

Define next-generation interfaces for AI video creation tools
Ship designs directly to production through code contributions
Transform complex ML capabilities into intuitive workflows
Contribute actively to our React codebase and design system
Set new standards for technical tool design and implementation

Required Experience

Rare combination of velocity and quality in shipped work
Strong front-end development skills in React/TypeScript
Portfolio of groundbreaking tools used by thousands of developers/creators
Track record of shipping code alongside design work
History of setting new interaction paradigms

Technical Requirements

Proficiency in modern React, TypeScript, and CSS
Experience building and maintaining design systems in code
Ability to implement complex interactions and animations
Understanding of front-end performance and optimization
Comfort with Git workflow and code review process

Outcomes

Ship revolutionary features from design through implementation
Create reusable components that accelerate team velocity
Turn technical complexity into user delight
Drive improvements across design and engineering
Set new standards for AI tool interfaces

Looking for someone who:

Ships full features faster than entire teams
Maintains world-class design standards while writing their own code
Makes hard technical concepts feel magically simple
Thrives in the space between design and engineering
Can work alongside engineers as a true technical contributor
Can prototype, validate, and ship without dependencies

Our goal is to keep the team lean, hungry, and shipping fast.

These are the qualities we embody and look for:

[1] Raw intelligence: We tackle complex problems and push the boundaries of what's possible.

[2] Boundless curiosity: We're always learning, exploring new technologies, and questioning assumptions.

[3] Exceptional resolve: We persevere through challenges and never lose sight of our goals.

[4] High agency: We take ownership of our work and drive initiatives forward autonomously.

[5] Outlier hustle: We work smart and hard, going above and beyond to achieve extraordinary results.

[6] Obsessively data-driven: We base our decisions on solid data and measurable outcomes.

[7] Radical candor: We communicate openly and honestly, providing direct feedback to help each other grow.

Technology

next.js nest.js python pytorch aws/gcp/azure kubernetes

Interview Process

We’re a small team who works hard to create outsized impact. our interview process is grounded in reality:

We expect whoever we hire into this role to have a high degree of agency and maniacal urgency.

Our interview process is grounded in reality — its hard to get a sense of how well we'd work together from a traditional interview question or take home test.

Here is our process:

[1] 30 mins, intro call to understand goals, evaluate mutual fit, and set up next steps.

[2] 3 hrs, design assessment and interview loop. Can solve p2p live, or offline, you choose. We understand everyone is different, and solving a real world problem in a similar environment and pace as you would normally in the role will be the best way of giving you the best chance to succeed.

[3] 4hrs, irl on-site interview. we’ll fly you out to SF where we’ll work on a problem together.

From there, you’ll get an offer or decision in <24 hrs.

we want to set expectations – we work hard, work fast, and do a lot with very little. You'd be joining an outlier team at the ground floor, and a culture of obsession is what we care about most.

Apply now

Other jobs at sync.

Hundreds of YC startups are hiring on Work at a Startup.