Founding Software Engineer at Reducto (W24)
$120K - $200K  •  0.20% - 1.00%
Unlocking data behind complex documents
San Francisco, CA, US
Full-time
1+ years
About Reducto

Nearly 80% of enterprise data is in unstructured formats like PDFs

PDFs are the status quo for enterprise knowledge in nearly every industry. Insurance claims, financial statements, invoices, and health records are all stored in a structure that’s simply impractical for use in digital workflows. This isn’t an inconvenience—it’s a critical bottleneck that leads to dozens of wasted hours every week.

Traditional approaches fail at reliably extracting information in complex PDFs

OCR and even more sophisticated ML approaches work for simple text documents but are unreliable for anything more complex. Text from different columns are jumbled together, figures are ignored, and tables are a nightmare to get right. Overcoming this usually requires a large engineering effort dedicated to building specialized pipelines for every document type you work with.

Reducto breaks document layouts into subsections and then contextually parses each depending on the type of content. This is made possible by a combination of vision models, LLMs, and a suite of heuristics we built over time. Put simply, we can help you:

  • Accurately extract text and tables even with nonstandard layouts
  • Automatically convert graphs to tabular data and summarize images in documents
  • Extract important fields from complex forms with simple, natural language instructions
  • Build powerful retrieval pipelines using Reducto’s document metadata
  • Intelligently chunk information using the document’s layout data
About the role
Skills: Torch/PyTorch, Python, TypeScript, Computer Vision

The vast majority of enterprise data is in files like PDFs and spreadsheets. That includes everything from financial statements to medical records. Reducto helps AI teams turn those really complex documents into LLM-ready inputs with exceptional accuracy.

Hundreds of companies have signed up to use Reducto since our launch, and we’re now processing tens of millions of pages every month for teams ranging from startups to Fortune 10 enterprises. We’re hiring founding software engineers to help us continue to serve our customers as we build the ingestion layer that connects human data with LLMs.

As a founding engineer you’ll work on our core API and on prem deployments. That means you’ll have a hand in everything that our customers need. 

The core work will include:

  • Making improvements to API design and pre-processing algorithms (chunking, structured extraction, etc.) based on customer feedback.
  • Training and fine tuning vision models for tasks like extraction, segmentation, and detection.
  • Build internal tooling and evals to better understand/analyze failure cases.
  • Experimenting with new techniques and output structures to LLM accuracy 
  • Working directly with the founders to shape the product direction and engineering strategy

We would love to meet you if you: 

  • Are an autonomous and resourceful engineer with 2-5 years of experience building real-world applications, with a very high bar for quality.
  • Have a solid fundamental understanding of Python and algorithms.
  • Can rapidly go from 0 to 1 in building apps in Typescript/Next.js.
  • Are excited about working directly with customers to design and build features

Bonus points if you:

  • Have prior experience founding a company or building products at early stages
  • Are ambitious and driven, and care a lot about doing great work with great people
  • Keep up with the latest developments in ML/AI

This is an in person role at our office in SF.

Other jobs at Reducto

fulltimeSan Francisco, CA, USDevops$120K - $220K0.25% - 1.00%1+ years

fulltimeSan Francisco, CA, USFull stack$120K - $200K0.20% - 1.00%1+ years

fulltimeSan Francisco, CA, USFull cycle$110K - $160K3+ years

Hundreds of YC startups are hiring on Work at a Startup.

Sign up to see more ›