We provide a simple API for creating, storing, versioning, and collaborating on multi-modal AI datasets of any size. With Activeloop's open-core stack, you can rapidly transform and stream data while training models at scale. Deep Lake powers foundational model training by acting as a vector database with significant benefits, such as (1) the ability to use multi-modal datasets to fine-tune your own LLM models, (2) storing both the embeddings and the original data with automatic version control, so no embedding re-computation is needed (3) truly serverless service with no vendor lock-in. How cool is that?
GitHub loves us - we're one of the fastest-growing libraries there, and we're used by little-known companies like Google, Waymo, and Intel. No big deal.
Our founding team hails from places like Princeton, Stanford, Google, and Tesla, and we're backed by Y Combinator & other Silicon Valley heavyweights.
Activeloop is hiring, and we want you! Check out our open roles on our YC page and join the fun.
10-min demo: https://activeloop.wistia.com/medias/aibvo0dst2 Whitepaper: https://www.deeplake.ai/whitepaper
At Activeloop we are transforming the way organizations harness their data for AI with our Deep Lake and Multi-modal AI Search. Whether you're answering critical clinical questions or searching across vast repositories of scientific papers, we empower you to index, search and organize billions of documents, images, and videos intuitively using natural language powered by Large Language Models. Join us in making data more accessible and actionable than ever before.
About the Role
We are looking for a Python ML Engineer with a strong foundation in machine learning, large-scale data systems, and deep learning. The ideal candidate will have expertise in developing and optimizing ML pipelines, implementing efficient indexing techniques, and integrating state-of-the-art retrieval and organization methods. You will collaborate with software engineers, customers, and business stakeholders to develop ML solutions that deliver significant value to the organization and our clients.
Machine Learning Pipeline Development: Design, implement, and optimize robust machine learning pipelines for large-scale datasets.
Algorithm Optimization: Develop and refine algorithms for semantic understanding, retrieval performance, and relevance ranking.
Data Integration : Work on integrating data storage and retrieval solutions within Deep Lake to support efficient data access for ML models.
Query Understanding and Processing: Build advanced pipelines for query processing, including contextual interpretation and intent recognition, to improve data interaction.
Model Development: Fine-tune and deploy machine learning models tailored to data organization and retrieval tasks.
Performance Evaluation: Establish metrics and testing frameworks to continuously evaluate and improve system performance.
Scalability and Efficiency: Optimize ML systems for high throughput, low latency, and large-scale dataset handling.
Ways to Stand Out from The Crowd
Why Join Activeloop?
Activeloop Deep Lake is at the forefront of transitioning from traditional software to AI, accelerating AI deployment across various industries. Our products empower advanced LLMs, generative models, and computer vision models. Trusted by industry leaders we are expanding our team to further advance AI applications. We pride ourselves on being an inclusive, equal opportunity workplace, committed to diversity and accessibility for all applicants.
We are building Deep Lake, the Data Lake for Deep Learning https://github.com/activeloopai/deeplake
The landscape of computation resources across different special hardware and cloud providers is becoming increasingly fragmented.
We're building a platform that unifies and abstracts away infrastructure for easier and highly efficient machine learning and deep learning.
fulltimeMountain View, CA, US / Remote (CA, US)Full stack$120K - $200K6+ years
fulltimeMountain View, CA, US / Remote (CA, US)Machine learning$120K - $200K6+ years
fulltimeMountain View / Remote (CA, US)Backend$120K - $200K6+ years