Senior Machine Learning Engineer - Data
About Us
At Colossyan, we’re building the future of workplace learning with AI video.
Top companies like P&G, Porsche, BASF, BDO, and Paramount already use Colossyan to create engaging and interactive video content faster and more cost effectively than traditional video production. Nearly 1 million videos have been created using Colossyan, and we’ve been recognised as a G2 Leader in multiple product categories.Here’s an overview of our standout features:
- Create text-to-speech videos hosted by one of our 150+ AI avatars
- Translate your video content to 70+ languages in just four clicks
- Bring documents to life with our document-to-video feature
- Personalize your videos by creating a custom avatar of yourself, complete with a cloned voice
- Make learning content interactive with features like branching, multiple choice quizzes, and more
To learn more about our product features, visit colossyan.com.
The role
We’re looking for a Senior ML Engineer - Data to play a key role in shaping the foundation of our AI models by curating, processing, and optimizing large-scale datasets.In this role, you’ll work closely with research and product teams to ensure our models are trained on the highest quality data. You’ll design robust data pipelines, develop automated evaluation frameworks, and explore innovative techniques like semi-supervised learning and human-in-the-loop ML to continuously improve model performance.This is an opportunity to make a real impact—your work will directly influence the effectiveness and accuracy of our AI-driven products.
Key Responsibilities:
- Design and develop scalable data pipelines, including sourcing, scraping, filtering, post-processing, de-duplicating, and versioning of data for AI model training.
- Build frameworks for data evaluation and quality assessment, ensuring that our models are trained on high-quality, reliable data.
- Develop automated evaluation pipelines to benchmark new models before deployment in our production API.
- Collaborate with research and product teams to incorporate their data needs and optimize pipelines for various tasks.
- Conduct open-ended research on data quality improvements, including semi-supervised learning, human-in-the-loop ML, and fine-tuning with human feedback.
What we’re looking for:
- 5+ years of experience as a Data Engineer, ML Engineer, or Data Scientist handling large-scale data.
- Strong belief in high-quality data and the impact of data curation on model performance.
- Experience with end-to-end ML training pipelines.
- Expertise in large-scale distributed systems.
- Strong programming skills in Python and experience with PyTorch.
- (Preferred) Experience working with visual media and computer vision algorithms
At Colossyan, we believe that diversity drives innovation and inclusion fosters a sense of belonging. We are committed to creating a workplace where everyone feels valued, respected, and empowered to bring their authentic selves to work.
We actively seek to build a diverse team and encourage applications from candidates of all backgrounds and beliefs to apply to our open positions.
We strongly encourage individuals from underrepresented and/or marginalised identities to apply. If you need any accommodations for your interview, please email recruitment@colossyan.com
- Department
- Research
- Locations
- Budapest
- Remote status
- Fully Remote
About Colossyan
Here at Colossyan, we are building the future of learning with AI video. 🚀
Our easy-to-use AI video platform is reshaping the landscape of L&D content creation. Join top companies like P&G, Porsche, BASF, BDO, and Paramount and say goodbye to expensive filming, scheduling delays, and low engagement. Colossyan enables you to create training videos using AI at a fraction of the cost of traditional production, with higher effectiveness than text-only material. 🎬
📝 Create videos from text
Create effective videos from text, PDFs, professionally designed templates, or using an AI-powered Prompt-to-Video tool. Harness the power of our advanced text-to-speech technology, complemented by engaging ready-to-use templates, localization tools, and a simple and intuitive video editor.
🗣️ Pick your perfect AI presenter
We offer an extensive and diverse library of 50+ best-quality AI avatars, making it easier than ever to personalize your videos with hyper-realistic presenters. Leverage Colossyan’s unique Conversations feature to practice scenario-based learning with multiple avatars in one scene, or create an AI presenter of yourself with our Custom AI Avatar add-on.
🌍 Localize in four clicks
Produce videos in 70+ languages and accents, and easily translate your Colossyan videos in just four clicks using our auto translation feature.
🎁 Try Colossyan for free
Experience the Colossyan difference with our risk-free 14-day trial. Unlock your team's potential with AI-driven video learning.
Our dedication to exceptional support, endorsed by G2 with Best Support and Best Relationship rankings, ensures a seamless and rewarding journey. 🏆
Ready to transform your team’s workplace learning?
Try Colossyan today! 🚀
Senior Machine Learning Engineer - Data
Loading application form