Israel Office | Full-time | Intermediate
Glassbox is looking for an ML Engineer to join our Global R&D team
We are Glassbox, a world leader in digital experience analytics, on a mission to deliver frictionless digital journeys to brands and their customers all over the world.
We are a hyper-growth scale-up that has most recently acquired a strong player in the CX field- SessionCam and just IPO’d in June!
So, now is the perfect time to come to Glassbox and help us accelerate our global leadership position!
Will you join us in this journey?
What You Will Do
We are growing fast, and we are looking for an experienced Data Engineer who will take a crucial role in enabling our continued growth.
As a Data Engineer in our ML team, you will surely get excited with the opportunity of handling exceptional volumes of customer data (hundreds of billions of monthly events!) and help us facilitate cutting-edge Machine Learning technologies at a large scale for some of the largest organizations in the world.
You will work closely with several R&D teams within Glassbox - Product, Software, QA, and take a pivotal role in defining our next-generation data pipeline and ML platform infrastructure.
- Utilize a combination of cloud-based and open source frameworks to solve our most complex data problems
- Define, design, and develop multiple data pipelines and ETL processes (normalization, aggregation, transformation, data movement, etc.) implemented in Scala & Python, leveraging the Hadoop & Spark frameworks
- Work closely with ML Researchers to design and develop algorithms and models that will help provide our customers with actionable insights about their digital channels.
- Maintain and improve our ML infrastructures.
- Develop processes and tools to monitor, analyze, maintain and improve data pipeline operation, performance, and usability
- Expert skills in Python, PySpark -- Must. Scala is a plus
- Proven experience in building, optimizing and maintaining big data pipelines using popular open-source frameworks (Kafka, Spark, Hive, Presto, Airflow, etc)
- Deep understanding and experience with data storage in RDBMS, NoSQL DB, data lakes, data warehouses (like S3, Redshift, Clickhouse, Postgres, ElasticSearch, Cassandra,, etc)
- Hands-on experience in public cloud and services (e.g. AWS S3, RDS/Aurora, EMR, Redshift, Step Functions, Athena, etc)
- Knowledge in data pipeline optimization for execution time, complexity and compute the cost
- Independent self-starter and team player with excellent communication and interpersonal skills
- Degree or equivalent experience in Computer Science or similar Engineering fields
- Prior experience with ML methodologies and frameworks
- Prior experience deploying ML models in a production environment (Docker/Kubernetes/Etc.)
- Excellent / Fluent written and spoken English
- Prior experience and understanding of digital customer experience (CX) or similar business domain -- Big advantage
- Familiarity with MLOPS best practices
- Thrives and enjoys working in startups and fast-paced environment