What you’ll Do:
- Design, build, optimize, launch and support new and existing data models and ETL processes in production
- Monitor ETL job, ensure pipeline is not delayed and executed successfully
- Work with BI analysts and data scientists to transform data into actionable insights, features, and models
- Design and implement dimensional data models and data warehouse
- Ensure quality of data and keep track of data lineage
- Participate in performance tuning to continually improve the quality of data platform
- Develop services supporting ML models deployment
- Oversee data governance in organization
What you’ll Need:
- Bachelor’s degree or equivalent experience in Computer Science or related field
- 1+ (Junior)/3+ (Senior) years experience in custom ETL design and implementation
- Experience in building real-world data pipelines
- Strong SQL and Python
- Comfortable with Git version control
- Have a growth and can-do mindset, and willing to learn new things or share knowledge with others
- Good understanding of Hadoop, Hive, Spark, and Presto
- Good understanding of RDS such as MySQL, Postgresql and MongoDB
- Good understanding of Apache Druid is a plus