What you’ll Do:
- Accelerate the ML Lifecycle: Design and scale MLOps infrastructure and agentic AI workflows to speed up end-to-end model development, validation, and CI/CD deployment pipelines.
- Bridge Engineering & Data Science: Collaborate with data scientists to optimize model architectures, enforce clean code standards, and transition experimental models into production-ready and scalable software.
- Drive Application Integration: Build robust APIs, microservices, and backend business logic required to seamlessly connect machine learning models with downstream applications and serve predictions efficiently.
- Build Data & Feature Pipelines: Construct and maintain scalable data pipelines and feature stores to ensure reliable, high-throughput data ingestion for both training and real-time inference.
- Own Production Health & Observability: Establish robust monitoring, anomaly detection, and incident response systems to track data quality, model drift, and A/B testing performance.
- Optimize System Performance: Troubleshoot production bottlenecks and optimize models for latency, throughput, and compute efficiency (e.g., GPU/CPU utilization and API cost management).
What you’ll Need:
- Experience in software engineering, machine learning engineering, data science, or a related field; 2+ years of professional experience required for Senior-level candidates.
- Demonstrate a solid understanding of core ML concepts and frameworks, including supervised/unsupervised learning, optimization, evaluation metrics, and hands-on experience with PyTorch, TensorFlow, or Scikit-Learn.
- Have practical experience developing agentic AI workflows or LLM applications, including autonomous agents or orchestration frameworks like LangChain, AutoGen, CrewAI, or custom tool-calling/RAG architectures.
- Show strong proficiency with Python or Go in production environments, with exposure to distributed computing frameworks like Apache Spark considered a strong plus.
- Understand systems architecture and distributed systems deeply, with the ability to design, maintain, and scale complex ML pipelines and application logic.
- Experience with container orchestration tools like Kubernetes, or workflow orchestration tools like Apache Airflow, is highly desirable for managing scalable ML workloads.
- Bring research to reality with the proven ability to read, dissect, and practically implement advanced algorithms, papers, and methodologies from published AI/ML research.
- Apply a pragmatic, high-velocity engineering approach with a long-term vision, using a systematic "hacker" mindset to build clever, elegant, and resourceful solutions under tight constraints while balancing rapid delivery with sustainable, scalable practices that minimize technical debt.
- Be able to communicate in English and Thai both speaking and writing fluently.