Be the first to apply!

Senior Data Engineer

Orysys Limited   Colombo • Full-time

Job Description

We are seeking a highly skilled and experienced Senior Data Engineer to lead the architecture, design, and deployment of our enterprise data infrastructure. In this role, you will take ownership of complex data pipelines, drive architectural decisions, and ensure our data ecosystems are scalable, secure, and production-ready. You will bridge the gap between data engineering, backend services, and machine learning operations (MLOps).

Key Responsibilities:

  • Pipeline Architecture: Architect and build highly scalable, fault-tolerant ETL/ELT pipelines from scratch and oversee their journey to production.
  • Big Data Ecosystems: Design and manage open-source big data environments using Hadoop, Apache Spark, Kafka, and NiFi for high-throughput, real-time, and batch data processing.
  • Backend & API Integration: Lead backend development for data serving using Python FastAPI and Spring Boot. Secure APIs utilizing OAuth2 protocols.
  • Orchestration & Transformation: Utilize Apache Airflow for complex workflow orchestration and dbt for robust data transformation.
  • DevOps & Infrastructure: Deploy and manage containerized applications using Docker and Kubernetes. Leverage deep knowledge of Linux commands and network configurations for seamless deployments.
  • MLOps & Advanced Data: Design infrastructures that support MLOps practices for model deployment and monitoring. Handle complex time-series data storage and analysis.
  • Mentorship: Provide technical guidance, code reviews, and mentorship to junior and mid-level data engineers.

Requirements and Qualifications:

  • Experience: Minimum 5+ years of hands-on experience in Data Engineering.
  • Lead the small team and guide junior engineers.
  • Best Practices need to follow on production ready setup.
  • Efficient code design pattern usage.
  • Multithreading knowledge.
  • Expertise in git end point version handling.
  • Heavily hands on experience expecting.
  • Exposure to Google Cloud Platform services and architecture. (or any other vendor)
  • Programming: Strong proficiency in Python. Solid experience with backend frameworks like FastAPI and Spring Boot.
  • Big Data & Streaming: Extensive experience with Hadoop environments, Apache Spark, Kafka, and NiFi.
  • ETL/Orchestration: Deep expertise in ETL pipeline creation, productionizing data workflows, Airflow, and dbt.
  • Databases: Advanced SQL skills with deep knowledge of DB2 SQL and PostgreSQL. Capable of handling massive time- series datasets.
  • Infrastructure & Security: Proficiency with Linux CLI, Docker, Kubernetes, network configuration, and OAuth2 implementation.
  • MLOps: Proven knowledge and hands-on experience with MLOps principles and productionizing machine learning