Job Title: Data Pipeline Engineer
Location: Remote
Contract
Job Summary:
We are seeking a highly skilled Data Pipeline Engineer to design, build, and optimize robust data pipelines and platforms supporting large-scale data analytics and processing workloads. The ideal candidate will have hands-on experience with Databricks, Azure SQL Server, PySpark, and modern ETL/ELT frameworks, coupled with a solid understanding of data modeling, containerization, and scalability best practices.
· Design, develop, and maintain data pipelines using Databricks (Python and PySpark) to support ingestion, transformation, and storage of large datasets.
· Implement ETL/ELT processes to extract data from various sources, ensuring data integrity, consistency, and accuracy across systems.
· Develop and optimize data models to support analytical and operational data needs.
· Build and manage Azure SQL Server instances including Managed Instances, Azure SQL Database, and SQL Server VMs for data storage and access.
· Apply data engineering best practices for performance tuning, scalability, and cost optimization across the data pipeline infrastructure.
· Utilize Docker and Azure Kubernetes Service (AKS) for containerized deployment and orchestration of data pipeline components.
· Collaborate with Data Scientists, Analysts, and Business Stakeholders to translate business requirements into technical solutions.
· Monitor and troubleshoot data pipeline performance, ensuring high availability, reliability, and data quality.
· Maintain thorough documentation of data architecture, data flows, and pipeline configurations.
· Drive automation initiatives to streamline deployment, testing, and data operations.
· Proven experience with Databricks using Python, SQL, and PySpark.
· Strong knowledge of Azure SQL Server ecosystem — including Managed Instance, Azure SQL DB, and SQL Server VMs.
· Expertise in data modeling, ETL/ELT design patterns, and data integration frameworks.
· Solid understanding of containerization (Docker) and Kubernetes (AKS) for data engineering workloads.
· Experience in performance optimization and scalability in cloud data environments.
· Strong SQL development skills, including query optimization and stored procedure development.
· Hands-on experience with Azure Data Services (ADF, Synapse, Storage Accounts) is a plus.
· Familiarity with CI/CD pipelines for automated deployment of data solutions.