Role: Data Engineer
Location - New York, NY Onsite
Experience – 10 To 15 years .
Fulltime Permanent FTE
Job Description
Must Have Technical/Functional Skills
· Hands-on experience in building ETL using Databricks SaaS infrastructure.
· Experience in developing data pipeline solutions to ingest and exploit new and existing data
· sources.
· Expertise in leveraging SQL, programming language like Python and ETL tools like
· Databricks
· Perform code reviews to ensure requirements, optimal execution patterns and adherence to
· established standards.
· Expertise in AWS Compute (EC2, EMR), AWS Storage (S3, EBS), AWS Databases (RDS,
· DynamoDB), AWS Data Integration (Glue).
· Advanced understanding of Container Orchestration services including Docker and
· Kubernetes, and a variety of AWS tools and services.
· Good understanding of AWS Identify and Access management, AWS Networking and AWS
· Monitoring tools.
· Proficiency in CI/CD and deployment automation using GITLAB pipeline.
· Proficiency in Cloud infrastructure provisioning tools e.g., Terraform.
· Proficiency in one or more programming languages e.g., Python, Scala.
· Experience in Starburst, Trino and building SQL queries in federated architecture.
· Good knowledge of Lake house architecture.
· Design, develop, and optimize scalable ETL/ELT pipelines using Databricks and Apache
· Spark (PySpark and Scala).
· Build data ingestion workflows from various sources (structured, semi-structured,
· Develop reusable components and frameworks for efficient data processing. ·
· Implement best practices for data quality, validation, and governance. ·
· Collaborate with data architects, analysts, and business stakeholders to understand data requirements. ·
· Tune Spark jobs for performance and scalability in a cloud-based environment. ·
· Maintain robust data lake or Lakehouse architecture. ·
· Ensure high availability, security, and integrity of data pipelines and platforms. ·
· Support troubleshooting, debugging, and performance optimization in production workloads.
Roles & Responsibilities
· Work on migrating applications from an on-premises location to the cloud service providers.
· Develop products and services on the latest technologies through contributions in
· Development, enhancements, testing and implementation.
· Develop, modify, extend code for building cloud infrastructure, and automate using CI/CD
· pipeline.
· Partners with business and peers in the pursuit of solutions that achieve business goals
· through an agile software development methodology.
· Perform problem analysis, data analysis, reporting, and communication.
· Work with peers across the system to define and implement best practices and standards.
· Assess applications and help determine the appropriate application infrastructure patterns.
· Use the best practices and knowledge of internal or external drivers to improve products or services.