Candidate should have 8+ years of experience in Data Engineering
Designing, creating, testing and maintaining the complete data management & processing
systems.
Working closely with the stakeholders & solution architect.
Ensuring architecture meets the business requirements.
Building highly scalable, robust & fault-tolerant systems.
Taking care of the complete ETL process.
Knowledge of Hadoop ecosystem and different frameworks inside it – HDFS, YARN,
MapReduce, Apache Pig, Hive, Flume, Sqoop, ZooKeeper, Oozie, Impala and Kafka
Must have knowledge and working experience in Real-time processing Framework (Apache
Spark), PySpark and in AWS Redshift
Must have experience on SQL-based technologies (e.g. MySQL/ Oracle DB) and NoSQL
technologies (e.g. Cassandra and MongoDB)
Should have Python/Scala/Java Programming skills