Role: Hadoop Admin
Location: Austin, TX (Onsite)
FTE ONLY
Must Have Technical/Functional Skills
• Performance tuning of Hadoop clusters and Hadoop workloads.
• Screen Hadoop cluster job performances and capacity planning at application/queue level
• Monitor Hadoop cluster connectivity and security
• Manage and review Hadoop log files.
• File system management and monitoring.
• implementation and ongoing administration of Hadoop infrastructure
• Screen Hadoop cluster job performances and capacity planning at application/queue level
• Monitor Hadoop cluster connectivity and security
• Manage and review Hadoop log files.
Roles & Responsibilities
• Responsible for implementation and ongoing administration of Hadoop infrastructure.
• Aligning with the systems engineering team to propose and deploy new hardware and software environments required for Hadoop and to expand existing environments.
• Working with data delivery teams to setup new Hadoop users/applications. This job includes onboarding activities like setting up Linux users, setting up Kerberos principals and testing HDFS, Hive, Pig, Spark and MapReduce access for the new users/applications.
• Cluster maintenance as well as creation and removal of nodes using tools like Ambari and other home-grown tools.
• Performance tuning of Hadoop clusters and Hadoop workloads.
• Screen Hadoop cluster job performances and capacity planning at application/queue level
• Monitor Hadoop cluster connectivity and security
• Manage and review Hadoop log files.
• File system management and monitoring.
• HDFS support and maintenance.
• Diligently teaming with the infrastructure, network, database, application and business intelligence teams to guarantee high data quality and availability.
• Collaborating with application teams to install operating system and Hadoop updates, patches, version upgrades when required.
• Optimize analytical jobs and queries against data in the HDFS/Hive environments
• Develop considerable bash shell or python scripts, LINUX utilities & LINUX Commands to ease day to day operations
Maintain central dashboards for all System, Data, Utilization and availability metrics