Role- Hadoop Administrator
Providing Administrative Support for Think Big Customers on Hadoop platforms. Typically these customers may have 24/7 contracts, and the successful applicant must be prepared to work in shifts and also be on-call to support customer site/s per contractual obligations.
The Hadoop Administrator manages and controls the Hadoop System environment for Teradata customers. The Hadoop Administrator requires specific technical knowledge about the administration and control of the Hadoop System, including the associated operating system, related tools, network, and hardware.
* Minimum experience of 1 year in Managing and Supporting large scale Production Hadoop environments in any of the Hadoop distributions (Apache, Teradata, Hortonworks, Cloudera, MapR, IBM BigInsights, Pivotal HD)
* 3-8 years of experience in Scripting Language (Linux, SQL, Python). Should be proficient in shell scripting
* 3-8 years of experience on Administrative activities likes –
* Management of data, users, and job execution on the Hadoop System
* Periodic backups of the system
* Plan for and support hardware and software installation and upgrades.
* 2+ years of experience in Supporting Hadoop Production systems (configuration management, monitoring, and performance tuning).
* 1+ years of Experience in Hadoop Monitoring tools (Nagios, Ganglia, Cloudera Manager, and Ambari etc).
* Product knowledge on Hadoop distributions such as Cloudera, Hortonworks & Greenplum pivotal, OR MapR.
* Administration, maintenance, control, and optimization of Hadoop capacity, security, configuration, process scheduling, and errors.
Nice to have Experience:
* Experience may include (but is not limited to) build and support including design, configuration, installation (upgrade), monitoring and performance tuning of any of the Hadoop distributions
* Experience with High availability, BAR and DR strategies and principles.
* Hadoop software installation and upgrades
* Experience with ANY ONE of the following:
* Proficiency in Hive internals (including HCatalog), SQOOP, Pig, Oozie and Flume/Kafka.
* Development or administration on NoSQL technologies like Hbase, MongoDB, Cassandra, Accumulo, etc.
* Development or administration on Web or cloud platforms like Amazon S3, EC2, Redshift, Rackspace, OpenShift, etc.
* Development/scripting experience on Configuration management and provisioning tools e.g. Puppet, Chef
* Web/Application Server & SOA administration (Tomcat, JBoss, etc.)
* Development, Implementation or deployment experience on the Hadoop ecosystem (HDFS, MapReduce, Hive, Hbase)
* Analysis and optimization of workloads, performance monitoring and tuning, and automation.
* Addressing challenges of query execution across a distributed database platform on modern hardware architectures
* Articulating and discussing the principles of performance tuning, workload management and/or capacity planning
* Define standards, Develop and Implement Best Practices to manage and support data platforms
Experience on any one of the following will be an added advantage:
* Hadoop integration with large scale distributed data platforms like Teradata, Teradata Aster, Vertica, Greenplum, Netezza, DB2, Oracle, etc.
* Proficiency with at least one of the following: Java, Python, Perl, Ruby, C or Web-related development
* Knowledge of Business Intelligence and/or Data Integration (ETL) operations delivery techniques, processes, methodologies
* Exposure to tools data acquisition, transformation & integration tools like Talend, Informatica, etc. & BI tools like Tableau, Pentaho, etc.
Global Delivery Center (GDC)