8+ Cloudera Jobs in India
Apply to 8+ Cloudera Jobs on CutShort.io. Find your next job, effortlessly. Browse Cloudera Jobs and apply today!
Experience: 12-15 Years
Key Responsibilities:
- Client Engagement & Requirements Gathering: Independently engage with client stakeholders to
- understand data landscapes and requirements, translating them into functional and technical specifications.
- Data Architecture & Solution Design: Architect and implement Hadoop-based Cloudera CDP solutions,
- including data integration, data warehousing, and data lakes.
- Data Processes & Governance: Develop data ingestion and ETL/ELT frameworks, ensuring robust data governance and quality practices.
- Performance Optimization: Provide SQL expertise and optimize Hadoop ecosystems (HDFS, Ozone, Kudu, Spark Streaming, etc.) for maximum performance.
- Coding & Development: Hands-on coding in relevant technologies and frameworks, ensuring project deliverables meet stringent quality and performance standards.
- API & Database Management: Integrate APIs and manage databases (e.g., PostgreSQL, Oracle) to support seamless data flows.
- Leadership & Mentoring: Guide and mentor a team of data engineers and analysts, fostering collaboration and technical excellence.
Skills Required:
- a. Technical Proficiency:
- • Extensive experience with Hadoop ecosystem tools and services (HDFS, YARN, Cloudera
- Manager, Impala, Kudu, Hive, Spark Streaming, etc.).
- • Proficiency in programming languages like Spark, Python, Scala and a strong grasp of SQL
- performance tuning.
- • ETL tool expertise (e.g., Informatica, Talend, Apache Nifi) and data modelling knowledge.
- • API integration skills for effective data flow management.
- b. Project Management & Communication:
- • Proven ability to lead large-scale data projects and manage project timelines.
- • Excellent communication, presentation, and critical thinking skills.
- c. Client & Team Leadership:
- • Engage effectively with clients and partners, leading onsite and offshore teams.
Key Responsibilities:
• Install, configure, and maintain Hadoop clusters.
• Monitor cluster performance and ensure high availability.
• Manage Hadoop ecosystem components (HDFS, YARN, Ozone, Spark, Kudu, Hive).
• Perform routine cluster maintenance and troubleshooting.
• Implement and manage security and data governance.
• Monitor systems health and optimize performance.
• Collaborate with cross-functional teams to support big data applications.
• Perform Linux administration tasks and manage system configurations.
• Ensure data integrity and backup procedures.
Responsibilities :
- Provide Support Services to our Gold & Enterprise customers using our flagship product suits. This may include assistance provided during the engineering and operations of distributed systems as well as responses for mission-critical systems and production customers.
- Lead end-to-end delivery and customer success of next-generation features related to scalability, reliability, robustness, usability, security, and performance of the product
- Lead and mentor others about concurrency, parallelization to deliver scalability, performance, and resource optimization in a multithreaded and distributed environment
- Demonstrate the ability to actively listen to customers and show empathy to the customer’s business impact when they experience issues with our products
Requires Skills :
- 10+ years of Experience with a highly scalable, distributed, multi-node environment (100+ nodes)
- Hadoop operation including Zookeeper, HDFS, YARN, Hive, and related components like the Hive metastore, Cloudera Manager/Ambari, etc
- Authentication and security configuration and tuning (KNOX, LDAP, Kerberos, SSL/TLS, second priority: SSO/OAuth/OIDC, Ranger/Sentry)
- Java troubleshooting, e.g., collection and evaluation of jstacks, heap dumps
- Linux, NFS, Windows, including application installation, scripting, basic command line
- Docker and Kubernetes configuration and troubleshooting, including Helm charts, storage options, logging, and basic kubectl CLI
- Experience working with scripting languages (Bash, PowerShell, Python)
- Working knowledge of application, server, and network security management concepts
- Familiarity with virtual machine technologies
- Knowledge of databases like MySQL and PostgreSQL,
- Certification on any of the leading Cloud providers (AWS, Azure, GCP ) and/or Kubernetes is a big plus
one of the leading payments bank
Requirements:
- Proficiency in shell scripting.
- Proficiency in automation of tasks.
- Proficiency in Pyspark/Python.
- Proficiency in writing and understanding of sqoop.
- Understanding of Cloud Era manager.
- Good understanding of RDBMS.
- Good understanding of Excel.
- Familiarity with Hadoop ecosystem and its components.
- Understanding of data loading tools such as Flume, Sqoop etc.
- Ability to write reliable, manageable, and high-performance code.
- Good knowledge of database principles, practices, structures, and theories.
payments bank
- Proficiency in shell scripting
- Proficiency in automation of tasks
- Proficiency in Pyspark/Python
- Proficiency in writing and understanding of sqoop
- Understanding of CloudEra manager
- Good understanding of RDBMS
- Good understanding of Excel
Good understating or hand's on in Kafka Admin / Apache Kafka Streaming.
Implementing, managing, and administering the overall hadoop infrastructure.
Takes care of the day-to-day running of Hadoop clusters
A hadoop administrator will have to work closely with the database team, network team, BI team, and application teams to make sure that all the big data applications are highly available and performing as expected.
If working with open source Apache Distribution, then hadoop admins have to manually setup all the configurations- Core-Site, HDFS-Site, YARN-Site and Map Red-Site. However, when working with popular hadoop distribution like Hortonworks, Cloudera or MapR the configuration files are setup on startup and the hadoop admin need not configure them manually.
Hadoop admin is responsible for capacity planning and estimating the requirements for lowering or increasing the capacity of the hadoop cluster.
Hadoop admin is also responsible for deciding the size of the hadoop cluster based on the data to be stored in HDFS.
Ensure that the hadoop cluster is up and running all the time.
Monitoring the cluster connectivity and performance.
Manage and review Hadoop log files.
Backup and recovery tasks
Resource and security management
Troubleshooting application errors and ensuring that they do not occur again.