8+ Cloudera Jobs in India
Apply to 8+ Cloudera Jobs on CutShort.io. Find your next job, effortlessly. Browse Cloudera Jobs and apply today!

About the Role:
We are looking for an experienced and imaginative Generative AI Architect & Engineer to lead, design, and build cutting-edge solutions using Large Language Models (LLMs) such as OpenAI’s GPT-4/5, Google Gemini, Meta’s LLaMA, and Anthropic Claude. You will spearhead the development of a corporate-wide GenAI platform that harnesses proprietary enterprise knowledge and empowers sales, operations, and other departments with instant, contextual insights, guidance, and automation.
As the GenAI leader, you will be responsible for building scalable pipelines, integrating enterprise content repositories, fine-tuning or adapting foundational models, and creating secure and intuitive access patterns for business teams. You are passionate about creating highly usable solutions that solve real-world problems, and are fluent across architecture, implementation, and MLOps best practices.
Key Responsibilities:
- Architect and implement an end-to-end GenAI solution leveraging LLMs to serve as a contextual assistant across multiple business units.
- Develop pipelines to ingest, clean, and index enterprise knowledge (documents, wikis, CRM, chat transcripts, etc.) using RAG (Retrieval-Augmented Generation) patterns and vector databases.
- Lead fine-tuning, prompt engineering, and evaluation of LLMs, adapting open-source or commercial models to enterprise needs.
- Design a secure, scalable, API-first microservice platform, including middleware and access control, integrated into corporate systems.
- Work closely with sales, operations, and customer support teams to gather use cases and translate them into impactful GenAI features.
- Drive experimentation and benchmarking to evaluate various open and closed LLMs (OpenAI, Claude, Gemini, LLaMA, Mistral, etc.) for best performance and cost-efficiency.
- Collaborate with DevOps teams to enable MLOps workflows, CI/CD pipelines, versioning, and A/B testing for AI models.
- Contribute to technical documentation, best practices, and internal knowledge sharing.
Key Qualifications:
- 4–5+ years of hands-on experience in AI/ML product development or applied research.
- Demonstrated experience working with LLMs (OpenAI, LLaMA, Claude, Gemini, Mistral, etc.) and RAG pipelines, including vector search (FAISS, Weaviate, Pinecone, Chroma, etc.).
- Strong Python skills and experience with frameworks such as LangChain, LlamaIndex, Transformers, Ray, HuggingFace, or equivalent.
- Deep understanding of NLP, model fine-tuning, embeddings, tokenization, and content ingestion pipelines.
- Exposure to enterprise content systems (e.g., SharePoint, Confluence, Salesforce, internal wikis, etc.) and integrating with them securely.
- Solid foundation in software architecture, microservices, API design, and cloud deployments (Azure, AWS, or GCP).
- Experience with security, RBAC, and compliance practices in enterprise-grade solutions.
- Ability to lead projects independently and mentor junior engineers or data scientists.
How to Apply:
Submit your resume and a short technical project summary or portfolio (GitHub, Hugging Face, blog posts)
Key Responsibilities:
• Install, configure, and maintain Hadoop clusters.
• Monitor cluster performance and ensure high availability.
• Manage Hadoop ecosystem components (HDFS, YARN, Ozone, Spark, Kudu, Hive).
• Perform routine cluster maintenance and troubleshooting.
• Implement and manage security and data governance.
• Monitor systems health and optimize performance.
• Collaborate with cross-functional teams to support big data applications.
• Perform Linux administration tasks and manage system configurations.
• Ensure data integrity and backup procedures.
Responsibilities :
- Provide Support Services to our Gold & Enterprise customers using our flagship product suits. This may include assistance provided during the engineering and operations of distributed systems as well as responses for mission-critical systems and production customers.
- Lead end-to-end delivery and customer success of next-generation features related to scalability, reliability, robustness, usability, security, and performance of the product
- Lead and mentor others about concurrency, parallelization to deliver scalability, performance, and resource optimization in a multithreaded and distributed environment
- Demonstrate the ability to actively listen to customers and show empathy to the customer’s business impact when they experience issues with our products
Requires Skills :
- 10+ years of Experience with a highly scalable, distributed, multi-node environment (100+ nodes)
- Hadoop operation including Zookeeper, HDFS, YARN, Hive, and related components like the Hive metastore, Cloudera Manager/Ambari, etc
- Authentication and security configuration and tuning (KNOX, LDAP, Kerberos, SSL/TLS, second priority: SSO/OAuth/OIDC, Ranger/Sentry)
- Java troubleshooting, e.g., collection and evaluation of jstacks, heap dumps
- Linux, NFS, Windows, including application installation, scripting, basic command line
- Docker and Kubernetes configuration and troubleshooting, including Helm charts, storage options, logging, and basic kubectl CLI
- Experience working with scripting languages (Bash, PowerShell, Python)
- Working knowledge of application, server, and network security management concepts
- Familiarity with virtual machine technologies
- Knowledge of databases like MySQL and PostgreSQL,
- Certification on any of the leading Cloud providers (AWS, Azure, GCP ) and/or Kubernetes is a big plus
Requirements:
- Proficiency in shell scripting.
- Proficiency in automation of tasks.
- Proficiency in Pyspark/Python.
- Proficiency in writing and understanding of sqoop.
- Understanding of Cloud Era manager.
- Good understanding of RDBMS.
- Good understanding of Excel.
- Familiarity with Hadoop ecosystem and its components.
- Understanding of data loading tools such as Flume, Sqoop etc.
- Ability to write reliable, manageable, and high-performance code.
- Good knowledge of database principles, practices, structures, and theories.
- Proficiency in shell scripting
- Proficiency in automation of tasks
- Proficiency in Pyspark/Python
- Proficiency in writing and understanding of sqoop
- Understanding of CloudEra manager
- Good understanding of RDBMS
- Good understanding of Excel
Good understating or hand's on in Kafka Admin / Apache Kafka Streaming.
Implementing, managing, and administering the overall hadoop infrastructure.
Takes care of the day-to-day running of Hadoop clusters
A hadoop administrator will have to work closely with the database team, network team, BI team, and application teams to make sure that all the big data applications are highly available and performing as expected.
If working with open source Apache Distribution, then hadoop admins have to manually setup all the configurations- Core-Site, HDFS-Site, YARN-Site and Map Red-Site. However, when working with popular hadoop distribution like Hortonworks, Cloudera or MapR the configuration files are setup on startup and the hadoop admin need not configure them manually.
Hadoop admin is responsible for capacity planning and estimating the requirements for lowering or increasing the capacity of the hadoop cluster.
Hadoop admin is also responsible for deciding the size of the hadoop cluster based on the data to be stored in HDFS.
Ensure that the hadoop cluster is up and running all the time.
Monitoring the cluster connectivity and performance.
Manage and review Hadoop log files.
Backup and recovery tasks
Resource and security management
Troubleshooting application errors and ensuring that they do not occur again.

