
Location: Bangalore/Pune/Hyderabad/Nagpur
4-5 years of overall experience in software development.
- Experience on Hadoop (Apache/Cloudera/Hortonworks) and/or other Map Reduce Platforms
- Experience on Hive, Pig, Sqoop, Flume and/or Mahout
- Experience on NO-SQL – HBase, Cassandra, MongoDB
- Hands on experience with Spark development, Knowledge of Storm, Kafka, Scala
- Good knowledge of Java
- Good background of Configuration Management/Ticketing systems like Maven/Ant/JIRA etc.
- Knowledge around any Data Integration and/or EDW tools is plus
- Good to have knowledge of using Python/Perl/Shell
Please note - Hbase hive and spark are must.

Similar jobs
Level of skills and experience:
5 years of hands-on experience in using Python, Spark,Sql.
Experienced in AWS Cloud usage and management.
Experience with Databricks (Lakehouse, ML, Unity Catalog, MLflow).
Experience using various ML models and frameworks such as XGBoost, Lightgbm, Torch.
Experience with orchestrators such as Airflow and Kubeflow.
Familiarity with containerization and orchestration technologies (e.g., Docker, Kubernetes).
Fundamental understanding of Parquet, Delta Lake and other data file formats.
Proficiency on an IaC tool such as Terraform, CDK or CloudFormation.
Strong written and verbal English communication skill and proficient in communication with non-technical stakeholderst
Requirements
• Extensive and expert programming experience in at least one general programming language (e. g.
Java, C, C++) & tech stack to write maintainable, scalable, unit-tested code.
• Experience with multi-threading and concurrency programming.
• Extensive experience in object oriented design skills, knowledge of design patterns, and a huge passion
and ability to design intuitive modules and class-level interfaces.
• Excellent coding skills - should be able to convert design into code fluently.
• Knowledge of Test Driven Development.
• Good understanding of databases (e. g. MySQL) and NoSQL (e. g. HBase, Elasticsearch, Aerospike etc).
• Strong desire to solve complex and interesting real world problems.
• Experience with full life cycle development in any programming language on a Linux platform.
• Go-getter attitude that reflects in energy and intent behind assigned tasks.
• Worked in a startup-like environment with high levels of ownership and commitment.
• BTech, MTech or Ph. D. in Computer Science or related technical discipline (or equivalent).
• Experience in building highly scalable business applications, which involve implementing large complex
business flows and dealing with huge amounts of data.
• 3+ years of experience in the art of writing code and solving problems on a large scale.
• Open communicator who shares thoughts and opinions frequently, listens intently, and takes
constructive feedback.
About the Role-
Thinking big and executing beyond what is expected. The challenges cut across algorithmic problem solving, systems engineering, machine learning and infrastructure at a massive scale.
Reason to Join-
An opportunity for innovators, problem solvers & learners. Working will be Innovative, empowering, rewarding & fun. Amazing Office, competitive pay along with excellent benefits package.
Requiremets and Responsibilities- (please read carefully before applying)
- The overall experience of 3-6 years in Java/Python Framework and Machine Learning.
- Develop Web Services, REST, XSD, XML technologies, Java, Python, AWS, API.
- Experience on Elastic Search or SOLR or Lucene -Search Engine, Text Mining, Indexing.
- Experience in highly scalable tools like Kafka, Spark, Aerospike, etc.
- Hands on experience in Design, Architecture, Implementation, Performance & Scalability, and Distributed Systems.
- Design, implement, and deploy highly scalable and reliable systems.
- Troubleshoot Solr indexing process and querying engine.
- Bachelors or Masters in Computer Science from Tier 1 Institutions
Cloudera Data Warehouse Hive team looking for a passionate senior developer to join our growing engineering team. This group is targeting the biggest enterprises wanting to utilize Cloudera’s services in a private and public cloud environment. Our product is built on open source technologies like Hive, Impala, Hadoop, Kudu, Spark and so many more providing unlimited learning opportunities.
A Day in the Life
Over the past 10+ years, Cloudera has experienced tremendous growth making us the leading contributor to Big Data platforms and ecosystems and a leading provider for enterprise solutions based on Apache Hadoop. You will work with some of the best engineers in the industry who are tackling challenges that will continue to shape the Big Data revolution. We foster an engaging, supportive, and productive work environment where you can do your best work. The team culture values engineering excellence, technical depth, grassroots innovation, teamwork, and collaboration.
You will manage product development for our CDP components, develop engineering tools and scalable services to enable efficient development, testing, and release operations. You will be immersed in many exciting, cutting-edge technologies and projects, including collaboration with developers, testers, product, field engineers, and our external partners, both software and hardware vendors.
Opportunity:
Cloudera is a leader in the fast-growing big data platforms market. This is a rare chance to make a name for yourself in the industry and in the Open Source world. The candidate will responsible for Apache Hive and CDW projects. We are looking for a candidate who would like to work on these projects upstream and downstream. If you are curious about the project and code quality you can check the project and the code at the following link. You can start the development before you join. This is one of the beauties of the OSS world.
https://hive.apache.org/" target="_blank">Apache Hive
Responsibilities:
-
Build robust and scalable data infrastructure software
-
Design and create services and system architecture for your projects
-
Improve code quality through writing unit tests, automation, and code reviews
-
The candidate would write Java code and/or build several services in the Cloudera Data Warehouse.
-
Worked with a team of engineers who reviewed each other's code/designs and held each other to an extremely high bar for the quality of code/designs
-
The candidate has to understand the basics of Kubernetes.
-
Build out the production and test infrastructure.
-
Develop automation frameworks to reproduce issues and prevent regressions.
-
Work closely with other developers providing services to our system.
-
Help to analyze and to understand how customers use the product and improve it where necessary.
Qualifications:
-
Deep familiarity with Java programming language.
-
Hands-on experience with distributed systems.
-
Knowledge of database concepts, RDBMS internals.
-
Knowledge of the Hadoop stack, containers, or Kubernetes is a strong plus.
-
Has experience working in a distributed team.
-
Has 3+ years of experience in software development.
-
Bachelor’s or master’s degree in Computer Engineering, Computer Science, Computer Applications, Mathematics, Statistics, or related technical field. Relevant experience of at least 3 years in lieu of above if from a different stream of education.
-
Well-versed in and 3+ hands-on demonstrable experience with: ▪ Stream & Batch Big Data Pipeline Processing using Apache Spark and/or Apache Flink.
▪ Distributed Cloud Native Computing including Server less Functions
▪ Relational, Object Store, Document, Graph, etc. Database Design & Implementation
▪ Micro services Architecture, API Modeling, Design, & Programming -
3+ years of hands-on development experience in Apache Spark using Scala and/or Java.
-
Ability to write executable code for Services using Spark RDD, Spark SQL, Structured Streaming, Spark MLLib, etc. with deep technical understanding of Spark Processing Framework.
-
In-depth knowledge of standard programming languages such as Scala and/or Java.
-
3+ years of hands-on development experience in one or more libraries & frameworks such as Apache Kafka, Akka, Apache Storm, Apache Nifi, Zookeeper, Hadoop ecosystem (i.e., HDFS, YARN, MapReduce, Oozie & Hive), etc.; extra points if you can demonstrate your knowledge with working examples.
-
3+ years of hands-on development experience in one or more Relational and NoSQL datastores such as PostgreSQL, Cassandra, HBase, MongoDB, DynamoDB, Elastic Search, Neo4J, etc.
-
Practical knowledge of distributed systems involving partitioning, bucketing, CAP theorem, replication, horizontal scaling, etc.
-
Passion for distilling large volumes of data, analyze performance, scalability, and capacity performance issues in Big Data Platforms.
-
Ability to clearly distinguish system and Spark Job performances and perform spark performance tuning and resource optimization.
-
Perform benchmarking/stress tests and document the best practices for different applications.
-
Proactively work with tenants on improving the overall performance and ensure the system is resilient, and scalable.
-
Good understanding of Virtualization & Containerization; must demonstrate experience in technologies such as Kubernetes, Istio, Docker, OpenShift, Anthos, Oracle VirtualBox, Vagrant, etc.
-
Well-versed with demonstrable working experience with API Management, API Gateway, Service Mesh, Identity & Access Management, Data Protection & Encryption.
Hands-on experience with demonstrable working experience with DevOps tools and platforms viz., Jira, GIT, Jenkins, Code Quality & Security Plugins, Maven, Artifactory, Terraform, Ansible/Chef/Puppet, Spinnaker, etc.
-
Well-versed in AWS and/or Azure or and/or Google Cloud; must demonstrate experience in at least FIVE (5) services offered under AWS and/or Azure or and/or Google Cloud in any categories: Compute or Storage, Database, Networking & Content Delivery, Management & Governance, Analytics, Security, Identity, & Compliance (or) equivalent demonstrable Cloud Platform experience.
-
Good understanding of Storage, Networks and Storage Networking basics which will enable you to work in a Cloud environment.
-
Good understanding of Network, Data, and Application Security basics which will enable you to work in a Cloud as well as Business Applications / API services environment.
Looking for Part time candidate job support having good skills in python, hadoop, oracle and perl .
Working hours - 2-3 hrs daily work for 1year payment from 200-700 whatsapp +1 mad C00 Vwxe your details







