



Location: Bangalore/Pune/Hyderabad/Nagpur
4-5 years of overall experience in software development.
- Experience on Hadoop (Apache/Cloudera/Hortonworks) and/or other Map Reduce Platforms
- Experience on Hive, Pig, Sqoop, Flume and/or Mahout
- Experience on NO-SQL – HBase, Cassandra, MongoDB
- Hands on experience with Spark development, Knowledge of Storm, Kafka, Scala
- Good knowledge of Java
- Good background of Configuration Management/Ticketing systems like Maven/Ant/JIRA etc.
- Knowledge around any Data Integration and/or EDW tools is plus
- Good to have knowledge of using Python/Perl/Shell
Please note - Hbase hive and spark are must.

Similar jobs

Requirements
- 3+ years work experience with production-grade python. Contribution to open source repos is preferred
- Experience writing concurrent and distributed programs, AWS lambda, Kubernetes, Docker, Spark is preferred.
- Experience with one relational & 1 non-relational DB is preferred
- Prior work in the ML domain will be a big boost
What You’ll Do
- Help realize the product vision: Production-ready machine learning models with monitoring within moments, not months.
- Help companies deploy their machine learning models at scale across a wide range of use-cases and sectors.
- Build integrations with other platforms to make it easy for our customers to use our product without changing their workflow.
- Write maintainable, scalable performant python code
- Building gRPC, rest API servers
- Working with Thrift, Protobufs, etc.


A small description about the Company.
It is an Account Engagement Platform which helps B2B organizations to achieve predictable revenue growth by putting the power of AI, big data, and machine learning behind every member of the revenue team.
Looking for PYTHON DEVELOPER.
The Sr. Analytics Engineer would provide technical expertise in needs identification, data modeling, data movement, and transformation mapping (source to target), automation and testing strategies, translating business needs into technical solutions with adherence to established data guidelines and approaches from a business unit or project perspective.
Understands and leverages best-fit technologies (e.g., traditional star schema structures, cloud, Hadoop, NoSQL, etc.) and approaches to address business and environmental challenges.
Provides data understanding and coordinates data-related activities with other data management groups such as master data management, data governance, and metadata management.
Actively participates with other consultants in problem-solving and approach development.
Responsibilities :
Provide a consultative approach with business users, asking questions to understand the business need and deriving the data flow, conceptual, logical, and physical data models based on those needs.
Perform data analysis to validate data models and to confirm the ability to meet business needs.
Assist with and support setting the data architecture direction, ensuring data architecture deliverables are developed, ensuring compliance to standards and guidelines, implementing the data architecture, and supporting technical developers at a project or business unit level.
Coordinate and consult with the Data Architect, project manager, client business staff, client technical staff and project developers in data architecture best practices and anything else that is data related at the project or business unit levels.
Work closely with Business Analysts and Solution Architects to design the data model satisfying the business needs and adhering to Enterprise Architecture.
Coordinate with Data Architects, Program Managers and participate in recurring meetings.
Help and mentor team members to understand the data model and subject areas.
Ensure that the team adheres to best practices and guidelines.
Requirements :
- Strong working knowledge of at least 3 years of Spark, Java/Scala/Pyspark, Kafka, Git, Unix / Linux, and ETL pipeline designing.
- Experience with Spark optimization/tuning/resource allocations
- Excellent understanding of IN memory distributed computing frameworks like Spark and its parameter tuning, writing optimized workflow sequences.
- Experience of relational databases (e.g., PostgreSQL, MySQL) and NoSQL databases (e.g., Redshift, Bigquery, Cassandra, etc).
- Familiarity with Docker, Kubernetes, Azure Data Lake/Blob storage, AWS S3, Google Cloud storage, etc.
- Have a deep understanding of the various stacks and components of the Big Data ecosystem.
- Hands-on experience with Python is a huge plus
· Core responsibilities to include analyze business requirements and designs for accuracy and completeness. Develops and maintains relevant product.
· BlueYonder is seeking a Senior/Principal Architect in the Data Services department (under Luminate Platform ) to act as one of key technology leaders to build and manage BlueYonder’ s technology assets in the Data Platform and Services.
· This individual will act as a trusted technical advisor and strategic thought leader to the Data Services department. The successful candidate will have the opportunity to lead, participate, guide, and mentor other people in the team on architecture and design in a hands-on manner. You are responsible for technical direction of Data Platform. This position reports to the Global Head, Data Services and will be based in Bangalore, India.
· Core responsibilities to include Architecting and designing (along with counterparts and distinguished Architects) a ground up cloud native (we use Azure) SaaS product in Order management and micro-fulfillment
· The team currently comprises of 60+ global associates across US, India (COE) and UK and is expected to grow rapidly. The incumbent will need to have leadership qualities to also mentor junior and mid-level software associates in our team. This person will lead the Data platform architecture – Streaming, Bulk with Snowflake/Elastic Search/other tools
Our current technical environment:
· Software: Java, Springboot, Gradle, GIT, Hibernate, Rest API, OAuth , Snowflake
· • Application Architecture: Scalable, Resilient, event driven, secure multi-tenant Microservices architecture
· • Cloud Architecture: MS Azure (ARM templates, AKS, HD insight, Application gateway, Virtue Networks, Event Hub, Azure AD)
· Frameworks/Others: Kubernetes, Kafka, Elasticsearch, Spark, NOSQL, RDBMS, Springboot, Gradle GIT, Ignite
Senior Software Engineer - 221254.
We (the Software Engineer team) are looking for a motivated, experienced person with a data driven approach to join our Distribution Team in Budapest or Szeged to help design, execute and improve our test sets and infrastructure for producing high-quality Hadoop software.
A Day in the life
You will be part of a team that makes sure our releases are predictable and deliver high value to the customer. This team is responsible for automating and maintaining our test harness, and making test results reliable and repeatable.
You will…
•work on making our distributed software stack more resilient to high-scale endurance runs and customer simulations
•provide valuable fixes to our product development teams to the issues you’ve found during exhaustive test runs
•work with product and field teams to make sure our customer simulations match the expectations and can provide valuable feedback to our customers
•work with amazing people - We are a fun & smart team, including many of the top luminaries in Hadoop and related open source communities. We frequently interact with the research community, collaborate with engineers at other top companies & host cutting edge researchers for tech talks.
•do innovative work - Cloudera pushes the frontier of big data & distributed computing, as our track record shows. We work on high-profile open source projects, interacting daily with engineers at other exciting companies, speaking at meet-ups, etc.
•be a part of a great culture - Transparent and open meritocracy. Everybody is always thinking of better ways to do things, and coming up with ideas that make a difference. We build our culture to be the best workplace in our careers.
You have...
•strong knowledge in at least 1 of the following languages: Java / Python / Scala / C++ / C#
•hands-on experience with at least 1 of the following configuration management tools: Ansible, Chef, Puppet, Salt
•confidence with Linux environments
•ability to identify critical weak spots in distributed software systems
•experience in developing automated test cases and test plans
•ability to deal with distributed systems
•solid interpersonal skills conducive to a distributed environment
•ability to work independently on multiple tasks
•self-driven & motivated, with a strong work ethic and a passion for problem solving
•innovate and automate and break the code
The right person in this role has an opportunity to make a huge impact at Cloudera and add value to our future decisions. If this position has piqued your interest and you have what we described - we invite you to apply! An adventure in data awaits.

Experience:
The candidate should have about 2+ years of experience with design and development in Java/Scala. Experience in algorithm, data-structure, database and distributed System is mandatory.
Required Skills:
Mandatory: -
- Core Java or Scala
- Experience in Big Data, Spark
- Extensive experience in developing spark job. Should possess good Oops knowledge and be aware of enterprise application design patterns.
- Should have the ability to analyze, design, develop and test complexity of spark job.
- Working knowledge of Unix/Linux.
- Hands on experience in Spark, creating RDD, applying operation - transformation-action
Good To have: -
- Python
- Spark streaming
- Py Spark
- Azure/AWS Cloud Knowledge of Data Storage and Compute side
Cloudera Data Warehouse Hive team looking for a passionate senior developer to join our growing engineering team. This group is targeting the biggest enterprises wanting to utilize Cloudera’s services in a private and public cloud environment. Our product is built on open source technologies like Hive, Impala, Hadoop, Kudu, Spark and so many more providing unlimited learning opportunities.
A Day in the Life
Over the past 10+ years, Cloudera has experienced tremendous growth making us the leading contributor to Big Data platforms and ecosystems and a leading provider for enterprise solutions based on Apache Hadoop. You will work with some of the best engineers in the industry who are tackling challenges that will continue to shape the Big Data revolution. We foster an engaging, supportive, and productive work environment where you can do your best work. The team culture values engineering excellence, technical depth, grassroots innovation, teamwork, and collaboration.
You will manage product development for our CDP components, develop engineering tools and scalable services to enable efficient development, testing, and release operations. You will be immersed in many exciting, cutting-edge technologies and projects, including collaboration with developers, testers, product, field engineers, and our external partners, both software and hardware vendors.
Opportunity:
Cloudera is a leader in the fast-growing big data platforms market. This is a rare chance to make a name for yourself in the industry and in the Open Source world. The candidate will responsible for Apache Hive and CDW projects. We are looking for a candidate who would like to work on these projects upstream and downstream. If you are curious about the project and code quality you can check the project and the code at the following link. You can start the development before you join. This is one of the beauties of the OSS world.
https://hive.apache.org/" target="_blank">Apache Hive
Responsibilities:
-
Build robust and scalable data infrastructure software
-
Design and create services and system architecture for your projects
-
Improve code quality through writing unit tests, automation, and code reviews
-
The candidate would write Java code and/or build several services in the Cloudera Data Warehouse.
-
Worked with a team of engineers who reviewed each other's code/designs and held each other to an extremely high bar for the quality of code/designs
-
The candidate has to understand the basics of Kubernetes.
-
Build out the production and test infrastructure.
-
Develop automation frameworks to reproduce issues and prevent regressions.
-
Work closely with other developers providing services to our system.
-
Help to analyze and to understand how customers use the product and improve it where necessary.
Qualifications:
-
Deep familiarity with Java programming language.
-
Hands-on experience with distributed systems.
-
Knowledge of database concepts, RDBMS internals.
-
Knowledge of the Hadoop stack, containers, or Kubernetes is a strong plus.
-
Has experience working in a distributed team.
-
Has 3+ years of experience in software development.

Spark / Scala experience should be more than 2 years.
Combination with Java & Scala is fine or we are even fine with Big Data Developer with strong Core Java Concepts. - Scala / Spark Developer.
Strong proficiency Scala on Spark (Hadoop) - Scala + Java is also preferred
Complete SDLC process and Agile Methodology (Scrum)
Version control / Git
- You will be responsible for design, development and testing of Products
- Contributing in all phases of the development lifecycle
- Writing well designed, testable, efficient code
- Ensure designs are in compliance with specifications
- Prepare and produce releases of software components
- Support continuous improvement by investigating alternatives and technologies and presenting these for architectural review
- Some of the technologies you will be working on: Core Java, Solr, Hadoop, Spark, Elastic search, Clustering, Text Mining, NLP, Mahout and Lucene etc.

