MapReduce Jobs in Pune
Ideal candidates should have technical experience in migrations and the ability to help customers get value from Datametica's tools and accelerators.
Experience : 7+ years
Location : Pune / Hyderabad
- Drive and participate in requirements gathering workshops, estimation discussions, design meetings and status review meetings
- Participate and contribute in Solution Design and Solution Architecture for implementing Big Data Projects on-premise and on cloud
- Technical Hands on experience in design, coding, development and managing Large Hadoop implementation
- Proficient in SQL, Hive, PIG, Spark SQL, Shell Scripting, Kafka, Flume, Scoop with large Big Data and Data Warehousing projects with either Java, Python or Scala based Hadoop programming background
- Proficient with various development methodologies like waterfall, agile/scrum and iterative
- Good Interpersonal skills and excellent communication skills for US and UK based clients
A global Leader in the Data Warehouse Migration and Modernization to the Cloud, we empower businesses by migrating their Data/Workload/ETL/Analytics to the Cloud by leveraging Automation.
We have expertise in transforming legacy Teradata, Oracle, Hadoop, Netezza, Vertica, Greenplum along with ETLs like Informatica, Datastage, AbInitio & others, to cloud-based data warehousing with other capabilities in data engineering, advanced analytics solutions, data management, data lake and cloud optimization.
Datametica is a key partner of the major cloud service providers - Google, Microsoft, Amazon, Snowflake.
We have our own products!
Eagle – Data warehouse Assessment & Migration Planning Product
Raven – Automated Workload Conversion Product
Pelican - Automated Data Validation Product, which helps automate and accelerate data migration to the cloud.
Why join us!
Datametica is a place to innovate, bring new ideas to live and learn new things. We believe in building a culture of innovation, growth and belonging. Our people and their dedication over these years are the key factors in achieving our success.
Benefits we Provide!
Working with Highly Technical and Passionate, mission-driven people
Subsidized Meals & Snacks
Access to various learning tools and programs
Certification Reimbursement Policy
Check out more about us on our website below!
We are hiring for Tier 1 MNC for the software developer with good knowledge in Spark,Hadoop and Scala
About Amber (https://amberstudent.com)
Long-term accommodation booking platform for students (think booking.com for
student housing). Amber helps 80M students worldwide, find and book full-time accommodations near their universities, without the hassle of negotiation, nonstandardized and cumbersome paperwork, and a broken payment process.
We are the leading student housing platform globally, with 1M+ student housing units listed in 6 countries and across 80 cities.
We are growing rapidly and targeting $400M in annual gross bookings value by 2022.
If you are passionate about making international mobility and living, seamless and accessible, then - Join us in building the future of student housing!
We are amongst the fastest growing companies in Asia-Pacific as per
Financial times https://www.ft.com/high-growth-asia-pacific-ranking-2022 .
- In charge of converting raw data into usable information for analytics and business decision-making
- Setting up accurate data pipelines to structure the Data and optimize the cost
- Create and maintain optimal data pipeline architecture
- Assemble large, complex data sets that meet functional / non-functional business requirements.
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, and re-designing infrastructure for greater scalability, etc.
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS technologies.
- Work with stakeholders including the Executive, Product, Analytics and Design teams to assist with data-related technical issues and support their data infrastructure needs.
- Minimum 2 years of previous experience as a data engineer or in a similar role.
- Technical expertise in data models, data mining, and segmentation
- Knowledge and hands-on with of programming languages (e.g. Java, Python
- and Scala)
- Hands-on experience with SQL database design and AWS lambda function.
- Experience with big data tools: Spark, and Kafka.
- Experience with AWS cloud services: Redshift and S3.
- Experience in ETL frameworks like AWS Glue.
- Experience in designing Data warehousing and streaming processes.
What will you get from amber:
- Fast-paced growth (can skip intermediate levels)
- Total freedom and authority (everything under you, just get the job done!)
- Open and Inclusive Environment
- Great Compensation (and ESOPs)
We have an urgent requirements of Big Data Developer profiles in our reputed MNC company.
or Python Aws
• Project Planning and Management
o Take end-to-end ownership of multiple projects / project tracks
o Create and maintain project plans and other related documentation for project
objectives, scope, schedule and delivery milestones
o Lead and participate across all the phases of software engineering, right from
requirements gathering to GO LIVE
o Lead internal team meetings on solution architecture, effort estimation, manpower
planning and resource (software/hardware/licensing) planning
o Manage RIDA (Risks, Impediments, Dependencies, Assumptions) for projects by
developing effective mitigation plans
• Team Management
o Act as the Scrum Master
o Conduct SCRUM ceremonies like Sprint Planning, Daily Standup, Sprint Retrospective
o Set clear objectives for the project and roles/responsibilities for each team member
o Train and mentor the team on their job responsibilities and SCRUM principles
o Make the team accountable for their tasks and help the team in achieving them
o Identify the requirements and come up with a plan for Skill Development for all team
o Be the Single Point of Contact for the client in terms of day-to-day communication
o Periodically communicate project status to all the stakeholders (internal/external)
• Process Management and Improvement
o Create and document processes across all disciplines of software engineering
o Identify gaps and continuously improve processes within the team
o Encourage team members to contribute towards process improvement
o Develop a culture of quality and efficiency within the team
• Minimum 08 years of experience (hands-on as well as leadership) in software / data engineering
across multiple job functions like Business Analysis, Development, Solutioning, QA, DevOps and
• Hands-on as well as leadership experience in Big Data Engineering projects
• Experience developing or managing cloud solutions using Azure or other cloud provider
• Demonstrable knowledge on Hadoop, Hive, Spark, NoSQL DBs, SQL, Data Warehousing, ETL/ELT,
• Strong project management and communication skills
• Strong analytical and problem-solving skills
• Strong systems level critical thinking skills
• Strong collaboration and influencing skills
Good to have:
• Knowledge on PySpark, Azure Data Factory, Azure Data Lake Storage, Synapse Dedicated SQL
Pool, Databricks, PowerBI, Machine Learning, Cloud Infrastructure
• Background in BFSI with focus on core banking
• Willingness to travel
• Customer Office (Mumbai) / Remote Work
• UG: B. Tech - Computers / B. E. – Computers / BCA / B.Sc. Computer Science
Mid / Senior Big Data Engineer
Role: Big Data EngineerNumber of open positions: 5Location: PuneAt Clairvoyant, we're building a thriving big data practice to help enterprises enable and accelerate the adoption of Big data and cloud services. In the big data space, we lead and serve as innovators, troubleshooters, and enablers. Big data practice at Clairvoyant, focuses on solving our customer's business problems by delivering products designed with best in class engineering practices and a commitment to keep the total cost of ownership to a minimum.
- 4-10 years of experience in software development.
- At least 2 years of relevant work experience on large scale Data applications.
- Strong coding experience in Java is mandatory
- Good aptitude, strong problem solving abilities, and analytical skills, ability to take ownership as appropriate
- Should be able to do coding, debugging, performance tuning and deploying the apps to Prod.
- Should have good working experience on
- o Hadoop ecosystem (HDFS, Hive, Yarn, File formats like Avro/Parquet)
- o Kafka
- o J2EE Frameworks (Spring/Hibernate/REST)
- o Spark Streaming or any other streaming technology.
- Strong coding experience in Java is mandatory
- Ability to work on the sprint stories to completion along with Unit test case coverage.
- Experience working in Agile Methodology
- Excellent communication and coordination skills
- Knowledgeable (and preferred hands on) - UNIX environments, different continuous integration tools.
- Must be able to integrate quickly into the team and work independently towards team goals
- Take the complete responsibility of the sprint stories' execution
- Be accountable for the delivery of the tasks in the defined timelines with good quality.
- Follow the processes for project execution and delivery.
- Follow agile methodology
- Work with the team lead closely and contribute to the smooth delivery of the project.
- Understand/define the architecture and discuss the pros-cons of the same with the team
- Involve in the brainstorming sessions and suggest improvements in the architecture/design.
- Work with other team leads to get the architecture/design reviewed.
- Work with the clients and counter-parts (in US) of the project.
- Keep all the stakeholders updated about the project/task status/risks/issues if there are any.
Experience: 4 to 9 years
Keywords: java, scala, spark, software development, hadoop, hive
Data Engineers develop modern data architecture approaches to meet key business objectives and provide end-to-end data solutions. You might spend a few weeks with a new client on a deep technical review or a complete organizational review, helping them to understand the potential that data brings to solve their most pressing problems. On other projects, you might be acting as the architect, leading the design of technical solutions, or perhaps overseeing a program inception to build a new product. It could also be a software delivery project where you're equally happy coding and tech-leading the team to implement the solution.
You’ll spend time on the following:
- You will partner with teammates to create complex data processing pipelines in order to solve our clients’ most ambitious challenges
- You will collaborate with Data Scientists in order to design scalable implementations of their models
- You will pair to write clean and iterative code based on TDD
- Leverage various continuous delivery practices to deploy data pipelines
- Advise and educate clients on how to use different distributed storage and computing technologies from the plethora of options available
- Develop modern data architecture approaches to meet key business objectives and provide end-to-end data solutions
- Create data models and speak to the tradeoffs of different modeling approaches
Here’s what we’re looking for:
- You have a good understanding of data modelling and experience with data engineering tools and platforms such as Kafka, Spark, and Hadoop
- You have built large-scale data pipelines and data-centric applications using any of the distributed storage platforms such as HDFS, S3, NoSQL databases (Hbase, Cassandra, etc.) and any of the distributed processing platforms like Hadoop, Spark, Hive, Oozie, and Airflow in a production setting
- Hands on experience in MapR, Cloudera, Hortonworks and/or cloud (AWS EMR, Azure HDInsights, Qubole etc.) based Hadoop distributions
- You are comfortable taking data-driven approaches and applying data security strategy to solve business problems
- Working with data excites you: you can build and operate data pipelines, and maintain data storage, all within distributed systems
- Strong communication and client-facing skills with the ability to work in a consulting environment
Easebuzz is a payment solutions (fintech organisation) company which enables online merchants to accept, process and disburse payments through developer friendly APIs. We are focusing on building plug n play products including the payment infrastructure to solve complete business problems. Definitely a wonderful place where all the actions related to payments, lending, subscription, eKYC is happening at the same time.
We have been consistently profitable and are constantly developing new innovative products, as a result, we are able to grow 4x over the past year alone. We are well capitalised and have recently closed a fundraise of $4M in March, 2021 from prominent VC firms and angel investors. The company is based out of Pune and has a total strength of 180 employees. Easebuzz’s corporate culture is tied into the vision of building a workplace which breeds open communication and minimal bureaucracy. An equal opportunity employer, we welcome and encourage diversity in the workplace. One thing you can be sure of is that you will be surrounded by colleagues who are committed to helping each other grow.
Easebuzz Pvt. Ltd. has its presence in Pune, Bangalore, Gurugram.
Salary: As per company standards.
Designation: Data Engineering
Experience with ETL, Data Modeling, and Data Architecture
Design, build and operationalize large scale enterprise data solutions and applications using one or more of AWS data and analytics services in combination with 3rd parties
- Spark, EMR, DynamoDB, RedShift, Kinesis, Lambda, Glue.
Experience with AWS cloud data lake for development of real-time or near real-time use cases
Experience with messaging systems such as Kafka/Kinesis for real time data ingestion and processing
Build data pipeline frameworks to automate high-volume and real-time data delivery
Create prototypes and proof-of-concepts for iterative development.
Experience with NoSQL databases, such as DynamoDB, MongoDB etc
Create and maintain optimal data pipeline architecture,
Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS ‘big data’ technologies.
Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
Keep our data separated and secure across national boundaries through multiple data centers and AWS regions.
Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
Evangelize a very high standard of quality, reliability and performance for data models and algorithms that can be streamlined into the engineering and sciences workflow
Build and enhance data pipeline architecture by designing and implementing data ingestion solutions.
- 6+ years of recent hands-on Java development
- Developing data pipelines in AWS or Google Cloud
- Great understanding of designing for performance, scalability, and reliability of data intensive application
- Hadoop MapReduce, Spark, Pig. Understanding of database fundamentals and advanced SQL knowledge.
- In-depth understanding of object oriented programming concepts and design patterns
- Ability to communicate clearly to technical and non-technical audiences, verbally and in writing
- Understanding of full software development life cycle, agile development and continuous integration
- Experience in Agile methodologies including Scrum and Kanban
ears of Exp: 3-6+ Years
Skills: Scala, Python, Hive, Airflow, Spark
Languages: Java, Python, Shell Scripting
GCP: BigTable, DataProc, BigQuery, GCS, Pubsub
AWS: Athena, Glue, EMR, S3, Redshift
MongoDB, MySQL, Kafka
Platforms: Cloudera / Hortonworks
AdTech domain experience is a plus.
Job Type - Full Time