Cutshort logo

50+ Hadoop Jobs in India

Apply to 50+ Hadoop Jobs on CutShort.io. Find your next job, effortlessly. Browse Hadoop Jobs and apply today!

icon
xyz

at xyz

Agency job
via HR BIZ HUB by Pooja shankla
Bengaluru (Bangalore)
4 - 6 yrs
₹12L - ₹15L / yr
Java
Big Data
Apache Hive
Hadoop
Spark

Job Title Big Data Developer

Job Description

Bachelor's degree in Engineering or Computer Science or equivalent OR Master's in Computer Applications or equivalent.

Solid Experience of software development experience and leading teams of engineers and scrum teams.

4+ years of hands-on experience of working with Map-Reduce, Hive, Spark (core, SQL and PySpark).

Solid Datawarehousing concepts.

Knowledge of Financial reporting ecosystem will be a plus.

4+ years of experience within Data Engineering/ Data Warehousing using Big Data technologies will be an addon.

Expert on Distributed ecosystem.

Hands-on experience with programming using Core Java or Python/Scala

Expert on Hadoop and Spark Architecture and its working principle

Hands-on experience on writing and understanding complex SQL(Hive/PySpark-dataframes), optimizing joins while processing huge amount of data.

Experience in UNIX shell scripting.

Roles & Responsibilities

Ability to design and develop optimized Data pipelines for batch and real time data processing

Should have experience in analysis, design, development, testing, and implementation of system applications

Demonstrated ability to develop and document technical and functional specifications and analyze software and system processing flows.

Excellent technical and analytical aptitude

Good communication skills.

Excellent Project management skills.

Results driven Approach.

Mandatory SkillsBig Data, PySpark, Hive

Read more
Neysa Networks Pvt Ltd

at Neysa Networks Pvt Ltd

2 candid answers
Swapna Uchil
Posted by Swapna Uchil
Mumbai
7 - 10 yrs
Best in industry
TensorFlow
PyTorch
Python
Hadoop
R Programming
+9 more

Day in the life...


As a machine learning engineer at Neysa, you would be required to

- Collaborate with network engineers and IT teams to identify network-related challenges and areas where ML can provide solutions. Understand the specific network remediation problems that need to be addressed.

- Develop ML-based models and algorithms specific to issues that affect computer networks, for example, congestion, security threats and human errors.

- Be comfortable handling multiple types and data sources and then pre-process and clean them for modelling purposes. 

- Develop machine learning models that analyse networking data to predict (and possibly prevent) issues, detect anomalies or optimise performance. Choose the right approach for each, such as deep learning, reinforcement learning, or traditional statistical methods. 

- Train and evaluate the efficacy of your models, create performance metrics to assess robustness and effectiveness, and use feedback loops to make course corrections.

- Design solutions that can scale to handle large network environments efficiently. This typically means optimising execution latency and resource usage.  

- Integrate your model to work with running and maintaining a network. These could be with other machines, human operators, or both. 

- Document your design, architecture and your thought process. Work with the technical writers to make sure your message gets through.

- Stay updated with all that is changing in AI and ML.

 

Must have skills

 

- You should have expertise in machine learning algorithms, data processing, and model development. It would be best to have proficiency in associated frameworks such as TensorFlow, PyTorch or scikit-learn.

- You should understand how computer networks work, how they are used, how they fail, and what happens if they fail.

- You should be proficient in programming languages like Python, R, Go, LISP, etc. It would help if you also were very useful with old-school shell scripting. 

- Experience with data processing tools like Hadoop, Spark or Kafka. 

- An above-average understanding of Linux and operating systems in general.

 

What separates the best from the rest

 

ML and AI are continuously evolving, and so is understanding how these technologies can be applied to solve real-world problems. To do your best, you may also need to

- Conceptualise ways to map new problems to existing methodologies or create new ones.

- Be prepared to iterate, reiterate and then iterate your approach.

- Be able to interact with subject matter experts in multiple fields to identify potential use cases for machine learning.

 

What you can expect

 

An environment where you can do your best work...

- The best equipment that complements your talents

- The best tools in the business for you to bring your creations to life

- A great environment

- Flexible work hours and flexible work locations

- The opportunity to make your mark and shape the future

- And have fun...

Read more
Staffbee Solutions INC
Remote only
6 - 10 yrs
₹1L - ₹1.5L / yr
Spotfire
Qlikview
Tableau
PowerBI
Data Visualization
+11 more

Looking for freelance?

We are seeking a freelance Data Engineer with 7+ years of experience

 

Skills Required: Deep knowledge in any cloud (AWS, Azure , Google cloud), Data bricks, Data lakes, Data Ware housing Python/Scala , SQL, BI, and other analytics systems

 

What we are looking for

We are seeking an experienced Senior Data Engineer with experience in architecture, design, and development of highly scalable data integration and data engineering processes

 

  • The Senior Consultant must have a strong understanding and experience with data & analytics solution architecture, including data warehousing, data lakes, ETL/ELT workload patterns, and related BI & analytics systems
  • Strong in scripting languages like Python, Scala
  • 5+ years of hands-on experience with one or more of these data integration/ETL tools.
  • Experience building on-prem data warehousing solutions.
  • Experience with designing and developing ETLs, Data Marts, Star Schema
  • Designing a data warehouse solution using Synapse or Azure SQL DB
  • Experience building pipelines using Synapse or Azure Data Factory to ingest data from various sources
  • Understanding of integration run times available in Azure.
  • Advanced working SQL knowledge and experience working with relational databases, and queries. authoring (SQL) as well as working familiarity with a variety of database


Read more
Impetus technologies
any where in india
10 - 12 yrs
₹3L - ₹15L / yr
Vue.js
AngularJS (1.x)
Angular (2+)
React.js
Javascript
+11 more

Experience:


Should have a minimum of 10-12 years of Experience.

Should have experience on Product Development/Maintenance/Production Support experience in a support organization

Should have a good understanding of services business for fortune 1000 from the operations point of view

Ability to read, understand and communicate complex technical information

Ability to express ideas in an organized, articulate and concise manner

Ability to face stressful situation with positive attitude

Any certification in regards to support services will be an added advantage

 


Education: BE, B- Tech (CS), MCA

Location: India

Primary Skills:

 

Hands on experience with OpenStack framework. Ability to set up private cloud using OpenStack environment. Awareness to various OpenStack services and modules

Strong experience with OpenStack services like Neutron, Cinder, Keystone, etc.

Proficiency in programming languages such as Python, Ruby, or Go.

Strong knowledge of Linux systems administration and networking.

Familiarity with virtualization technologies like KVM or VMware.

Experience with configuration management and IaC tools like Ansible, Terraform.

Subject matter expertise in OpenStack security

Solid experience with Linux and shell scripting

Sound knowledge of cloud computing concepts & technologies, such as docker, Kubernetes, AWS, GCP, Azure etc.

Ability to configure OpenStack environment for optimum resources

Good knowledge of security, operations in open stack environment

Strong knowledge of Linux internals, networking, storage, security

Strong knowledge of VMware Enterprise products (ESX, vCenter)

Hands on experience with HEAT orchestration

Experience with CI/CD, monitoring, operational aspects

Strong experience working with Rest API's, JSON

Exposure to Big data technologies ( Messaging queues, Hadoop/MPP, NoSQL databases)

Hands on experience with open source monitoring tools like Grafana/Prometheus/Nagios/Ganglia/Zabbix etc.

Strong verbal and written communication skills are mandatory

Excellent analytical and problem solving skills are mandatory

 

Role & Responsibilities


Advise customers and colleagues on cloud and virtualization topics

Work with the architecture team on cloud design projects using openstack

Collaborate with product, customer success, and presales on customer projects

Participate in onsite assessments and workshops when requested 

Provide subject matter expertise and mentor colleagues

Set up open stack environments for projects

Design, deploy, and maintain OpenStack infrastructure.

Collaborate with cross-functional chapters to integrate OpenStack with other services (k8s, DBaaS)

Develop automation scripts and tools to streamline OpenStack operations.

Troubleshoot and resolve issues related to OpenStack services.

Monitor and optimize the performance and scalability of OpenStack components.

Stay updated with the latest OpenStack releases and contribute to the OpenStack community.

Work closely with Architects and Product Management to understand requirement

should be capable of working independently & responsible for end-to-end implementation

Should work with complete ownership and handle all issues without missing SLA's

Work closely with engineering team and support team

Should be able to debug the issues and report appropriately in the ticketing system

Contribute to improve the efficiency of the assignment by quality improvements & innovative suggestions

Should be able to debug/create scripts for automation

Should be able to configure monitoring utilities & set up alerts

Should be hands on in setting up OS, applications, databases and have passion to learn new technologies

Should be able to scan logs, errors, exception and get to the root cause of the issue

Contribute in developing a knowledge base on collaboration with other team members

Maintain customer loyalty through Integrity and accountability

Groom and mentor team members on project technologies and work

Read more
Molecular Connections

at Molecular Connections

4 recruiters
Molecular Connections
Posted by Molecular Connections
Bengaluru (Bangalore)
8 - 10 yrs
₹15L - ₹20L / yr
Spark
Hadoop
Big Data
Data engineering
PySpark
+4 more
  1. Big data developer with 8+ years of professional IT experience with expertise in Hadoop ecosystem components in ingestion, Data modeling, querying, processing, storage, analysis, Data Integration and Implementing enterprise level systems spanning Big Data.
  2. A skilled developer with strong problem solving, debugging and analytical capabilities, who actively engages in understanding customer requirements.
  3. Expertise in Apache Hadoop ecosystem components like Spark, Hadoop Distributed File Systems(HDFS), HiveMapReduce, Hive, Sqoop, HBase, Zookeeper, YARN, Flume, Pig, Nifi, Scala and Oozie.
  4. Hands on experience in creating real - time data streaming solutions using Apache Spark core, Spark SQL & DataFrames, Kafka, Spark streaming and Apache Storm.
  5. Excellent knowledge of Hadoop architecture and daemons of Hadoop clusters, which include Name node,Data node, Resource manager, Node Manager and Job history server.
  6. Worked on both Cloudera and Horton works in Hadoop Distributions. Experience in managing Hadoop clustersusing Cloudera Manager tool.
  7. Well versed in installation, Configuration, Managing of Big Data and underlying infrastructure of Hadoop Cluster.
  8. Hands on experience in coding MapReduce/Yarn Programs using Java, Scala and Python for analyzing Big Data.
  9. Exposure to Cloudera development environment and management using Cloudera Manager.
  10. Extensively worked on Spark using Scala on cluster for computational (analytics), installed it on top of Hadoop performed advanced analytical application by making use of Spark with Hive and SQL/Oracle .
  11. Implemented Spark using PYTHON and utilizing Data frames and Spark SQL API for faster processing of data and handled importing data from different data sources into HDFS using Sqoop and performing transformations using Hive, MapReduce and then loading data into HDFS.
  12. Used Spark Data Frames API over Cloudera platform to perform analytics on Hive data.
  13. Hands on experience in MLlib from Spark which are used for predictive intelligence, customer segmentation and for smooth maintenance in Spark streaming.
  14. Experience in using Flume to load log files into HDFS and Oozie for workflow design and scheduling.
  15. Experience in optimizing MapReduce jobs to use HDFS efficiently by using various compression mechanisms.
  16. Working on creating data pipeline for different events of ingestion, aggregation, and load consumer response data into Hive external tables in HDFS location to serve as feed for tableau dashboards.
  17. Hands on experience in using Sqoop to import data into HDFS from RDBMS and vice-versa.
  18. In-depth Understanding of Oozie to schedule all Hive/Sqoop/HBase jobs.
  19. Hands on expertise in real time analytics with Apache Spark.
  20. Experience in converting Hive/SQL queries into RDD transformations using Apache Spark, Scala and Python.
  21. Extensive experience in working with different ETL tool environments like SSIS, Informatica and reporting tool environments like SQL Server Reporting Services (SSRS).
  22. Experience in Microsoft cloud and setting cluster in Amazon EC2 & S3 including the automation of setting & extending the clusters in AWS Amazon cloud.
  23. Extensively worked on Spark using Python on cluster for computational (analytics), installed it on top of Hadoop performed advanced analytical application by making use of Spark with Hive and SQL.
  24. Strong experience and knowledge of real time data analytics using Spark Streaming, Kafka and Flume.
  25. Knowledge in installation, configuration, supporting and managing Hadoop Clusters using Apache, Cloudera (CDH3, CDH4) distributions and on Amazon web services (AWS).
  26. Experienced in writing Ad Hoc queries using Cloudera Impala, also used Impala analytical functions.
  27. Experience in creating Data frames using PySpark and performing operation on the Data frames using Python.
  28. In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS and MapReduce Programming Paradigm, High Availability and YARN architecture.
  29. Establishing multiple connections to different Redshift clusters (Bank Prod, Card Prod, SBBDA Cluster) and provide the access for pulling the information we need for analysis. 
  30. Generated various kinds of knowledge reports using Power BI based on Business specification. 
  31. Developed interactive Tableau dashboards to provide a clear understanding of industry specific KPIs using quick filters and parameters to handle them more efficiently.
  32. Well Experience in projects using JIRA, Testing, Maven and Jenkins build tools.
  33. Experienced in designing, built, and deploying and utilizing almost all the AWS stack (Including EC2, S3,), focusing on high-availability, fault tolerance, and auto-scaling.
  34. Good experience with use-case development, with Software methodologies like Agile and Waterfall.
  35. Working knowledge of Amazon's Elastic Cloud Compute( EC2 ) infrastructure for computational tasks and Simple Storage Service ( S3 ) as Storage mechanism.
  36. Good working experience in importing data using Sqoop, SFTP from various sources like RDMS, Teradata, Mainframes, Oracle, Netezza to HDFS and performed transformations on it using Hive, Pig and Spark .
  37. Extensive experience in Text Analytics, developing different Statistical Machine Learning solutions to various business problems and generating data visualizations using Python and R.
  38. Proficient in NoSQL databases including HBase, Cassandra, MongoDB and its integration with Hadoop cluster.
  39. Hands on experience in Hadoop Big data technology working on MapReduce, Pig, Hive as Analysis tool, Sqoop and Flume data import/export tools.
Read more
AdElement

at AdElement

2 recruiters
Sachin Bhatevara
Posted by Sachin Bhatevara
Pune
0 - 1 yrs
₹0.1L - ₹3L / yr
Java
Javascript
React.js
Angular (2+)
AngularJS (1.x)
+10 more

We are looking for computer science/engineering final year students/ fresh graduates that have solid understanding of computer science fundamentals (algorithms, data structures, object oriented programming) and strong java. programming skills. You will get to work on machine learning algorithms as applied to online advertising or do data analytics. You will learn how to collaborate in small, agile teams, do rapid development, testing and get to taste the invigorating feel of a start-up company.


Experience

None required


Required Skills

-Solid foundation in computer science, with strong competencies in data structures, algorithms, and software design

-Java / Python programming

-UI/UX HTML5 CSS3, Javascript

-MYSQL, Relational Databases

-MVC Framework, ReactJS


Optional Skills

-Familiarity with online advertising, web technologies

-Familiarity with Hadoop, Spark, Scala


Education

UG - B.Tech/B.E. - Computers; PG - M.Tech - Computers

Read more
Accolite Digital
Nitesh Parab
Posted by Nitesh Parab
Bengaluru (Bangalore), Hyderabad, Gurugram, Delhi, Noida, Ghaziabad, Faridabad
4 - 8 yrs
₹5L - ₹15L / yr
ETL
Informatica
Data Warehouse (DWH)
SSIS
SQL Server Integration Services (SSIS)
+10 more

Job Title: Data Engineer

Job Summary: As a Data Engineer, you will be responsible for designing, building, and maintaining the infrastructure and tools necessary for data collection, storage, processing, and analysis. You will work closely with data scientists and analysts to ensure that data is available, accessible, and in a format that can be easily consumed for business insights.

Responsibilities:

  • Design, build, and maintain data pipelines to collect, store, and process data from various sources.
  • Create and manage data warehousing and data lake solutions.
  • Develop and maintain data processing and data integration tools.
  • Collaborate with data scientists and analysts to design and implement data models and algorithms for data analysis.
  • Optimize and scale existing data infrastructure to ensure it meets the needs of the business.
  • Ensure data quality and integrity across all data sources.
  • Develop and implement best practices for data governance, security, and privacy.
  • Monitor data pipeline performance / Errors and troubleshoot issues as needed.
  • Stay up-to-date with emerging data technologies and best practices.

Requirements:

Bachelor's degree in Computer Science, Information Systems, or a related field.

Experience with ETL tools like Matillion,SSIS,Informatica

Experience with SQL and relational databases such as SQL server, MySQL, PostgreSQL, or Oracle.

Experience in writing complex SQL queries

Strong programming skills in languages such as Python, Java, or Scala.

Experience with data modeling, data warehousing, and data integration.

Strong problem-solving skills and ability to work independently.

Excellent communication and collaboration skills.

Familiarity with big data technologies such as Hadoop, Spark, or Kafka.

Familiarity with data warehouse/Data lake technologies like Snowflake or Databricks

Familiarity with cloud computing platforms such as AWS, Azure, or GCP.

Familiarity with Reporting tools

Teamwork/ growth contribution

  • Helping the team in taking the Interviews and identifying right candidates
  • Adhering to timelines
  • Intime status communication and upfront communication of any risks
  • Tech, train, share knowledge with peers.
  • Good Communication skills
  • Proven abilities to take initiative and be innovative
  • Analytical mind with a problem-solving aptitude

Good to have :

Master's degree in Computer Science, Information Systems, or a related field.

Experience with NoSQL databases such as MongoDB or Cassandra.

Familiarity with data visualization and business intelligence tools such as Tableau or Power BI.

Knowledge of machine learning and statistical modeling techniques.

If you are passionate about data and want to work with a dynamic team of data scientists and analysts, we encourage you to apply for this position.

Read more
iLink Systems

at iLink Systems

1 video
1 recruiter
Ganesh Sooriyamoorthu
Posted by Ganesh Sooriyamoorthu
Chennai, Pune, Noida, Bengaluru (Bangalore)
5 - 15 yrs
₹10L - ₹15L / yr
Apache Kafka
Big Data
Java
Spark
Hadoop
+1 more
  • KSQL
  • Data Engineering spectrum (Java/Spark)
  • Spark Scala / Kafka Streaming
  • Confluent Kafka components
  • Basic understanding of Hadoop


Read more
Mumbai, Navi Mumbai
6 - 14 yrs
₹16L - ₹37L / yr
Python
PySpark
Data engineering
Big Data
Hadoop
+3 more

Role: Principal Software Engineer


We looking for a passionate Principle Engineer - Analytics to build data products that extract valuable business insights for efficiency and customer experience. This role will require managing, processing and analyzing large amounts of raw information and in scalable databases. This will also involve developing unique data structures and writing algorithms for the entirely new set of products. The candidate will be required to have critical thinking and problem-solving skills. The candidates must be experienced with software development with advanced algorithms and must be able to handle large volume of data. Exposure with statistics and machine learning algorithms is a big plus. The candidate should have some exposure to cloud environment, continuous integration and agile scrum processes.



Responsibilities:


• Lead projects both as a principal investigator and project manager, responsible for meeting project requirements on schedule

• Software Development that creates data driven intelligence in the products which deals with Big Data backends

• Exploratory analysis of the data to be able to come up with efficient data structures and algorithms for given requirements

• The system may or may not involve machine learning models and pipelines but will require advanced algorithm development

• Managing, data in large scale data stores (such as NoSQL DBs, time series DBs, Geospatial DBs etc.)

• Creating metrics and evaluation of algorithm for better accuracy and recall

• Ensuring efficient access and usage of data through the means of indexing, clustering etc.

• Collaborate with engineering and product development teams.


Requirements:


• Master’s or Bachelor’s degree in Engineering in one of these domains - Computer Science, Information Technology, Information Systems, or related field from top-tier school

• OR Master’s degree or higher in Statistics, Mathematics, with hands on background in software development.

• Experience of 8 to 10 year with product development, having done algorithmic work

• 5+ years of experience working with large data sets or do large scale quantitative analysis

• Understanding of SaaS based products and services.

• Strong algorithmic problem-solving skills

• Able to mentor and manage team and take responsibilities of team deadline.


Skill set required:


• In depth Knowledge Python programming languages

• Understanding of software architecture and software design

• Must have fully managed a project with a team

• Having worked with Agile project management practices

• Experience with data processing analytics and visualization tools in Python (such as pandas, matplotlib, Scipy, etc.)

• Strong understanding of SQL and querying to NoSQL database (eg. Mongo, Casandra, Redis

Read more
Telstra

at Telstra

1 video
1 recruiter
Mahesh Balappa
Posted by Mahesh Balappa
Bengaluru (Bangalore), Hyderabad, Pune
3 - 7 yrs
Best in industry
Spark
Hadoop
NOSQL Databases
Apache Kafka

About Telstra

 

Telstra is Australia’s leading telecommunications and technology company, with operations in more than 20 countries, including In India where we’re building a new Innovation and Capability Centre (ICC) in Bangalore.

 

We’re growing, fast, and for you that means many exciting opportunities to develop your career at Telstra. Join us on this exciting journey, and together, we’ll reimagine the future.

 

Why Telstra?

 

  • We're an iconic Australian company with a rich heritage that's been built over 100 years. Telstra is Australia's leading Telecommunications and Technology Company. We've been operating internationally for more than 70 years.
  • International presence spanning over 20 countries.
  • We are one of the 20 largest telecommunications providers globally
  • At Telstra, the work is complex and stimulating, but with that comes a great sense of achievement. We are shaping the tomorrow's modes of communication with our innovation driven teams.

 

Telstra offers an opportunity to make a difference to lives of millions of people by providing the choice of flexibility in work and a rewarding career that you will be proud of!

 

About the team

Being part of Networks & IT means you'll be part of a team that focuses on extending our network superiority to enable the continued execution of our digital strategy.

With us, you'll be working with world-leading technology and change the way we do IT to ensure business needs drive priorities, accelerating our digitisation programme.

 

Focus of the role

Any new engineer who comes into data chapter would be mostly into developing reusable data processing and storage frameworks that can be used across data platform.

 

About you

To be successful in the role, you'll bring skills and experience in:-

 

Essential 

  • Hands-on experience in Spark Core, Spark SQL, SQL/Hive/Impala, Git/SVN/Any other VCS and Data warehousing
  • Skilled in the Hadoop Ecosystem(HDP/Cloudera/MapR/EMR etc)
  • Azure data factory/Airflow/control-M/Luigi
  • PL/SQL
  • Exposure to NOSQL(Hbase/Cassandra/GraphDB(Neo4J)/MongoDB)
  • File formats (Parquet/ORC/AVRO/Delta/Hudi etc.)
  • Kafka/Kinesis/Eventhub

 

Highly Desirable

Experience and knowledgeable on the following:

  • Spark Streaming
  • Cloud exposure (Azure/AWS/GCP)
  • Azure data offerings - ADF, ADLS2, Azure Databricks, Azure Synapse, Eventhubs, CosmosDB etc.
  • Presto/Athena
  • Azure DevOps
  • Jenkins/ Bamboo/Any similar build tools
  • Power BI
  • Prior experience in building or working in team building reusable frameworks,
  • Data modelling.
  • Data Architecture and design principles. (Delta/Kappa/Lambda architecture)
  • Exposure to CI/CD
  • Code Quality - Static and Dynamic code scans
  • Agile SDLC      

 

If you've got a passion to innovate, succeed as part of a great team, and looking for the next step in your career, we'd welcome you to apply!

___________________________

 

We’re committed to building a diverse and inclusive workforce in all its forms. We encourage applicants from diverse gender, cultural and linguistic backgrounds and applicants who may be living with a disability. We also offer flexibility in all our roles, to ensure everyone can participate.

To learn more about how we support our people, including accessibility adjustments we can provide you through the recruitment process, visit tel.st/thrive.

Read more
Ascendeum

at Ascendeum

3 recruiters
Swezelle Esteves
Posted by Swezelle Esteves
Remote only
1 - 5 yrs
₹8L - ₹10L / yr
Python
Data Analytics
Data Science
Machine Learning (ML)
Natural Language Processing (NLP)
+4 more

Job Responsibilities: 

 

  • Identify valuable data sources and automate collection processes 
  • Undertake preprocessing of structured and unstructured data. 
  • Analyze large amounts of information to discover trends and patterns 
  • Helping develop reports and analysis. 
  • Present information using data visualization techniques. 
  • Assessing tests and implementing new or upgraded software and assisting with strategic decisions on new systems. 
  • Evaluating changes and updates to source production systems. 
  • Develop, implement, and maintain leading-edge analytic systems, taking complicated problems and building simple frameworks 
  • Providing technical expertise in data storage structures, data mining, and data cleansing. 
  • Propose solutions and strategies to business challenges 

 

Desired Skills and Experience: 

 

  • At least 1 year of experience in Data Analysis 
  • Complete understanding of Operations Research, Data Modelling, ML, and AI concepts. 
  • Knowledge of Python is mandatory, familiarity with MySQL, SQL, Scala, Java or C++ is an asset 
  • Experience using visualization tools (e.g. Jupyter Notebook) and data frameworks (e.g. Hadoop) 
  • Analytical mind and business acumen 
  • Strong math skills (e.g. statistics, algebra) 
  • Problem-solving aptitude 
  • Excellent communication and presentation skills. 
  • Bachelor’s / Master's Degree in Computer Science, Engineering, Data Science or other quantitative or relevant field is preferred  
Read more
Pune
5 - 9 yrs
₹5L - ₹15L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+2 more
This role is for a developer with strong core application or system programming skills in Scala, java and
good exposure to concepts and/or technology across the broader spectrum. Enterprise Risk Technology
covers a variety of existing systems and green-field projects.
A Full stack Hadoop development experience with Scala development
A Full stack Java development experience covering Core Java (including JDK 1.8) and good understanding
of design patterns.
Requirements:-
• Strong hands-on development in Java technologies.
• Strong hands-on development in Hadoop technologies like Spark, Scala and experience on Avro.
• Participation in product feature design and documentation
• Requirement break-up, ownership and implantation.
• Product BAU deliveries and Level 3 production defects fixes.
Qualifications & Experience
• Degree holder in numerate subject
• Hands on Experience on Hadoop, Spark, Scala, Impala, Avro and messaging like Kafka
• Experience across a core compiled language – Java
• Proficiency in Java related frameworks like Springs, Hibernate, JPA
• Hands on experience in JDK 1.8 and strong skillset covering Collections, Multithreading with

For internal use only
For internal use only
experience working on Distributed applications.
• Strong hands-on development track record with end-to-end development cycle involvement
• Good exposure to computational concepts
• Good communication and interpersonal skills
• Working knowledge of risk and derivatives pricing (optional)
• Proficiency in SQL (PL/SQL), data modelling.
• Understanding of Hadoop architecture and Scala program language is a good to have.
Read more
Hyderabad
3 - 5 yrs
₹10L - ₹14L / yr
Microservices
Java
Ansible
Spring Boot
Spring MVC
+8 more

Roles and Responsibilities

Java + Microservices Developer Responsibilies Hands-on experience of minimum 3-5 Years in development of scalable and extensible systems using Java. Hands-on experience into Microservices. Experience into frameworks like Spring, Spring MVC, Spring Boot, Hibernate etc. Good knowledge or hands-on experience with a minimum 1 year in Java Script Good working exposure into any Bigdata Technologies like Hadoop, Spark, Scala etc. Experience into Jenkins, Maven, Git. Solid and fluent understanding of algorithm and data structures. Excellent software design, problem-solving and analytical skills. Candidates graduated from Good schools like IIT's, NIIT's, IIIT's (Preferred). Excellent Communication Skills Experience in Database technology such as SQL & No SQL. Good understanding of Elastic Search, Redis, Routines Sync & Async
Read more
Hyderabad
7 - 12 yrs
₹12L - ₹24L / yr
Spark
Hadoop
Big Data
Data engineering
PySpark
+5 more

Skills

Proficient experience of minimum 7 years into Hadoop. Hands-on experience of minimum 2 years into AWS - EMR/ S3 and other AWS services and dashboards. Good experience of minimum 2 years into Spark framework. Good understanding of Hadoop Eco system including Hive, MR, Spark and Zeppelin. Responsible for troubleshooting and recommendation for Spark and MR jobs. Should be able to use existing logs to debug the issue. Responsible for implementation and ongoing administration of Hadoop infrastructure including monitoring, tuning and troubleshooting Triage production issues when they occur with other operational teams. Hands on experience to troubleshoot incidents, formulate theories and test hypothesis and narrow down possibilities to find the root cause.
Read more
Play Games24x7

at Play Games24x7

2 recruiters
Agency job
via Zyoin Web Private Limited by Vishali Vashnavi
Bengaluru (Bangalore)
8 - 12 yrs
₹40L - ₹50L / yr
Java
J2EE
PostgreSQL
MySQL
MongoDB
+19 more
Requirements:
• B. E. /B. Tech. in Computer Science or MCA from a reputed university.
• 3.5 plus years of experience in software development, with emphasis on JAVA/J2EE Server side
programming.
• Hands on experience in core Java, multithreading, RMI, socket programing, JDBC, NIO, webservices
and design patterns.
• Knowledge of distributed system, distributed caching, messaging frameworks, ESB etc.
• Experience in Linux operating system and PostgreSQL/MySQL/MongoDB/Cassandra database.
• Additionally, knowledge of HBase, Hadoop and Hive is desirable.
• Familiarity with message queue systems and AMQP and Kafka is desirable.
• Experience as a participant in agile methodologies.
• Excellent written and verbal communication skills and presentation skills.
• This is not a fullstack requirement, we are looking for a purely backend expert.
Read more
Miracle Software Systems, Inc
Ratnakumari Modhalavalasa
Posted by Ratnakumari Modhalavalasa
Visakhapatnam
3 - 5 yrs
₹2L - ₹4L / yr
Hadoop
Apache Sqoop
Apache Hive
Apache Spark
Apache Pig
+9 more
Position : Data Engineer

Duration : Full Time

Location : Vishakhapatnam, Bangalore, Chennai

years of experience : 3+ years

Job Description :

- 3+ Years of working as a Data Engineer with thorough understanding of data frameworks that collect, manage, transform and store data that can derive business insights.

- Strong communications (written and verbal) along with being a good team player.

- 2+ years of experience within the Big Data ecosystem (Hadoop, Sqoop, Hive, Spark, Pig, etc.)

- 2+ years of strong experience with SQL and Python (Data Engineering focused).

- Experience with GCP Data Services such as BigQuery, Dataflow, Dataproc, etc. is an added advantage and preferred.

- Any prior experience in ETL tools such as DataStage, Informatica, DBT, Talend, etc. is an added advantage for the role.
Read more
Hyderabad
3 - 7 yrs
₹3L - ₹10L / yr
Spark
Hadoop
Big Data
Data engineering
PySpark
+4 more


Experience : 3 to 7 Years
Number of Positions : 20

Job Location : Hyderabad

Notice : 30 Days

 

1. Expertise in building AWS Data Engineering pipelines with AWS Glue -> Athena -> Quick sight

2. Experience in developing lambda functions with AWS Lambda

3. Expertise with Spark/PySpark – Candidate should be hands on with PySpark code and should be able to do transformations with Spark

4. Should be able to code in Python and Scala.

5. Snowflake experience will be a plus

 

Hadoop and Hive requirements as good to have or understanding of is enough.

Read more
Tata Digital Pvt Ltd
Agency job
via Seven N Half by Priya Singh
Bengaluru (Bangalore)
8 - 13 yrs
₹10L - ₹15L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+2 more

 

              Data Engineer

 

-          High Skilled and proficient on Azure Data Engineering Tech stacks (ADF, Databricks)

-          Should be well experienced in design and development of Big data integration platform (Kafka, Hadoop).

-          Highly skilled and experienced in building medium to complex data integration pipelines for Data at Rest and streaming data using Spark.

-          Strong knowledge in R/Python.

-          Advanced proficiency in solution design and implementation through Azure Data Lake, SQL and NoSQL Databases.

-          Strong in Data Warehousing concepts

-          Expertise in SQL, SQL tuning, Data Management (Data Security), schema design, Python and ETL processes

-          Highly Motivated, Self-Starter and quick learner

-          Must have Good knowledge on Data modelling and understating of Data analytics

-          Exposure to Statistical procedures, Experiments and Machine Learning techniques is an added advantage.

-          Experience in leading small team of 6/7 Data Engineers.

-          Excellent written and verbal communication skills

 

Read more
Astegic

at Astegic

3 recruiters
Nikita Pasricha
Posted by Nikita Pasricha
Remote only
5 - 7 yrs
₹8L - ₹15L / yr
Data engineering
SQL
Relational Database (RDBMS)
Big Data
Scala
+14 more

WHAT YOU WILL DO:

  • ●  Create and maintain optimal data pipeline architecture.

  • ●  Assemble large, complex data sets that meet functional / non-functional business requirements.

  • ●  Identify, design, and implement internal process improvements: automating manual processes,

    optimizing data delivery, re-designing infrastructure for greater scalability, etc.

  • ●  Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide

    variety of data sources using Spark,Hadoop and AWS 'big data' technologies.(EC2, EMR, S3, Athena).

  • ●  Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition,

    operational efficiency and other key business performance metrics.

  • ●  Work with stakeholders including the Executive, Product, Data and Design teams to assist with

    data-related technical issues and support their data infrastructure needs.

  • ●  Keep our data separated and secure across national boundaries through multiple data centers and AWS

    regions.

  • ●  Create data tools for analytics and data scientist team members that assist them in building and

    optimizing our product into an innovative industry leader.

  • ●  Work with data and analytics experts to strive for greater functionality in our data systems.

    REQUIRED SKILLS & QUALIFICATIONS:

  • ●  5+ years of experience in a Data Engineer role.

  • ●  Advanced working SQL knowledge and experience working with relational databases, query authoring

    (SQL) as well as working familiarity with a variety of databases.

  • ●  Experience building and optimizing 'big data' data pipelines, architectures and data sets.

  • ●  Experience performing root cause analysis on internal and external data and processes to answer

    specific business questions and identify opportunities for improvement.

  • ●  Strong analytic skills related to working with unstructured datasets.

  • ●  Build processes supporting data transformation, data structures, metadata, dependency and workload

    management.

  • ●  A successful history of manipulating, processing and extracting value from large disconnected datasets.

  • ●  Working knowledge of message queuing, stream processing, and highly scalable 'big data' data stores.

  • ●  Strong project management and organizational skills.

  • ●  Experience supporting and working with cross-functional teams in a dynamic environment

  • ●  Experience with big data tools: Hadoop, Spark, Pig, Vetica, etc.

  • ●  Experience with AWS cloud services: EC2, EMR, S3, Athena

  • ●  Experience with Linux

  • ●  Experience with object-oriented/object function scripting languages: Python, Java, Shell, Scala, etc.


    PREFERRED SKILLS & QUALIFICATIONS:

● Graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field.

Read more
Number Theory

at Number Theory

3 recruiters
Nidhi Mishra
Posted by Nidhi Mishra
Gurugram
2 - 4 yrs
₹10L - ₹15L / yr
Hadoop
Spark
HDFS
Scala
Java
+2 more
Position Overview: Data Engineer (2+ yrs)
Our company is seeking to hire a skilled software developer to help with the development of our AI/ML platform.
Your duties will primarily revolve around building Platform by writing code in Scala, as well as modifying platform
to fix errors, work on distributed computing, adapt it to new cloud services, improve its performance, or upgrade
interfaces. To be successful in this role, you will need extensive knowledge of programming languages and the
software development life-cycle.

Responsibilities:
 Analyze, design develop, troubleshoot and debug Platform
 Writes code and guides other team membersfor best practices and performs testing and debugging of
applications.
 Specify, design and implementminor changes to existing software architecture. Build highly complex
enhancements and resolve complex bugs. Build and execute unit tests and unit plans.
 Duties and tasks are varied and complex, needing independent judgment. Fully competent in own area of
expertise

Experience:
The candidate should have about 2+ years of experience with design and development in Java/Scala. Experience in
algorithm, Distributed System, Data-structure, database and architectures of distributed System is mandatory.

Required Skills:
1. In-depth knowledge of Hadoop, Spark architecture and its componentssuch as HDFS, YARN and executor, cores and memory param
2. Knowledge of Scala/Java.
3. Extensive experience in developing spark job. Should possess good Oops knowledge and be aware of
enterprise application design patterns.
4. Good knowledge of Unix/Linux.
5. Experience working on large-scale software projects
6. Keep an eye out for technological trends, open-source projects that can be used.
7. Knows common programming languages Frameworks
Read more
Number Theory

at Number Theory

3 recruiters
Nidhi Mishra
Posted by Nidhi Mishra
Gurugram
5 - 12 yrs
₹10L - ₹40L / yr
Spark
Hadoop
Big Data
Data engineering
PySpark
+5 more
Job Description – Big Data Architect
Number Theory is looking for experienced software/data engineer who would be focused on owning and rearchitecting dynamic pricing engineering systems
Job Responsibilities:
 Evaluate and recommend Big Data technology stack best suited for NT AI at scale Platform
and other products
 Lead the team for defining proper Big Data Architecture Design.
 Design and implement features on NT AI at scale platform using Spark and other Hadoop
Stack components.
 Drive significant technology initiatives end to end and across multiple layers of architecture
 Provides strong technical leadership in adopting and contributing to open source technologies related to Big Data across multiple engagements
 Designing /architecting complex, highly available, distributed, failsafe compute systems dealing with considerable scalable amount of data
 Identify and work upon incorporating Non-functional requirements into the solution (Performance, scalability, monitoring etc.)

Requirements:
 A successful candidate with 8+ years of experience in the role of implementation of a highend software product.
 Provides technical leadership in Big Data space (Spark and Hadoop Stack like Map/Reduc,
HDFS, Hive, HBase, Flume, Sqoop etc. NoSQL stores like Cassandra, HBase etc) across
Engagements and contributes to open-source Big Data technologies.
 Rich hands on in Spark and worked on Spark at a larger scale.
 Visualize and evangelize next generation infrastructure in Big Data space (Batch, Near
Real-time, Realtime technologies).
 Passionate for continuous learning, experimenting, applying and contributing towards
cutting edge open-source technologies and software paradigms
 Expert-level proficiency in Java and Scala.
 Strong understanding and experience in distributed computing frameworks, particularly
Apache Hadoop2.0 (YARN; MR & HDFS) and associated technologies one or more of Hive,
Sqoop, Avro, Flume, Oozie, Zookeeper, etc.Hands-on experience with Apache Spark and its
components (Streaming, SQL, MLLib)
 Operating knowledge of cloud computing platforms (AWS,Azure) –

Good to have:

 Operating knowledge of different enterprise hadoop distribution (C) –
 Good Knowledge of Design Patterns
 Experience working within a Linux computing environment, and use of command line tools
including knowledge of shell/Python scripting for automating common tasks.
Read more
Tata Digital Pvt Ltd
Agency job
via Seven N Half by Priya Singh
Mumbai, Mangalore, Gurugram
5 - 11 yrs
₹1L - ₹15L / yr
SOA
EAI
ESB
J2EE
RESTful APIs
+14 more

Role / Purpose - Lead Developer - API and Microservices

Must have a strong hands-on development track record building integration utilizing a variety of integration products, tools, protocols, technologies, and patterns.

  • Must have an in-depth understanding of SOA/EAI/ESB concepts, SOA Governance, Event-Driven Architecture, message-based architectures, file sharing, and exchange platforms, data virtualization and caching strategies, J2EE design patterns, frameworks
  • Should possess experience with at least one of middleware technologies (Application Servers, BPMS, BRMS, ESB & Message Brokers), Programming languages (e.g. Java/J2EE, JavaScript, COBOL, C), Operating Systems (e.g. Windows, Linux, MVS), and Databases (DB2, MySQL, No SQL Databases like MongoDB, Cassandra, Hadoop, etc.)
  • Must have experience implementing API Service architectures (SOAP, REST) using any of the market-leading API Management tools such as Apigee and frameworks such as Spring Boot for Microservices
  • Should have Advanced skills in implementing API Service architectures (SOAP, REST) using any of the market-leading API Management tools such as Apigee or similar frameworks such as Spring Boot for Microservices 
  • Appetite to manage large-scale projects and multiple tracks
  •  Experience and knowhow of the e-commerce domain and retail experience are preferred
  •  Good communication & people managerial skills
Read more
EnterpriseMinds

at EnterpriseMinds

2 recruiters
Rani Galipalli
Posted by Rani Galipalli
Remote only
4 - 8 yrs
₹8L - ₹25L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+2 more

Job Description

 

  1. Solid technical skills with a proven and successful history working with data at scale and empowering organizations through data
  2. Big data processing frameworks: Spark, Scala, Hadoop, Hive, Kafka, EMR with Python
  3. Advanced experience and hands-on architecture and administration experience on big data platforms

 

Read more
Simpl

at Simpl

3 recruiters
Elish Ismael
Posted by Elish Ismael
Bengaluru (Bangalore)
3 - 10 yrs
₹10L - ₹50L / yr
Java
Apache Spark
Big Data
Hadoop
Apache Hive
About Simpl
The thrill of working at a start-up that is starting to scale massively is something else. Simpl (FinTech startup of the year - 2020) was formed in 2015 by Nitya Sharma, an investment banker from Wall Street and Chaitra Chidanand, a tech executive from the Valley, when they teamed up with a very clear mission - to make money simple so that people can live well and do amazing things. Simpl is the payment platform for the mobile-first world, and we’re backed by some of the best names in fintech globally (folks who have invested in Visa, Square and Transferwise), and
has Joe Saunders, Ex Chairman and CEO of Visa as a board member.

Everyone at Simpl is an internal entrepreneur who is given a lot of bandwidth and resources to create the next breakthrough towards the long term vision of “making money Simpl”. Our first product is a payment platform that lets people buy instantly, anywhere online, and pay later. In
the background, Simpl uses big data for credit underwriting, risk and fraud modelling, all without any paperwork, and enables Banks and Non-Bank Financial Companies to access a whole new consumer market.
In place of traditional forms of identification and authentication, Simpl integrates deeply into merchant apps via SDKs and APIs. This allows for more sophisticated forms of authentication that take full advantage of smartphone data and processing power

Skillset:
 Workflow manager/scheduler like Airflow, Luigi, Oozie
 Good handle on Python
 ETL Experience
 Batch processing frameworks like Spark, MR/PIG
 File formats: parquet, JSON, XML, thrift, avro, protobuff
 Rule engine (drools - business rule management system)
 Distributed file systems like HDFS, NFS, AWS, S3 and equivalent
 Built/configured dashboards

Nice to have:
 Data platform experience for eg: building data lakes, working with near - realtime
applications/frameworks like storm, flink, spark.
 AWS
 File encoding types: Thrift, Avro, Protobuff, Parquet, JSON, XML
 HIVE, HBASE
Read more
xpressbees
Alfiya Khan
Posted by Alfiya Khan
Pune, Bengaluru (Bangalore)
6 - 8 yrs
₹15L - ₹25L / yr
Big Data
Data Warehouse (DWH)
Data modeling
Apache Spark
Data integration
+10 more
Company Profile
XpressBees – a logistics company started in 2015 – is amongst the fastest growing
companies of its sector. While we started off rather humbly in the space of
ecommerce B2C logistics, the last 5 years have seen us steadily progress towards
expanding our presence. Our vision to evolve into a strong full-service logistics
organization reflects itself in our new lines of business like 3PL, B2B Xpress and cross
border operations. Our strong domain expertise and constant focus on meaningful
innovation have helped us rapidly evolve as the most trusted logistics partner of
India. We have progressively carved our way towards best-in-class technology
platforms, an extensive network reach, and a seamless last mile management
system. While on this aggressive growth path, we seek to become the one-stop-shop
for end-to-end logistics solutions. Our big focus areas for the very near future
include strengthening our presence as service providers of choice and leveraging the
power of technology to improve efficiencies for our clients.

Job Profile
As a Lead Data Engineer in the Data Platform Team at XpressBees, you will build the data platform
and infrastructure to support high quality and agile decision-making in our supply chain and logistics
workflows.
You will define the way we collect and operationalize data (structured / unstructured), and
build production pipelines for our machine learning models, and (RT, NRT, Batch) reporting &
dashboarding requirements. As a Senior Data Engineer in the XB Data Platform Team, you will use
your experience with modern cloud and data frameworks to build products (with storage and serving
systems)
that drive optimisation and resilience in the supply chain via data visibility, intelligent decision making,
insights, anomaly detection and prediction.

What You Will Do
• Design and develop data platform and data pipelines for reporting, dashboarding and
machine learning models. These pipelines would productionize machine learning models
and integrate with agent review tools.
• Meet the data completeness, correction and freshness requirements.
• Evaluate and identify the data store and data streaming technology choices.
• Lead the design of the logical model and implement the physical model to support
business needs. Come up with logical and physical database design across platforms (MPP,
MR, Hive/PIG) which are optimal physical designs for different use cases (structured/semi
structured). Envision & implement the optimal data modelling, physical design,
performance optimization technique/approach required for the problem.
• Support your colleagues by reviewing code and designs.
• Diagnose and solve issues in our existing data pipelines and envision and build their
successors.

Qualifications & Experience relevant for the role

• A bachelor's degree in Computer Science or related field with 6 to 9 years of technology
experience.
• Knowledge of Relational and NoSQL data stores, stream processing and micro-batching to
make technology & design choices.
• Strong experience in System Integration, Application Development, ETL, Data-Platform
projects. Talented across technologies used in the enterprise space.
• Software development experience using:
• Expertise in relational and dimensional modelling
• Exposure across all the SDLC process
• Experience in cloud architecture (AWS)
• Proven track record in keeping existing technical skills and developing new ones, so that
you can make strong contributions to deep architecture discussions around systems and
applications in the cloud ( AWS).

• Characteristics of a forward thinker and self-starter that flourishes with new challenges
and adapts quickly to learning new knowledge
• Ability to work with a cross functional teams of consulting professionals across multiple
projects.
• Knack for helping an organization to understand application architectures and integration
approaches, to architect advanced cloud-based solutions, and to help launch the build-out
of those systems
• Passion for educating, training, designing, and building end-to-end systems.
Read more
Ushur Technologies Pvt Ltd

at Ushur Technologies Pvt Ltd

1 video
2 recruiters
Priyanka N
Posted by Priyanka N
Bengaluru (Bangalore)
6 - 12 yrs
Best in industry
MongoDB
Spark
Hadoop
Big Data
Data engineering
+5 more
What You'll Do:
● Our Infrastructure team is looking for an excellent Big Data Engineer to join a core group that
designs the industry’s leading Micro-Engagement Platform. This role involves design and
implementation of architectures and frameworks of big data for industry’s leading intelligent
workflow automation platform. As a specialist in Ushur Engineering team, your responsibilities will
be to:
● Use your in-depth understanding to architect and optimize databases and data ingestion pipelines
● Develop HA strategies, including replica sets and sharding to for highly available clusters
● Recommend and implement solutions to improve performance, resource consumption, and
resiliency
● On an ongoing basis, identify bottlenecks in databases in development and production
environments and propose solutions
● Help DevOps team with your deep knowledge in the area of database performance, scaling,
tuning, migration & version upgrades
● Provide verifiable technical solutions to support operations at scale and with high availability
● Recommend appropriate data processing toolset and big data ecosystems to adopt
● Design and scale databases and pipelines across multiple physical locations on cloud
● Conduct Root-cause analysis of data issues
● Be self-driven, constantly research and suggest latest technologies

The experience you need:
● Engineering degree in Computer Science or related field
● 10+ years of experience working with databases, most of which should have been around
NoSql technologies
● Expertise in implementing and maintaining distributed, Big data pipelines and ETL
processes
● Solid experience in one of the following cloud-native data platforms (AWS Redshift/ Google
BigQuery/ SnowFlake)
● Exposure to real time processing techniques like Apache Kafka and CDC tools
(Debezium, Qlik Replicate)
● Strong experience in Linux Operating System
● Solid knowledge of database concepts, MongoDB, SQL, and NoSql internals
● Experience with backup and recovery for production and non-production environments
● Experience in security principles and its implementation
● Exceptionally passionate about always keeping the product quality bar at an extremely
high level
Nice-to-haves
● Proficient with one or more of Python/Node.Js/Java/similar languages

Why you want to Work with Us:
● Great Company Culture. We pride ourselves on having a values-based culture that
is welcoming, intentional, and respectful. Our internal NPS of over 65 speaks for
itself - employees recommend Ushur as a great place to work!
● Bring your whole self to work. We are focused on building a diverse culture, with
innovative ideas where you and your ideas are valued. We are a start-up and know
that every person has a significant impact!
● Rest and Relaxation. 13 Paid leaves, wellness Fridays offs (aka a day off to care
for yourself- every last Friday of the month), 12 paid sick Leaves, and more!
● Health Benefits. Preventive health checkups, Medical Insurance covering the
dependents, wellness sessions, and health talks at the office
● Keep learning. One of our core values is Growth Mindset - we believe in lifelong
learning. Certification courses are reimbursed. Ushur Community offers wide
resources for our employees to learn and grow.
● Flexible Work. In-office or hybrid working model, depending on position and
location. We seek to create an environment for all our employees where they can
thrive in both their profession and personal life.
Read more
Pune
5 - 8 yrs
₹1L - ₹15L / yr
Informatica
Informatica PowerCenter
Spark
Hadoop
Big Data
+6 more

Technical/Core skills

  1. Minimum 3 yrs of exp in Informatica Big data Developer(BDM) in Hadoop environment.
  2. Have knowledge of informatica Power exchange (PWX).
  3. Minimum 3 yrs of exp in big data querying tool like Hive and Impala.
  4. Ability to designing/development of complex mappings using informatica Big data Developer.
  5. Create and manage Informatica power exchange and CDC real time implementation
  6. Strong Unix knowledge skills for writing shell scripts and troubleshoot of existing scripts.
  7. Good knowledge of big data platforms and its framework.
  8. Good to have an experience in cloudera data platform (CDP)
  9. Experience with building stream processing systems using Kafka and spark
  10. Excellent SQL knowledge

 

Soft skills :

  1. Ability to work independently 
  2. Strong analytical and problem solving skills
  3. Attitude of learning new technology
  4. Regular interaction with vendors, partners and stakeholders
Read more
Cloudera

at Cloudera

2 recruiters
Sushmitha Rengarajan
Posted by Sushmitha Rengarajan
Bengaluru (Bangalore)
3 - 20 yrs
₹1L - ₹44L / yr
ETL
Informatica
Data Warehouse (DWH)
Relational Database (RDBMS)
Data Structures
+7 more

 

Cloudera Data Warehouse Hive team looking for a passionate senior developer to join our growing engineering team. This group is targeting the biggest enterprises wanting to utilize Cloudera’s services in a private and public cloud environment. Our product is built on open source technologies like Hive, Impala, Hadoop, Kudu, Spark and so many more providing unlimited learning opportunities.A Day in the LifeOver the past 10+ years, Cloudera has experienced tremendous growth making us the leading contributor to Big Data platforms and ecosystems and a leading provider for enterprise solutions based on Apache Hadoop. You will work with some of the best engineers in the industry who are tackling challenges that will continue to shape the Big Data revolution.  We foster an engaging, supportive, and productive work environment where you can do your best work. The team culture values engineering excellence, technical depth, grassroots innovation, teamwork, and collaboration.
You will manage product development for our CDP components, develop engineering tools and scalable services to enable efficient development, testing, and release operations.  You will be immersed in many exciting, cutting-edge technologies and projects, including collaboration with developers, testers, product, field engineers, and our external partners, both software and hardware vendors.Opportunity:Cloudera is a leader in the fast-growing big data platforms market. This is a rare chance to make a name for yourself in the industry and in the Open Source world. The candidate will responsible for Apache Hive and CDW projects. We are looking for a candidate who would like to work on these projects upstream and downstream. If you are curious about the project and code quality you can check the project and the code at the following link. You can start the development before you join. This is one of the beauties of the OSS world.Apache Hive

 

Responsibilities:

•Build robust and scalable data infrastructure software

•Design and create services and system architecture for your projects

•Improve code quality through writing unit tests, automation, and code reviews

•The candidate would write Java code and/or build several services in the Cloudera Data Warehouse.

•Worked with a team of engineers who reviewed each other's code/designs and held each other to an extremely high bar for the quality of code/designs

•The candidate has to understand the basics of Kubernetes.

•Build out the production and test infrastructure.

•Develop automation frameworks to reproduce issues and prevent regressions.

•Work closely with other developers providing services to our system.

•Help to analyze and to understand how customers use the product and improve it where necessary. 

Qualifications:

•Deep familiarity with Java programming language.

•Hands-on experience with distributed systems.

•Knowledge of database concepts, RDBMS internals.

•Knowledge of the Hadoop stack, containers, or Kubernetes is a strong plus. 

•Has experience working in a distributed team.

•Has 3+ years of experience in software development.

 

Read more
Cloudera

at Cloudera

2 recruiters
Sushmitha Rengarajan
Posted by Sushmitha Rengarajan
Remote, Bengaluru (Bangalore)
5 - 20 yrs
₹1L - ₹44L / yr
Java
Kubernetes
Docker
Hadoop
Apache Kafka
+3 more

 

Senior Software Engineer - 221254.

 

We (the Software Engineer team) are looking for a motivated, experienced person with a data driven approach to join our Distribution Team in Budapest or Szeged to help design, execute and improve our test sets and infrastructure for producing high-quality Hadoop software.

 

A Day in the life

 

You will be part of a team that makes sure our releases are predictable and deliver high value to the customer. This team is responsible for automating and maintaining our test harness, and making test results reliable and repeatable.

 

You will…

•work on making our distributed software stack more resilient to high-scale endurance runs and customer simulations

•provide valuable fixes to our product development teams to the issues you’ve found during exhaustive test runs

•work with product and field teams to make sure our customer simulations match the expectations and can provide valuable feedback to our customers

•work with amazing people - We are a fun & smart team, including many of the top luminaries in Hadoop and related open source communities. We frequently interact with the research community, collaborate with engineers at other top companies & host cutting edge researchers for tech talks.

•do innovative work - Cloudera pushes the frontier of big data & distributed computing, as our track record shows. We work on high-profile open source projects, interacting daily with engineers at other exciting companies, speaking at meet-ups, etc.

•be a part of a great culture - Transparent and open meritocracy. Everybody is always thinking of better ways to do things, and coming up with ideas that make a difference. We build our culture to be the best workplace in our careers.

 

You have...

•strong knowledge in at least 1 of the following languages: Java / Python / Scala / C++ / C#

•hands-on experience with at least 1 of the following configuration management tools: Ansible, Chef, Puppet, Salt

•confidence with Linux environments

•ability to identify critical weak spots in distributed software systems

•experience in developing automated test cases and test plans

•ability to deal with distributed systems

•solid interpersonal skills conducive to a distributed environment

•ability to work independently on multiple tasks

•self-driven & motivated, with a strong work ethic and a passion for problem solving

•innovate and automate and break the code

The right person in this role has an opportunity to make a huge impact at Cloudera and add value to our future decisions. If this position has piqued your interest and you have what we described - we invite you to apply! An adventure in data awaits.

 

Read more
EnterpriseMinds

at EnterpriseMinds

2 recruiters
phani kalyan
Posted by phani kalyan
Pune
9 - 14 yrs
₹20L - ₹40L / yr
Spark
Hadoop
Big Data
Data engineering
PySpark
+3 more
Job Id: SG0601

Hi,

Enterprise Minds is looking for Data Architect for Pune Location.

Req Skills:
Python,Pyspark,Hadoop,Java,Scala
Read more
Subhanu Consulting

at Subhanu Consulting

4 recruiters
Rashmi Anand
Posted by Rashmi Anand
Bengaluru (Bangalore)
8 - 15 yrs
₹10L - ₹15L / yr
J2EE
Apache Kafka
API
JMS
Hadoop
+4 more
  • Produce clean code and automated tests
  • Align with enterprise architecture frameworks and standards
  • Be the role-model for all engineers in the team in terms of technical competency
  • Research, assess and adopt new technologies as required
  • Be a guide and mentor to the team members and help in ramping up the overall skill-base of the team.
  • Produce detailed estimates and optimized work plans for requirements and changes
  • Ensure that features are delivered on time and that they meet the business needs
  • Strive for quality of performance, usability, reliability, maintainability, and extensibility
  • Identify opportunities for process and tool improvements
  • Use analytical rigor to produce effective solutions to poorly defined problems
  • Follow Build to Ship mantra in practice with full Dev Ops implementation
  • 10+ years of core software development and product creation experience in CPaaS.
  • Working knowledge in VoIP, communication API , J2EE, JMS/ Kafka, Web-Services, Hadoop, React, Node.js, GoLang.
  • Working knowledge in Various CPaaS channels - SMS, voice, WhatsApp, RCS, Email.
  • Working knowledge of DevOps, automation testing, test driven development, behavior driven development, server-less or micro-services
  • Experience with AWS / Azure deployments
  • Solid background in large scale software development.
  • Full stack understanding of web/mobile/API/database development concepts and patterns
  • Exposure to Microservices, Iaas, PaaS, service mesh, SaaS and cloud native application development.

  • Understanding of Agile Scrum and SDLC principles.
  • Containerization and orchestrations:- Dockers, kuberenetes, openshift, consule etc.
  • Knowledge on NFV (openstack, Vsphere, Vcloud etc)
  • Experience in Data Analytics/AI/ML or Marketing Tech domain is an added advantage
  •  
Read more
Tier 1 MNC
Chennai, Pune, Bengaluru (Bangalore), Noida, Gurugram, Kochi (Cochin), Coimbatore, Hyderabad, Mumbai, Navi Mumbai
3 - 12 yrs
₹3L - ₹15L / yr
Spark
Hadoop
Big Data
Data engineering
PySpark
+1 more
Greetings,
We are hiring for Tier 1 MNC for the software developer with good knowledge in Spark,Hadoop and Scala
Read more
Celebal Technologies

at Celebal Technologies

2 recruiters
Payal Hasnani
Posted by Payal Hasnani
Jaipur, Noida, Gurugram, Delhi, Ghaziabad, Faridabad, Pune, Mumbai
5 - 15 yrs
₹7L - ₹25L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+4 more
Job Responsibilities:

• Project Planning and Management
o Take end-to-end ownership of multiple projects / project tracks
o Create and maintain project plans and other related documentation for project
objectives, scope, schedule and delivery milestones
o Lead and participate across all the phases of software engineering, right from
requirements gathering to GO LIVE
o Lead internal team meetings on solution architecture, effort estimation, manpower
planning and resource (software/hardware/licensing) planning
o Manage RIDA (Risks, Impediments, Dependencies, Assumptions) for projects by
developing effective mitigation plans
• Team Management
o Act as the Scrum Master
o Conduct SCRUM ceremonies like Sprint Planning, Daily Standup, Sprint Retrospective
o Set clear objectives for the project and roles/responsibilities for each team member
o Train and mentor the team on their job responsibilities and SCRUM principles
o Make the team accountable for their tasks and help the team in achieving them
o Identify the requirements and come up with a plan for Skill Development for all team
members
• Communication
o Be the Single Point of Contact for the client in terms of day-to-day communication
o Periodically communicate project status to all the stakeholders (internal/external)
• Process Management and Improvement
o Create and document processes across all disciplines of software engineering
o Identify gaps and continuously improve processes within the team
o Encourage team members to contribute towards process improvement
o Develop a culture of quality and efficiency within the team

Must have:
• Minimum 08 years of experience (hands-on as well as leadership) in software / data engineering
across multiple job functions like Business Analysis, Development, Solutioning, QA, DevOps and
Project Management
• Hands-on as well as leadership experience in Big Data Engineering projects
• Experience developing or managing cloud solutions using Azure or other cloud provider
• Demonstrable knowledge on Hadoop, Hive, Spark, NoSQL DBs, SQL, Data Warehousing, ETL/ELT,
DevOps tools
• Strong project management and communication skills
• Strong analytical and problem-solving skills
• Strong systems level critical thinking skills
• Strong collaboration and influencing skills

Good to have:
• Knowledge on PySpark, Azure Data Factory, Azure Data Lake Storage, Synapse Dedicated SQL
Pool, Databricks, PowerBI, Machine Learning, Cloud Infrastructure
• Background in BFSI with focus on core banking
• Willingness to travel

Work Environment
• Customer Office (Mumbai) / Remote Work

Education
• UG: B. Tech - Computers / B. E. – Computers / BCA / B.Sc. Computer Science
Read more
6sense

at 6sense

15 recruiters
Kunjan Bhagat
Posted by Kunjan Bhagat
Remote only
4 - 10 yrs
Best in industry
Java
API
Python
Kubernetes
Docker
+7 more

The Company:

It’s no surprise that 6sense is named a top workplace year after year — we have industry-leading technology developed and taken to market by a world-class team. 6sense is Top Rated on Glassdoor with a 4.9/5 and our CEO Jason Zintak was recognized as the #1 CEO in the small & medium business category by Glassdoor’s https://www.glassdoor.com/Award/Top-CEOs-at-SMBs-LST_KQ0%2C16.htm">2021 Top CEO Employees Choice Awards.

In 2021, the company was recognized for having the Best Company for Diversity, Best Company for Women, Best CEO, Best Company Culture, Best Company Perks & Benefits and Happiest Employees from the employee feedback platform Comparably. In addition, 6sense has also won several accolades that demonstrate its reputation as an employer of choice including the Glassdoor Best Place to Work (2022), TrustRadius Tech Cares (2021) and Inc. Best Workplaces (2022, 2021, 2020, 2019).

6sense reinvents the way organizations create, manage, and convert pipeline to revenue. The 6sense Revenue AI captures anonymous buying signals, predicts the right accounts to target at the ideal time, and recommends the channels and messages to boost revenue performance. Removing guesswork, friction and wasted sales effort, 6sense empowers sales, marketing, and customer success teams to significantly improve pipeline quality, accelerate sales velocity, increase conversion rates, and grow revenue predictably.

Senior Software Engineer - Infrastructure, Cloud

Responsibilities:

Develop and deploy services to improve the availability, ease of use/management, and visibility of 6sense systems

Building and scaling out our services and infrastructure

Learning and adopting technologies that may aide in solving our challenges

Own our critical underlying systems like AWS, Kubernetes, Mesos, infrastructure deployment, and compute cluster architecture (which services frameworks and engines like Hadoop/Hive/Presto)

Write/review/debug production code, develop documentation and capacity plans, and debug live production problems Contributing back to open-source projects if we need to add or patch functionality
Support the overall Software Engineering team to resolve any issues they encounter

Minimum Qualifications:

5+ years of experience with Linux/Unix system administration and networking fundamentals 3+ years in a Software Engineering role or equivalent experience
4+ years of working with AWS
4+ years of experience working with Kubernetes, Docker.

Strong skills in reading code as well as writing clean, maintainable, and scalable code
Good knowledge of Python
Experience designing, building, and maintaining scalable services and/or service-oriented architecture
Experience with high-availability
Experience with modern configuration management tools (e.g. Ansible/AWX, Chef, Puppet, Pulumi) and idempotency

Bonus Requirements:

Knowledge of standard security practices
Knowledge of the Hadoop ecosystem (e.g. Hadoop, Hive, Presto) including deployment, scaling, and maintenance Experience with operating and maintaining VPN/SSH/ZeroTrust access infrastructure
Experience with CDNs such as CloudFront and Akamai
Good knowledge of Javascript, Java, Golang
Exposure to modern build systems such as Bazel, Buck, or Pants#LI-remote

Every person in every role at 6sense owns a part of defining the future of our industry-leading technology. You’ll join a team where curiosity is prized, no one’s satisfied with the status quo, and everyone’s all-in on the collective good.6sense is a place where difference-makers roll up their sleeves, take risks, act with integrity, and measure successby the value we create for our customers.

We want 6sense to be the best chapter of your career.

Feel part of something

You’ll be part of building tomorrow’s tech, revolutionizing how marketing and sales teams create, manage, and convert pipeline to revenue. And you’ll be seen and appreciated by co-workers who challenge you, cheer you on, and always have your back.

At 6sense, you’ll experience the passion from customers and colleagues alike for our market-leading vision, and you're entrusted with applying your unique talents to help bring that vision to life.

Build a career

As part of a company on a rocketship trajectory, there’s no way around it: You’re going to experience unparalleled career growth. With colleagues as humble and hungry as you are, and a leadership philosophy grounded in trust, transparency, and empowerment, every day is a chance to improve on the one before.

Enjoy access to our Udemy Training Library with 5,000+ courses, give and get recognition from your coworkers, and spend time with our executive team every two weeks in our All Hands gathering to connect, learn and ask leaders about whatever is on your mind.

Enjoy work, and your life

This is a place where you’ll do your best work and inspire others to do theirs — where you’re guaranteed to make real connections, for life, along the way.

We want to help you prioritize health and wellness, today and tomorrow. Take advantage of family medical coverage; a monthly stipend to support your physical, mental, and financial wellness; generous paid parental leave benefits; Plus, we have an open time-off policy, so you can take the time you need.

Set for success 

A vision as big as ours only comes to life when we’re all winning together.

We’ll make sure you have the equipment you need to work at home or in one of our offices. And have the right snacks, pens or lighting with our work-from-home expense reimbursement allowance. We also partner with WeWork to make sure that if your choice is a hybrid of home and office, we have you covered in the locations they’re offered.

That’s the commitment we make to every one of our employees. If this sounds like a place where you'll thrive as you take your success to the next level, let’s chat!

Read more
IntraEdge

at IntraEdge

1 recruiter
Poornima V
Posted by Poornima V
Remote only
4 - 16 yrs
₹11L - ₹27L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+2 more

Company Name: Intraedge Technologies Ltd (https://intraedge.com/" target="_blank">https://intraedge.com/)

Type: Permanent, Full time

Location: Any

A Bachelor’s degree in computer science, computer engineering, other technical discipline, or equivalent work experience

  • 4+ years of software development experience
  • 4+ years exp in programming languages- Python, spark, Scala, Hadoop, hive
  • Demonstrated experience with Agile or other rapid application development methods
  • Demonstrated experience with object-oriented design and coding.

Please mail you rresume to poornimakattherateintraedgedotcomalong with NP, how soon can you join, ECTC, Availability for interview, Location
Read more
Product based company
Bengaluru (Bangalore)
3 - 12 yrs
₹5L - ₹30L / yr
Spark
Hadoop
Big Data
Data engineering
PySpark
+6 more

Responsibilities:

  • Should act as a technical resource for the Data Science team and be involved in creating and implementing current and future Analytics projects like data lake design, data warehouse design, etc.
  • Analysis and design of ETL solutions to store/fetch data from multiple systems like Google Analytics, CleverTap, CRM systems etc.
  • Developing and maintaining data pipelines for real time analytics as well as batch analytics use cases.
  • Collaborate with data scientists and actively work in the feature engineering and data preparation phase of model building
  • Collaborate with product development and dev ops teams in implementing the data collection and aggregation solutions
  • Ensure quality and consistency of the data in Data warehouse and follow best data governance practices
  • Analyse large amounts of information to discover trends and patterns
  • Mine and analyse data from company databases to drive optimization and improvement of product development, marketing techniques and business strategies.\

Requirements

  • Bachelor’s or Masters in a highly numerate discipline such as Engineering, Science and Economics
  • 2-6 years of proven experience working as a Data Engineer preferably in ecommerce/web based or consumer technologies company
  • Hands on experience of working with different big data tools like Hadoop, Spark , Flink, Kafka and so on
  • Good understanding of AWS ecosystem for big data analytics
  • Hands on experience in creating data pipelines either using tools or by independently writing scripts
  • Hands on experience in scripting languages like Python, Scala, Unix Shell scripting and so on
  • Strong problem solving skills with an emphasis on product development.
  • Experience using business intelligence tools e.g. Tableau, Power BI would be an added advantage (not mandatory)
Read more
6sense

at 6sense

15 recruiters
Romesh Rawat
Posted by Romesh Rawat
Remote only
5 - 8 yrs
₹30L - ₹45L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+4 more

About Slintel (a 6sense company) :

Slintel, a 6sense company,  the leader in capturing technographics-powered buying intent, helps companies uncover the 3% of active buyers in their target market. Slintel evaluates over 100 billion data points and analyzes factors such as buyer journeys, technology adoption patterns, and other digital footprints to deliver market & sales intelligence.

Slintel's customers have access to the buying patterns and contact information of more than 17 million companies and 250 million decision makers across the world.

Slintel is a fast growing B2B SaaS company in the sales and marketing tech space. We are funded by top tier VCs, and going after a billion dollar opportunity. At Slintel, we are building a sales development automation platform that can significantly improve outcomes for sales teams, while reducing the number of hours spent on research and outreach.

We are a big data company and perform deep analysis on technology buying patterns, buyer pain points to understand where buyers are in their journey. Over 100 billion data points are analyzed every week to derive recommendations on where companies should focus their marketing and sales efforts on. Third party intent signals are then clubbed with first party data from CRMs to derive meaningful recommendations on whom to target on any given day.

6sense is headquartered in San Francisco, CA and has 8 office locations across 4 countries.

6sense, an account engagement platform, secured $200 million in a Series E funding round, bringing its total valuation to $5.2 billion 10 months after its $125 million Series D round. The investment was co-led by Blue Owl and MSD Partners, among other new and existing investors.

Linkedin (Slintel) : https://www.linkedin.com/company/slintel/">https://www.linkedin.com/company/slintel/

Industry : Software Development

Company size : 51-200 employees (189 on LinkedIn)

Headquarters : Mountain View, California

Founded : 2016

Specialties : Technographics, lead intelligence, Sales Intelligence, Company Data, and Lead Data.

Website (Slintel) : https://www.slintel.com/slintel">https://www.slintel.com/slintel

Linkedin (6sense) : https://www.linkedin.com/company/6sense/">https://www.linkedin.com/company/6sense/

Industry : Software Development

Company size : 501-1,000 employees (937 on LinkedIn)

Headquarters : San Francisco, California

Founded : 2013

Specialties : Predictive intelligence, Predictive marketing, B2B marketing, and Predictive sales

Website (6sense) : https://6sense.com/">https://6sense.com/

Acquisition News : 

https://inc42.com/buzz/us-based-based-6sense-acquires-b2b-buyer-intelligence-startup-slintel/ 

Funding Details & News :

Slintel funding : https://www.crunchbase.com/organization/slintel">https://www.crunchbase.com/organization/slintel

6sense funding : https://www.crunchbase.com/organization/6sense">https://www.crunchbase.com/organization/6sense

https://www.nasdaq.com/articles/ai-software-firm-6sense-valued-at-%245.2-bln-after-softbank-joins-funding-round">https://www.nasdaq.com/articles/ai-software-firm-6sense-valued-at-%245.2-bln-after-softbank-joins-funding-round

https://www.bloomberg.com/news/articles/2022-01-20/6sense-reaches-5-2-billion-value-with-softbank-joining-round">https://www.bloomberg.com/news/articles/2022-01-20/6sense-reaches-5-2-billion-value-with-softbank-joining-round

https://xipometer.com/en/company/6sense">https://xipometer.com/en/company/6sense

Slintel & 6sense Customers :

https://www.featuredcustomers.com/vendor/slintel/customers

https://www.featuredcustomers.com/vendor/6sense/customers">https://www.featuredcustomers.com/vendor/6sense/customers

About the job

Responsibilities

  • Work in collaboration with the application team and integration team to design, create, and maintain optimal data pipeline architecture and data structures for Data Lake/Data Warehouse
  • Work with stakeholders including the Sales, Product, and Customer Support teams to assist with data-related technical issues and support their data analytics needs
  • Assemble large, complex data sets from third-party vendors to meet business requirements.
  • Identify, design, and implement internal process improvements: automating manual processes, optimising data delivery, re-designing infrastructure for greater scalability, etc.
  • Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL, Elastic search, MongoDB, and AWS technology
  • Streamline existing and introduce enhanced reporting and analysis solutions that leverage complex data sources derived from multiple internal systems

Requirements

  • 3+ years of experience in a Data Engineer role
  • Proficiency in Linux
  • Must have SQL knowledge and experience working with relational databases, query authoring (SQL) as well as familiarity with databases including Mysql, Mongo, Cassandra, and Athena
  • Must have experience with Python/ Scala
  • Must have experience with Big Data technologies like Apache Spark
  • Must have experience with Apache Airflow
  • Experience with data pipeline and ETL tools like AWS Glue
  • Experience working with AWS cloud services: EC2 S3 RDS, Redshift and other Data solutions eg. Databricks, Snowflake

 

Desired Skills and Experience

Python, SQL, Scala, Spark, ETL

 

Read more
Ganit Business Solutions

at Ganit Business Solutions

3 recruiters
Vijitha VS
Posted by Vijitha VS
Remote only
4 - 7 yrs
₹10L - ₹30L / yr
Scala
ETL
Informatica
Data Warehouse (DWH)
Big Data
+4 more

Job Description:

We are looking for a Big Data Engineer who have worked across the entire ETL stack. Someone who has ingested data in a batch and live stream format, transformed large volumes of daily and built Data-warehouse to store the transformed data and has integrated different visualization dashboards and applications with the data stores.    The primary focus will be on choosing optimal solutions to use for these purposes, then maintaining, implementing, and monitoring them.

Responsibilities:

  • Develop, test, and implement data solutions based on functional / non-functional business requirements.
  • You would be required to code in Scala and PySpark daily on Cloud as well as on-prem infrastructure
  • Build Data Models to store the data in a most optimized manner
  • Identify, design, and implement process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
  • Implementing the ETL process and optimal data pipeline architecture
  • Monitoring performance and advising any necessary infrastructure changes.
  • Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
  • Work with data and analytics experts to strive for greater functionality in our data systems.
  • Proactively identify potential production issues and recommend and implement solutions
  • Must be able to write quality code and build secure, highly available systems.
  • Create design documents that describe the functionality, capacity, architecture, and process.
  • Review peer-codes and pipelines before deploying to Production for optimization issues and code standards

Skill Sets:

  • Good understanding of optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and ‘big data’ technologies.
  • Proficient understanding of distributed computing principles
  • Experience in working with batch processing/ real-time systems using various open-source technologies like NoSQL, Spark, Pig, Hive, Apache Airflow.
  • Implemented complex projects dealing with the considerable data size (PB).
  • Optimization techniques (performance, scalability, monitoring, etc.)
  • Experience with integration of data from multiple data sources
  • Experience with NoSQL databases, such as HBase, Cassandra, MongoDB, etc.,
  • Knowledge of various ETL techniques and frameworks, such as Flume
  • Experience with various messaging systems, such as Kafka or RabbitMQ
  • Creation of DAGs for data engineering
  • Expert at Python /Scala programming, especially for data engineering/ ETL purposes

 

 

 

Read more
Hiring for a leading client
New Delhi
3 - 5 yrs
₹10L - ₹15L / yr
Big Data
Apache Kafka
Business Intelligence (BI)
Data Warehouse (DWH)
Coding
+15 more
Job Description:
Senior Software Engineer - Data Team

We are seeking a highly motivated Senior Software Engineer with hands-on experience and build scalable, extensible data solutions, identifying and addressing performance bottlenecks, collaborating with other team members, and implementing best practices for data engineering. Our engineering process is fully agile, and has a really fast release cycle - which keeps our environment very energetic and fun.

What you'll do:

Design and development of scalable applications.
Work with Product Management teams to get maximum value out of existing data.
Contribute to continual improvement by suggesting improvements to the software system.
Ensure high scalability and performance
You will advocate for good, clean, well documented and performing code; follow standards and best practices.
We'd love for you to have:

Education: Bachelor/Master Degree in Computer Science.
Experience: 3-5 years of relevant experience in BI/DW with hands-on coding experience.

Mandatory Skills

Strong in problem-solving
Strong experience with Big Data technologies, Hive, Hadoop, Impala, Hbase, Kafka, Spark
Strong experience with orchestration framework like Apache oozie, Airflow
Strong experience of Data Engineering
Strong experience with Database and Data Warehousing technologies and ability to understand complex design, system architecture
Experience with the full software development lifecycle, design, develop, review, debug, document, and deliver (especially in a multi-location organization)
Good knowledge of Java
Desired Skills

Experience with Python
Experience with reporting tools like Tableau, QlikView
Experience of Git and CI-CD pipeline
Awareness of cloud platform ex:- AWS
Excellent communication skills with team members, Business owners, across teams
Be able to work in a challenging, dynamic environment and meet tight deadlines
Read more
Bengaluru (Bangalore), Pune, Hyderabad
4 - 6 yrs
₹6L - ₹22L / yr
Apache HBase
Apache Hive
Apache Spark
Go Programming (Golang)
Ruby on Rails (ROR)
+5 more
Urgently require Hadoop Developer in reputed MNC company

Location: Bangalore/Pune/Hyderabad/Nagpur

4-5 years of overall experience in software development.
- Experience on Hadoop (Apache/Cloudera/Hortonworks) and/or other Map Reduce Platforms
- Experience on Hive, Pig, Sqoop, Flume and/or Mahout
- Experience on NO-SQL – HBase, Cassandra, MongoDB
- Hands on experience with Spark development,  Knowledge of Storm, Kafka, Scala
- Good knowledge of Java
- Good background of Configuration Management/Ticketing systems like Maven/Ant/JIRA etc.
- Knowledge around any Data Integration and/or EDW tools is plus
- Good to have knowledge of  using Python/Perl/Shell

 

Please note - Hbase hive and spark are must.

Read more
Indium Software

at Indium Software

16 recruiters
Karunya P
Posted by Karunya P
Bengaluru (Bangalore), Hyderabad
1 - 9 yrs
₹1L - ₹15L / yr
SQL
Python
Hadoop
HiveQL
Spark
+1 more

Responsibilities:

 

* 3+ years of Data Engineering Experience - Design, develop, deliver and maintain data infrastructures.

SQL Specialist – Strong knowledge and Seasoned experience with SQL Queries

Languages: Python

* Good communicator, shows initiative, works well with stakeholders.

* Experience working closely with Data Analysts and provide the data they need and guide them on the issues.

* Solid ETL experience and Hadoop/Hive/Pyspark/Presto/ SparkSQL

* Solid communication and articulation skills

* Able to handle stakeholders independently with less interventions of reporting manager.

* Develop strategies to solve problems in logical yet creative ways.

* Create custom reports and presentations accompanied by strong data visualization and storytelling

 

We would be excited if you have:

 

* Excellent communication and interpersonal skills

* Ability to meet deadlines and manage project delivery

* Excellent report-writing and presentation skills

* Critical thinking and problem-solving capabilities

Read more
Top startup of India -  News App
Noida
6 - 10 yrs
₹35L - ₹65L / yr
Data Science
Machine Learning (ML)
Natural Language Processing (NLP)
Computer Vision
TensorFlow
+6 more
This will be an individual contributor role and people from Tier 1/2 and Product based company can only apply.

Requirements-

● B.Tech/Masters in Mathematics, Statistics, Computer Science or another quantitative field
● 2-3+ years of work experience in ML domain ( 2-5 years experience )
● Hands-on coding experience in Python
● Experience in machine learning techniques such as Regression, Classification,Predictive modeling, Clustering, Deep Learning stack, NLP.
● Working knowledge of Tensorflow/PyTorch
Optional Add-ons-
● Experience with distributed computing frameworks: Map/Reduce, Hadoop, Spark etc.
● Experience with databases: MongoDB
Read more
Top startup of India -  News App
Noida
2 - 5 yrs
₹20L - ₹35L / yr
Linux/Unix
Python
Hadoop
Apache Spark
MongoDB
+4 more
Responsibilities
● Create and maintain optimal data pipeline architecture.
● Assemble large, complex data sets that meet functional / non-functional
business requirements.
● Building and optimizing ‘big data’ data pipelines, architectures and data sets.
● Maintain, organize & automate data processes for various use cases.
● Identifying trends, doing follow-up analysis, preparing visualizations.
● Creating daily, weekly and monthly reports of product KPIs.
● Create informative, actionable and repeatable reporting that highlights
relevant business trends and opportunities for improvement.

Required Skills And Experience:
● 2-5 years of work experience in data analytics- including analyzing large data sets.
● BTech in Mathematics/Computer Science
● Strong analytical, quantitative and data interpretation skills.
● Hands-on experience with Python, Apache Spark, Hadoop, NoSQL
databases(MongoDB preferred), Linux is a must.
● Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
● Experience with Google Cloud Data Analytics Products such as BigQuery, Dataflow, Dataproc etc. (or similar cloud-based platforms).
● Experience working within a Linux computing environment, and use of
command-line tools including knowledge of shell/Python scripting for
automating common tasks.
● Previous experience working at startups and/or in fast-paced environments.
● Previous experience as a data engineer or in a similar role.
Read more
Chennai
5 - 13 yrs
₹9L - ₹28L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+6 more
  • Demonstrable experience owning and developing big data solutions, using Hadoop, Hive/Hbase, Spark, Databricks, ETL/ELT for 5+ years

·       10+ years of Information Technology experience, preferably with Telecom / wireless service providers.

·       Experience in designing data solution following Agile practices (SAFe methodology); designing for testability, deployability and releaseability; rapid prototyping, data modeling, and decentralized innovation

  • DataOps mindset: allowing the architecture of a system to evolve continuously over time, while simultaneously supporting the needs of current users
  • Create and maintain Architectural Runway, and Non-Functional Requirements.
  • Design for Continuous Delivery Pipeline (CI/CD data pipeline) and enables Built-in Quality & Security from the start.

·       To be able to demonstrate an understanding and ideally use of, at least one recognised architecture framework or standard e.g. TOGAF, Zachman Architecture Framework etc

·       The ability to apply data, research, and professional judgment and experience to ensure our products are making the biggest difference to consumers

·       Demonstrated ability to work collaboratively

·       Excellent written, verbal and social skills - You will be interacting with all types of people (user experience designers, developers, managers, marketers, etc.)

·       Ability to work in a fast paced, multiple project environment on an independent basis and with minimal supervision

·       Technologies: .NET, AWS, Azure; Azure Synapse, Nifi, RDS, Apache Kafka, Azure Data bricks, Azure datalake storage, Power BI, Reporting Analytics, QlickView, SQL on-prem Datawarehouse; BSS, OSS & Enterprise Support Systems

Read more
US Based Product Organization
Bengaluru (Bangalore)
10 - 15 yrs
₹25L - ₹45L / yr
Hadoop
HDFS
Apache Hive
Zookeeper
Cloudera
+8 more

Responsibilities :

  • Provide Support Services to our Gold & Enterprise customers using our flagship product suits. This may include assistance provided during the engineering and operations of distributed systems as well as responses for mission-critical systems and production customers.
  • Lead end-to-end delivery and customer success of next-generation features related to scalability, reliability, robustness, usability, security, and performance of the product
  • Lead and mentor others about concurrency, parallelization to deliver scalability, performance, and resource optimization in a multithreaded and distributed environment
  • Demonstrate the ability to actively listen to customers and show empathy to the customer’s business impact when they experience issues with our products


Requires Skills :

  • 10+ years of Experience with a highly scalable, distributed, multi-node environment (100+ nodes)
  • Hadoop operation including Zookeeper, HDFS, YARN, Hive, and related components like the Hive metastore, Cloudera Manager/Ambari, etc
  • Authentication and security configuration and tuning (KNOX, LDAP, Kerberos, SSL/TLS, second priority: SSO/OAuth/OIDC, Ranger/Sentry)
  • Java troubleshooting, e.g., collection and evaluation of jstacks, heap dumps
  • Linux, NFS, Windows, including application installation, scripting, basic command line
  • Docker and Kubernetes configuration and troubleshooting, including Helm charts, storage options, logging, and basic kubectl CLI
  • Experience working with scripting languages (Bash, PowerShell, Python)
  • Working knowledge of application, server, and network security management concepts
  • Familiarity with virtual machine technologies
  • Knowledge of databases like MySQL and PostgreSQL,
  • Certification on any of the leading Cloud providers (AWS, Azure, GCP ) and/or Kubernetes is a big plus
Read more
Picture the future
Agency job
via Jobdost by Sathish Kumar
Hyderabad
4 - 7 yrs
₹5L - ₹15L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+7 more

CORE RESPONSIBILITIES

  • Create and manage cloud resources in AWS 
  • Data ingestion from different data sources which exposes data using different technologies, such as: RDBMS, REST HTTP API, flat files, Streams, and Time series data based on various proprietary systems. Implement data ingestion and processing with the help of Big Data technologies 
  • Data processing/transformation using various technologies such as Spark and Cloud Services. You will need to understand your part of business logic and implement it using the language supported by the base data platform 
  • Develop automated data quality check to make sure right data enters the platform and verifying the results of the calculations 
  • Develop an infrastructure to collect, transform, combine and publish/distribute customer data.
  • Define process improvement opportunities to optimize data collection, insights and displays.
  • Ensure data and results are accessible, scalable, efficient, accurate, complete and flexible 
  • Identify and interpret trends and patterns from complex data sets 
  • Construct a framework utilizing data visualization tools and techniques to present consolidated analytical and actionable results to relevant stakeholders. 
  • Key participant in regular Scrum ceremonies with the agile teams  
  • Proficient at developing queries, writing reports and presenting findings 
  • Mentor junior members and bring best industry practices 

 

QUALIFICATIONS

  • 5-7+ years’ experience as data engineer in consumer finance or equivalent industry (consumer loans, collections, servicing, optional product, and insurance sales) 
  • Strong background in math, statistics, computer science, data science or related discipline
  • Advanced knowledge one of language: Java, Scala, Python, C# 
  • Production experience with: HDFS, YARN, Hive, Spark, Kafka, Oozie / Airflow, Amazon Web Services (AWS), Docker / Kubernetes, Snowflake  
  • Proficient with
  • Data mining/programming tools (e.g. SAS, SQL, R, Python)
  • Database technologies (e.g. PostgreSQL, Redshift, Snowflake. and Greenplum)
  • Data visualization (e.g. Tableau, Looker, MicroStrategy)
  • Comfortable learning about and deploying new technologies and tools. 
  • Organizational skills and the ability to handle multiple projects and priorities simultaneously and meet established deadlines. 
  • Good written and oral communication skills and ability to present results to non-technical audiences 
  • Knowledge of business intelligence and analytical tools, technologies and techniques.


Mandatory Requirements 

  • Experience in AWS Glue
  • Experience in Apache Parquet 
  • Proficient in AWS S3 and data lake 
  • Knowledge of Snowflake
  • Understanding of file-based ingestion best practices.
  • Scripting language - Python & pyspark

 

Read more
Hyderabad
4 - 7 yrs
₹12L - ₹28L / yr
Python
Spark
Big Data
Hadoop
Apache Hive
Must have :

  • At least 4 to 7 years of relevant experience as Big Data Engineer
  • Hands-on experience in Scala or Python
  • Hands-on experience on major components in Hadoop Ecosystem like HDFS, Map Reduce, Hive, Impala.
  • Strong programming experience in building applications/platform using Scala or Python.
  • Experienced in implementing Spark RDD Transformations, actions to implement business analysis


We are specialized in productizing solutions of new technology. 
Our vision is to build engineers with entrepreneurial and leadership mindsets who can create highly impactful products and solutions using technology to deliver immense value to our clients.
We strive to develop innovation and passion into everything we do, whether it is services or products, or solutions.
Read more
NA

at NA

Agency job
via Talent folks by Rijooshri Saikia
Bengaluru (Bangalore)
7 - 13 yrs
₹10L - ₹12L / yr
Team Management
Java
Hadoop
Microservices
People Management
+1 more

Senior Team Lead, Software Engineering (96386)

 

Role: Senior Team Lead


Skills:  Has to be an expert in these -               

  1. Java
  2. Microservices
  3. Hadoop
  4. People Management Skills.

                   
Will be a plus if knowledge on -            

AWS

Location:    Bangalore India – North Gate.

 

Read more
Acceldata

at Acceldata

5 recruiters
Richa  Kukar
Posted by Richa Kukar
Bengaluru (Bangalore)
6 - 10 yrs
Best in industry
SRE
Reliability engineering
Site reliability
Hadoop
HDFS
+1 more

Senior SRE - Acceldata (IC3 Level)


About the Job


You will join a team of highly skilled engineers who are responsible for delivering Acceldata’s support services. Our Site Reliability Engineers are trained to be active listeners and demonstrate empathy when customers encounter product issues. In our fun and collaborative environment  Site Reliability Engineers develop strong business, interpersonal and technical skills to deliver high-quality service to our valued customers.


When you arrive for your first day, we’ll want you to have:

  • Solid skills in troubleshooting to repair failed products or processes on a machine or a system using a logical, systematic search for the source of a problem in order to solve it, and make the product or process operational again
  • A strong ability to understand the feelings of our customers as we empathize with them on the issue at hand
  • A strong desire to increase your product and technology skillset; increase- your confidence supporting our products so you can help our customers succeed

In this position you will…

  • Provide Support Services to our Gold & Enterprise customers using our flagship Acceldata Pulse,Flow & Torch Product suits. This may include assistance provided during the engineering and operations of distributed systems as well as responses for mission-critical systems and production customers.
  • Demonstrate the ability to actively listen to customers and show empathy to the customer’s business impact when they experience issues with our products
  • Participate in the queue management and coordination process by owning customer escalations, managing the unassigned queue.
  • Be involved with and work on other support related activities - Performing POC & assisting Onboarding deployments of Acceldata & Hadoop distribution products.
  • Triage, diagnose and escalate customer inquiries when applicable during their engineering and operations efforts.
  • Collaborate and share solutions with both customers and the Internal team.
  • Investigate product related issues both for particular customers and for common trends that may arise
  • Study and understand critical system components and large cluster operations
  • Differentiate between issues that arise in operations, user code, or product
  • Coordinate enhancement and feature requests with product management and Acceldata engineering team.
  • Flexible in working in Shifts.
  • Participate in a Rotational weekend on-call roster for critical support needs.
  • Participate as a designated or dedicated engineer for specific customers. Aspects of this engagement translates to building long term successful relationships with customers, leading weekly status calls, and occasional visits to customer sites

In this position, you should have…

  • A strong desire and aptitude to become a well-rounded support professional. Acceldata Support considers the service we deliver as our core product.
  • A positive attitude towards feedback and continual improvement
  • A willingness to give direct feedback to and partner with management to improve team operations
  • A tenacity to bring calm and order to the often stressful situations of customer cases
  • A mental capability to multi-task across many customer situations simultaneously
  • Bachelor degree in Computer Science or Engineering or equivalent experience. Master’s degree is a plus
  • At least 2+ years of experience with at least one of the following cloud platforms: Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform (GCP), experience with managing and supporting a cloud infrastructure on any of the 3 platforms. Also knowledge on Kubernetes, Docker is a must.
  • Strong troubleshooting skills (in example, TCP/IP, DNS, File system, Load balancing, database, Java)
  • Excellent communication skills in English (written and verbal)
  • Prior enterprise support experience in a technical environment strongly preferred

Strong Hands-on Experience Working With Or Supporting The Following

  • 8-12 years of Experience with a highly-scalable, distributed, multi-node environment (50+ nodes)
  • Hadoop operation including Zookeeper, HDFS, YARN, Hive, and related components like the Hive metastore, Cloudera Manager/Ambari, etc
  • Authentication and security configuration and tuning (KNOX, LDAP, Kerberos, SSL/TLS, second priority: SSO/OAuth/OIDC, Ranger/Sentry)
  • Java troubleshooting, e.g., collection and evaluation of jstacks, heap dumps

You might also have…

  • Linux, NFS, Windows, including application installation, scripting, basic command line
  • Docker and Kubernetes configuration and troubleshooting, including Helm charts, storage options, logging, and basic kubectl CLI
  • Experience working with scripting languages (Bash, PowerShell, Python)
  • Working knowledge of application, server, and network security management concepts
  • Familiarity with virtual machine technologies
  • Knowledge of databases like MySQL and PostgreSQL,
  • Certification on any of the leading Cloud providers (AWS, Azure, GCP ) and/or Kubernetes is a big plus

The right person in this role has an opportunity to make a huge impact at Acceldata and add value to our future decisions. If this position has piqued your interest and you have what we described - we invite you to apply! An adventure in data awaits.

Learn more at https://www.acceldata.io/about-us">https://www.acceldata.io/about-us



Read more
Information Solution Provider Company
Delhi, Gurugram, Noida, Ghaziabad, Faridabad
2 - 7 yrs
₹10L - ₹15L / yr
Spark
Scala
Hadoop
Big Data
Data engineering
+2 more

Responsibilities:

 

  • Designing and implementing fine-tuned production ready data/ML pipelines in Hadoop platform.
  • Driving optimization, testing and tooling to improve quality.
  • Reviewing and approving high level & amp; detailed design to ensure that the solution delivers to the business needs and aligns to the data & analytics architecture principles and roadmap.
  • Understanding business requirements and solution design to develop and implement solutions that adhere to big data architectural guidelines and address business requirements.
  • Following proper SDLC (Code review, sprint process).
  • Identifying, designing, and implementing internal process improvements: automating manual processes, optimizing data delivery, etc.
  • Building robust and scalable data infrastructure (both batch processing and real-time) to support needs from internal and external users.
  • Understanding various data security standards and using secure data security tools to apply and adhere to the required data controls for user access in the Hadoop platform.
  • Supporting and contributing to development guidelines and standards for data ingestion.
  • Working with a data scientist and business analytics team to assist in data ingestion and data related technical issues.
  • Designing and documenting the development & deployment flow.

 

Requirements:

 

  • Experience in developing rest API services using one of the Scala frameworks.
  • Ability to troubleshoot and optimize complex queries on the Spark platform
  • Expert in building and optimizing ‘big data’ data/ML pipelines, architectures and data sets.
  • Knowledge in modelling unstructured to structured data design.
  • Experience in Big Data access and storage techniques.
  • Experience in doing cost estimation based on the design and development.
  • Excellent debugging skills for the technical stack mentioned above which even includes analyzing server logs and application logs.
  • Highly organized, self-motivated, proactive, and ability to propose best design solutions.
  • Good time management and multitasking skills to work to deadlines by working independently and as a part of a team.

 

Read more
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort
Why apply via Cutshort?
Connect with actual hiring teams and get their fast response. No spam.
Find more jobs
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort