Cutshort logo
Apache spark jobs

50+ Apache Spark Jobs in India

Apply to 50+ Apache Spark Jobs on CutShort.io. Find your next job, effortlessly. Browse Apache Spark Jobs and apply today!

icon
Publicis Sapient

at Publicis Sapient

10 recruiters
Mohit Singh
Posted by Mohit Singh
Bengaluru (Bangalore), Pune, Hyderabad, Gurugram, Noida
5 - 11 yrs
₹20L - ₹36L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+7 more

Publicis Sapient Overview:

The Senior Associate People Senior Associate L1 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution 

.

Job Summary:

As Senior Associate L2 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution

The role requires a hands-on technologist who has strong programming background like Java / Scala / Python, should have experience in Data Ingestion, Integration and data Wrangling, Computation, Analytics pipelines and exposure to Hadoop ecosystem components. You are also required to have hands-on knowledge on at least one of AWS, GCP, Azure cloud platforms.


Role & Responsibilities:

Your role is focused on Design, Development and delivery of solutions involving:

• Data Integration, Processing & Governance

• Data Storage and Computation Frameworks, Performance Optimizations

• Analytics & Visualizations

• Infrastructure & Cloud Computing

• Data Management Platforms

• Implement scalable architectural models for data processing and storage

• Build functionality for data ingestion from multiple heterogeneous sources in batch & real-time mode

• Build functionality for data analytics, search and aggregation

Experience Guidelines:

Mandatory Experience and Competencies:

# Competency

1.Overall 5+ years of IT experience with 3+ years in Data related technologies

2.Minimum 2.5 years of experience in Big Data technologies and working exposure in at least one cloud platform on related data services (AWS / Azure / GCP)

3.Hands-on experience with the Hadoop stack – HDFS, sqoop, kafka, Pulsar, NiFi, Spark, Spark Streaming, Flink, Storm, hive, oozie, airflow and other components required in building end to end data pipeline.

4.Strong experience in at least of the programming language Java, Scala, Python. Java preferable

5.Hands-on working knowledge of NoSQL and MPP data platforms like Hbase, MongoDb, Cassandra, AWS Redshift, Azure SQLDW, GCP BigQuery etc

6.Well-versed and working knowledge with data platform related services on at least 1 cloud platform, IAM and data security


Preferred Experience and Knowledge (Good to Have):

# Competency

1.Good knowledge of traditional ETL tools (Informatica, Talend, etc) and database technologies (Oracle, MySQL, SQL Server, Postgres) with hands on experience

2.Knowledge on data governance processes (security, lineage, catalog) and tools like Collibra, Alation etc

3.Knowledge on distributed messaging frameworks like ActiveMQ / RabbiMQ / Solace, search & indexing and Micro services architectures

4.Performance tuning and optimization of data pipelines

5.CI/CD – Infra provisioning on cloud, auto build & deployment pipelines, code quality

6.Cloud data specialty and other related Big data technology certifications


Personal Attributes:

• Strong written and verbal communication skills

• Articulation skills

• Good team player

• Self-starter who requires minimal oversight

• Ability to prioritize and manage multiple tasks

• Process orientation and the ability to define and set up processes


Read more
Bengaluru (Bangalore), Hyderabad, Delhi, Gurugram
5 - 10 yrs
₹14L - ₹15L / yr
Google Cloud Platform (GCP)
Spark
PySpark
Apache Spark
"DATA STREAMING"

Data Engineering : Senior Engineer / Manager


As Senior Engineer/ Manager in Data Engineering, you will translate client requirements into technical design, and implement components for a data engineering solutions. Utilize a deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution.


Must Have skills :


1. GCP


2. Spark streaming : Live data streaming experience is desired.


3. Any 1 coding language: Java/Pyhton /Scala



Skills & Experience :


- Overall experience of MINIMUM 5+ years with Minimum 4 years of relevant experience in Big Data technologies


- Hands-on experience with the Hadoop stack - HDFS, sqoop, kafka, Pulsar, NiFi, Spark, Spark Streaming, Flink, Storm, hive, oozie, airflow and other components required in building end to end data pipeline. Working knowledge on real-time data pipelines is added advantage.


- Strong experience in at least of the programming language Java, Scala, Python. Java preferable


- Hands-on working knowledge of NoSQL and MPP data platforms like Hbase, MongoDb, Cassandra, AWS Redshift, Azure SQLDW, GCP BigQuery etc.


- Well-versed and working knowledge with data platform related services on GCP


- Bachelor's degree and year of work experience of 6 to 12 years or any combination of education, training and/or experience that demonstrates the ability to perform the duties of the position


Your Impact :


- Data Ingestion, Integration and Transformation


- Data Storage and Computation Frameworks, Performance Optimizations


- Analytics & Visualizations


- Infrastructure & Cloud Computing


- Data Management Platforms


- Build functionality for data ingestion from multiple heterogeneous sources in batch & real-time


- Build functionality for data analytics, search and aggregation

Read more
AdElement

at AdElement

2 recruiters
Sachin Bhatevara
Posted by Sachin Bhatevara
Pune
3 - 7 yrs
₹25L - ₹40L / yr
Machine Learning (ML)
Data Science
Artificial Intelligence (AI)
Neural networks
PyTorch
+2 more

Data driven decision-making is core to advertising technology at AdElement. We are looking for sharp, disciplined, and highly quantitative machine learning/ artificial intellignce engineers with big data experience and a passion for digital marketing to help drive informed decision-making. You will work with top-talent and cutting edge technology and have a unique opportunity to turn your insights into products influencing billions. The potential candidate will have an extensive background in distributed training frameworks, will have experience to deploy related machine learning models end to end, and will have some experience in data-driven decision making of machine learning infrastructure enhancement. This is your chance to leave your legacy and be part of a highly successful and growing company.


Required Skills

- 3+ years of industry experience with Java/ Python in a programming intensive role

- 3+ years of experience with one or more of the following machine learning topics: classification, clustering, optimization, recommendation system, graph mining, deep learning

- 3+ years of industry experience with distributed computing frameworks such as Hadoop/Spark, Kubernetes ecosystem, etc

- 3+ years of industry experience with popular deep learning frameworks such as Spark MLlib, Keras, Tensorflow, PyTorch, etc

- 3+ years of industry experience with major cloud computing services

- An effective communicator with the ability to explain technical concepts to a non-technical audience

- (Preferred) Prior experience with ads product development (e.g., DSP/ad-exchange/SSP)

- Able to lead a small team of AI/ML Engineers to achieve business objectives



Responsibilities

- Collaborate across multiple teams - Data Science, Operations & Engineering on unique machine learning system challenges at scale

- Leverage distributed training systems to build scalable machine learning pipelines including ETL, model training and deployments in Real-Time Bidding space. 

- Design and implement solutions to optimize distributed training execution in terms of model hyperparameter optimization, model training/inference latency and system-level bottlenecks  

- Research state-of-the-art machine learning infrastructures to improve data healthiness, model quality and state management during the lifecycle of ML models refresh.

- Optimize integration between popular machine learning libraries and cloud ML and data processing frameworks. 

- Build Deep Learning models and algorithms with optimal parallelism and performance on CPUs/ GPUs.

- Work with top management on defining teams goals and objectives.


Education

- MTech or Ph.D. in Computer Science, Software Engineering, Mathematics or related fields

Read more
Career Forge

at Career Forge

2 candid answers
Mohammad Faiz
Posted by Mohammad Faiz
Delhi, Gurugram, Noida, Ghaziabad, Faridabad
5 - 7 yrs
₹12L - ₹15L / yr
Python
Apache Spark
PySpark
Data engineering
ETL
+10 more

🚀 Exciting Opportunity: Data Engineer Position in Gurugram 🌐


Hello 


We are actively seeking a talented and experienced Data Engineer to join our dynamic team at Reality Motivational Venture in Gurugram (Gurgaon). If you're passionate about data, thrive in a collaborative environment, and possess the skills we're looking for, we want to hear from you!


Position: Data Engineer  

Location: Gurugram (Gurgaon)  

Experience: 5+ years 


Key Skills:

- Python

- Spark, Pyspark

- Data Governance

- Cloud (AWS/Azure/GCP)


Main Responsibilities:

- Define and set up analytics environments for "Big Data" applications in collaboration with domain experts.

- Implement ETL processes for telemetry-based and stationary test data.

- Support in defining data governance, including data lifecycle management.

- Develop large-scale data processing engines and real-time search and analytics based on time series data.

- Ensure technical, methodological, and quality aspects.

- Support CI/CD processes.

- Foster know-how development and transfer, continuous improvement of leading technologies within Data Engineering.

- Collaborate with solution architects on the development of complex on-premise, hybrid, and cloud solution architectures.


Qualification Requirements:

- BSc, MSc, MEng, or PhD in Computer Science, Informatics/Telematics, Mathematics/Statistics, or a comparable engineering degree.

- Proficiency in Python and the PyData stack (Pandas/Numpy).

- Experience in high-level programming languages (C#/C++/Java).

- Familiarity with scalable processing environments like Dask (or Spark).

- Proficient in Linux and scripting languages (Bash Scripts).

- Experience in containerization and orchestration of containerized services (Kubernetes).

- Education in database technologies (SQL/OLAP and Non-SQL).

- Interest in Big Data storage technologies (Elastic, ClickHouse).

- Familiarity with Cloud technologies (Azure, AWS, GCP).

- Fluent English communication skills (speaking and writing).

- Ability to work constructively with a global team.

- Willingness to travel for business trips during development projects.


Preferable:

- Working knowledge of vehicle architectures, communication, and components.

- Experience in additional programming languages (C#/C++/Java, R, Scala, MATLAB).

- Experience in time-series processing.


How to Apply:

Interested candidates, please share your updated CV/resume with me.


Thank you for considering this exciting opportunity.

Read more
6sense

at 6sense

15 recruiters
Romesh Rawat
Posted by Romesh Rawat
Remote only
9 - 15 yrs
Best in industry
PySpark
Data engineering
Big Data
Hadoop
Spark
+2 more

About Us:

6sense is a Predictive Intelligence Engine that is reimagining how B2B companies do

sales and marketing. It works with big data at scale, advanced machine learning and

predictive modelling to find buyers and predict what they will purchase, when and

how much.

6sense helps B2B marketing and sales organizations fully understand the complex ABM

buyer journey. By combining intent signals from every channel with the industry’s most

advanced AI predictive capabilities, it is finally possible to predict account demand and

optimize demand generation in an ABM world. Equipped with the power of AI and the

6sense Demand PlatformTM, marketing and sales professionals can uncover, prioritize,

and engage buyers to drive more revenue.

6sense is seeking a Staff Software Engineer and data to become part of a team

designing, developing, and deploying its customer-centric applications.

We’ve more than doubled our revenue in the past five years and completed our Series

E funding of $200M last year, giving us a stable foundation for growth.


Responsibilities:

1. Own critical datasets and data pipelines for product & business, and work

towards direct business goals of increased data coverage, data match rates, data

quality, data freshness

2. Create more value from various datasets with creative solutions, and unlocking

more value from existing data, and help build data moat for the company3. Design, develop, test, deploy and maintain optimal data pipelines, and assemble

large, complex data sets that meet functional and non-functional business

requirements

4. Improving our current data pipelines i.e. improve their performance, SLAs,

remove redundancies, and figure out a way to test before v/s after roll out

5. Identify, design, and implement process improvements in data flow across

multiple stages and via collaboration with multiple cross functional teams eg.

automating manual processes, optimising data delivery, hand-off processes etc.

6. Work with cross function stakeholders including the Product, Data Analytics ,

Customer Support teams for their enablement for data access and related goals

7. Build for security, privacy, scalability, reliability and compliance

8. Mentor and coach other team members on scalable and extensible solutions

design, and best coding standards

9. Help build a team and cultivate innovation by driving cross-collaboration and

execution of projects across multiple teams

Requirements:

 8-10+ years of overall work experience as a Data Engineer

 Excellent analytical and problem-solving skills

 Strong experience with Big Data technologies like Apache Spark. Experience with

Hadoop, Hive, Presto would-be a plus

 Strong experience in writing complex, optimized SQL queries across large data

sets. Experience with optimizing queries and underlying storage

 Experience with Python/ Scala

 Experience with Apache Airflow or other orchestration tools

 Experience with writing Hive / Presto UDFs in Java

 Experience working on AWS cloud platform and services.

 Experience with Key Value stores or NoSQL databases would be a plus.

 Comfortable with Unix / Linux command line

Interpersonal Skills:

 You can work independently as well as part of a team.

 You take ownership of projects and drive them to conclusion.

 You’re a good communicator and are capable of not just doing the work, but also

teaching others and explaining the “why” behind complicated technical

decisions.

 You aren’t afraid to roll up your sleeves: This role will evolve over time, and we’ll

want you to evolve with it

Read more
Thoughtworks

at Thoughtworks

1 video
27 recruiters
Sunidhi Thakur
Posted by Sunidhi Thakur
Bengaluru (Bangalore)
10 - 13 yrs
Best in industry
Data modeling
PySpark
Data engineering
Big Data
Hadoop
+10 more

Lead Data Engineer

 

Data Engineers develop modern data architecture approaches to meet key business objectives and provide end-to-end data solutions. You might spend a few weeks with a new client on a deep technical review or a complete organizational review, helping them to understand the potential that data brings to solve their most pressing problems. On other projects, you might be acting as the architect, leading the design of technical solutions, or perhaps overseeing a program inception to build a new product. It could also be a software delivery project where you're equally happy coding and tech-leading the team to implement the solution.

 

Job responsibilities

 

·      You might spend a few weeks with a new client on a deep technical review or a complete organizational review, helping them to understand the potential that data brings to solve their most pressing problems

·      You will partner with teammates to create complex data processing pipelines in order to solve our clients' most ambitious challenges

·      You will collaborate with Data Scientists in order to design scalable implementations of their models

·      You will pair to write clean and iterative code based on TDD

·      Leverage various continuous delivery practices to deploy, support and operate data pipelines

·      Advise and educate clients on how to use different distributed storage and computing technologies from the plethora of options available

·      Develop and operate modern data architecture approaches to meet key business objectives and provide end-to-end data solutions

·      Create data models and speak to the tradeoffs of different modeling approaches

·      On other projects, you might be acting as the architect, leading the design of technical solutions, or perhaps overseeing a program inception to build a new product

·      Seamlessly incorporate data quality into your day-to-day work as well as into the delivery process

·      Assure effective collaboration between Thoughtworks' and the client's teams, encouraging open communication and advocating for shared outcomes

 

Job qualifications Technical skills

·      You are equally happy coding and leading a team to implement a solution

·      You have a track record of innovation and expertise in Data Engineering

·      You're passionate about craftsmanship and have applied your expertise across a range of industries and organizations

·      You have a deep understanding of data modelling and experience with data engineering tools and platforms such as Kafka, Spark, and Hadoop

·      You have built large-scale data pipelines and data-centric applications using any of the distributed storage platforms such as HDFS, S3, NoSQL databases (Hbase, Cassandra, etc.) and any of the distributed processing platforms like Hadoop, Spark, Hive, Oozie, and Airflow in a production setting

·      Hands on experience in MapR, Cloudera, Hortonworks and/or cloud (AWS EMR, Azure HDInsights, Qubole etc.) based Hadoop distributions

·      You are comfortable taking data-driven approaches and applying data security strategy to solve business problems

·      You're genuinely excited about data infrastructure and operations with a familiarity working in cloud environments

·      Working with data excites you: you have created Big data architecture, you can build and operate data pipelines, and maintain data storage, all within distributed systems

 

Professional skills


·      Advocate your data engineering expertise to the broader tech community outside of Thoughtworks, speaking at conferences and acting as a mentor for more junior-level data engineers

·      You're resilient and flexible in ambiguous situations and enjoy solving problems from technical and business perspectives

·      An interest in coaching others, sharing your experience and knowledge with teammates

·      You enjoy influencing others and always advocate for technical excellence while being open to change when needed

Read more
TEKsystems

at TEKsystems

1 recruiter
priyanka kanwar
Posted by priyanka kanwar
Gurugram
5 - 10 yrs
₹15L - ₹25L / yr
Apache Spark
Amazon Web Services (AWS)
Python
airflow
Algorithms

TOP 3 SKILLS

Python (Language)

Spark Framework

Spark Streaming

Docker/Jenkins/ Spinakar

AWS

Hive Queries

He/She should be good coder.

Preff: - Airflow

Must have experience: -

Python

Spark framework and streaming

exposure to Machine Learning Lifecycle is mandatory.

Project:

This is searching domain project. Any searching activity which is happening on website this team create the model for the same, they create sorting/scored model for any search. This is done by the data

scientist This team is working more on the streaming side of data, the candidate would work extensively on Spark streaming and there will be a lot of work in Machine Learning.


INTERVIEW INFORMATION

3-4 rounds.

1st round based on data engineering batching experience.

2nd round based on data engineering streaming experience.

3rd round based on ML lifecycle (3rd round can be a techno-functional round based on previous

feedbacks otherwise 4th round will be a functional round if required.

Read more
Codemonk

at Codemonk

2 recruiters
Manjunath S
Posted by Manjunath S
Remote only
6 - 12 yrs
₹10L - ₹24L / yr
Python
Django
Flask
WSGI
ASGI
+7 more

About the Role:

Our team is responsible for building the backend components of MLOps plaBorm on AWS.

The backend components we build are the fundamental blocks for feature engineering,

feature serving, model deployment and model inference in both batch and online modes.


What you’ll do here

•Design & build backend components of our MLOps plaBormon AWS.

•Collaborate with geographically distributed cross-funcMonal teams.

•ParMcipate in on-call rotaMon with the rest of the team to handle producMon incidents.


What you’ll need to succeed


Must have skills:

•At least 8+ years ofprofessionalbackend web development experience with Python.

•Experiencewith web development frameworks such as Flask, Django or FastAPI.

•Experience working with WSGI & ASGI web serverssuch as Gunicorn, Uvicornetc.

•Experience with concurrent programming designs such as AsyncIO.

•Experience withunit and funcMonal tesMng frameworks.

•Experience with any of the public cloud platforms like AWS, Azure, GCP, preferably AWS.

•Experience with CI/CD practices, tools,and frameworks.


Nice to have skills:

•Experience with Apache KaYa and developing KaYa client applicaMons in Python.

•Experience with MLOps plaBorms such as AWS Sagemaker, Kubeflow or MLflow.

•Experience with big data processing frameworks, preferably Apache Spark.

•Experience with containers(Docker)andcontainer plaBorms likeAWS ECSor AWS EKS.

•Experience with DevOps & IaC tools such as Terraform, Jenkinsetc.

•Experience with various Python packaging opMons such as Wheel, PEX or Conda.

Read more
JK Technosoft Ltd
Nishu Gupta
Posted by Nishu Gupta
Bengaluru (Bangalore)
3 - 5 yrs
₹5L - ₹15L / yr
Data Science
Machine Learning (ML)
Natural Language Processing (NLP)
Computer Vision
recommendation algorithm
+13 more

Roles and Responsibilities:

  • Design, develop, and maintain the end-to-end MLOps infrastructure from the ground up, leveraging open-source systems across the entire MLOps landscape.
  • Creating pipelines for data ingestion, data transformation, building, testing, and deploying machine learning models, as well as monitoring and maintaining the performance of these models in production.
  • Managing the MLOps stack, including version control systems, continuous integration and deployment tools, containerization, orchestration, and monitoring systems.
  • Ensure that the MLOps stack is scalable, reliable, and secure.

Skills Required:

  • 3-6 years of MLOps experience
  • Preferably worked in the startup ecosystem

Primary Skills:

  • Experience with E2E MLOps systems like ClearML, Kubeflow, MLFlow etc.
  • Technical expertise in MLOps: Should have a deep understanding of the MLOps landscape and be able to leverage open-source systems to build scalable, reliable, and secure MLOps infrastructure.
  • Programming skills: Proficient in at least one programming language, such as Python, and have experience with data science libraries, such as TensorFlow, PyTorch, or Scikit-learn.
  • DevOps experience: Should have experience with DevOps tools and practices, such as Git, Docker, Kubernetes, and Jenkins.

Secondary Skills:

  • Version Control Systems (VCS) tools like Git and Subversion
  • Containerization technologies like Docker and Kubernetes
  • Cloud Platforms like AWS, Azure, and Google Cloud Platform
  • Data Preparation and Management tools like Apache Spark, Apache Hadoop, and SQL databases like PostgreSQL and MySQL
  • Machine Learning Frameworks like TensorFlow, PyTorch, and Scikit-learn
  • Monitoring and Logging tools like Prometheus, Grafana, and Elasticsearch
  • Continuous Integration and Continuous Deployment (CI/CD) tools like Jenkins, GitLab CI, and CircleCI
  • Explain ability and Interpretability tools like LIME and SHAP


Read more
[x]cube LABS

at [x]cube LABS

2 candid answers
1 video
Krishna kandregula
Posted by Krishna kandregula
Hyderabad
2 - 6 yrs
₹8L - ₹20L / yr
ETL
Informatica
Data Warehouse (DWH)
PowerBI
DAX
+12 more
  • Creating and managing ETL/ELT pipelines based on requirements
  • Build PowerBI dashboards and manage datasets needed.
  • Work with stakeholders to identify data structures needed for future and perform any transformations including aggregations.
  • Build data cubes for real-time visualisation needs and CXO dashboards.


Required Tech Skills


  • Microsoft PowerBI & DAX
  • Python, Pandas, PyArrow, Jupyter Noteboks, ApacheSpark
  • Azure Synapse, Azure DataBricks, Azure HDInsight, Azure Data Factory



Read more
Conviva

at Conviva

1 recruiter
Adarsh Sikarwar
Posted by Adarsh Sikarwar
Bengaluru (Bangalore)
4 - 8 yrs
₹15L - ₹40L / yr
Apache Kafka
Redis
Systems design
Data Structures
Algorithms
+5 more

Have you streamed a program on Disney+, watched your favorite binge-worthy series on Peacock or cheered your favorite team on during the World Cup from one of the 20 top streaming platforms around the globe? If the answer is yes, you’ve already benefitted from Conviva technology, helping the world’s leading streaming publishers deliver exceptional streaming experiences and grow their businesses. 


Conviva is the only global streaming analytics platform for big data that collects, standardizes, and puts trillions of cross-screen, streaming data points in context, in real time. The Conviva platform provides comprehensive, continuous, census-level measurement through real-time, server side sessionization at unprecedented scale. If this sounds important, it is! We measure a global footprint of more than 500 million unique viewers in 180 countries watching 220 billion streams per year across 3 billion applications streaming on devices. With Conviva, customers get a unique level of actionability and scale from continuous streaming measurement insights and benchmarking across every stream, every screen, every second.

 

What you get to do in this role:

Work on extremely high scale RUST web services or backend systems.

Design and develop solutions for highly scalable web and backend systems.

Proactively identify and solve performance issues.

Maintain a high bar on code quality and unit testing.

 

What you bring to the role:

5+ years of hands-on software development experience.

At least 2+ years of RUST development experience.

Knowledge of cargo packages for kafka, redis etc.

Strong CS fundamentals, including system design, data structures and algorithms.

Expertise in backend and web services development.

Good analytical and troubleshooting skills.

 

What will help you stand out:

Experience working with large scale web services and applications.

Exposure to Golang, Scala or Java

Exposure to Big data systems like Kafka, Spark, Hadoop etc.

 

Underpinning the Conviva platform is a rich history of innovation. More than 60 patents represent award-winning technologies and standards, including first-of-its kind-innovations like time-state analytics and AI-automated data modeling, that surfaces actionable insights. By understanding real-world human experiences and having the ability to act within seconds of observation, our customers can solve business-critical issues and focus on growing their business ahead of the competition. Examples of the brands Conviva has helped fuel streaming growth for include: DAZN, Disney+, HBO, Hulu, NBCUniversal, Paramount+, Peacock, Sky, Sling TV, Univision and Warner Bros Discovery.  


Privately held, Conviva is headquartered in Silicon Valley, California with offices and people around the globe. For more information, visit us at www.conviva.com. Join us to help extend our leadership position in big data streaming analytics to new audiences and markets! 

Read more
Conviva

at Conviva

1 recruiter
Anusha Bondada
Posted by Anusha Bondada
Bengaluru (Bangalore)
3 - 15 yrs
₹25L - ₹70L / yr
Scala
Akka
Algorithms
Data Structures
Functional programming
+6 more

Have you streamed a program on Disney+, watched your favorite binge-worthy series on Peacock or cheered your favorite team on during the World Cup from one of the 20 top streaming platforms around the globe? If the answer is yes, you’ve already benefitted from Conviva technology, helping the world’s leading streaming publishers deliver exceptional streaming experiences and grow their businesses. 

 

Conviva is the only global streaming analytics platform for big data that collects, standardizes, and puts trillions of cross-screen, streaming data points in context, in real time. The Conviva platform provides comprehensive, continuous, census-level measurement through real-time, server side sessionization at unprecedented scale. If this sounds important, it is! We measure a global footprint of more than 500 million unique viewers in 180 countries watching 220 billion streams per year across 3 billion applications streaming on devices. With Conviva, customers get a unique level of actionability and scale from continuous streaming measurement insights and benchmarking across every stream, every screen, every second.

 

As Conviva is expanding, we are building products providing deep insights into end user experience for our customers.

 

Platform and TLB Team

The vision for the TLB team is to build data processing software that works on terabytes of streaming data in real time. Engineer the next-gen Spark-like system for in-memory computation of large time-series dataset’s – both Spark-like backend infra and library based programming model. Build horizontally and vertically scalable system that analyses trillions of events per day within sub second latencies. Utilize the latest and greatest of big data technologies to build solutions for use-cases across multiple verticals. Lead technology innovation and advancement that will have big business impact for years to come. Be part of a worldwide team building software using the latest technologies and the best of software development tools and processes.

 

What You’ll Do

This is an individual contributor position. Expectations will be on the below lines:

  • Design, build and maintain the stream processing, and time-series analysis system which is at the heart of Conviva's products
  • Responsible for the architecture of the Conviva platform
  • Build features, enhancements, new services, and bug fixing in Scala and Java on a Jenkins-based pipeline to be deployed as Docker containers on Kubernetes
  • Own the entire lifecycle of your microservice including early specs, design, technology choice, development, unit-testing, integration-testing, documentation, deployment, troubleshooting, enhancements etc.
  • Lead a team to develop a feature or parts of the product
  • Adhere to the Agile model of software development to plan, estimate, and ship per business priority

 

What you need to succeed

  • 9+ years of work experience in software development of data processing products.
  • Engineering degree in software or equivalent from a premier institute.
  • Excellent knowledge of fundamentals of Computer Science like algorithms and data structures. Hands-on with functional programming and know-how of its concepts
  • Excellent programming and debugging skills on the JVM. Proficient in writing code in Scala/Java/Rust/Haskell/Erlang that is reliable, maintainable, secure, and performant
  • Experience with big data technologies like Spark, Flink, Kafka, Druid, HDFS, etc.
  • Deep understanding of distributed systems concepts and scalability challenges including multi-threading, concurrency, sharding, partitioning, etc.
  • Experience/knowledge of Akka/Lagom framework and/or stream processing technologies like RxJava or Project Reactor will be a big plus. Knowledge of design patterns like event-streaming, CQRS and DDD to build large microservice architectures will be a big plus
  • Excellent communication skills. Willingness to work under pressure. Hunger to learn and succeed. Comfortable with ambiguity. Comfortable with complexity

 

Underpinning the Conviva platform is a rich history of innovation. More than 60 patents represent award-winning technologies and standards, including first-of-its kind-innovations like time-state analytics and AI-automated data modeling, that surfaces actionable insights. By understanding real-world human experiences and having the ability to act within seconds of observation, our customers can solve business-critical issues and focus on growing their businesses ahead of the competition. Examples of the brands Conviva has helped fuel streaming growth for include DAZN, Disney+, HBO, Hulu, NBCUniversal, Paramount+, Peacock, Sky, Sling TV, Univision, and Warner Bros Discovery.  

Privately held, Conviva is headquartered in Silicon Valley, California with offices and people around the globe. For more information, visit us at www.conviva.com. Join us to help extend our leadership position in big data streaming analytics to new audiences and markets! 



Read more
Accolite Digital
Nitesh Parab
Posted by Nitesh Parab
Bengaluru (Bangalore), Hyderabad, Gurugram, Delhi, Noida, Ghaziabad, Faridabad
4 - 8 yrs
₹5L - ₹15L / yr
ETL
Informatica
Data Warehouse (DWH)
SSIS
SQL Server Integration Services (SSIS)
+10 more

Job Title: Data Engineer

Job Summary: As a Data Engineer, you will be responsible for designing, building, and maintaining the infrastructure and tools necessary for data collection, storage, processing, and analysis. You will work closely with data scientists and analysts to ensure that data is available, accessible, and in a format that can be easily consumed for business insights.

Responsibilities:

  • Design, build, and maintain data pipelines to collect, store, and process data from various sources.
  • Create and manage data warehousing and data lake solutions.
  • Develop and maintain data processing and data integration tools.
  • Collaborate with data scientists and analysts to design and implement data models and algorithms for data analysis.
  • Optimize and scale existing data infrastructure to ensure it meets the needs of the business.
  • Ensure data quality and integrity across all data sources.
  • Develop and implement best practices for data governance, security, and privacy.
  • Monitor data pipeline performance / Errors and troubleshoot issues as needed.
  • Stay up-to-date with emerging data technologies and best practices.

Requirements:

Bachelor's degree in Computer Science, Information Systems, or a related field.

Experience with ETL tools like Matillion,SSIS,Informatica

Experience with SQL and relational databases such as SQL server, MySQL, PostgreSQL, or Oracle.

Experience in writing complex SQL queries

Strong programming skills in languages such as Python, Java, or Scala.

Experience with data modeling, data warehousing, and data integration.

Strong problem-solving skills and ability to work independently.

Excellent communication and collaboration skills.

Familiarity with big data technologies such as Hadoop, Spark, or Kafka.

Familiarity with data warehouse/Data lake technologies like Snowflake or Databricks

Familiarity with cloud computing platforms such as AWS, Azure, or GCP.

Familiarity with Reporting tools

Teamwork/ growth contribution

  • Helping the team in taking the Interviews and identifying right candidates
  • Adhering to timelines
  • Intime status communication and upfront communication of any risks
  • Tech, train, share knowledge with peers.
  • Good Communication skills
  • Proven abilities to take initiative and be innovative
  • Analytical mind with a problem-solving aptitude

Good to have :

Master's degree in Computer Science, Information Systems, or a related field.

Experience with NoSQL databases such as MongoDB or Cassandra.

Familiarity with data visualization and business intelligence tools such as Tableau or Power BI.

Knowledge of machine learning and statistical modeling techniques.

If you are passionate about data and want to work with a dynamic team of data scientists and analysts, we encourage you to apply for this position.

Read more
Kloud9 Technologies
Bengaluru (Bangalore)
3 - 6 yrs
₹5L - ₹20L / yr
Amazon Web Services (AWS)
Amazon EMR
EMR
Spark
PySpark
+9 more

About Kloud9:

 

Kloud9 exists with the sole purpose of providing cloud expertise to the retail industry. Our team of cloud architects, engineers and developers help retailers launch a successful cloud initiative so you can quickly realise the benefits of cloud technology. Our standardised, proven cloud adoption methodologies reduce the cloud adoption time and effort so you can directly benefit from lower migration costs.

 

Kloud9 was founded with the vision of bridging the gap between E-commerce and cloud. The E-commerce of any industry is limiting and poses a huge challenge in terms of the finances spent on physical data structures.

 

At Kloud9, we know migrating to the cloud is the single most significant technology shift your company faces today. We are your trusted advisors in transformation and are determined to build a deep partnership along the way. Our cloud and retail experts will ease your transition to the cloud.

 

Our sole focus is to provide cloud expertise to retail industry giving our clients the empowerment that will take their business to the next level. Our team of proficient architects, engineers and developers have been designing, building and implementing solutions for retailers for an average of more than 20 years.

 

We are a cloud vendor that is both platform and technology independent. Our vendor independence not just provides us with a unique perspective into the cloud market but also ensures that we deliver the cloud solutions available that best meet our clients' requirements.


What we are looking for:

● 3+ years’ experience developing Data & Analytic solutions

● Experience building data lake solutions leveraging one or more of the following AWS, EMR, S3, Hive& Spark

● Experience with relational SQL

● Experience with scripting languages such as Shell, Python

● Experience with source control tools such as GitHub and related dev process

● Experience with workflow scheduling tools such as Airflow

● In-depth knowledge of scalable cloud

● Has a passion for data solutions

● Strong understanding of data structures and algorithms

● Strong understanding of solution and technical design

● Has a strong problem-solving and analytical mindset

● Experience working with Agile Teams.

● Able to influence and communicate effectively, both verbally and written, with team members and business stakeholders

● Able to quickly pick up new programming languages, technologies, and frameworks

● Bachelor’s Degree in computer science


Why Explore a Career at Kloud9:

 

With job opportunities in prime locations of US, London, Poland and Bengaluru, we help build your career paths in cutting edge technologies of AI, Machine Learning and Data Science. Be part of an inclusive and diverse workforce that's changing the face of retail technology with their creativity and innovative solutions. Our vested interest in our employees translates to deliver the best products and solutions to our customers.

Read more
Cubera Tech India Pvt Ltd
Bengaluru (Bangalore), Chennai
5 - 8 yrs
Best in industry
Data engineering
Big Data
Java
Python
Hibernate (Java)
+10 more

Data Engineer- Senior

Cubera is a data company revolutionizing big data analytics and Adtech through data share value principles wherein the users entrust their data to us. We refine the art of understanding, processing, extracting, and evaluating the data that is entrusted to us. We are a gateway for brands to increase their lead efficiency as the world moves towards web3.

What are you going to do?

Design & Develop high performance and scalable solutions that meet the needs of our customers.

Closely work with the Product Management, Architects and cross functional teams.

Build and deploy large-scale systems in Java/Python.

Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.

Create data tools for analytics and data scientist team members that assist them in building and optimizing their algorithms.

Follow best practices that can be adopted in Bigdata stack.

Use your engineering experience and technical skills to drive the features and mentor the engineers.

What are we looking for ( Competencies) :

Bachelor’s degree in computer science, computer engineering, or related technical discipline.

Overall 5 to 8 years of programming experience in Java, Python including object-oriented design.

Data handling frameworks: Should have a working knowledge of one or more data handling frameworks like- Hive, Spark, Storm, Flink, Beam, Airflow, Nifi etc.

Data Infrastructure: Should have experience in building, deploying and maintaining applications on popular cloud infrastructure like AWS, GCP etc.

Data Store: Must have expertise in one of general-purpose No-SQL data stores like Elasticsearch, MongoDB, Redis, RedShift, etc.

Strong sense of ownership, focus on quality, responsiveness, efficiency, and innovation.

Ability to work with distributed teams in a collaborative and productive manner.

Benefits:

Competitive Salary Packages and benefits.

Collaborative, lively and an upbeat work environment with young professionals.

Job Category: Development

Job Type: Full Time

Job Location: Bangalore

 

Read more
Simpl

at Simpl

3 recruiters
Elish Ismael
Posted by Elish Ismael
Bengaluru (Bangalore)
3 - 10 yrs
₹10L - ₹50L / yr
Java
Apache Spark
Big Data
Hadoop
Apache Hive
About Simpl
The thrill of working at a start-up that is starting to scale massively is something else. Simpl (FinTech startup of the year - 2020) was formed in 2015 by Nitya Sharma, an investment banker from Wall Street and Chaitra Chidanand, a tech executive from the Valley, when they teamed up with a very clear mission - to make money simple so that people can live well and do amazing things. Simpl is the payment platform for the mobile-first world, and we’re backed by some of the best names in fintech globally (folks who have invested in Visa, Square and Transferwise), and
has Joe Saunders, Ex Chairman and CEO of Visa as a board member.

Everyone at Simpl is an internal entrepreneur who is given a lot of bandwidth and resources to create the next breakthrough towards the long term vision of “making money Simpl”. Our first product is a payment platform that lets people buy instantly, anywhere online, and pay later. In
the background, Simpl uses big data for credit underwriting, risk and fraud modelling, all without any paperwork, and enables Banks and Non-Bank Financial Companies to access a whole new consumer market.
In place of traditional forms of identification and authentication, Simpl integrates deeply into merchant apps via SDKs and APIs. This allows for more sophisticated forms of authentication that take full advantage of smartphone data and processing power

Skillset:
 Workflow manager/scheduler like Airflow, Luigi, Oozie
 Good handle on Python
 ETL Experience
 Batch processing frameworks like Spark, MR/PIG
 File formats: parquet, JSON, XML, thrift, avro, protobuff
 Rule engine (drools - business rule management system)
 Distributed file systems like HDFS, NFS, AWS, S3 and equivalent
 Built/configured dashboards

Nice to have:
 Data platform experience for eg: building data lakes, working with near - realtime
applications/frameworks like storm, flink, spark.
 AWS
 File encoding types: Thrift, Avro, Protobuff, Parquet, JSON, XML
 HIVE, HBASE
Read more
xpressbees
Alfiya Khan
Posted by Alfiya Khan
Pune, Bengaluru (Bangalore)
6 - 8 yrs
₹15L - ₹25L / yr
Big Data
Data Warehouse (DWH)
Data modeling
Apache Spark
Data integration
+10 more
Company Profile
XpressBees – a logistics company started in 2015 – is amongst the fastest growing
companies of its sector. While we started off rather humbly in the space of
ecommerce B2C logistics, the last 5 years have seen us steadily progress towards
expanding our presence. Our vision to evolve into a strong full-service logistics
organization reflects itself in our new lines of business like 3PL, B2B Xpress and cross
border operations. Our strong domain expertise and constant focus on meaningful
innovation have helped us rapidly evolve as the most trusted logistics partner of
India. We have progressively carved our way towards best-in-class technology
platforms, an extensive network reach, and a seamless last mile management
system. While on this aggressive growth path, we seek to become the one-stop-shop
for end-to-end logistics solutions. Our big focus areas for the very near future
include strengthening our presence as service providers of choice and leveraging the
power of technology to improve efficiencies for our clients.

Job Profile
As a Lead Data Engineer in the Data Platform Team at XpressBees, you will build the data platform
and infrastructure to support high quality and agile decision-making in our supply chain and logistics
workflows.
You will define the way we collect and operationalize data (structured / unstructured), and
build production pipelines for our machine learning models, and (RT, NRT, Batch) reporting &
dashboarding requirements. As a Senior Data Engineer in the XB Data Platform Team, you will use
your experience with modern cloud and data frameworks to build products (with storage and serving
systems)
that drive optimisation and resilience in the supply chain via data visibility, intelligent decision making,
insights, anomaly detection and prediction.

What You Will Do
• Design and develop data platform and data pipelines for reporting, dashboarding and
machine learning models. These pipelines would productionize machine learning models
and integrate with agent review tools.
• Meet the data completeness, correction and freshness requirements.
• Evaluate and identify the data store and data streaming technology choices.
• Lead the design of the logical model and implement the physical model to support
business needs. Come up with logical and physical database design across platforms (MPP,
MR, Hive/PIG) which are optimal physical designs for different use cases (structured/semi
structured). Envision & implement the optimal data modelling, physical design,
performance optimization technique/approach required for the problem.
• Support your colleagues by reviewing code and designs.
• Diagnose and solve issues in our existing data pipelines and envision and build their
successors.

Qualifications & Experience relevant for the role

• A bachelor's degree in Computer Science or related field with 6 to 9 years of technology
experience.
• Knowledge of Relational and NoSQL data stores, stream processing and micro-batching to
make technology & design choices.
• Strong experience in System Integration, Application Development, ETL, Data-Platform
projects. Talented across technologies used in the enterprise space.
• Software development experience using:
• Expertise in relational and dimensional modelling
• Exposure across all the SDLC process
• Experience in cloud architecture (AWS)
• Proven track record in keeping existing technical skills and developing new ones, so that
you can make strong contributions to deep architecture discussions around systems and
applications in the cloud ( AWS).

• Characteristics of a forward thinker and self-starter that flourishes with new challenges
and adapts quickly to learning new knowledge
• Ability to work with a cross functional teams of consulting professionals across multiple
projects.
• Knack for helping an organization to understand application architectures and integration
approaches, to architect advanced cloud-based solutions, and to help launch the build-out
of those systems
• Passion for educating, training, designing, and building end-to-end systems.
Read more
Celebal Technologies

at Celebal Technologies

2 recruiters
Payal Hasnani
Posted by Payal Hasnani
Jaipur, Noida, Gurugram, Delhi, Ghaziabad, Faridabad, Pune, Mumbai
5 - 15 yrs
₹7L - ₹25L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+4 more
Job Responsibilities:

• Project Planning and Management
o Take end-to-end ownership of multiple projects / project tracks
o Create and maintain project plans and other related documentation for project
objectives, scope, schedule and delivery milestones
o Lead and participate across all the phases of software engineering, right from
requirements gathering to GO LIVE
o Lead internal team meetings on solution architecture, effort estimation, manpower
planning and resource (software/hardware/licensing) planning
o Manage RIDA (Risks, Impediments, Dependencies, Assumptions) for projects by
developing effective mitigation plans
• Team Management
o Act as the Scrum Master
o Conduct SCRUM ceremonies like Sprint Planning, Daily Standup, Sprint Retrospective
o Set clear objectives for the project and roles/responsibilities for each team member
o Train and mentor the team on their job responsibilities and SCRUM principles
o Make the team accountable for their tasks and help the team in achieving them
o Identify the requirements and come up with a plan for Skill Development for all team
members
• Communication
o Be the Single Point of Contact for the client in terms of day-to-day communication
o Periodically communicate project status to all the stakeholders (internal/external)
• Process Management and Improvement
o Create and document processes across all disciplines of software engineering
o Identify gaps and continuously improve processes within the team
o Encourage team members to contribute towards process improvement
o Develop a culture of quality and efficiency within the team

Must have:
• Minimum 08 years of experience (hands-on as well as leadership) in software / data engineering
across multiple job functions like Business Analysis, Development, Solutioning, QA, DevOps and
Project Management
• Hands-on as well as leadership experience in Big Data Engineering projects
• Experience developing or managing cloud solutions using Azure or other cloud provider
• Demonstrable knowledge on Hadoop, Hive, Spark, NoSQL DBs, SQL, Data Warehousing, ETL/ELT,
DevOps tools
• Strong project management and communication skills
• Strong analytical and problem-solving skills
• Strong systems level critical thinking skills
• Strong collaboration and influencing skills

Good to have:
• Knowledge on PySpark, Azure Data Factory, Azure Data Lake Storage, Synapse Dedicated SQL
Pool, Databricks, PowerBI, Machine Learning, Cloud Infrastructure
• Background in BFSI with focus on core banking
• Willingness to travel

Work Environment
• Customer Office (Mumbai) / Remote Work

Education
• UG: B. Tech - Computers / B. E. – Computers / BCA / B.Sc. Computer Science
Read more
Clairvoyant India Private Limited
Taruna Roy
Posted by Taruna Roy
Remote only
4 - 9 yrs
₹10L - ₹15L / yr
Java
Apache Spark
Spark
SQL
HiveQL
+1 more
Must-Have:
  • 5+ years of experience in software development.
  • At least 2 years of relevant work experience on large scale Data applications
  • Good attitude, strong problem-solving abilities, analytical skills, ability to take ownership as appropriate
  • Should be able to do coding, debugging, performance tuning, and deploying the apps to Prod.
  • Should have good working experience Hadoop ecosystem (HDFS, Hive, Yarn, File formats like Avro/Parquet)
  • Kafka
  • J2EE Frameworks (Spring/Hibernate/REST)
  • Spark Streaming or any other streaming technology.
  • Java programming language is mandatory.
  • Good to have experience with Java
  • Ability to work on the sprint stories to completion along with Unit test case coverage.
  • Experience working in Agile Methodology
  • Excellent communication and coordination skills
  • Knowledgeable (and preferred hands-on) - UNIX environments, different continuous integration tools.
  • Must be able to integrate quickly into the team and work independently towards team goals
Role & Responsibilities:
  • Take the complete responsibility of the sprint stories’ execution
  • Be accountable for the delivery of the tasks in the defined timelines with good quality
  • Follow the processes for project execution and delivery.
  • Follow agile methodology
  • Work with the team lead closely and contribute to the smooth delivery of the project.
  • Understand/define the architecture and discuss the pros-cons of the same with the team
  • Involve in the brainstorming sessions and suggest improvements in the architecture/design.
  • Work with other team leads to get the architecture/design reviewed.
  • Work with the clients and counterparts (in US) of the project.
  • Keep all the stakeholders updated about the project/task status/risks/issues if there are any.
Read more
6sense

at 6sense

15 recruiters
Romesh Rawat
Posted by Romesh Rawat
Remote only
5 - 8 yrs
₹30L - ₹45L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+4 more

About Slintel (a 6sense company) :

Slintel, a 6sense company,  the leader in capturing technographics-powered buying intent, helps companies uncover the 3% of active buyers in their target market. Slintel evaluates over 100 billion data points and analyzes factors such as buyer journeys, technology adoption patterns, and other digital footprints to deliver market & sales intelligence.

Slintel's customers have access to the buying patterns and contact information of more than 17 million companies and 250 million decision makers across the world.

Slintel is a fast growing B2B SaaS company in the sales and marketing tech space. We are funded by top tier VCs, and going after a billion dollar opportunity. At Slintel, we are building a sales development automation platform that can significantly improve outcomes for sales teams, while reducing the number of hours spent on research and outreach.

We are a big data company and perform deep analysis on technology buying patterns, buyer pain points to understand where buyers are in their journey. Over 100 billion data points are analyzed every week to derive recommendations on where companies should focus their marketing and sales efforts on. Third party intent signals are then clubbed with first party data from CRMs to derive meaningful recommendations on whom to target on any given day.

6sense is headquartered in San Francisco, CA and has 8 office locations across 4 countries.

6sense, an account engagement platform, secured $200 million in a Series E funding round, bringing its total valuation to $5.2 billion 10 months after its $125 million Series D round. The investment was co-led by Blue Owl and MSD Partners, among other new and existing investors.

Linkedin (Slintel) : https://www.linkedin.com/company/slintel/">https://www.linkedin.com/company/slintel/

Industry : Software Development

Company size : 51-200 employees (189 on LinkedIn)

Headquarters : Mountain View, California

Founded : 2016

Specialties : Technographics, lead intelligence, Sales Intelligence, Company Data, and Lead Data.

Website (Slintel) : https://www.slintel.com/slintel">https://www.slintel.com/slintel

Linkedin (6sense) : https://www.linkedin.com/company/6sense/">https://www.linkedin.com/company/6sense/

Industry : Software Development

Company size : 501-1,000 employees (937 on LinkedIn)

Headquarters : San Francisco, California

Founded : 2013

Specialties : Predictive intelligence, Predictive marketing, B2B marketing, and Predictive sales

Website (6sense) : https://6sense.com/">https://6sense.com/

Acquisition News : 

https://inc42.com/buzz/us-based-based-6sense-acquires-b2b-buyer-intelligence-startup-slintel/ 

Funding Details & News :

Slintel funding : https://www.crunchbase.com/organization/slintel">https://www.crunchbase.com/organization/slintel

6sense funding : https://www.crunchbase.com/organization/6sense">https://www.crunchbase.com/organization/6sense

https://www.nasdaq.com/articles/ai-software-firm-6sense-valued-at-%245.2-bln-after-softbank-joins-funding-round">https://www.nasdaq.com/articles/ai-software-firm-6sense-valued-at-%245.2-bln-after-softbank-joins-funding-round

https://www.bloomberg.com/news/articles/2022-01-20/6sense-reaches-5-2-billion-value-with-softbank-joining-round">https://www.bloomberg.com/news/articles/2022-01-20/6sense-reaches-5-2-billion-value-with-softbank-joining-round

https://xipometer.com/en/company/6sense">https://xipometer.com/en/company/6sense

Slintel & 6sense Customers :

https://www.featuredcustomers.com/vendor/slintel/customers

https://www.featuredcustomers.com/vendor/6sense/customers">https://www.featuredcustomers.com/vendor/6sense/customers

About the job

Responsibilities

  • Work in collaboration with the application team and integration team to design, create, and maintain optimal data pipeline architecture and data structures for Data Lake/Data Warehouse
  • Work with stakeholders including the Sales, Product, and Customer Support teams to assist with data-related technical issues and support their data analytics needs
  • Assemble large, complex data sets from third-party vendors to meet business requirements.
  • Identify, design, and implement internal process improvements: automating manual processes, optimising data delivery, re-designing infrastructure for greater scalability, etc.
  • Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL, Elastic search, MongoDB, and AWS technology
  • Streamline existing and introduce enhanced reporting and analysis solutions that leverage complex data sources derived from multiple internal systems

Requirements

  • 3+ years of experience in a Data Engineer role
  • Proficiency in Linux
  • Must have SQL knowledge and experience working with relational databases, query authoring (SQL) as well as familiarity with databases including Mysql, Mongo, Cassandra, and Athena
  • Must have experience with Python/ Scala
  • Must have experience with Big Data technologies like Apache Spark
  • Must have experience with Apache Airflow
  • Experience with data pipeline and ETL tools like AWS Glue
  • Experience working with AWS cloud services: EC2 S3 RDS, Redshift and other Data solutions eg. Databricks, Snowflake

 

Desired Skills and Experience

Python, SQL, Scala, Spark, ETL

 

Read more
RenovITe Technologies

at RenovITe Technologies

1 recruiter
Pranjali Sinha
Posted by Pranjali Sinha
Noida
6 - 9 yrs
₹8L - ₹16L / yr
Apache Spark
Apache Kafka
Gradle
Maven
• Java 1.8+ or Open JDK 11+ • Core spring framework
• Spring Boot • Spring Rest Services
• Any RDBMS • Gradel or Maven
• JPA / Hibernate • GIT / Bitbucket
• Apache Spark
Read more
hiring for a leading client
Agency job
via Jobaajcom by Saksham Agarwal
Bengaluru (Bangalore)
1 - 3 yrs
₹12L - ₹15L / yr
Big Data
Apache Hadoop
Apache Impala
Apache Kafka
Apache Spark
+5 more
We are seeking a self motivated Software Engineer with hands-on experience to build sustainable data solutions, identifying and addressing performance bottlenecks, collaborating with other team members, and implementing best practices for data engineering. Our engineering process is fully agile, and has a really fast release cycle - which keeps our environment very energetic and fun.

What you'll do:

Design and development of scalable applications.
Collaborate with tech leads to get maximum understanding of underlying infrastructure.
Contribute to continual improvement by suggesting improvements to the software system.
Ensure high scalability and performance
You will advocate for good, clean, well documented and performing code; follow standards and best practices.
We'd love for you to have:

Education: Bachelor/Master Degree in Computer Science
Experience: 1-3 years of relevant experience in BI/Big-Data with hands-on coding experience
Mandatory Skills

Strong in problem-solving
Good exposure to Big Data technologies, Hive, Hadoop, Impala, Hbase, Kafka, Spark
Strong experience of Data Engineering
Able to comprehend challenges related to Database and Data Warehousing technologies and ability to understand complex design, system architecture
Experience with the software development lifecycle, design, develop, review, debug, document, and deliver (especially in a multi-location organization)
Working knowledge of Java, python
Desired Skills

Experience with reporting tools like Tableau, QlikView
Awareness of CI-CD pipeline
Inclination to work on cloud platform ex:- AWS
Crisp communication skills with team members, Business owners.
Be able to work in a challenging, dynamic environment and meet tight deadlines
Read more
Propellor.ai

at Propellor.ai

5 candid answers
1 video
Anila Nair
Posted by Anila Nair
Remote only
2 - 5 yrs
₹5L - ₹15L / yr
Python
SQL
Spark
Data Science
Machine Learning (ML)
+10 more

Job Description: Data Scientist

At Propellor.ai, we derive insights that allow our clients to make scientific decisions. We believe in demanding more from the fields of Mathematics, Computer Science, and Business Logic. Combine these and we show our clients a 360-degree view of their business. In this role, the Data Scientist will be expected to work on Procurement problems along with a team-based across the globe.

We are a Remote-First Company.

Read more about us here: https://www.propellor.ai/consulting" target="_blank">https://www.propellor.ai/consulting


What will help you be successful in this role

  • Articulate
  • High Energy
  • Passion to learn
  • High sense of ownership
  • Ability to work in a fast-paced and deadline-driven environment
  • Loves technology
  • Highly skilled at Data Interpretation
  • Problem solver
  • Ability to narrate the story to the business stakeholders
  • Generate insights and the ability to turn them into actions and decisions

 

Skills to work in a challenging, complex project environment

  • Need you to be naturally curious and have a passion for understanding consumer behavior
  • A high level of motivation, passion, and high sense of ownership
  • Excellent communication skills needed to manage an incredibly diverse slate of work, clients, and team personalities
  • Flexibility to work on multiple projects and deadline-driven fast-paced environment
  • Ability to work in ambiguity and manage the chaos

 

Key Responsibilities

  • Analyze data to unlock insights: Ability to identify relevant insights and actions from data.  Use regression, cluster analysis, time series, etc. to explore relationships and trends in response to stakeholder questions and business challenges.   
  • Bring in experience for AI and ML:  Bring in Industry experience and apply the same to build efficient and optimal Machine Learning solutions.
  • Exploratory Data Analysis (EDA) and Generate Insights: Analyse internal and external datasets using analytical techniques, tools, and visualization methods. Ensure pre-processing/cleansing of data and evaluate data points across the enterprise landscape and/or external data points that can be leveraged in machine learning models to generate insights. 
  • DS and ML Model Identification and Training: Identity, test, and train machine learning models that need to be leveraged for business use cases. Evaluate models based on interpretability, performance, and accuracy as required. Experiment and identify features from datasets that will help influence model outputs.  Determine what models will need to be deployed, data points that need to be fed into models, and aid in the deployment and maintenance of models.


Technical Skills

An enthusiastic individual with the following skills. Please do not hesitate to apply if you do not match all of them. We are open to promising candidates who are passionate about their work, fast learners and are team players.

  • Strong experience with machine learning and AI including regression, forecasting, time series, cluster analysis, classification, Image recognition, NLP, Text Analytics and Computer Vision.
  • Strong experience with advanced analytics tools for Object-oriented/object function scripting using languages such as Python, or similar.
  • Strong experience with popular database programming languages including SQL.
  • Strong experience in Spark/Pyspark
  • Experience in working in Databricks

 

What are the company benefits you get, when you join us as?

  • Permanent Work from Home Opportunity
  • Opportunity to work with Business Decision Makers and an internationally based team
  • The work environment that offers limitless learning
  • A culture void of any bureaucracy, hierarchy
  • A culture of being open, direct, and with mutual respect
  • A fun, high-caliber team that trusts you and provides the support and mentorship to help you grow
  • The opportunity to work on high-impact business problems that are already defining the future of Marketing and improving real lives

To know more about how we work: https://bit.ly/3Oy6WlE" target="_blank">https://bit.ly/3Oy6WlE

Whom will you work with?

You will closely work with other Senior Data Scientists and Data Engineers.

Immediate to 15-day Joiners will be preferred.

 

Read more
Top 3 Fintech Startup
Agency job
via Jobdost by Sathish Kumar
Bengaluru (Bangalore)
6 - 9 yrs
₹20L - ₹30L / yr
Amazon Web Services (AWS)
PySpark
SQL
Apache Spark
Python

We are looking for an exceptionally talented Lead data engineer who has exposure in implementing AWS services to build data pipelines, api integration and designing data warehouse. Candidate with both hands-on and leadership capabilities will be ideal for this position.

 

Qualification: At least a bachelor’s degree in Science, Engineering, Applied Mathematics. Preferred Masters degree

 

Job Responsibilities:

• Total 6+ years of experience as a Data Engineer and 2+ years of experience in managing a team

• Have minimum 3 years of AWS Cloud experience.

• Well versed in languages such as Python, PySpark, SQL, NodeJS etc

• Has extensive experience in Spark ecosystem and has worked on both real time and batch processing

• Have experience in AWS Glue, EMR, DMS, Lambda, S3, DynamoDB, Step functions, Airflow, RDS, Aurora etc.

• Experience with modern Database systems such as Redshift, Presto, Hive etc.

• Worked on building data lakes in the past on S3 or Apache Hudi

• Solid understanding of Data Warehousing Concepts

• Good to have experience on tools such as Kafka or Kinesis

• Good to have AWS Developer Associate or Solutions Architect Associate Certification

• Have experience in managing a team

Read more
Top startup of India -  News App
Noida
2 - 5 yrs
₹20L - ₹35L / yr
Linux/Unix
Python
Hadoop
Apache Spark
MongoDB
+4 more
Responsibilities
● Create and maintain optimal data pipeline architecture.
● Assemble large, complex data sets that meet functional / non-functional
business requirements.
● Building and optimizing ‘big data’ data pipelines, architectures and data sets.
● Maintain, organize & automate data processes for various use cases.
● Identifying trends, doing follow-up analysis, preparing visualizations.
● Creating daily, weekly and monthly reports of product KPIs.
● Create informative, actionable and repeatable reporting that highlights
relevant business trends and opportunities for improvement.

Required Skills And Experience:
● 2-5 years of work experience in data analytics- including analyzing large data sets.
● BTech in Mathematics/Computer Science
● Strong analytical, quantitative and data interpretation skills.
● Hands-on experience with Python, Apache Spark, Hadoop, NoSQL
databases(MongoDB preferred), Linux is a must.
● Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
● Experience with Google Cloud Data Analytics Products such as BigQuery, Dataflow, Dataproc etc. (or similar cloud-based platforms).
● Experience working within a Linux computing environment, and use of
command-line tools including knowledge of shell/Python scripting for
automating common tasks.
● Previous experience working at startups and/or in fast-paced environments.
● Previous experience as a data engineer or in a similar role.
Read more
Spica Systems

at Spica Systems

1 recruiter
Priyanka Bhattacharya
Posted by Priyanka Bhattacharya
Kolkata
3 - 5 yrs
₹7L - ₹12L / yr
Python
Apache Spark
We are a Silicon Valley based start-up, established in 2019 and are recognized as experts in building products and providing R&D and Software Development services in wide range of leading-edge technologies such as LTE, 5G, Cloud Services (Public -AWS, AZURE,GCP,Private – Openstack) and Kubernetes. It has a highly scalable and secured 5G Packet Core Network, orchestrated by ML powered Kubernetes platform, which can be deployed in various multi cloud mode along with a test tool.Headquartered in San Jose, California, we have our R&D centre in Sector V, Salt Lake Kolkata.
 

Requirements:

  • Overall 3 to 5 years of experience in designing and implementing complex large scale Software.
  • Good in Python is must.
  • Experience in Apache Spark, Scala, Java and Delta Lake
  • Experience in designing and implementing templated ETL/ELT data pipelines
  • Expert level experience in Data Pipeline Orchestrationusing Apache Airflow for large scale production deployment
  • Experience in visualizing data from various tasks in the data pipeline using Apache Zeppelin/Plotly or any other visualization library.
  • Log management and log monitoring using ELK/Grafana
  • Git Hub Integration

 

Technology Stack: Apache Spark, Apache Airflow, Python, AWS, EC2, S3, Kubernetes, ELK, Grafana , Apache Arrow, Java

Read more
Chennai
3 - 6 yrs
₹3L - ₹8L / yr
Machine Learning (ML)
Data Science
Natural Language Processing (NLP)
Data modeling
Data Analytics
+2 more

Location:  Chennai
Education: BE/BTech
Experience: Minimum 3+ years of experience as a Data Scientist/Data Engineer

Domain knowledge: Data cleaning, modelling, analytics, statistics, machine learning, AI

Requirements:

  • To be part of Digital Manufacturing and Industrie 4.0 projects across client group of companies
  • Design and develop AI//ML models to be deployed across factories
  • Knowledge on Hadoop, Apache Spark, MapReduce, Scala, Python programming, SQL and NoSQL databases is required
  • Should be strong in statistics, data analysis, data modelling, machine learning techniques and Neural Networks
  • Prior experience in developing AI and ML models is required
  • Experience with data from the Manufacturing Industry would be a plus

Roles and Responsibilities:

  • Develop AI and ML models for the Manufacturing Industry with a focus on Energy, Asset Performance Optimization and Logistics
  • Multitasking, good communication necessary
  • Entrepreneurial attitude

Additional Information:

  • Travel:                                  Must be willing to travel on shorter duration within India and abroad
  • Job Location:                      Chennai
  • Reporting to:                      Team Leader, Energy Management System
Read more
ATF lab

at ATF lab

1 recruiter
Priya Goyal
Posted by Priya Goyal
Agra
3 - 5 yrs
₹6L - ₹10L / yr
Java
Scala
Apache Spark
Spark
Hadoop
+1 more
Major Accountabilities

Collaborate with the CIO on application Architecture and Design of our ETL (Extract, Transform,
Load) and other aspects of Data Pipelines. Our stack is built on top of the well-known Spark
Ecosystem (e.g. Scala, Python, etc.)
Periodically evaluate architectural landscape for efficiencies in our Data Pipelines and define current
state, target state architecture and transition plans, road maps to achieve desired architectural state
Conducts/leads and implements proof of concepts to prove new technologies in support of
architecture vision and guiding principles (e.g. Flink)
Assist in the ideation and execution of architectural principles, guidelines and technology standards
that can be leveraged across the team and organization. Specially around ETL & Data Pipelines
Promotes consistency between all applications leveraging enterprise automation capabilities
Provide architectural consultation, support, mentoring, and guidance to project teams, e.g. architects,
data scientist, developers, etc.
Collaborate with the DevOps Lead on technical features
Define and manage work items using Agile methodologies (Kanban, Azure boards, etc) Leads Data
Engineering efforts (e.g. Scala Spark, PySpark, etc)

Knowledge & Experience
Experienced with Spark, Delta Lake, and Scala to work with Petabytes of data (to work with Batch
and Streaming flows)
Knowledge of a wide variety of open source technologies including but not limited to; NiFi,
Kubernetes, Docker, Hive, Oozie, YARN, Zookeeper, PostgreSQL, RabbitMQ, Elasticsearch
A strong understanding of AWS/Azure and/or technology as a service (Iaas, SaaS, PaaS)
Strong verbal and written communications skills are a must, as well as the ability to work effectively
across internal and external organizations and virtual teams
Appreciation of building high volume, low latency systems for the API flow
Core Dev skills (SOLID principles, IOC, 12-factor app, CI-CD, GIT)
Messaging, Microservice Architecture, Caching (Redis), Containerization, Performance, and Load
testing, REST APIs
Knowledge of HTML, JavaScript frameworks (preferably Angular 2+), Typescript
Appreciation of Python and C# .NET Core or Java Appreciation of global data privacy requirements
and cryptography
Experience in System Testing and experience of automated testing e.g. unit tests, integration tests,
mocking/stubbing
Relevant industry and other professional qualifications
Tertiary qualifications (degree level)
We are an inclusive employer and welcome applicants from all backgrounds. We pride ourselves on
our commitment to Equality and Diversity and are committed to removing barriers throughout our
hiring process.

Key Requirements

Extensive data engineering development experience (e.g., ETL), using well known stacks (e.g., Scala
Spark)
Experience in Technical Leadership positions (or looking to gain experience)
Background software engineering
The ability to write technical documentation
Solid understanding of virtualization and/or cloud computing technologies (e.g., docker, Kubernetes)
Experience in designing software solutions and enjoys UML and the odd sequence diagram
Experience operating within an Agile environment Ability to work independently and with minimum
supervision
Strong project development management skills, with the ability to successfully manage and prioritize
numerous time pressured analytical projects/work tasks simultaneously
Able to pivot quickly and make rapid decisions based on changing needs in a fast-paced environment
Works constructively with teams and acts with high integrity
Passionate team player with an inquisitive, creative mindset and ability to think outside the box.
Read more
Remote only
5 - 10 yrs
₹10L - ₹20L / yr
Java
J2EE
Spring Boot
Hibernate (Java)
Ansible
+11 more
  • Bachelor’s or master’s degree in Computer Engineering, Computer Science, Computer Applications, Mathematics, Statistics, or related technical field. Relevant experience of at least 3 years in lieu of above if from a different stream of education.

  • Well-versed in and 3+ hands-on demonstrable experience with: ▪ Stream & Batch Big Data Pipeline Processing using Apache Spark and/or Apache Flink.
    ▪ Distributed Cloud Native Computing including Server less Functions
    ▪ Relational, Object Store, Document, Graph, etc. Database Design & Implementation
    ▪ Micro services Architecture, API Modeling, Design, & Programming

  • 3+ years of hands-on development experience in Apache Spark using Scala and/or Java.

  • Ability to write executable code for Services using Spark RDD, Spark SQL, Structured Streaming, Spark MLLib, etc. with deep technical understanding of Spark Processing Framework.

  • In-depth knowledge of standard programming languages such as Scala and/or Java.

  • 3+ years of hands-on development experience in one or more libraries & frameworks such as Apache Kafka, Akka, Apache Storm, Apache Nifi, Zookeeper, Hadoop ecosystem (i.e., HDFS, YARN, MapReduce, Oozie & Hive), etc.; extra points if you can demonstrate your knowledge with working examples.

  • 3+ years of hands-on development experience in one or more Relational and NoSQL datastores such as PostgreSQL, Cassandra, HBase, MongoDB, DynamoDB, Elastic Search, Neo4J, etc.

  • Practical knowledge of distributed systems involving partitioning, bucketing, CAP theorem, replication, horizontal scaling, etc.

  • Passion for distilling large volumes of data, analyze performance, scalability, and capacity performance issues in Big Data Platforms.

  • Ability to clearly distinguish system and Spark Job performances and perform spark performance tuning and resource optimization.

  • Perform benchmarking/stress tests and document the best practices for different applications.

  • Proactively work with tenants on improving the overall performance and ensure the system is resilient, and scalable.

  • Good understanding of Virtualization & Containerization; must demonstrate experience in technologies such as Kubernetes, Istio, Docker, OpenShift, Anthos, Oracle VirtualBox, Vagrant, etc.

  • Well-versed with demonstrable working experience with API Management, API Gateway, Service Mesh, Identity & Access Management, Data Protection & Encryption.

    Hands-on experience with demonstrable working experience with DevOps tools and platforms viz., Jira, GIT, Jenkins, Code Quality & Security Plugins, Maven, Artifactory, Terraform, Ansible/Chef/Puppet, Spinnaker, etc.

  • Well-versed in AWS and/or Azure or and/or Google Cloud; must demonstrate experience in at least FIVE (5) services offered under AWS and/or Azure or and/or Google Cloud in any categories: Compute or Storage, Database, Networking & Content Delivery, Management & Governance, Analytics, Security, Identity, & Compliance (or) equivalent demonstrable Cloud Platform experience.

  • Good understanding of Storage, Networks and Storage Networking basics which will enable you to work in a Cloud environment.

  • Good understanding of Network, Data, and Application Security basics which will enable you to work in a Cloud as well as Business Applications / API services environment.

Read more
American Multinational Retail Corp
Chennai
2 - 5 yrs
₹5L - ₹15L / yr
Scala
Spark
Apache Spark

Should have Passion to learn and adapt new technologies, understanding,

solving/troubleshooting issues and risks, able to make informed decisions and ability to

lead the projects.

 

Your Qualifications

 

  • 2-5 Years’ Experience with functional programming
  • Experience with functional programming using Scala with Spark framework.
  • Strong understanding of Object-oriented programming, data structures and algorithms
  • Good experience in any of the cloud platforms (Azure, AWS, GCP) etc.,
  • Experience with distributed (multi-tiered) systems, relational databases and NoSql storage solutions
  • Desire to learn new technologies and languages
  • Participation in software design, development, and code reviews
  • High level of proficiency with Computer Science/Software Engineering knowledge and contribution to the technical skills growth of other team members


Your Responsibility

 

  • Design, build and configure applications to meet business process and application requirements
  • Proactively identify and communicate potential issues and concerns and recommend/implement alternative solutions as appropriate.
  • Troubleshooting & Optimization of existing solution

 

Provide advice on technical design to ensure solutions are forward looking and flexible for potential future requirements and business needs.
Read more
Bengaluru (Bangalore)
2 - 6 yrs
₹25L - ₹45L / yr
Data engineering
Data Analytics
Big Data
Apache Spark
airflow
+8 more
2+ years of experience in a Data Engineer role.
● Proficiency in Linux.
● Experience working with AWS cloud services: EC2, S3, RDS, Redshift.
● Must have SQL knowledge and experience working with relational databases, query
authoring (SQL) as well as familiarity with databases including Mysql, Mongo, Cassandra,
and Athena.
● Must have experience with Python/Scala.
● Must have experience with Big Data technologies like Apache Spark.
● Must have experience with Apache Airflow.
● Experience with data pipelines and ETL tools like AWS Glue.
Read more
Bengaluru (Bangalore)
8 - 15 yrs
₹25L - ₹60L / yr
Data engineering
Big Data
Spark
Apache Kafka
Cassandra
+20 more
Responsibilities

● Able to contribute to the gathering of functional requirements, developing technical
specifications, and test case planning
● Demonstrating technical expertise, and solving challenging programming and design
problems
● 60% hands-on coding with architecture ownership of one or more products
● Ability to articulate architectural and design options, and educate development teams and
business users
● Resolve defects/bugs during QA testing, pre-production, production, and post-release
patches
● Mentor and guide team members
● Work cross-functionally with various bidgely teams including product management, QA/QE,
various product lines, and/or business units to drive forward results

Requirements
● BS/MS in computer science or equivalent work experience
● 8-12 years’ experience designing and developing applications in Data Engineering
● Hands-on experience with Big data EcoSystems.
● Past experience with Hadoop,Hdfs,Map Reduce,YARN,AWS Cloud, EMR, S3, Spark, Cassandra,
Kafka, Zookeeper
● Expertise with any of the following Object-Oriented Languages (OOD): Java/J2EE,Scala,
Python
● Ability to lead and mentor technical team members
● Expertise with the entire Software Development Life Cycle (SDLC)
● Excellent communication skills: Demonstrated ability to explain complex technical issues to
both technical and non-technical audiences
● Expertise in the Software design/architecture process
● Expertise with unit testing & Test-Driven Development (TDD)
● Business Acumen - strategic thinking & strategy development
● Experience on Cloud or AWS is preferable
● Have a good understanding and ability to develop software, prototypes, or proofs of
concepts (POC's) for various Data Engineering requirements.
● Experience with Agile Development, SCRUM, or Extreme Programming methodologies
Read more
Bengaluru (Bangalore)
5 - 8 yrs
₹20L - ₹35L / yr
Big Data
Data engineering
Big Data Engineering
Data Engineer
ETL
+5 more

Data Engineer JD:

  • Designing, developing, constructing, installing, testing and maintaining the complete data management & processing systems.
  • Building highly scalable, robust, fault-tolerant, & secure user data platform adhering to data protection laws.
  • Taking care of the complete ETL (Extract, Transform & Load) process.
  • Ensuring architecture is planned in such a way that it meets all the business requirements.
  • Exploring new ways of using existing data, to provide more insights out of it.
  • Proposing ways to improve data quality, reliability & efficiency of the whole system.
  • Creating data models to reduce system complexity and hence increase efficiency & reduce cost.
  • Introducing new data management tools & technologies into the existing system to make it more efficient.
  • Setting up monitoring and alarming on data pipeline jobs to detect failures and anomalies

What do we expect from you?

  • BS/MS in Computer Science or equivalent experience
  • 5 years of recent experience in Big Data Engineering.
  • Good experience in working with Hadoop and Big Data technologies like HDFS, Pig, Hive, Zookeeper, Storm, Spark, Airflow and NoSQL systems
  • Excellent programming and debugging skills in Java or Python.
  • Apache spark, python, hands on experience in deploying ML models
  • Has worked on streaming and realtime pipelines
  • Experience with Apache Kafka or has worked with any of Spark Streaming, Flume or Storm

 

 

 

 

 

 

 

 

 

 

 

 

Focus Area:

 

R1

Data structure & Algorithms

R2

Problem solving + Coding

R3

Design (LLD)

 

Read more
Slintel
Agency job
via Qrata by Prajakta Kulkarni
Bengaluru (Bangalore)
4 - 9 yrs
₹20L - ₹28L / yr
Big Data
ETL
Apache Spark
Spark
Data engineer
+5 more
Responsibilities
  • Work in collaboration with the application team and integration team to design, create, and maintain optimal data pipeline architecture and data structures for Data Lake/Data Warehouse.
  • Work with stakeholders including the Sales, Product, and Customer Support teams to assist with data-related technical issues and support their data analytics needs.
  • Assemble large, complex data sets from third-party vendors to meet business requirements.
  • Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
  • Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL, Elasticsearch, MongoDB, and AWS technology.
  • Streamline existing and introduce enhanced reporting and analysis solutions that leverage complex data sources derived from multiple internal systems.

Requirements
  • 5+ years of experience in a Data Engineer role.
  • Proficiency in Linux.
  • Must have SQL knowledge and experience working with relational databases, query authoring (SQL) as well as familiarity with databases including Mysql, Mongo, Cassandra, and Athena.
  • Must have experience with Python/Scala.
  • Must have experience with Big Data technologies like Apache Spark.
  • Must have experience with Apache Airflow.
  • Experience with data pipeline and ETL tools like AWS Glue.
  • Experience working with AWS cloud services: EC2, S3, RDS, Redshift.
Read more
Simplifai Cognitive Solutions Pvt Ltd
Priyanka Malani
Posted by Priyanka Malani
Pune
2 - 15 yrs
₹10L - ₹30L / yr
Spark
Big Data
Apache Spark
Python
PySpark
+1 more

We are looking for a skilled Senior/Lead Bigdata Engineer to join our team. The role is part of the research and development team, where you with enthusiasm and knowledge are going to be our technical evangelist for the development of our inspection technology and products.

 

At Elop we are developing product lines for sustainable infrastructure management using our own patented technology for ultrasound scanners and combine this with other sources to see holistic overview of the concrete structure. At Elop we will provide you with world-class colleagues highly motivated to position the company as an international standard of structural health monitoring. With the right character you will be professionally challenged and developed.

This position requires travel to Norway.

 

Elop is sister company of Simplifai and co-located together in all geographic locations.

https://elop.no/

https://www.simplifai.ai/en/


Roles and Responsibilities

  • Define technical scope and objectives through research and participation in requirements gathering and definition of processes
  • Ingest and Process data from data sources (Elop Scanner) in raw format into Big Data ecosystem
  • Realtime data feed processing using Big Data ecosystem
  • Design, review, implement and optimize data transformation processes in Big Data ecosystem
  • Test and prototype new data integration/processing tools, techniques and methodologies
  • Conversion of MATLAB code into Python/C/C++.
  • Participate in overall test planning for the application integrations, functional areas and projects.
  • Work with cross functional teams in an Agile/Scrum environment to ensure a quality product is delivered.

Desired Candidate Profile

  • Bachelor's degree in Statistics, Computer or equivalent
  • 7+ years of experience in Big Data ecosystem, especially Spark, Kafka, Hadoop, HBase.
  • 7+ years of hands-on experience in Python/Scala is a must.
  • Experience in architecting the big data application is needed.
  • Excellent analytical and problem solving skills
  • Strong understanding of data analytics and data visualization, and must be able to help development team with visualization of data.
  • Experience with signal processing is plus.
  • Experience in working on client server architecture is plus.
  • Knowledge about database technologies like RDBMS, Graph DB, Document DB, Apache Cassandra, OpenTSDB
  • Good communication skills, written and oral, in English

We can Offer

  • An everyday life with exciting and challenging tasks with the development of socially beneficial solutions
  • Be a part of companys research and Development team to create unique and innovative products
  • Colleagues with world-class expertise, and an organization that has ambitions and is highly motivated to position the company as an international player in maintenance support and monitoring of critical infrastructure!
  • Good working environment with skilled and committed colleagues an organization with short decision paths.
  • Professional challenges and development
Read more
Remote only
3 - 8 yrs
₹20L - ₹26L / yr
Airflow
Amazon Redshift
Amazon Web Services (AWS)
Java
ETL
+4 more
  • Experience with Cloud native Data tools/Services such as AWS Athena, AWS Glue, Redshift Spectrum, AWS EMR, AWS Aurora, Big Query, Big Table, S3, etc.

 

  • Strong programming skills in at least one of the following languages: Java, Scala, C++.

 

  • Familiarity with a scripting language like Python as well as Unix/Linux shells.

 

  • Comfortable with multiple AWS components including RDS, AWS Lambda, AWS Glue, AWS Athena, EMR. Equivalent tools in the GCP stack will also suffice.

 

  • Strong analytical skills and advanced SQL knowledge, indexing, query optimization techniques.

 

  • Experience implementing software around data processing, metadata management, and ETL pipeline tools like Airflow.

 

Experience with the following software/tools is highly desired:

 

  • Apache Spark, Kafka, Hive, etc.

 

  • SQL and NoSQL databases like MySQL, Postgres, DynamoDB.

 

  • Workflow management tools like Airflow.

 

  • AWS cloud services: RDS, AWS Lambda, AWS Glue, AWS Athena, EMR.

 

  • Familiarity with Spark programming paradigms (batch and stream-processing).

 

  • RESTful API services.
Read more
upGrad

at upGrad

1 video
19 recruiters
Priyanka Muralidharan
Posted by Priyanka Muralidharan
Mumbai, Bengaluru (Bangalore)
8 - 12 yrs
₹40L - ₹60L / yr
Technical Architecture
Technical architect
Java
Go Programming (Golang)
React.js
+10 more
About Us

upGrad is an online education platform building the careers of tomorrow by offering the most industry-relevant programs in an immersive learning experience. Our mission is to create a new digital-first learning experience to deliver tangible career impact to individuals at scale. upGrad currently offers programs in Data Science, Machine Learning, Product Management, Digital Marketing, and Entrepreneurship, etc. upGrad is looking for people passionate about management and education to help design learning programs for working professionals to stay sharp and stay relevant and help build the careers of tomorrow.
  • upGrad was awarded the Best Tech for Education by IAMAI for 2018-19,
  • upGrad was also ranked as one of the LinkedIn Top Startups 2018: The 25 most sought-after startups in India.
  • upGrad was earlier selected as one of the top ten most innovative companies in India by FastCompany.
  • We were also covered by the Financial Times along with other disruptors in Ed-Tech.
  • upGrad is the official education partner for Government of India - Startup India program.
  • Our program with IIIT B has been ranked #1 program in the country in the domain of Artificial Intelligence and Machine Learning.

About the Role

A highly motivated individual who has expe rience in architecting end to end web based ecommerce/online/SaaS products and systems; bringing them to production quickly and with high quality. Able to understand expected business results and map architecture to drive business forward. Passionate about building world class solutions.

Role and Responsibilities

  • Work with Product Managers and Business to understand business/product requirements and vision.
  • Provide a clear architectural vision in line with business and product vision.
  • Lead a team of architects, developers, and data engineers to provide platform services to other engineering teams.
  • Provide architectural oversight to engineering teams across the organization.
  • Hands on design and development of platform services and features owned by self - this is a hands-on coding role.
  • Define guidelines for best practices covering design, unit testing, secure coding etc.
  • Ensure quality by reviewing design, code, test plans, load test plans etc. as appropriate.
  • Work closely with the QA and Support teams to track quality and proactively identify improvement opportunities.
  • Work closely with DevOps and IT to ensure highly secure and cost optimized operations in the cloud.
  • Grow technical skills in the team - identify skill gaps with plans to address them, participate in hiring, mentor other architects and engineers.
  • Support other engineers in resolving complex technical issues as a go-to person.

Skills/Experience
  • 12+ years of experience in design and development of ecommerce scale systems and highly scalable SaaS or enterprise products.
  • Extensive experience in developing extensible and scalable web applications with
    • Java, Spring Boot, Go
    • Web Services - REST, OAuth, OData
    • Database/Caching - MySQL, Cassandra, MongoDB, Memcached/Redis
    • Queue/Broker services - RabbitMQ/Kafka
    • Microservices architecture via Docker on AWS or Azure.
    • Experience with web front end technologies - HTML5, CSS3, JavaScript libraries and frameworks such as jQuery, AngularJS, React, Vue.js, Bootstrap etc.
  • Extensive experience with cloud based architectures and how to optimize design for cost.
  • Expert level understanding of secure application design practices and a working understanding of cloud infrastructure security.
  • Experience with CI/CD processes and design for testability.
  • Experience working with big data technologies such as Spark/Storm/Hadoop/Data Lake Architectures is a big plus.
  • Action and result-oriented problem-solver who works well both independently and as part of a team; able to foster and develop others' ideas as well as his/her own.
  • Ability to organize, prioritize and schedule a high workload and multiple parallel projects efficiently.
  • Excellent verbal and written communication with stakeholders in a matrixed environment.
  • Long term experience with at least one product from inception to completion and evolution of the product over multiple years.
Qualification
B.Tech/MCA (IT/Computer Science) from a premier institution (IIT/NIT/BITS) and/or a US Master's degree in Computer Science.
Read more
Happymonk AI labs

at Happymonk AI labs

8 recruiters
Agency job
via Tritech Solutions by Sushant Hiremath
Bengaluru (Bangalore)
2 - 6 yrs
₹4L - ₹11L / yr
NodeJS (Node.js)
Node
Javascript
React.js
PostgreSQL
+3 more

Work across the full stack, building highly scalable distributed solutions that enable positive user experiences and measurable business growth

Develop new features and infrastructure development in support of rapidly emerging business and project requirements

Assume leadership of new projects from conceptualization to deployment

Ensure application performance, uptime, and scale, maintaining high standards of code quality and thoughtful application design

Work with agile development methodologies, adhering to best practices and pursuing continued learning opportunities

Visualize, design, and develop creative and innovative software platforms, as we continue to experience dramatic growth in the usage and visibility of our products

Create scalable software platforms and applications, and efficient networking solutions that are unit tested, code reviewed and checked regularly for continuous integration

Examine existing systems, identifying flaws and creating solutions to improve service uptime and time-to-resolve through monitoring and automated remediation

Plan and execute full software development life cycles (SDLC) for each assigned project, adhering to company standards and expectations



Special Skills Required

Bachelor’s degree in software engineering or information technology

 

2+ years of experience engineering software and networking platforms

 

2+ years of experience in building large-scale software applications.

 

Proven ability to document design processes, including development, tests, analytics, and troubleshooting

 

Experience with rapid development cycles in a web-based environment

 

Strong scripting and test automation abilities, ability to drive a Test Driven Development Model

 

Working knowledge of relational databases as well as ORM and Postgre and other SQL technologies

 

Proficiency with Javascript, Typescript, React.js, Babylon, Nodejs, HTML5, CSS3, and order management systems

 

Proven experience designing interactive applications and networking platforms

 

Web application development experience with multiple frameworks, including Blockchain, Hyperledger, Spark, Kafka, Elasticsearch, neo4j, graphQL

 

Desire to continue to grow professional capabilities with ongoing training and educational opportunities

 

Additional Knowledge in computer vision, embedded, blockchain technologies a plus

 

Experience Designing and integrating RESTful APIs

 

Excellent debugging and optimization skills

 

Unit/integration testing experience

 

Interest in learning new tools, languages, workflows, and philosophies to grow

 

Professional certifications

 

Location

Bengaluru, Karnataka, India - 560072

Read more
Hammoq

at Hammoq

1 recruiter
Nikitha Muthuswamy
Posted by Nikitha Muthuswamy
Remote, Indore, Ujjain, Hyderabad, Bengaluru (Bangalore)
5 - 8 yrs
₹5L - ₹15L / yr
pandas
NumPy
Data engineering
Data Engineer
Apache Spark
+6 more
  • Does analytics to extract insights from raw historical data of the organization. 
  • Generates usable training dataset for any/all MV projects with the help of Annotators, if needed.
  • Analyses user trends, and identifies their biggest bottlenecks in Hammoq Workflow.
  • Tests the short/long term impact of productized MV models on those trends.
  • Skills - Numpy, Pandas, SPARK, APACHE SPARK, PYSPARK, ETL mandatory. 
Read more
DataMetica

at DataMetica

1 video
7 recruiters
Sumangali Desai
Posted by Sumangali Desai
Pune, Hyderabad
7 - 12 yrs
₹7L - ₹20L / yr
Apache Spark
Big Data
Spark
Scala
Hadoop
+3 more
We at Datametica Solutions Private Limited are looking for Big Data Spark Lead who have a passion for cloud with knowledge of different on-premise and cloud Data implementation in the field of Big Data and Analytics including and not limiting to Teradata, Netezza, Exadata, Oracle, Cloudera, Hortonworks and alike.
Ideal candidates should have technical experience in migrations and the ability to help customers get value from Datametica's tools and accelerators.

Job Description
Experience : 7+ years
Location : Pune / Hyderabad
Skills :
  • Drive and participate in requirements gathering workshops, estimation discussions, design meetings and status review meetings
  • Participate and contribute in Solution Design and Solution Architecture for implementing Big Data Projects on-premise and on cloud
  • Technical Hands on experience in design, coding, development and managing Large Hadoop implementation
  • Proficient in SQL, Hive, PIG, Spark SQL, Shell Scripting, Kafka, Flume, Scoop with large Big Data and Data Warehousing projects with either Java, Python or Scala based Hadoop programming background
  • Proficient with various development methodologies like waterfall, agile/scrum and iterative
  • Good Interpersonal skills and excellent communication skills for US and UK based clients

About Us!
A global Leader in the Data Warehouse Migration and Modernization to the Cloud, we empower businesses by migrating their Data/Workload/ETL/Analytics to the Cloud by leveraging Automation.

We have expertise in transforming legacy Teradata, Oracle, Hadoop, Netezza, Vertica, Greenplum along with ETLs like Informatica, Datastage, AbInitio & others, to cloud-based data warehousing with other capabilities in data engineering, advanced analytics solutions, data management, data lake and cloud optimization.

Datametica is a key partner of the major cloud service providers - Google, Microsoft, Amazon, Snowflake.


We have our own products!
Eagle –
Data warehouse Assessment & Migration Planning Product
Raven –
Automated Workload Conversion Product
Pelican -
Automated Data Validation Product, which helps automate and accelerate data migration to the cloud.

Why join us!
Datametica is a place to innovate, bring new ideas to live and learn new things. We believe in building a culture of innovation, growth and belonging. Our people and their dedication over these years are the key factors in achieving our success.

Benefits we Provide!
Working with Highly Technical and Passionate, mission-driven people
Subsidized Meals & Snacks
Flexible Schedule
Approachable leadership
Access to various learning tools and programs
Pet Friendly
Certification Reimbursement Policy

Check out more about us on our website below!
www.datametica.com
Read more
A logistic Company
Agency job
via Anzy by Dattatraya Kolangade
Bengaluru (Bangalore)
5 - 7 yrs
₹18L - ₹25L / yr
Data engineering
ETL
SQL
Hadoop
Apache Spark
+13 more
Key responsibilities:
• Create and maintain data pipeline
• Build and deploy ETL infrastructure for optimal data delivery
• Work with various including product, design and executive team to troubleshoot data
related issues
• Create tools for data analysts and scientists to help them build and optimise the product
• Implement systems and process for data access controls and guarantees
• Distill the knowledge from experts in the field outside the org and optimise internal data
systems
Preferred qualifications/skills:
• 5+ years experience
• Strong analytical skills

____ 04

Freight Commerce Solutions Pvt Ltd. 

• Degree in Computer Science, Statistics, Informatics, Information Systems
• Strong project management and organisational skills
• Experience supporting and working with cross-functional teams in a dynamic environment
• SQL guru with hands on experience on various databases
• NoSQL databases like Cassandra, MongoDB
• Experience with Snowflake, Redshift
• Experience with tools like Airflow, Hevo
• Experience with Hadoop, Spark, Kafka, Flink
• Programming experience in Python, Java, Scala
Read more
GitHub

at GitHub

4 recruiters
Nataliia Mediana
Posted by Nataliia Mediana
Remote only
3 - 8 yrs
$24K - $60K / yr
ETL
PySpark
Data engineering
Data engineer
athena
+9 more
We are a nascent quant hedge fund; we need to stage financial data and make it easy to run and re-run various preprocessing and ML jobs on the data.
- We are looking for an experienced data engineer to join our team.
- The preprocessing involves ETL tasks, using pyspark, AWS Glue, staging data in parquet formats on S3, and Athena

To succeed in this data engineering position, you should care about well-documented, testable code and data integrity. We have devops who can help with AWS permissions.
We would like to build up a consistent data lake with staged, ready-to-use data, and to build up various scripts that will serve as blueprints for various additional data ingestion and transforms.

If you enjoy setting up something which many others will rely on, and have the relevant ETL expertise, we’d like to work with you.

Responsibilities
- Analyze and organize raw data
- Build data pipelines
- Prepare data for predictive modeling
- Explore ways to enhance data quality and reliability
- Potentially, collaborate with data scientists to support various experiments

Requirements
- Previous experience as a data engineer with the above technologies
Read more
Recko

at Recko

1 recruiter
Agency job
via Zyoin Web Private Limited by Chandrakala M
Bengaluru (Bangalore)
3 - 7 yrs
₹16L - ₹40L / yr
Big Data
Hadoop
Spark
Apache Hive
Data engineering
+6 more

Recko Inc. is looking for data engineers to join our kick-ass engineering team. We are looking for smart, dynamic individuals to connect all the pieces of the data ecosystem.

 

What are we looking for:

  1. 3+  years of development experience in at least one of MySQL, Oracle, PostgreSQL or MSSQL and experience in working with Big Data technologies like Big Data frameworks/platforms/data stores like Hadoop, HDFS, Spark, Oozie, Hue, EMR, Scala, Hive, Glue, Kerberos etc.

  2. Strong experience setting up data warehouses, data modeling, data wrangling and dataflow architecture on the cloud

  3. 2+ experience with public cloud services such as AWS, Azure, or GCP and languages like Java/ Python etc

  4. 2+ years of development experience in Amazon Redshift, Google Bigquery or Azure data warehouse platforms preferred

  5. Knowledge of statistical analysis tools like R, SAS etc 

  6. Familiarity with any data visualization software

  7. A growth mindset and passionate about building things from the ground up and most importantly, you should be fun to work with

As a data engineer at Recko, you will:

  1. Create and maintain optimal data pipeline architecture,

  2. Assemble large, complex data sets that meet functional / non-functional business requirements.

  3. Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.

  4. Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS ‘big data’ technologies.

  5. Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.

  6. Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.

  7. Keep our data separated and secure across national boundaries through multiple data centers and AWS regions.

  8. Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.

  9. Work with data and analytics experts to strive for greater functionality in our data systems.

 

About Recko: 

Recko was founded in 2017 to organise the world’s transactional information and provide intelligent applications to finance and product teams to make sense of the vast amount of data available. With the proliferation of digital transactions over the past two decades, Enterprises, Banks and Financial institutions are finding it difficult to keep a track on the money flowing across their systems. With the Recko Platform, businesses can build, integrate and adapt innovative and complex financial use cases within the organization and  across external payment ecosystems with agility, confidence and at scale.  . Today, customer-obsessed brands such as Deliveroo, Meesho, Grofers, Dunzo, Acommerce, etc use Recko so their finance teams can optimize resources with automation and prioritize growth over repetitive and time-consuming tasks around day-to-day operations. 

 

Recko is a Series A funded startup, backed by marquee investors like Vertex Ventures, Prime Venture Partners and Locus Ventures. Traditionally enterprise software is always built around functionality. We believe software is an extension of one’s capability, and it should be delightful and fun to use.

 

Working at Recko: 

We believe that great companies are built by amazing people. At Recko, We are a group of young Engineers, Product Managers, Analysts and Business folks who are on a mission to bring consumer tech DNA to enterprise fintech applications. The current team at Recko is 60+ members strong with stellar experience across fintech, e-commerce, digital domains at companies like Flipkart, PhonePe, Ola Money, Belong, Razorpay, Grofers, Jio, Oracle etc. We are growing aggressively across verticals.

Read more
Digital Banking Firm
Agency job
via Qrata by Prajakta Kulkarni
Bengaluru (Bangalore)
5 - 10 yrs
₹20L - ₹40L / yr
Apache Kafka
Hadoop
Spark
Apache Hadoop
Big Data
+5 more
Location - Bangalore (Remote for now)
 
Designation - Sr. SDE (Platform Data Science)
 
About Platform Data Science Team

The Platform Data Science team works at the intersection of data science and engineering. Domain experts develop and advance platforms, including the data platforms, machine learning platform, other platforms for Forecasting, Experimentation, Anomaly Detection, Conversational AI, Underwriting of Risk, Portfolio Management, Fraud Detection & Prevention and many more. We also are the Data Science and Analytics partners for Product and provide Behavioural Science insights across Jupiter.
 
About the role:

We’re looking for strong Software Engineers that can combine EMR, Redshift, Hadoop, Spark, Kafka, Elastic Search, Tensorflow, Pytorch and other technologies to build the next generation Data Platform, ML Platform, Experimentation Platform. If this sounds interesting we’d love to hear from you!
This role will involve designing and developing software products that impact many areas of our business. The individual in this role will have responsibility help define requirements, create software designs, implement code to these specifications, provide thorough unit and integration testing, and support products while deployed and used by our stakeholders.

Key Responsibilities:

Participate, Own & Influence in architecting & designing of systems
Collaborate with other engineers, data scientists, product managers
Build intelligent systems that drive decisions
Build systems that enable us to perform experiments and iterate quickly
Build platforms that enable scientists to train, deploy and monitor models at scale
Build analytical systems that drives better decision making
 

Required Skills:

Programming experience with at least one modern language such as Java, Scala including object-oriented design
Experience in contributing to the architecture and design (architecture, design patterns, reliability and scaling) of new and current systems
Bachelor’s degree in Computer Science or related field
Computer Science fundamentals in object-oriented design
Computer Science fundamentals in data structures
Computer Science fundamentals in algorithm design, problem solving, and complexity analysis
Experience in databases, analytics, big data systems or business intelligence products:
Data lake, data warehouse, ETL, ML platform
Big data tech like: Hadoop, Apache Spark
Read more
SpringML

at SpringML

1 video
2 recruiters
Kayal Vizhi
Posted by Kayal Vizhi
Hyderabad
4 - 11 yrs
₹8L - ₹20L / yr
Big Data
Hadoop
Apache Spark
Spark
Data Structures
+3 more

SpringML is looking to hire a top-notch Senior  Data Engineer who is passionate about working with data and using the latest distributed framework to process large dataset. As an Associate Data Engineer, your primary role will be to design and build data pipelines. You will be focused on helping client projects on data integration, data prep and implementing machine learning on datasets. In this role, you will work on some of the latest technologies, collaborate with partners on early win, consultative approach with clients, interact daily with executive leadership, and help build a great company. Chosen team members will be part of the core team and play a critical role in scaling up our emerging practice.

RESPONSIBILITIES:

 

  • Ability to work as a member of a team assigned to design and implement data integration solutions.
  • Build Data pipelines using standard frameworks in Hadoop, Apache Beam and other open-source solutions.
  • Learn quickly – ability to understand and rapidly comprehend new areas – functional and technical – and apply detailed and critical thinking to customer solutions.
  • Propose design solutions and recommend best practices for large scale data analysis

 

SKILLS:

 

  • B.tech  degree in computer science, mathematics or other relevant fields.
  • 4+years of experience in ETL, Data Warehouse, Visualization and building data pipelines.
  • Strong Programming skills – experience and expertise in one of the following: Java, Python, Scala, C.
  • Proficient in big data/distributed computing frameworks such as Apache,Spark, Kafka,
  • Experience with Agile implementation methodologies
Read more
SAP company
Mumbai, Navi Mumbai
3 - 8 yrs
₹7L - ₹13L / yr
Data engineering
Apache Kafka
Apache Spark
Hadoop
apache flink
+7 more
Build data systems and pipelines using Apache Flink (or similar) pipelines.
Understand various raw data input formats, build consumers on Kafka/ksqldb for them and ingest large amounts of raw data into Flink and Spark.
Conduct complex data analysis and report on results.
Build various aggregation streams for data and convert raw data into various logical processing streams.
Build algorithms to integrate multiple sources of data and create a unified data model from all the sources.
Build a unified data model on both SQL and NO-SQL databases to act as data sink.
Communicate the designs effectively with the fullstack engineering team for development.
Explore machine learning models that can be fitted on top of the data pipelines.

Mandatory Qualifications Skills:

Deep knowledge of Scala and Java programming languages is mandatory
Strong background in streaming data frameworks (Apache Flink, Apache Spark) is mandatory
Good understanding and hands on skills on streaming messaging platforms such as Kafka
Familiarity with R, C and Python is an asset
Analytical mind and business acumen with strong math skills (e.g. statistics, algebra)
Problem-solving aptitude
Excellent communication and presentation skills
Read more
Series 'A' funded Silicon Valley based BI startup
Bengaluru (Bangalore)
4 - 6 yrs
₹30L - ₹45L / yr
Data engineering
Data Engineer
Scala
Data Warehouse (DWH)
Big Data
+7 more
It is the leader in capturing technographics-powered buying intent, helps
companies uncover the 3% of active buyers in their target market. It evaluates
over 100 billion data points and analyzes factors such as buyer journeys, technology
adoption patterns, and other digital footprints to deliver market & sales intelligence.
Its customers have access to the buying patterns and contact information of
more than 17 million companies and 70 million decision makers across the world.

Role – Data Engineer

Responsibilities

 Work in collaboration with the application team and integration team to
design, create, and maintain optimal data pipeline architecture and data
structures for Data Lake/Data Warehouse.
 Work with stakeholders including the Sales, Product, and Customer Support
teams to assist with data-related technical issues and support their data
analytics needs.
 Assemble large, complex data sets from third-party vendors to meet business
requirements.
 Identify, design, and implement internal process improvements: automating
manual processes, optimizing data delivery, re-designing infrastructure for
greater scalability, etc.
 Build the infrastructure required for optimal extraction, transformation, and
loading of data from a wide variety of data sources using SQL, Elasticsearch,
MongoDB, and AWS technology.
 Streamline existing and introduce enhanced reporting and analysis solutions
that leverage complex data sources derived from multiple internal systems.

Requirements
 5+ years of experience in a Data Engineer role.
 Proficiency in Linux.
 Must have SQL knowledge and experience working with relational databases,
query authoring (SQL) as well as familiarity with databases including Mysql,
Mongo, Cassandra, and Athena.
 Must have experience with Python/Scala.
 Must have experience with Big Data technologies like Apache Spark.
 Must have experience with Apache Airflow.
 Experience with data pipeline and ETL tools like AWS Glue.
 Experience working with AWS cloud services: EC2, S3, RDS, Redshift.
Read more
Prescience Decision Solutions
Shivakumar K
Posted by Shivakumar K
Bengaluru (Bangalore)
3 - 7 yrs
₹10L - ₹20L / yr
Big Data
ETL
Spark
Apache Kafka
Apache Spark
+4 more

The Data Engineer would be responsible for selecting and integrating Big Data tools and frameworks required. Would implement Data Ingestion & ETL/ELT processes

Required Experience, Skills and Qualifications:

  • Hands on experience on Big Data tools/technologies like Spark,  Databricks, Map Reduce, Hive, HDFS.
  • Expertise and excellent understanding of big data toolset such as Sqoop, Spark-streaming, Kafka, NiFi
  • Proficiency in any of the programming language: Python/ Scala/  Java with 4+ years’ experience
  • Experience in Cloud infrastructures like MS Azure, Data lake etc
  • Good working knowledge in NoSQL DB (Mongo, HBase, Casandra)
Read more
Startup Focused on simplifying Buying Intent
Bengaluru (Bangalore)
4 - 9 yrs
₹28L - ₹56L / yr
Big Data
Apache Spark
Spark
Hadoop
ETL
+7 more
5+ years of experience in a Data Engineer role.
 Proficiency in Linux.
 Must have SQL knowledge and experience working with relational databases,
query authoring (SQL) as well as familiarity with databases including Mysql,
Mongo, Cassandra, and Athena.
 Must have experience with Python/Scala.
Must have experience with Big Data technologies like Apache Spark.
 Must have experience with Apache Airflow.
 Experience with data pipeline and ETL tools like AWS Glue.
 Experience working with AWS cloud services: EC2, S3, RDS, Redshift.
Read more
jhjkhhk
Agency job
via CareerBabu by Tanisha Takkar
Bengaluru (Bangalore)
2 - 5 yrs
₹10L - ₹40L / yr
Apache Spark
Big Data
Java
Spring
Data Structures
+5 more
  • Owns the end to end implementation of the assigned data processing components/product features  i.e. design, development, deployment, and testing of the data processing components and associated flows conforming to best coding practices 

  • Creation and optimization of data engineering pipelines for analytics projects. 

  • Support data and cloud transformation initiatives 

  • Contribute to our cloud strategy based on prior experience 

  • Independently work with all stakeholders across the organization to deliver enhanced functionalities 

  • Create and maintain automated ETL processes with a special focus on data flow, error recovery, and exception handling and reporting 

  • Gather and understand data requirements, work in the team to achieve high-quality data ingestion and build systems that can process the data, transform the data 

  • Be able to comprehend the application of database index and transactions 

  • Involve in the design and development of a Big Data predictive analytics SaaS-based customer data platform using object-oriented analysis, design and programming skills, and design patterns 

  • Implement ETL workflows for data matching, data cleansing, data integration, and management 

  • Maintain existing data pipelines, and develop new data pipeline using big data technologies 

  • Responsible for leading the effort of continuously improving reliability, scalability, and stability of microservices and platform

Read more
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort
Why apply via Cutshort?
Connect with actual hiring teams and get their fast response. No spam.
Find more jobs
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort