Cutshort logo
Scry AI logo
Data Engineer (Azure)
Data Engineer (Azure)
Scry AI's logo

Data Engineer (Azure)

Siddarth Thakur's profile picture
Posted by Siddarth Thakur
3 - 8 yrs
₹15L - ₹20L / yr
Remote only
Skills
Spark
Hadoop
Big Data
Data engineering
PySpark
Windows Azure
skill iconAmazon Web Services (AWS)
Google Cloud Platform (GCP)
SQL
NOSQL Databases
Apache Kafka

Title: Data Engineer (Azure) (Location: Gurgaon/Hyderabad)

Salary: Competitive as per Industry Standard

We are expanding our Data Engineering Team and hiring passionate professionals with extensive

knowledge and experience in building and managing large enterprise data and analytics platforms. We

are looking for creative individuals with strong programming skills, who can understand complex

business and architectural problems and develop solutions. The individual will work closely with the rest

of our data engineering and data science team in implementing and managing Scalable Smart Data

Lakes, Data Ingestion Platforms, Machine Learning and NLP based Analytics Platforms, Hyper-Scale

Processing Clusters, Data Mining and Search Engines.

What You’ll Need:

  • 3+ years of industry experience in creating and managing end-to-end Data Solutions, Optimal

Data Processing Pipelines and Architecture dealing with large volume, big data sets of varied

data types.

  • Proficiency in Python, Linux and shell scripting.
  • Strong knowledge of working with PySpark dataframes, Pandas dataframes for writing efficient pre-processing and other data manipulation tasks.
    ● Strong experience in developing the infrastructure required for data ingestion, optimal

extraction, transformation, and loading of data from a wide variety of data sources using tools like Azure Data Factory,  Azure Databricks (or Jupyter notebooks/ Google Colab) (or other similiar tools).

  • Working knowledge of github or other version control tools.
  • Experience with creating Restful web services and API platforms.
  • Work with data science and infrastructure team members to implement practical machine

learning solutions and pipelines in production.

  • Experience with cloud providers like Azure/AWS/GCP.
  • Experience with SQL and NoSQL databases. MySQL/ Azure Cosmosdb / Hbase/MongoDB/ Elasticsearch etc.
  • Experience with stream-processing systems: Spark-Streaming, Kafka etc and working experience with event driven architectures.
  • Strong analytic skills related to working with unstructured datasets.

 

Good to have (to filter or prioritize candidates)

  • Experience with testing libraries such as pytest for writing unit-tests for the developed code.
  • Knowledge of Machine Learning algorithms and libraries would be good to have,

implementation experience would be an added advantage.

  • Knowledge and experience of Datalake, Dockers and Kubernetes would be good to have.
  • Knowledge of Azure functions , Elastic search etc will be good to have.

 

  • Having experience with model versioning (mlflow) and data versioning will be beneficial
  • Having experience with microservices libraries or with python libraries such as flask for hosting ml services and models would be great.
Read more
Users love Cutshort
Read about what our users have to say about finding their next opportunity on Cutshort.
Subodh Popalwar's profile image

Subodh Popalwar

Software Engineer, Memorres
For 2 years, I had trouble finding a company with good work culture and a role that will help me grow in my career. Soon after I started using Cutshort, I had access to information about the work culture, compensation and what each company was clearly offering.
Companies hiring on Cutshort
companies logos

About Scry AI

Founded :
2014
Type
Size
Stage :
Profitable
About

Scry AI invents, designs, and develops cutting-edge technology-based Enterprise solutions powered by Machine Learning, Natural Language Processing, Big Data, and Computer Vision.


Scry AI is an R&D organization leading innovation in business automation technology and has been helping companies and businesses transform how they work.


Catering to core industries like Fintech, Healthcare, Communication, Mobility, and Smart Cities, Scry has invested heavily in R&D to build cutting-edge product suites that address challenges and roadblocks that plague traditional business environments.

Read more
Connect with the team
Profile picture
Siddarth Thakur
Company social profiles
instagramlinkedintwitterfacebook

Similar jobs

Episource
at Episource
11 recruiters
Ahamed Riaz
Posted by Ahamed Riaz
Mumbai
5 - 12 yrs
₹18L - ₹30L / yr
Big Data
skill iconPython
skill iconAmazon Web Services (AWS)
Serverless
DevOps
+4 more

ABOUT EPISOURCE:


Episource has devoted more than a decade in building solutions for risk adjustment to measure healthcare outcomes. As one of the leading companies in healthcare, we have helped numerous clients optimize their medical records, data, analytics to enable better documentation of care for patients with chronic diseases.


The backbone of our consistent success has been our obsession with data and technology. At Episource, all of our strategic initiatives start with the question - how can data be “deployed”? Our analytics platforms and datalakes ingest huge quantities of data daily, to help our clients deliver services. We have also built our own machine learning and NLP platform to infuse added productivity and efficiency into our workflow. Combined, these build a foundation of tools and practices used by quantitative staff across the company.


What’s our poison you ask? We work with most of the popular frameworks and technologies like Spark, Airflow, Ansible, Terraform, Docker, ELK. For machine learning and NLP, we are big fans of keras, spacy, scikit-learn, pandas and numpy. AWS and serverless platforms help us stitch these together to stay ahead of the curve.


ABOUT THE ROLE:


We’re looking to hire someone to help scale Machine Learning and NLP efforts at Episource. You’ll work with the team that develops the models powering Episource’s product focused on NLP driven medical coding. Some of the problems include improving our ICD code recommendations, clinical named entity recognition, improving patient health, clinical suspecting and information extraction from clinical notes.


This is a role for highly technical data engineers who combine outstanding oral and written communication skills, and the ability to code up prototypes and productionalize using a large range of tools, algorithms, and languages. Most importantly they need to have the ability to autonomously plan and organize their work assignments based on high-level team goals.


You will be responsible for setting an agenda to develop and ship data-driven architectures that positively impact the business, working with partners across the company including operations and engineering. You will use research results to shape strategy for the company and help build a foundation of tools and practices used by quantitative staff across the company.


During the course of a typical day with our team, expect to work on one or more projects around the following;


1. Create and maintain optimal data pipeline architectures for ML


2. Develop a strong API ecosystem for ML pipelines


3. Building CI/CD pipelines for ML deployments using Github Actions, Travis, Terraform and Ansible


4. Responsible to design and develop distributed, high volume, high-velocity multi-threaded event processing systems


5. Knowledge of software engineering best practices across the development lifecycle, coding standards, code reviews, source management, build processes, testing, and operations  


6. Deploying data pipelines in production using Infrastructure-as-a-Code platforms

 

7. Designing scalable implementations of the models developed by our Data Science teams  


8. Big data and distributed ML with PySpark on AWS EMR, and more!



BASIC REQUIREMENTS 


  1.  Bachelor’s degree or greater in Computer Science, IT or related fields

  2.  Minimum of 5 years of experience in cloud, DevOps, MLOps & data projects

  3. Strong experience with bash scripting, unix environments and building scalable/distributed systems

  4. Experience with automation/configuration management using Ansible, Terraform, or equivalent

  5. Very strong experience with AWS and Python

  6. Experience building CI/CD systems

  7. Experience with containerization technologies like Docker, Kubernetes, ECS, EKS or equivalent

  8. Ability to build and manage application and performance monitoring processes

Read more
Kloud9 Technologies
Remote only
8 - 14 yrs
₹35L - ₹45L / yr
Google Cloud Platform (GCP)
Agile/Scrum
SQL
skill iconPython
Apache Kafka
+1 more

Senior Data Engineer


Responsibilities:

●      Clean, prepare and optimize data at scale for ingestion and consumption by machine learning models

●      Drive the implementation of new data management projects and re-structure of the current data architecture

●      Implement complex automated workflows and routines using workflow scheduling tools 

●      Build continuous integration, test-driven development and production deployment frameworks

●      Drive collaborative reviews of design, code, test plans and dataset implementation performed by other data engineers in support of maintaining data engineering standards 

●      Anticipate, identify and solve issues concerning data management to improve data quality

●      Design and build reusable components, frameworks and libraries at scale to support machine learning products 

●      Design and implement product features in collaboration with business and Technology stakeholders 

●      Analyze and profile data for the purpose of designing scalable solutions 

●      Troubleshoot complex data issues and perform root cause analysis to proactively resolve product and operational issues

●      Mentor and develop other data engineers in adopting best practices 

●      Able to influence and communicate effectively, both verbally and written, with team members and business stakeholders






Qualifications:

●      8+ years of experience developing scalable Big Data applications or solutions on distributed platforms

●      Experience in Google Cloud Platform (GCP) and good to have other cloud platform tools

●      Experience working with Data warehousing tools, including DynamoDB, SQL, and Snowflake

●      Experience architecting data products in Streaming, Serverless and Microservices Architecture and platform.

●      Experience with Spark (Scala/Python/Java) and Kafka

●      Work experience with using Databricks (Data Engineering and Delta Lake components

●      Experience working with Big Data platforms, including Dataproc, Data Bricks etc

●      Experience working with distributed technology tools including Spark, Presto, Databricks, Airflow

●      Working knowledge of Data warehousing, Data modeling


●      Experience working in Agile and Scrum development process

●      Bachelor's degree in Computer Science, Information Systems, Business, or other relevant subject area


Role:

Senior Data Engineer

Total No. of Years:

8+ years of relevant experience 

To be onboarded by:

Immediate

Notice Period:

 

Skills

Mandatory / Desirable

Min years (Project Exp)

Max years (Project Exp)

GCP Exposure 

Mandatory Min 3 to 7

BigQuery, Dataflow, Dataproc, AI Building Blocks, Looker, Cloud Data Fusion, Dataprep .Spark and PySpark

Mandatory Min 5 to 9

Relational SQL

Mandatory Min 4 to 8

Shell scripting language 

Mandatory Min 4 to 8

Python /scala language 

Mandatory Min 4 to 8

Airflow/Kubeflow workflow scheduling tool 

Mandatory Min 3 to 7

Kubernetes

Desirable 1 to 6

Scala

Mandatory Min 2 to 6

Databricks

Desirable Min 1 to 6

Google Cloud Functions

Mandatory Min 2 to 6

GitHub source control tool 

Mandatory Min 4 to 8

Machine Learning

Desirable 1 to 6

Deep Learning

Desirable Min 1to 6

Data structures and algorithms

Mandatory Min 4 to 8

Read more
Agiletech Info Solutions pvt ltd
Chennai
4 - 8 yrs
₹4L - ₹15L / yr
ETL
Informatica
Data Warehouse (DWH)
Spark
SQL
+1 more
We are looking for a Data Engineer to join our growing team of analytics experts. The hire will be responsible for expanding and optimizing our data and data pipeline architecture, as well as optimizing data flow and collection for cross functional teams. The ideal candidate is an experienced data pipeline builder and data wrangler who enjoy optimizing data systems and building them from the ground up.

The Data Engineer will support our software developers, database architects, data analysts and data scientists on data initiatives and will ensure optimal data delivery architecture is consistent throughout ongoing projects. They must be self-directed and comfortable supporting the data needs of multiple teams, systems and products.
Responsibilities for Data Engineer
• Create and maintain optimal data pipeline architecture,
• Assemble large, complex data sets that meet functional / non-functional business requirements.
• Identify, design, and implement internal process improvements: automating manual processes,
optimizing data delivery, re-designing infrastructure for greater scalability, etc.
• Build the infrastructure required for optimal extraction, transformation, and loading of data
from a wide variety of data sources using SQL and AWS big data technologies.
• Build analytics tools that utilize the data pipeline to provide actionable insights into customer
acquisition, operational efficiency and other key business performance metrics.
• Work with stakeholders including the Executive, Product, Data and Design teams to assist with
data-related technical issues and support their data infrastructure needs.
• Create data tools for analytics and data scientist team members that assist them in building and
optimizing our product into an innovative industry leader.
• Work with data and analytics experts to strive for greater functionality in our data systems.
Qualifications for Data Engineer
• Experience building and optimizing big data ETL pipelines, architectures and data sets.
• Advanced working SQL knowledge and experience working with relational databases, query
authoring (SQL) as well as working familiarity with a variety of databases.
• Experience performing root cause analysis on internal and external data and processes to
answer specific business questions and identify opportunities for improvement.
• Strong analytic skills related to working with unstructured datasets.
• Build processes supporting data transformation, data structures, metadata, dependency and
workload management.
• A successful history of manipulating, processing and extracting value from large disconnected
datasets.
Read more
InnovAccer
at InnovAccer
3 recruiters
Jyoti Kaushik
Posted by Jyoti Kaushik
Noida, Bengaluru (Bangalore), Pune, Hyderabad
4 - 7 yrs
₹4L - ₹16L / yr
ETL
SQL
Data Warehouse (DWH)
Informatica
Datawarehousing
+2 more

We are looking for a Senior Data Engineer to join the Customer Innovation team, who will be responsible for acquiring, transforming, and integrating customer data onto our Data Activation Platform from customers’ clinical, claims, and other data sources. You will work closely with customers to build data and analytics solutions to support their business needs, and be the engine that powers the partnership that we build with them by delivering high-fidelity data assets.

In this role, you will work closely with our Product Managers, Data Scientists, and Software Engineers to build the solution architecture that will support customer objectives. You'll work with some of the brightest minds in the industry, work with one of the richest healthcare data sets in the world, use cutting-edge technology, and see your efforts affect products and people on a regular basis. The ideal candidate is someone that

  • Has healthcare experience and is passionate about helping heal people,
  • Loves working with data,
  • Has an obsessive focus on data quality,
  • Is comfortable with ambiguity and making decisions based on available data and reasonable assumptions,
  • Has strong data interrogation and analysis skills,
  • Defaults to written communication and delivers clean documentation, and,
  • Enjoys working with customers and problem solving for them.

A day in the life at Innovaccer:

  • Define the end-to-end solution architecture for projects by mapping customers’ business and technical requirements against the suite of Innovaccer products and Solutions.
  • Measure and communicate impact to our customers.
  • Enabling customers on how to activate data themselves using SQL, BI tools, or APIs to solve questions they have at speed.

What You Need:

  • 4+ years of experience in a Data Engineering role, a Graduate degree in Computer Science, Statistics, Informatics, Information Systems, or another quantitative field.
  • 4+ years of experience working with relational databases like Snowflake, Redshift, or Postgres.
  • Intermediate to advanced level SQL programming skills.
  • Data Analytics and Visualization (using tools like PowerBI)
  • The ability to engage with both the business and technical teams of a client - to document and explain technical problems or concepts in a clear and concise way.
  • Ability to work in a fast-paced and agile environment.
  • Easily adapt and learn new things whether it’s a new library, framework, process, or visual design concept.

What we offer:

  • Industry certifications: We want you to be a subject matter expert in what you do. So, whether it’s our product or our domain, we’ll help you dive in and get certified.
  • Quarterly rewards and recognition programs: We foster learning and encourage people to take risks. We recognize and reward your hard work.
  • Health benefits: We cover health insurance for you and your loved ones.
  • Sabbatical policy: We encourage people to take time off and rejuvenate, learn new skills, and pursue their interests so they can generate new ideas with Innovaccer.
  • Pet-friendly office and open floor plan: No boring cubicles.
Read more
RealPage, Inc.
Agency job
via Beyond Pinks by Shailaja Maddala
Hyderabad
4 - 7 yrs
₹5L - ₹15L / yr
SQL
skill iconData Analytics
SQL Azure
DevOps
JIRA
+2 more
Experience with running SQL queries and data reporting
 Experience with Agile development and software such as Azure DevOps or JIRA. Product
Owner certification is a plus
 Experience with global teams
 Bachelors required. CS degree preferred
Read more
Cloud infrastructure solutions and support company. (SE1)
Cloud infrastructure solutions and support company. (SE1)
Agency job
via Multi Recruit by Ranjini A R
Pune
2 - 6 yrs
₹12L - ₹16L / yr
SQL
ETL
Data engineering
Big Data
skill iconJava
+2 more
  • Design, create, test, and maintain data pipeline architecture in collaboration with the Data Architect.
  • Build the infrastructure required for extraction, transformation, and loading of data from a wide variety of data sources using Java, SQL, and Big Data technologies.
  • Support the translation of data needs into technical system requirements. Support in building complex queries required by the product teams.
  • Build data pipelines that clean, transform, and aggregate data from disparate sources
  • Develop, maintain and optimize ETLs to increase data accuracy, data stability, data availability, and pipeline performance.
  • Engage with Product Management and Business to deploy and monitor products/services on cloud platforms.
  • Stay up-to-date with advances in data persistence and big data technologies and run pilots to design the data architecture to scale with the increased data sets of consumer experience.
  • Handle data integration, consolidation, and reconciliation activities for digital consumer / medical products.

Job Qualifications:

  • Bachelor’s or master's degree in Computer Science, Information management, Statistics or related field
  • 5+ years of experience in the Consumer or Healthcare industry in an analytical role with a focus on building on data pipelines, querying data, analyzing, and clearly presenting analyses to members of the data science team.
  • Technical expertise with data models, data mining.
  • Hands-on Knowledge of programming languages in Java, Python, R, and Scala.
  • Strong knowledge in Big data tools like the snowflake, AWS Redshift, Hadoop, map-reduce, etc.
  • Having knowledge in tools like AWS Glue, S3, AWS EMR, Streaming data pipelines, Kafka/Kinesis is desirable.
  • Hands-on knowledge in SQL and No-SQL database design.
  • Having knowledge in CI/CD for the building and hosting of the solutions.
  • Having AWS certification is an added advantage.
  • Having Strong knowledge in visualization tools like Tableau, QlikView is an added advantage
  • A team player capable of working and integrating across cross-functional teams for implementing project requirements. Experience in technical requirements gathering and documentation.
  • Ability to work effectively and independently in a fast-paced agile environment with tight deadlines
  • A flexible, pragmatic, and collaborative team player with the innate ability to engage with data architects, analysts, and scientists
Read more
MNC
MNC
Agency job
via Fragma Data Systems by geeti gaurav mohanty
Bengaluru (Bangalore)
3 - 5 yrs
₹6L - ₹12L / yr
Spark
Big Data
Data engineering
Hadoop
Apache Kafka
+5 more
Data Engineer

• Drive the data engineering implementation
• Strong experience in building data pipelines
• AWS stack experience is must
• Deliver Conceptual, Logical and Physical data models for the implementation
teams.

• SQL stronghold is must. Advanced SQL working knowledge and experience
working with a variety of relational databases, SQL query authoring
• AWS Cloud data pipeline experience is must. Data pipelines and data centric
applications using distributed storage platforms like S3 and distributed processing
platforms like Spark, Airflow, Kafka
• Working knowledge of AWS technologies such as S3, EC2, EMR, RDS, Lambda,
Elasticsearch
• Ability to use a major programming (e.g. Python /Java) to process data for
modelling.
Read more
Indium Software
at Indium Software
16 recruiters
Mohammed Shabeer
Posted by Mohammed Shabeer
Remote only
2 - 3 yrs
₹5L - ₹8L / yr
skill iconData Analytics
data analyst
Apache Synapse
SQL
SAP MDG ( Master Data Governance)
+1 more
The Data Analyst in the CoE will provide end to end solution development, working in conjunction with the Domain Leads and Technology Partners. He is responsible for the delivery of solutions and solution changes which are driven by the business requirements as well as providing technical and development capabilities. Knowledge
Read more
Numantra Technologies
at Numantra Technologies
2 recruiters
nisha mattas
Posted by nisha mattas
Remote, Mumbai, powai
2 - 12 yrs
₹8L - ₹18L / yr
ADF
PySpark
Jupyter Notebook
Big Data
Windows Azure
+3 more
      • Data pre-processing, data transformation, data analysis, and feature engineering
      • Performance optimization of scripts (code) and Productionizing of code (SQL, Pandas, Python or PySpark, etc.)
    • Required skills:
      • Bachelors in - in Computer Science, Data Science, Computer Engineering, IT or equivalent
      • Fluency in Python (Pandas), PySpark, SQL, or similar
      • Azure data factory experience (min 12 months)
      • Able to write efficient code using traditional, OO concepts, modular programming following the SDLC process.
      • Experience in production optimization and end-to-end performance tracing (technical root cause analysis)
      • Ability to work independently with demonstrated experience in project or program management
      • Azure experience ability to translate data scientist code in Python and make it efficient (production) for cloud deployment
 
Read more
Product Based MNC
Product Based MNC
Agency job
via I Squaresoft by Madhusudhan R
Remote, Bengaluru (Bangalore)
5 - 9 yrs
₹5L - ₹20L / yr
Apache Spark
Python
skill iconAmazon Web Services (AWS)
SQL

 

Job Description

Role requires experience in AWS and also programming experience in Python and Spark

Roles & Responsibilities

You Will:

  • Translate functional requirements into technical design
  • Interact with clients and internal stakeholders to understand the data and platform requirements in detail and determine core cloud services needed to fulfil the technical design
  • Design, Develop and Deliver data integration interfaces in the AWS
  • Design, Develop and Deliver data provisioning interfaces to fulfil consumption needs
  • Deliver data models on Cloud platform, it could be on AWS Redshift, SQL.
  • Design, Develop and Deliver data integration interfaces at scale using Python / Spark 
  • Automate core activities to minimize the delivery lead times and improve the overall quality
  • Optimize platform cost by selecting right platform services and architecting the solution in a cost-effective manner
  • Manage code and deploy DevOps and CI CD processes
  • Deploy logging and monitoring across the different integration points for critical alerts

You Have:

  • Minimum 5 years of software development experience
  • Bachelor's and/or Master’s degree in computer science
  • Strong Consulting skills in data management including data governance, data quality, security, data integration, processing and provisioning
  • Delivered data management projects in any of the AWS
  • Translated complex analytical requirements into technical design including data models, ETLs and Dashboards / Reports
  • Experience deploying dashboards and self-service analytics solutions on both relational and non-relational databases
  • Experience with different computing paradigms in databases such as In-Memory, Distributed, Massively Parallel Processing
  • Successfully delivered large scale data management initiatives covering Plan, Design, Build and Deploy phases leveraging different delivery methodologies including Agile
  • Strong knowledge of continuous integration, static code analysis and test-driven development
  • Experience in delivering projects in a highly collaborative delivery model with teams at onsite and offshore
  • Must have Excellent analytical and problem-solving skills
  • Delivered change management initiatives focused on driving data platforms adoption across the enterprise
  • Strong verbal and written communications skills are a must, as well as the ability to work effectively across internal and external organizations

 

Read more
Why apply to jobs via Cutshort
people_solving_puzzle
Personalized job matches
Stop wasting time. Get matched with jobs that meet your skills, aspirations and preferences.
people_verifying_people
Verified hiring teams
See actual hiring teams, find common social connections or connect with them directly. No 3rd party agencies here.
ai_chip
Move faster with AI
We use AI to get you faster responses, recommendations and unmatched user experience.
21,01,133
Matches delivered
37,12,187
Network size
15,000
Companies hiring
Did not find a job you were looking for?
icon
Search for relevant jobs from 10000+ companies such as Google, Amazon & Uber actively hiring on Cutshort.
companies logo
companies logo
companies logo
companies logo
companies logo
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort
Users love Cutshort
Read about what our users have to say about finding their next opportunity on Cutshort.
Subodh Popalwar's profile image

Subodh Popalwar

Software Engineer, Memorres
For 2 years, I had trouble finding a company with good work culture and a role that will help me grow in my career. Soon after I started using Cutshort, I had access to information about the work culture, compensation and what each company was clearly offering.
Companies hiring on Cutshort
companies logos