AWS Simple Queuing Service (SQS) Jobs in Pune

11+ AWS Simple Queuing Service (SQS) Jobs in Pune | AWS Simple Queuing Service (SQS) Job openings in Pune

Apply to 11+ AWS Simple Queuing Service (SQS) Jobs in Pune on CutShort.io. Explore the latest AWS Simple Queuing Service (SQS) Job opportunities across top companies like Google, Amazon & Adobe.

Aws simple queuing service sqs jobs in other cities

AWS Simple Queuing Service (SQS) Jobs AWS Simple Queuing Service (SQS) Jobs in Ahmedabad AWS Simple Queuing Service (SQS) Jobs in Bangalore (Bengaluru)AWS Simple Queuing Service (SQS) Jobs in Chennai AWS Simple Queuing Service (SQS) Jobs in Coimbatore AWS Simple Queuing Service (SQS) Jobs in Delhi, NCR and Gurgaon AWS Simple Queuing Service (SQS) Jobs in Hyderabad AWS Simple Queuing Service (SQS) Jobs in Mumbai

Jobs by Category

Fullstack Developer Jobs Backend Developer Jobs Frontend Developer Jobs Android Developer Jobs iOS Developer Jobs DevOps Jobs Data Science Jobs

Business Developer Jobs Digital Marketing Jobs Sales Jobs

UX Designer Jobs Graphic Designer Jobs

Jobs by Location

Startup Jobs in Bangalore Startup Jobs in Pune Startup Jobs in Delhi All Startup jobs

Collections

Funded Startup Jobs Product Startup Jobs

Data Engineer

at consulting & implementation services in the area of Oil & Gas, Mining and Manufacturing Industry

Agency job

via Jobdost by Sathish Kumar

Ahmedabad, Hyderabad, Pune, Delhi

5 - 7 yrs

₹18L - ₹25L / yr

AWS Lambda

AWS Simple Notification Service (SNS)

AWS Simple Queuing Service (SQS)

Python

PySpark

+9 more

Data Engineer

Required skill set: AWS GLUE, AWS LAMBDA, AWS SNS/SQS, AWS ATHENA, SPARK, SNOWFLAKE, PYTHON

Mandatory Requirements 

Experience in AWS Glue
Experience in Apache Parquet 
Proficient in AWS S3 and data lake 
Knowledge of Snowflake
Understanding of file-based ingestion best practices.
Scripting language - Python & pyspark

CORE RESPONSIBILITIES

Create and manage cloud resources in AWS 
Data ingestion from different data sources which exposes data using different technologies, such as: RDBMS, REST HTTP API, flat files, Streams, and Time series data based on various proprietary systems. Implement data ingestion and processing with the help of Big Data technologies 
Data processing/transformation using various technologies such as Spark and Cloud Services. You will need to understand your part of business logic and implement it using the language supported by the base data platform 
Develop automated data quality check to make sure right data enters the platform and verifying the results of the calculations 
Develop an infrastructure to collect, transform, combine and publish/distribute customer data.
Define process improvement opportunities to optimize data collection, insights and displays.
Ensure data and results are accessible, scalable, efficient, accurate, complete and flexible 
Identify and interpret trends and patterns from complex data sets 
Construct a framework utilizing data visualization tools and techniques to present consolidated analytical and actionable results to relevant stakeholders. 
Key participant in regular Scrum ceremonies with the agile teams  
Proficient at developing queries, writing reports and presenting findings 
Mentor junior members and bring best industry practices

 QUALIFICATIONS

5-7+ years’ experience as data engineer in consumer finance or equivalent industry (consumer loans, collections, servicing, optional product, and insurance sales) 
Strong background in math, statistics, computer science, data science or related discipline
Advanced knowledge one of language: Java, Scala, Python, C# 
Production experience with: HDFS, YARN, Hive, Spark, Kafka, Oozie / Airflow, Amazon Web Services (AWS), Docker / Kubernetes, Snowflake  
Proficient with
Data mining/programming tools (e.g. SAS, SQL, R, Python)
Database technologies (e.g. PostgreSQL, Redshift, Snowflake. and Greenplum)
Data visualization (e.g. Tableau, Looker, MicroStrategy)
Comfortable learning about and deploying new technologies and tools. 
Organizational skills and the ability to handle multiple projects and priorities simultaneously and meet established deadlines. 
Good written and oral communication skills and ability to present results to non-technical audiences 
Knowledge of business intelligence and analytical tools, technologies and techniques.

Familiarity and experience in the following is a plus: 

AWS certification
Spark Streaming 
Kafka Streaming / Kafka Connect 
ELK Stack 
Cassandra / MongoDB 
CI/CD: Jenkins, GitLab, Jira, Confluence other related tools

Data Engineer

Required skill set: AWS GLUE, AWS LAMBDA, AWS SNS/SQS, AWS ATHENA, SPARK, SNOWFLAKE, PYTHON

Mandatory Requirements 

Experience in AWS Glue
Experience in Apache Parquet 
Proficient in AWS S3 and data lake 
Knowledge of Snowflake
Understanding of file-based ingestion best practices.
Scripting language - Python & pyspark

CORE RESPONSIBILITIES

Create and manage cloud resources in AWS 
Data ingestion from different data sources which exposes data using different technologies, such as: RDBMS, REST HTTP API, flat files, Streams, and Time series data based on various proprietary systems. Implement data ingestion and processing with the help of Big Data technologies 
Data processing/transformation using various technologies such as Spark and Cloud Services. You will need to understand your part of business logic and implement it using the language supported by the base data platform 
Develop automated data quality check to make sure right data enters the platform and verifying the results of the calculations 
Develop an infrastructure to collect, transform, combine and publish/distribute customer data.
Define process improvement opportunities to optimize data collection, insights and displays.
Ensure data and results are accessible, scalable, efficient, accurate, complete and flexible 
Identify and interpret trends and patterns from complex data sets 
Construct a framework utilizing data visualization tools and techniques to present consolidated analytical and actionable results to relevant stakeholders. 
Key participant in regular Scrum ceremonies with the agile teams  
Proficient at developing queries, writing reports and presenting findings 
Mentor junior members and bring best industry practices

 QUALIFICATIONS

5-7+ years’ experience as data engineer in consumer finance or equivalent industry (consumer loans, collections, servicing, optional product, and insurance sales) 
Strong background in math, statistics, computer science, data science or related discipline
Advanced knowledge one of language: Java, Scala, Python, C# 
Production experience with: HDFS, YARN, Hive, Spark, Kafka, Oozie / Airflow, Amazon Web Services (AWS), Docker / Kubernetes, Snowflake  
Proficient with
Data mining/programming tools (e.g. SAS, SQL, R, Python)
Database technologies (e.g. PostgreSQL, Redshift, Snowflake. and Greenplum)
Data visualization (e.g. Tableau, Looker, MicroStrategy)
Comfortable learning about and deploying new technologies and tools. 
Organizational skills and the ability to handle multiple projects and priorities simultaneously and meet established deadlines. 
Good written and oral communication skills and ability to present results to non-technical audiences 
Knowledge of business intelligence and analytical tools, technologies and techniques.

Familiarity and experience in the following is a plus: 

AWS certification
Spark Streaming 
Kafka Streaming / Kafka Connect 
ELK Stack 
Cassandra / MongoDB 
CI/CD: Jenkins, GitLab, Jira, Confluence other related tools

Data Analyst

at MSMEx

6 recruiters

Posted by Sujata Ranjan

Remote, Mumbai, Pune

4 - 6 yrs

₹5L - ₹12L / yr

Data Analytics

Data Analysis

Data Analyst

SQL

Python

+4 more

We are looking for a Data Analyst that oversees organisational data analytics. This will require you to design and help implement the data analytics platform that will keep the organisation running. The team will be the go-to for all data needs for the app and we are looking for a self-starter who is hands on and yet able to abstract problems and anticipate data requirements.
This person should be very strong technical data analyst who can design and implement data systems on his own. Along with him, he also needs to be proficient in business reporting and should have keen interest in provided data needed for business.

Tools familiarity: SQL, Python, Mix panel, Metabase, Google Analytics, Clever Tap, App Analytics

Responsibilities

Processes and frameworks for metrics, analytics, experimentation and user insights, lead the data analytics team
Metrics alignment across teams to make them actionable and promote accountability
Data based frameworks for assessing and strengthening Product Market Fit
Identify viable growth strategies through data and experimentation
Experimentation for product optimisation and understanding user behaviour
Structured approach towards deriving user insights, answer questions using data
This person needs to closely work with Technical and Business teams to get this implemented.

Skills

4 to 6 years at a relevant role in data analytics in a Product Oriented company
Highly organised, technically sound & good at communication
Ability to handle & build for cross functional data requirements / interactions with teams
Great with Python, SQL
Can build, mentor a team
Knowledge of key business metrics like cohort, engagement cohort, LTV, ROAS, ROE

Eligibility

BTech or MTech in Computer Science/Engineering from a Tier1, Tier2 colleges

Good knowledge on Data Analytics, Data Visualization tools. A formal certification would be added advantage.

We are more interested in what you CAN DO than your location, education, or experience levels.

Send us your code samples / GitHub profile / published articles if applicable.

Tools familiarity: SQL, Python, Mix panel, Metabase, Google Analytics, Clever Tap, App Analytics

Responsibilities

Processes and frameworks for metrics, analytics, experimentation and user insights, lead the data analytics team
Metrics alignment across teams to make them actionable and promote accountability
Data based frameworks for assessing and strengthening Product Market Fit
Identify viable growth strategies through data and experimentation
Experimentation for product optimisation and understanding user behaviour
Structured approach towards deriving user insights, answer questions using data
This person needs to closely work with Technical and Business teams to get this implemented.

Skills

4 to 6 years at a relevant role in data analytics in a Product Oriented company
Highly organised, technically sound & good at communication
Ability to handle & build for cross functional data requirements / interactions with teams
Great with Python, SQL
Can build, mentor a team
Knowledge of key business metrics like cohort, engagement cohort, LTV, ROAS, ROE

Eligibility

BTech or MTech in Computer Science/Engineering from a Tier1, Tier2 colleges

Good knowledge on Data Analytics, Data Visualization tools. A formal certification would be added advantage.

We are more interested in what you CAN DO than your location, education, or experience levels.

Send us your code samples / GitHub profile / published articles if applicable.

SDE III Machine Learning

at MindTickle

1 video

11 recruiters

Posted by Shama Afroj

Pune, Bengaluru (Bangalore)

6 - 10 yrs

₹30L - ₹65L / yr

Machine Learning (ML)

Data Science

Natural Language Processing (NLP)

Computer Vision

recommendation algorithm

+6 more

About Us

Mindtickle provides a comprehensive, data-driven solution for sales readiness and enablement that fuels revenue growth and brand value for dozens of Fortune 500 and Global 2000 companies and hundreds of the world’s most recognized companies across technology, life sciences, financial services, manufacturing, and service sectors.

With purpose-built applications, proven methodologies, and best practices designed to drive effective sales onboarding and ongoing readiness, mindtickle enables company leaders and sellers to continually assess, diagnose and develop the knowledge, skills, and behaviors required to engage customers and drive growth effectively. We are funded by great investors, like – Softbank, Canaan partners, NEA, Accel Partners, and others.

Job Brief

We are looking for a rockstar researcher at the Center of Excellence for Machine Learning. You are responsible for thinking outside the box, crafting new algorithms, developing end-to-end artificial intelligence-based solutions, and rightly selecting the most appropriate architecture for the system(s), such that it suits the business needs, and achieves the desired results under given constraints.

Credibility:

You must have a proven track record in research and development with adequate publication/patenting and/or academic credentials in data science.
You have the ability to directly connect business problems to research problems along with the latest emerging technologies.

Strategic Responsibility:

To perform the following: understanding problem statements, connecting the dots between high-level business statements and deep technology algorithms, crafting new systems and methods in the space of structured data mining, natural language processing, computer vision, speech technologies, robotics or Internet of things etc.
To be responsible for end-to-end production level coding with data science and machine learning algorithms, unit and integration testing, deployment, optimization and fine-tuning of models on cloud, desktop, mobile or edge etc.
To learn in a continuous mode, upgrade and upskill along with publishing novel articles in journals and conference proceedings and/or filing patents, and be involved in evangelism activities and ecosystem development etc.
To share knowledge, mentor colleagues, partners, and customers, take sessions on artificial intelligence topics both online or in-person, participate in workshops, conferences, seminars/webinars as a speaker, instructor, demonstrator or jury member etc.
To design and develop high-volume, low-latency applications for mission-critical systems and deliver high availability and performance.
To collaborate within the product streams and team to bring best practices and leverage world-class tech stack.
To set up every essentials (tracking / alerting) to make sure the infrastructure / software built is working as expected.
To search, collect and clean Data for analysis and setting up efficient storage and retrieval pipelines.

Personality:

Requires excellent communication skills – written, verbal, and presentation.
You should be a team player.
You should be positive towards problem-solving and have a very structured thought process to solve problems.
You should be agile enough to learn new technology if needed.

Qualifications:

B Tech / BS / BE / M Tech / MS / ME in CS or equivalent from Tier I / II or Top Tier Engineering Colleges and Universities.
6+ years of strong software (application or infrastructure) development experience and software engineering skills (Python, R, C, C++ / Java / Scala / Golang).
Deep expertise and practical knowledge of operating systems, MySQL and NoSQL databases(Redis/couchbase/mongodb/ES or any graphDB).
Good understanding of Machine Learning Algorithms, Linear Algebra and Statistics.
Working knowledge of Amazon Web Services(AWS).
Experience with Docker and Kubernetes will be a plus.
Experience with Natural Language Processing, Recommendation Systems, or Search Engines.

Our Culture

As an organization, it’s our priority to create a highly engaging and rewarding workplace. We offer tons of awesome perks, great learning opportunities & growth.

Our culture reflects the globally diverse backgrounds of our employees along with our commitment to our customers, each other, and a passion for excellence.

To know more about us, feel free to go through these videos:

1. Sales Readiness Explained: https://www.youtube.com/watch?v=XyMJj9AlNww&;t=6s

2. What We Do: https://www.youtube.com/watch?v=jv3Q2XgnkBY

3. Ready to Close More Deals, Faster: https://www.youtube.com/watch?v=nB0exreVU-s

To view more videos, please access the below-mentioned link:

https://www.youtube.com/c/mindtickle/videos

Mindtickle is proud to be an Equal Opportunity Employer

All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, disability, protected veteran status, or any other characteristic protected by law.

Your Right to Work - In compliance with applicable laws, all persons hired will be required to verify identity and eligibility to work in the respective work locations and to complete the required employment eligibility verification document form upon hire.

About Us

Job Brief

Credibility:

You must have a proven track record in research and development with adequate publication/patenting and/or academic credentials in data science.
You have the ability to directly connect business problems to research problems along with the latest emerging technologies.

Strategic Responsibility:

To perform the following: understanding problem statements, connecting the dots between high-level business statements and deep technology algorithms, crafting new systems and methods in the space of structured data mining, natural language processing, computer vision, speech technologies, robotics or Internet of things etc.
To be responsible for end-to-end production level coding with data science and machine learning algorithms, unit and integration testing, deployment, optimization and fine-tuning of models on cloud, desktop, mobile or edge etc.
To learn in a continuous mode, upgrade and upskill along with publishing novel articles in journals and conference proceedings and/or filing patents, and be involved in evangelism activities and ecosystem development etc.
To share knowledge, mentor colleagues, partners, and customers, take sessions on artificial intelligence topics both online or in-person, participate in workshops, conferences, seminars/webinars as a speaker, instructor, demonstrator or jury member etc.
To design and develop high-volume, low-latency applications for mission-critical systems and deliver high availability and performance.
To collaborate within the product streams and team to bring best practices and leverage world-class tech stack.
To set up every essentials (tracking / alerting) to make sure the infrastructure / software built is working as expected.
To search, collect and clean Data for analysis and setting up efficient storage and retrieval pipelines.

Personality:

Requires excellent communication skills – written, verbal, and presentation.
You should be a team player.
You should be positive towards problem-solving and have a very structured thought process to solve problems.
You should be agile enough to learn new technology if needed.

Qualifications:

B Tech / BS / BE / M Tech / MS / ME in CS or equivalent from Tier I / II or Top Tier Engineering Colleges and Universities.
6+ years of strong software (application or infrastructure) development experience and software engineering skills (Python, R, C, C++ / Java / Scala / Golang).
Deep expertise and practical knowledge of operating systems, MySQL and NoSQL databases(Redis/couchbase/mongodb/ES or any graphDB).
Good understanding of Machine Learning Algorithms, Linear Algebra and Statistics.
Working knowledge of Amazon Web Services(AWS).
Experience with Docker and Kubernetes will be a plus.
Experience with Natural Language Processing, Recommendation Systems, or Search Engines.

Our Culture

As an organization, it’s our priority to create a highly engaging and rewarding workplace. We offer tons of awesome perks, great learning opportunities & growth.

Our culture reflects the globally diverse backgrounds of our employees along with our commitment to our customers, each other, and a passion for excellence.

To know more about us, feel free to go through these videos:

1. Sales Readiness Explained: https://www.youtube.com/watch?v=XyMJj9AlNww&;t=6s

2. What We Do: https://www.youtube.com/watch?v=jv3Q2XgnkBY

3. Ready to Close More Deals, Faster: https://www.youtube.com/watch?v=nB0exreVU-s

To view more videos, please access the below-mentioned link:

https://www.youtube.com/c/mindtickle/videos

Mindtickle is proud to be an Equal Opportunity Employer

Big Data Architect

at Persistent Systems

1 video

1 recruiter

Agency job

via Milestone Hr Consultancy by Haina khan

Bengaluru (Bangalore), Hyderabad, Pune

9 - 16 yrs

₹7L - ₹32L / yr

Big Data

Scala

Spark

Hadoop

Python

+1 more

Greetings..

We have urgent requirement for the post of Big Data Architect in reputed MNC company

Location: Pune/Nagpur,Goa,Hyderabad/Bangalore

Job Requirements:

9 years and above of total experience preferably in bigdata space.
Creating spark applications using Scala to process data.
Experience in scheduling and troubleshooting/debugging Spark jobs in steps.
Experience in spark job performance tuning and optimizations.
Should have experience in processing data using Kafka/Pyhton.
Individual should have experience and understanding in configuring Kafka topics to optimize the performance.
Should be proficient in writing SQL queries to process data in Data Warehouse.
Hands on experience in working with Linux commands to troubleshoot/debug issues and creating shell scripts to automate tasks.
Experience on AWS services like EMR.

Greetings..

We have urgent requirement for the post of Big Data Architect in reputed MNC company

Location: Pune/Nagpur,Goa,Hyderabad/Bangalore

Job Requirements:

9 years and above of total experience preferably in bigdata space.
Creating spark applications using Scala to process data.
Experience in scheduling and troubleshooting/debugging Spark jobs in steps.
Experience in spark job performance tuning and optimizations.
Should have experience in processing data using Kafka/Pyhton.
Individual should have experience and understanding in configuring Kafka topics to optimize the performance.
Should be proficient in writing SQL queries to process data in Data Warehouse.
Hands on experience in working with Linux commands to troubleshoot/debug issues and creating shell scripts to automate tasks.
Experience on AWS services like EMR.

Kafka Developer

at DataMetica

1 video

7 recruiters

Posted by Nikita Aher

Pune, Hyderabad

3 - 12 yrs

₹5L - ₹25L / yr

Apache Kafka

Big Data

Hadoop

Apache Hive

Java

+1 more

Summary
Our Kafka developer has a combination of technical skills, communication skills and business knowledge. The developer should be able to work on multiple medium to large projects. The successful candidate will have excellent technical skills of Apache/Confluent Kafka, Enterprise Data WareHouse preferable GCP BigQuery or any equivalent Cloud EDW and also will be able to take oral and written business requirements and develop efficient code to meet set deliverables.

Must Have Skills

Participate in the development, enhancement and maintenance of data applications both as an individual contributor and as a lead.
Leading in the identification, isolation, resolution and communication of problems within the production environment.
Leading developer and applying technical skills Apache/Confluent Kafka (Preferred) AWS Kinesis (Optional), Cloud Enterprise Data Warehouse Google BigQuery (Preferred) or AWS RedShift or SnowFlakes (Optional)
Design recommending best approach suited for data movement from different sources to Cloud EDW using Apache/Confluent Kafka
Performs independent functional and technical analysis for major projects supporting several corporate initiatives.
Communicate and Work with IT partners and user community with various levels from Sr Management to detailed developer to business SME for project definition .
Works on multiple platforms and multiple projects concurrently.
Performs code and unit testing for complex scope modules, and projects
Provide expertise and hands on experience working on Kafka connect using schema registry in a very high volume environment (~900 Million messages)

Provide expertise in Kafka brokers, zookeepers, KSQL, KStream and Kafka Control center.
Provide expertise and hands on experience working on AvroConverters, JsonConverters, and StringConverters.
Provide expertise and hands on experience working on Kafka connectors such as MQ connectors, Elastic Search connectors, JDBC connectors, File stream connector, JMS source connectors, Tasks, Workers, converters, Transforms.
Provide expertise and hands on experience on custom connectors using the Kafka core concepts and API.
Working knowledge on Kafka Rest proxy.
Ensure optimum performance, high availability and stability of solutions.
Create topics, setup redundancy cluster, deploy monitoring tools, alerts and has good knowledge of best practices.
Create stubs for producers, consumers and consumer groups for helping onboard applications from different languages/platforms. Leverage Hadoop ecosystem knowledge to design, and develop capabilities to deliver our solutions using Spark, Scala, Python, Hive, Kafka and other things in the Hadoop ecosystem.
Use automation tools like provisioning using Jenkins, Udeploy or relevant technologies
Ability to perform data related benchmarking, performance analysis and tuning.
Strong skills in In-memory applications, Database Design, Data Integration.

Must Have Skills

Participate in the development, enhancement and maintenance of data applications both as an individual contributor and as a lead.
Leading in the identification, isolation, resolution and communication of problems within the production environment.
Leading developer and applying technical skills Apache/Confluent Kafka (Preferred) AWS Kinesis (Optional), Cloud Enterprise Data Warehouse Google BigQuery (Preferred) or AWS RedShift or SnowFlakes (Optional)
Design recommending best approach suited for data movement from different sources to Cloud EDW using Apache/Confluent Kafka
Performs independent functional and technical analysis for major projects supporting several corporate initiatives.
Communicate and Work with IT partners and user community with various levels from Sr Management to detailed developer to business SME for project definition .
Works on multiple platforms and multiple projects concurrently.
Performs code and unit testing for complex scope modules, and projects
Provide expertise and hands on experience working on Kafka connect using schema registry in a very high volume environment (~900 Million messages)

Provide expertise in Kafka brokers, zookeepers, KSQL, KStream and Kafka Control center.
Provide expertise and hands on experience working on AvroConverters, JsonConverters, and StringConverters.
Provide expertise and hands on experience working on Kafka connectors such as MQ connectors, Elastic Search connectors, JDBC connectors, File stream connector, JMS source connectors, Tasks, Workers, converters, Transforms.
Provide expertise and hands on experience on custom connectors using the Kafka core concepts and API.
Working knowledge on Kafka Rest proxy.
Ensure optimum performance, high availability and stability of solutions.
Create topics, setup redundancy cluster, deploy monitoring tools, alerts and has good knowledge of best practices.
Create stubs for producers, consumers and consumer groups for helping onboard applications from different languages/platforms. Leverage Hadoop ecosystem knowledge to design, and develop capabilities to deliver our solutions using Spark, Scala, Python, Hive, Kafka and other things in the Hadoop ecosystem.
Use automation tools like provisioning using Jenkins, Udeploy or relevant technologies
Ability to perform data related benchmarking, performance analysis and tuning.
Strong skills in In-memory applications, Database Design, Data Integration.

Data Engineer

at Mobile Programming LLC

1 video

34 recruiters

Posted by Apurva kalsotra

Mohali, Gurugram, Pune, Bengaluru (Bangalore), Hyderabad, Chennai

3 - 8 yrs

₹2L - ₹9L / yr

Data engineering

Data engineer

Spark

Apache Spark

Apache Kafka

+13 more

Responsibilities for Data Engineer

Create and maintain optimal data pipeline architecture,
Assemble large, complex data sets that meet functional / non-functional business requirements.
Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS ‘big data’ technologies.
Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
Keep our data separated and secure across national boundaries through multiple data centers and AWS regions.
Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
Work with data and analytics experts to strive for greater functionality in our data systems.

Qualifications for Data Engineer

Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
Strong analytic skills related to working with unstructured datasets.
Build processes supporting data transformation, data structures, metadata, dependency and workload management.
A successful history of manipulating, processing and extracting value from large disconnected datasets.
Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores.
Strong project management and organizational skills.
Experience supporting and working with cross-functional teams in a dynamic environment.
We are looking for a candidate with 5+ years of experience in a Data Engineer role, who has attained a Graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field. They should also have experience using the following software/tools:

Experience with big data tools: Hadoop, Spark, Kafka, etc.
Experience with relational SQL and NoSQL databases, including Postgres and Cassandra.
Experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
Experience with AWS cloud services: EC2, EMR, RDS, Redshift
Experience with stream-processing systems: Storm, Spark-Streaming, etc.
Experience with object-oriented/object function scripting languages: Python, Java, C++, Scala, etc.

Responsibilities for Data Engineer

Create and maintain optimal data pipeline architecture,
Assemble large, complex data sets that meet functional / non-functional business requirements.
Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS ‘big data’ technologies.
Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
Keep our data separated and secure across national boundaries through multiple data centers and AWS regions.
Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
Work with data and analytics experts to strive for greater functionality in our data systems.

Qualifications for Data Engineer

Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
Strong analytic skills related to working with unstructured datasets.
Build processes supporting data transformation, data structures, metadata, dependency and workload management.
A successful history of manipulating, processing and extracting value from large disconnected datasets.
Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores.
Strong project management and organizational skills.
Experience supporting and working with cross-functional teams in a dynamic environment.
We are looking for a candidate with 5+ years of experience in a Data Engineer role, who has attained a Graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field. They should also have experience using the following software/tools:

Experience with big data tools: Hadoop, Spark, Kafka, etc.
Experience with relational SQL and NoSQL databases, including Postgres and Cassandra.
Experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
Experience with AWS cloud services: EC2, EMR, RDS, Redshift
Experience with stream-processing systems: Storm, Spark-Streaming, etc.
Experience with object-oriented/object function scripting languages: Python, Java, C++, Scala, etc.

Director Data Science

at Health Care MNC

Agency job

via Kavayah People Consulting by Kavita Singh

Pune

12 - 24 yrs

₹35L - ₹60L / yr

Data Science

Python

C++

Java

Amazon Web Services (AWS)

+1 more

The Director for Data Science will support building of AI products in Agile fashion that
empower healthcare payers, providers and members to quickly process medical data to
make informed decisions and reduce health care costs. You will be focusing on research,
development, strategy, operations, people management, and being a thought leader for
team members based out of India. You should have professional healthcare experience
using both structured and unstructured data to build applications. These applications
include but are not limited to machine learning, artificial intelligence, optical character
recognition, natural language processing, and integrating processes into the overall AI
pipeline to mine healthcare and medical information with high recall and other relevant
metrics. The results will be used dually for real-time operational processes with both
automated and human-based decision making as well as contribute to reducing
healthcare administrative costs. We work with all major cloud and big data vendors
offerings including (Azure, AWS, Google, IBM, etc.) to achieve our goals in healthcare and
support
The Director, Data Science will have the opportunity to build a team, shape team culture
and operating norms as a result of the fast-paced nature of a new, high-growth
organization.

• Strong communication and presentation skills to convey progress to a diverse group of stakeholders
• Strong expertise in data science, data engineering, software engineering, cloud vendors, big data technologies, real-time streaming applications, DevOps and product delivery
• Experience building stakeholder trust and confidence in deployed models especially via application of the algorithmic bias, interpretable machine learning,
data integrity, data quality, reproducible research and reliable engineering 24x7x365 product availability, scalability
• Expertise in healthcare privacy, federated learning, continuous integration and deployment, DevOps support
• Provide mentoring to data scientists and machine learning engineers as well as career development
• Meet project related team members for individual specific needs on a regular basis related to project/product deliverables
• Provide training and guidance for team members when required
• Provide performance feedback when required by leadership

The Experience You’ll Need (Required):
• MS/M.Tech degree or PhD in Computer Science, Mathematics, Physics or related STEM fields
• Significant healthcare data experience including but not limited to usage of claims data
• Delivered multiple data science and machine learning projects over 8+ years with values exceeding $10 Million or more and has worked on platform members exceeding 10 million lives
• 9+ years of industry experience in data science, machine learning, and artificial intelligence
• Strong expertise in data science, data engineering, software engineering, cloud vendors, big data technologies, real time streaming applications, DevOps, and product delivery
• Knows how to solve and launch real artificial intelligence and data science related problems and products along with managing and coordinating the
business process change, IT / cloud operations, meeting production level code standards
• Ownerships of key workflows part of data science life cycle like data acquisition, data quality, and results
• Experience building stakeholder trust and confidence in deployed models especially via application of algorithmic bias, interpretable machine learning,
data integrity, data quality, reproducible research, and reliable engineering 24x7x365 product availability, scalability
• Expertise in healthcare privacy, federated learning, continuous integration and deployment, DevOps support
• 3+ Years of experience managing directly five (5) or more senior level data scientists, machine learning engineers with advanced degrees and directly
made staff decisions

• Very strong understanding of mathematical concepts including but not limited to linear algebra, advanced calculus, partial differential equations, and
statistics including Bayesian approaches at master’s degree level and above
• 6+ years of programming experience in C++ or Java or Scala and data science programming languages like Python and R including strong understanding of
concepts like data structures, algorithms, compression techniques, high performance computing, distributed computing, and various computer architecture
• Very strong understanding and experience with traditional data science approaches like sampling techniques, feature engineering, classification, and
regressions, SVM, trees, model evaluations with several projects over 3+ years
• Very strong understanding and experience in Natural Language Processing,
reasoning, and understanding, information retrieval, text mining, search, with
3+ years of hands on experience
• Experience with developing and deploying several products in production with
experience in two or more of the following languages (Python, C++, Java, Scala)
• Strong Unix/Linux background and experience with at least one of the
following cloud vendors like AWS, Azure, and Google
• Three plus (3+) years hands on experience with MapR \ Cloudera \ Databricks
Big Data platform with Spark, Hive, Kafka etc.
• Three plus (3+) years of experience with high-performance computing like
Dask, CUDA distributed GPU, TPU etc.
• Presented at major conferences and/or published materials

Data Steward

at Infogain

Agency job

via Technogen India PvtLtd by RAHUL BATTA

NCR (Delhi | Gurgaon | Noida), Bengaluru (Bangalore), Mumbai, Pune

7 - 8 yrs

₹15L - ₹16L / yr

Data steward

MDM

Tamr

Reltio

Data engineering

+7 more

Data Steward :

Data Steward will collaborate and work closely within the group software engineering and business division. Data Steward has overall accountability for the group's / Divisions overall data and reporting posture by responsibly managing data assets, data lineage, and data access, supporting sound data analysis. This role requires focus on data strategy, execution, and support for projects, programs, application enhancements, and production data fixes. Makes well-thought-out decisions on complex or ambiguous data issues and establishes the data stewardship and information management strategy and direction for the group. Effectively communicates to individuals at various levels of the technical and business communities. This individual will become part of the corporate Data Quality and Data management/entity resolution team supporting various systems across the board.

Primary Responsibilities:

Responsible for data quality and data accuracy across all group/division delivery initiatives.
Responsible for data analysis, data profiling, data modeling, and data mapping capabilities.
Responsible for reviewing and governing data queries and DML.
Accountable for the assessment, delivery, quality, accuracy, and tracking of any production data fixes.
Accountable for the performance, quality, and alignment to requirements for all data query design and development.
Responsible for defining standards and best practices for data analysis, modeling, and queries.
Responsible for understanding end-to-end data flows and identifying data dependencies in support of delivery, release, and change management.
Responsible for the development and maintenance of an enterprise data dictionary that is aligned to data assets and the business glossary for the group responsible for the definition and maintenance of the group's data landscape including overlays with the technology landscape, end-to-end data flow/transformations, and data lineage.
Responsible for rationalizing the group's reporting posture through the definition and maintenance of a reporting strategy and roadmap.
Partners with the data governance team to ensure data solutions adhere to the organization’s data principles and guidelines.
Owns group's data assets including reports, data warehouse, etc.
Understand customer business use cases and be able to translate them to technical specifications and vision on how to implement a solution.
Accountable for defining the performance tuning needs for all group data assets and managing the implementation of those requirements within the context of group initiatives as well as steady-state production.
Partners with others in test data management and masking strategies and the creation of a reusable test data repository.
Responsible for solving data-related issues and communicating resolutions with other solution domains.
Actively and consistently support all efforts to simplify and enhance the Clinical Trial Predication use cases.
Apply knowledge in analytic and statistical algorithms to help customers explore methods to improve their business.
Contribute toward analytical research projects through all stages including concept formulation, determination of appropriate statistical methodology, data manipulation, research evaluation, and final research report.
Visualize and report data findings creatively in a variety of visual formats that appropriately provide insight to the stakeholders.
Achieve defined project goals within customer deadlines; proactively communicate status and escalate issues as needed.

Additional Responsibilities:

Strong understanding of the Software Development Life Cycle (SDLC) with Agile Methodologies
Knowledge and understanding of industry-standard/best practices requirements gathering methodologies.
Knowledge and understanding of Information Technology systems and software development.
Experience with data modeling and test data management tools.
Experience in the data integration project • Good problem solving & decision-making skills.
Good communication skills within the team, site, and with the customer

Knowledge, Skills and Abilities

Technical expertise in data architecture principles and design aspects of various DBMS and reporting concepts.
Solid understanding of key DBMS platforms like SQL Server, Azure SQL
Results-oriented, diligent, and works with a sense of urgency. Assertive, responsible for his/her own work (self-directed), have a strong affinity for defining work in deliverables, and be willing to commit to deadlines.
Experience in MDM tools like MS DQ, SAS DM Studio, Tamr, Profisee, Reltio etc.
Experience in Report and Dashboard development
Statistical and Machine Learning models
Python (sklearn, numpy, pandas, genism)
Nice to Have:
1yr of ETL experience
Natural Language Processing
Neural networks and Deep learning
xperience in keras,tensorflow,spacy, nltk, LightGBM python library

Interaction : Frequently interacts with subordinate supervisors.

Education : Bachelor’s degree, preferably in Computer Science, B.E or other quantitative field related to the area of assignment. Professional certification related to the area of assignment may be required

Experience : 7 years of Pharmaceutical /Biotech/life sciences experience, 5 years of Clinical Trials experience and knowledge, Excellent Documentation, Communication, and Presentation Skills including PowerPoint

Data Steward :

Primary Responsibilities:

Responsible for data quality and data accuracy across all group/division delivery initiatives.
Responsible for data analysis, data profiling, data modeling, and data mapping capabilities.
Responsible for reviewing and governing data queries and DML.
Accountable for the assessment, delivery, quality, accuracy, and tracking of any production data fixes.
Accountable for the performance, quality, and alignment to requirements for all data query design and development.
Responsible for defining standards and best practices for data analysis, modeling, and queries.
Responsible for understanding end-to-end data flows and identifying data dependencies in support of delivery, release, and change management.
Responsible for the development and maintenance of an enterprise data dictionary that is aligned to data assets and the business glossary for the group responsible for the definition and maintenance of the group's data landscape including overlays with the technology landscape, end-to-end data flow/transformations, and data lineage.
Responsible for rationalizing the group's reporting posture through the definition and maintenance of a reporting strategy and roadmap.
Partners with the data governance team to ensure data solutions adhere to the organization’s data principles and guidelines.
Owns group's data assets including reports, data warehouse, etc.
Understand customer business use cases and be able to translate them to technical specifications and vision on how to implement a solution.
Accountable for defining the performance tuning needs for all group data assets and managing the implementation of those requirements within the context of group initiatives as well as steady-state production.
Partners with others in test data management and masking strategies and the creation of a reusable test data repository.
Responsible for solving data-related issues and communicating resolutions with other solution domains.
Actively and consistently support all efforts to simplify and enhance the Clinical Trial Predication use cases.
Apply knowledge in analytic and statistical algorithms to help customers explore methods to improve their business.
Contribute toward analytical research projects through all stages including concept formulation, determination of appropriate statistical methodology, data manipulation, research evaluation, and final research report.
Visualize and report data findings creatively in a variety of visual formats that appropriately provide insight to the stakeholders.
Achieve defined project goals within customer deadlines; proactively communicate status and escalate issues as needed.

Additional Responsibilities:

Strong understanding of the Software Development Life Cycle (SDLC) with Agile Methodologies
Knowledge and understanding of industry-standard/best practices requirements gathering methodologies.
Knowledge and understanding of Information Technology systems and software development.
Experience with data modeling and test data management tools.
Experience in the data integration project • Good problem solving & decision-making skills.
Good communication skills within the team, site, and with the customer

Knowledge, Skills and Abilities

Technical expertise in data architecture principles and design aspects of various DBMS and reporting concepts.
Solid understanding of key DBMS platforms like SQL Server, Azure SQL
Results-oriented, diligent, and works with a sense of urgency. Assertive, responsible for his/her own work (self-directed), have a strong affinity for defining work in deliverables, and be willing to commit to deadlines.
Experience in MDM tools like MS DQ, SAS DM Studio, Tamr, Profisee, Reltio etc.
Experience in Report and Dashboard development
Statistical and Machine Learning models
Python (sklearn, numpy, pandas, genism)
Nice to Have:
1yr of ETL experience
Natural Language Processing
Neural networks and Deep learning
xperience in keras,tensorflow,spacy, nltk, LightGBM python library

Interaction : Frequently interacts with subordinate supervisors.

Data Analyst

at A Product development Organisation

Agency job

via Millions Advisory by Vasuki N

Pune

5 - 8 yrs

₹10L - ₹17L / yr

Python

Big Data

Amazon Web Services (AWS)

Windows Azure

Google Cloud Platform (GCP)

+3 more

Must have 5-8 years of experience in handling data
Must have the ability to interpret large amounts of data and to multi-task
Must have strong knowledge of and experience with programming (Python), Linux/Bash scripting, databases(SQL, etc)
Must have strong analytical and critical thinking to resolve business problems using data and tech
Must have domain familiarity and interest of – Cloud technologies (GCP/Azure Microsoft/ AWS Amazon), open-source technologies, Enterprise technologies
Must have the ability to collect, organize, analyze, and disseminate significant amounts of information with attention to detail and accuracy.
Must have good communication skills
Working knowledge/exposure to ElasticSearch, PostgreSQL, Athena, PrestoDB, Jupyter Notebook

Must have 5-8 years of experience in handling data
Must have the ability to interpret large amounts of data and to multi-task
Must have strong knowledge of and experience with programming (Python), Linux/Bash scripting, databases(SQL, etc)
Must have strong analytical and critical thinking to resolve business problems using data and tech
Must have domain familiarity and interest of – Cloud technologies (GCP/Azure Microsoft/ AWS Amazon), open-source technologies, Enterprise technologies
Must have the ability to collect, organize, analyze, and disseminate significant amounts of information with attention to detail and accuracy.
Must have good communication skills
Working knowledge/exposure to ElasticSearch, PostgreSQL, Athena, PrestoDB, Jupyter Notebook

Python Developer

at Intentbase

1 video

1 recruiter

Posted by Nischal Vohra

Pune

2 - 5 yrs

₹5L - ₹10L / yr

Pandas

Numpy

Bash

Structured Query Language

Python

+2 more

We are an early stage startup working in the space of analytics, big data, machine learning, data visualization on multiple platforms and SaaS. We have our offices in Palo Alto and WTC, Kharadi, Pune and got some marque names as our customers. We are looking for really good Python programmer who MUST have scientific programming experience (Python, etc.) Hands-on with numpy and the Python scientific stack is a must. Demonstrated ability to track and work with 100s-1000s of files and GB-TB of data. Exposure to ML and Data mining algorithms. Need to be comfortable working in a Unix environment and SQL. You will be required to do following: Using command line tools to perform data conversion and analysis Supporting other team members in retrieving and archiving experimental results Quickly writing scripts to automate routine analysis tasks Creating insightful, simple graphics to represent complex trends Explore/design/invent new tools and design patterns to solve complex big data problems Experience working on a long-term, lab-based project (academic experience acceptable)

Bigdata Lead

at Saama Technologies

6 recruiters

Posted by Sandeep Chaudhary

Pune

2 - 5 yrs

₹1L - ₹18L / yr

Hadoop

Spark

Apache Hive

Apache Flume

Java

+5 more

Description Deep experience and understanding of Apache Hadoop and surrounding technologies required; Experience with Spark, Impala, Hive, Flume, Parquet and MapReduce. Strong understanding of development languages to include: Java, Python, Scala, Shell Scripting Expertise in Apache Spark 2. x framework principals and usages. Should be proficient in developing Spark Batch and Streaming job in Python, Scala or Java. Should have proven experience in performance tuning of Spark applications both from application code and configuration perspective. Should be proficient in Kafka and integration with Spark. Should be proficient in Spark SQL and data warehousing techniques using Hive. Should be very proficient in Unix shell scripting and in operating on Linux. Should have knowledge about any cloud based infrastructure. Good experience in tuning Spark applications and performance improvements. Strong understanding of data profiling concepts and ability to operationalize analyses into design and development activities Experience with best practices of software development; Version control systems, automated builds, etc. Experienced in and able to lead the following phases of the Software Development Life Cycle on any project (feasibility planning, analysis, development, integration, test and implementation) Capable of working within the team or as an individual Experience to create technical documentation

Get to hear about interesting companies hiring right now

Follow Cutshort

Why apply via Cutshort?

Connect with actual hiring teams and get their fast response. No spam.

Find more jobs

Get to hear about interesting companies hiring right now

Follow Cutshort