Pyspark Lead/Pyspark Dev

at Virtusa

Agency job
icon
Chennai, Bengaluru (Bangalore), Mumbai, Hyderabad, Pune
icon
3 - 10 yrs
icon
₹10L - ₹25L / yr (ESOP available)
icon
Full time
Skills
PySpark
Python
  • Minimum 1 years of relevant experience, in PySpark (mandatory)
  • Hands on experience in development, test, deploy, maintain and improving data integration pipeline in AWS cloud environment is added plus 
  • Ability to play lead role and independently manage 3-5 member of Pyspark development team 
  • EMR ,Python and PYspark mandate.
  • Knowledge and awareness working with AWS Cloud technologies like Apache Spark, , Glue, Kafka, Kinesis, and Lambda in S3, Redshift, RDS
Read more

About Virtusa

Virtusa help clients change, disrupt, and unlock new value that surpasses their wildest expectations not just to reach our best, but to redefine yours.
Read more
Founded
1996
Type
Services
Size
100-1000 employees
Stage
Profitable
View full company details
Why apply to jobs via Cutshort
Personalized job matches
Stop wasting time. Get matched with jobs that meet your skills, aspirations and preferences.
Verified hiring teams
See actual hiring teams, find common social connections or connect with them directly. No 3rd party agencies here.
Move faster with AI
We use AI to get you faster responses, recommendations and unmatched user experience.
2101133
Matches delivered
3712187
Network size
15000
Companies hiring

Similar jobs

PySpark
Data engineering
Big Data
Hadoop
Spark
SQL
Python
Microsoft SQL Server DBA
ELT
icon
Remote only
icon
7 - 13 yrs
icon
₹15L - ₹35L / yr
Experience
Experience Range

2 Years - 10 Years

Function Information Technology
Desired Skills
Must Have Skills:
• Good experience in Pyspark - Including Dataframe core functions and Spark SQL
• Good experience in SQL DBs - Be able to write queries including fair complexity.
• Should have excellent experience in Big Data programming for data transformation and aggregations
• Good at ELT architecture. Business rules processing and data extraction from Data Lake into data streams for business consumption.
• Good customer communication.
• Good Analytical skills
Education
Education Type Engineering
Degree / Diploma Bachelor of Engineering, Bachelor of Computer Applications, Any Engineering
Specialization / Subject Any Specialisation
Job Type Full Time
Job ID 000018
Department Software Development
Read more
Job posted by
Minakshi Kumari

Lead Data Scientist

at Metadata Technology North America

Agency job
via RS Consultants
Data Science
Machine Learning (ML)
Python
sagemaker
Go Programming (Golang)
Scikit-Learn
pandas
NumPy
Amazon Web Services (AWS)
Data Analytics
TensorFlow
Apache Kafka
Real time media streaming
Airflow
icon
Remote only
icon
8 - 16 yrs
icon
₹20L - ₹50L / yr
Data Scientist Lead / Manager
Job Description:
We are looking for an exceptional Data Scientist Lead / Manager who is passionate about data and motivated to build large scale machine learning solutions to shine our data products. This person will be contributing to the analytics of data for insight discovery and development of machine learning pipeline to support modeling of terabytes of daily data for various use cases.

Location: Pune (Initially remote due to COVID 19)

*****Looking for someone who can start immediately / Within a month. Hands-on experience in Python programming (Minimum 5 Years) is a must.


About the Organisation :

- It provides a dynamic, fun workplace filled with passionate individuals. We are at the cutting edge of advertising technology and there is never a dull moment at work.

- We have a truly global footprint, with our headquarters in Singapore and offices in Australia, United States, Germany, United Kingdom and India.

- You will gain work experience in a global environment. We speak over 20 different languages, from more than 16 different nationalities and over 42% of our staff are multilingual.


Qualifications:
• 8+ years relevant working experience
• Master / Bachelors in computer science or engineering
• Working knowledge of Python and SQL
• Experience in time series data, data manipulation, analytics, and visualization
• Experience working with large-scale data
• Proficiency of various ML algorithms for supervised and unsupervised learning
• Experience working in Agile/Lean model
• Experience with Java and Golang is a plus
• Experience with BI toolkit such as Tableau, Superset, Quicksight, etc is a plus
• Exposure to building large-scale ML models using one or more of modern tools and libraries such as AWS Sagemaker, Spark ML-Lib, Dask, Tensorflow, PyTorch, Keras, GCP ML Stack
• Exposure to modern Big Data tech such as Cassandra/Scylla, Kafka, Ceph, Hadoop, Spark
• Exposure to IAAS platforms such as AWS, GCP, Azure

Typical persona: Data Science Manager/Architect
Experience: 8+ years programming/engineering experience (with at least last 4 years in Data science in a Product development company)
Type: Hands-on candidate only

Must:
a. Hands-on Python: pandas,scikit-learn
b. Working knowledge of Kafka
c. Able to carry out own tasks and help the team in resolving problems - logical or technical (25% of job)
d. Good on analytical & debugging skills
e. Strong communication skills

Desired (in order of priorities)
a.Go (Strong advantage)
b. Airflow (Strong advantage)
c. Familiarity & working experience on more than one type of database: relational, object, columnar, graph and other unstructured databases
d. Data structures, Algorithms
e. Experience with multi-threaded and thread sync concepts
f. AWS Sagemaker
g. Keras
Read more
Job posted by
Biswadeep RS

Data Science Software Engineer

at StatusNeo

Founded 2020  •  Products & Services  •  100-1000 employees  •  Profitable
Data Science
Machine Learning (ML)
Python
Amazon Web Services (AWS)
Windows Azure
Google Cloud Platform (GCP)
SQL
Natural Language Processing (NLP)
pandas
NumPy
Healthcare
Deep Learning
Computer Vision
Git
icon
Bengaluru (Bangalore), Hyderabad
icon
2 - 4 yrs
icon
₹4L - ₹7L / yr

Responsibilities Description:

Responsible for the development and implementation of machine learning algorithms and techniques to solve business problems and optimize member experiences. Primary duties may include are but not limited to: Design machine learning projects to address specific business problems determined by consultation with business partners. Work with data-sets of varying degrees of size and complexity including both structured and unstructured data. Piping and processing massive data-streams in distributed computing environments such as Hadoop to facilitate analysis. Implements batch and real-time model scoring to drive actions. Develops machine learning algorithms to build customized solutions that go beyond standard industry tools and lead to innovative solutions. Develop sophisticated visualization of analysis output for business users.

 

Experience Requirements:

BS/MA/MS/PhD in Statistics, Computer Science, Mathematics, Machine Learning, Econometrics, Physics, Biostatistics or related Quantitative disciplines. 2-4 years of experience in predictive analytics and advanced expertise with software such as Python, or any combination of education and experience which would provide an equivalent background. Experience in the healthcare sector. Experience in Deep Learning strongly preferred.

 

Required Technical Skill Set:

  • Full cycle of building machine learning solutions,

o   Understanding of wide range of algorithms and their corresponding problems to solve

o   Data preparation and analysis

o   Model training and validation

o   Model application to the problem

  • Experience using the full open source programming tools and utilities
  • Experience in working in end-to-end data science project implementation.
  • 2+ years of experience with development and deployment of Machine Learning applications
  • 2+ years of experience with NLP approaches in a production setting
  • Experience in building models using bagging and boosting algorithms
  • Exposure/experience in building Deep Learning models for NLP/Computer Vision use cases preferred
  • Ability to write efficient code with good understanding of core Data Structures/algorithms is critical
  • Strong python skills following software engineering best practices
  • Experience in using code versioning tools like GIT, bit bucket
  • Experience in working in Agile projects
  • Comfort & familiarity with SQL and Hadoop ecosystem of tools including spark
  • Experience managing big data with efficient query program good to have
  • Good to have experience in training ML models in tools like Sage Maker, Kubeflow etc.
  • Good to have experience in frameworks to depict interpretability of models using libraries like Lime, Shap etc.
  • Experience with Health care sector is preferred
  • MS/M.Tech or PhD is a plus
Read more
Job posted by
Alex P

Sr Data Engineer - (Python, Pandas)

at SteelEye

Founded 2017  •  Product  •  20-100 employees  •  Raised funding
Python
ETL
Big Data
Amazon Web Services (AWS)
pandas
icon
Bengaluru (Bangalore)
icon
5 - 20 yrs
icon
₹20L - ₹35L / yr

What you’ll do

  • Deliver plugins for our Python-based ETL pipelines.
  • Deliver Python microservices for provisioning and managing cloud infrastructure.
  • Implement algorithms to analyse large data sets.
  • Draft design documents that translate requirements into code.
  • Deal with challenges associated with handling large volumes of data.
  • Assume responsibilities from technical design through technical client support.
  • Manage expectations with internal stakeholders and context-switch in a fast paced environment.
  • Thrive in an environment that uses AWS and Elasticsearch extensively.
  • Keep abreast of technology and contribute to the engineering strategy.
  • Champion best development practices and provide mentorship.

What we’re looking for

  • Experience in Python 3.
  • Python libraries used for data (such as pandas, numpy).
  • AWS.
  • Elasticsearch.
  • Performance tuning.
  • Object Oriented Design and Modelling.
  • Delivering complex software, ideally in a FinTech setting.
  • CI/CD tools.
  • Knowledge of design patterns.
  • Sharp analytical and problem-solving skills.
  • Strong sense of ownership.
  • Demonstrable desire to learn and grow.
  • Excellent written and oral communication skills.
  • Mature collaboration and mentoring abilities.

About SteelEye Culture

  • Work from home until you are vaccinated against COVID-19
  • Top of the line health insurance • Order discounted meals every day from a dedicated portal
  • Fair and simple salary structure
  • 30+ holidays in a year
  • Fresh fruits every day
  • Centrally located. 5 mins to the nearest metro station (MG Road)
  • Measured on output and not input
Read more
Job posted by
Arjun Shivraj

Data Engineer

at Servian

Founded 2008  •  Products & Services  •  100-1000 employees  •  Raised funding
Data engineering
ETL
Data Warehouse (DWH)
Powershell
DA
SQL
Python
Cloud Computing
Data modeling
Data migration
Data Visualization
Scripting
icon
Bengaluru (Bangalore)
icon
2 - 8 yrs
icon
₹10L - ₹25L / yr
Who we are
 
We are a consultant led organisation. We invest heavily in our consultants to ensure they have the technical skills and commercial acumen to be successful in their work.
 
Our consultants have a passion for data and solving complex problems. They are curious, ambitious and experts in their fields. We have developed a first rate team so you will be supported and learn from the best

About the role

  • Collaborating with a team of like-minded and experienced engineers for Tier 1 customers, you will focus on data engineering on large complex data projects. Your work will have an impact on platforms that handle crores of customers and millions of transactions daily.

  • As an engineer, you will use the latest cloud services to design and develop reusable core components and frameworks to modernise data integrations in a cloud first world and own those integrations end to end working closely with business units. You will design and build for efficiency, reliability, security and scalability. As a consultant, you will help drive a data engineering culture and advocate best practices.

Mandatory experience

    • 1-6 years of relevant experience
    • Strong SQL skills and data literacy
    • Hands-on experience designing and developing data integrations, either in ETL tools, cloud native tools or in custom software
    • Proficiency in scripting and automation (e.g. PowerShell, Bash, Python)
    • Experience in an enterprise data environment
    • Strong communication skills

Desirable experience

    • Ability to work on data architecture, data models, data migration, integration and pipelines
    • Ability to work on data platform modernisation from on-premise to cloud-native
    • Proficiency in data security best practices
    • Stakeholder management experience
    • Positive attitude with the flexibility and ability to adapt to an ever-changing technology landscape
    • Desire to gain breadth and depth of technologies to support customer's vision and project objectives

What to expect if you join Servian?

    • Learning & Development: We invest heavily in our consultants and offer internal training weekly (both technical and non-technical alike!) and abide by a ‘You Pass We Pay” policy.
    • Career progression: We take a longer term view of every hire. We have a flat org structure and promote from within. Every hire is developed as a future leader and client adviser.
    • Variety of projects: As a consultant, you will have the opportunity to work across multiple projects across our client base significantly increasing your skills and exposure in the industry.
    • Great culture: Working on the latest Apple MacBook pro in our custom designed offices in the heart of leafy Jayanagar, we provide a peaceful and productive work environment close to shops, parks and metro station.
    • Professional development: We invest heavily in professional development both technically, through training and guided certification pathways, and in consulting, through workshops in client engagement and communication. Growth in our organisation happens from the growth of our people.
Read more
Job posted by
sakshi nigam

Machine Learning Architect - Deployments

at Netmeds.com

Founded 2015  •  Product  •  500-1000 employees  •  Raised funding
Machine Learning (ML)
Software deployment
CI/CD
Cloud Computing
Snow flake schema
Amazon Redshift
Big Data
Serverless
AWS Lambda
PySpark
EMR
Data storage
Google Cloud Storage
Amazon S3
Amazon Glacier
Tableau
PowerBI
Qlik
Predictive modelling
Python
Scikit-Learn
k-means clustering
Artificial Intelligence (AI)
SaaS
icon
Chennai
icon
5 - 10 yrs
icon
₹10L - ₹30L / yr

We are looking for an outstanding ML Architect (Deployments) with expertise in deploying Machine Learning solutions/models into production and scaling them to serve millions of customers. A candidate with an adaptable and productive working style which fits in a fast-moving environment.

 

Skills:

- 5+ years deploying Machine Learning pipelines in large enterprise production systems.

- Experience developing end to end ML solutions from business hypothesis to deployment / understanding the entirety of the ML development life cycle.
- Expert in modern software development practices; solid experience using source control management (CI/CD).
- Proficient in designing relevant architecture / microservices to fulfil application integration, model monitoring, training / re-training, model management, model deployment, model experimentation/development, alert mechanisms.
- Experience with public cloud platforms (Azure, AWS, GCP).
- Serverless services like lambda, azure functions, and/or cloud functions.
- Orchestration services like data factory, data pipeline, and/or data flow.
- Data science workbench/managed services like azure machine learning, sagemaker, and/or AI platform.
- Data warehouse services like snowflake, redshift, bigquery, azure sql dw, AWS Redshift.
- Distributed computing services like Pyspark, EMR, Databricks.
- Data storage services like cloud storage, S3, blob, S3 Glacier.
- Data visualization tools like Power BI, Tableau, Quicksight, and/or Qlik.
- Proven experience serving up predictive algorithms and analytics through batch and real-time APIs.
- Solid working experience with software engineers, data scientists, product owners, business analysts, project managers, and business stakeholders to design the holistic solution.
- Strong technical acumen around automated testing.
- Extensive background in statistical analysis and modeling (distributions, hypothesis testing, probability theory, etc.)
- Strong hands-on experience with statistical packages and ML libraries (e.g., Python scikit learn, Spark MLlib, etc.)
- Experience in effective data exploration and visualization (e.g., Excel, Power BI, Tableau, Qlik, etc.)
- Experience in developing and debugging in one or more of the languages Java, Python.
- Ability to work in cross functional teams.
- Apply Machine Learning techniques in production including, but not limited to, neuralnets, regression, decision trees, random forests, ensembles, SVM, Bayesian models, K-Means, etc.

 

Roles and Responsibilities:

Deploying ML models into production, and scaling them to serve millions of customers.

Technical solutioning skills with deep understanding of technical API integrations, AI / Data Science, BigData and public cloud architectures / deployments in a SaaS environment.

Strong stakeholder relationship management skills - able to influence and manage the expectations of senior executives.
Strong networking skills with the ability to build and maintain strong relationships with both business, operations and technology teams internally and externally.

Provide software design and programming support to projects.

 

 Qualifications & Experience:

Engineering and post graduate candidates, preferably in Computer Science, from premier institutions with proven work experience as a Machine Learning Architect (Deployments) or a similar role for 5-7 years.

 

Read more
Job posted by
Vijay Hemnath
PySpark
Python
Spark
icon
Bengaluru (Bangalore)
icon
3 - 7 yrs
icon
₹8L - ₹16L / yr
Roles and Responsibilities:

• Responsible for developing and maintaining applications with PySpark 
• Contribute to the overall design and architecture of the application developed and deployed.
• Performance Tuning wrt to executor sizing and other environmental parameters, code optimization, partitions tuning, etc.
• Interact with business users to understand requirements and troubleshoot issues.
• Implement Projects based on functional specifications.

Must-Have Skills:

• Good experience in Pyspark - Including Dataframe core functions and Spark SQL
• Good experience in SQL DBs - Be able to write queries including fair complexity.
• Should have excellent experience in Big Data programming for data transformation and aggregations
• Good at ETL architecture. Business rules processing and data extraction from Data Lake into data streams for business consumption.
• Good customer communication.
• Good Analytical skills
Read more
Job posted by
Priyanka U

ML Engineer

at Global content marketplace

Agency job
via Qrata
Machine Learning (ML)
Natural Language Processing (NLP)
Python
icon
Mumbai
icon
4 - 8 yrs
icon
₹20L - ₹30L / yr

We are building a global content marketplace that brings companies and content

creators together to scale up content creation processes across 50+ content verticals and 150+ industries. Over the past 2.5 years, we’ve worked with companies like India Today, Amazon India, Adobe, Swiggy, Dunzo, Businessworld, Paisabazaar, IndiGo Airlines, Apollo Hospitals, Infoedge, Times Group, Digit, BookMyShow, UpGrad, Yulu, YourStory, and 350+ other brands.
Our mission is to become the world’s largest content creation and distribution platform for all kinds of content creators and brands.

 

Our Team

 

We are a 25+ member company and is scaling up rapidly in both team size and our ambition.

If we were to define the kind of people and the culture we have, it would be -

a) Individuals with an Extreme Sense of Passion About Work

b) Individuals with Strong Customer and Creator Obsession

c) Individuals with Extraordinary Hustle, Perseverance & Ambition

We are on the lookout for individuals who are always open to going the extra mile and thrive in a fast-paced environment. We are strong believers in building a great, enduring

a company that can outlast its builders and create a massive impact on the lives of our

employees, creators, and customers alike.

 

Our Investors

 

We are fortunate to be backed by some of the industry’s most prolific angel investors - Kunal Bahl and Rohit Bansal (Snapdeal founders), YourStory Media. (Shradha Sharma); Dr. Saurabh Srivastava, Co-founder of IAN and NASSCOM; Slideshare co-founder Amit Ranjan; Indifi's Co-founder and CEO Alok Mittal; Sidharth Rao, Chairman of Dentsu Webchutney; Ritesh Malik, Co-founder and CEO of Innov8; Sanjay Tripathy, former CMO, HDFC Life, and CEO of Agilio Labs; Manan Maheshwari, Co-founder of WYSH; and Hemanshu Jain, Co-founder of Diabeto.
Backed by Lightspeed Venture Partners



Job Responsibilities:
● Design, develop, test, deploy, maintain and improve ML models
● Implement novel learning algorithms and recommendation engines
● Apply Data Science concepts to solve routine problems of target users
● Translates business analysis needs into well-defined machine learning problems, and
selecting appropriate models and algorithms
● Create an architecture, implement, maintain and monitor various data source pipelines
that can be used across various different types of data sources
● Monitor performance of the architecture and conduct optimization
● Produce clean, efficient code based on specifications
● Verify and deploy programs and systems
● Troubleshoot, debug and upgrade existing applications
● Guide junior engineers for productive contribution to the development
The ideal candidate must -

ML and NLP Engineer
● 4 or more years of experience in ML Engineering
● Proven experience in NLP
● Familiarity with language generative model - GPT3
● Ability to write robust code in Python
● Familiarity with ML frameworks and libraries
● Hands on experience with AWS services like Sagemaker and Personalize
● Exposure to state of the art techniques in ML and NLP
● Understanding of data structures, data modeling, and software architecture
● Outstanding analytical and problem-solving skills
● Team player, an ability to work cooperatively with the other engineers.
● Ability to make quick decisions in high-pressure environments with limited information.
Read more
Job posted by
Mrunal Kokate

Product Analyst - Ad Tech

at MX Player

Founded 2011  •  Product  •  500-1000 employees  •  Profitable
Python
SQL
Tableau
icon
Mumbai, NCR (Delhi | Gurgaon | Noida)
icon
2 - 5 yrs
icon
Best in industry

About MX Player (https://play.google.com/store/apps/details?id=com.mxtech.videoplayer.ad&;hl=en_IN">Playstore Link)


MX Player is the world’s #1 entertainment superapp offering 100,000+ hours of premium OTT (over the top) content spanning acclaimed MX Originals, Web Shows, TV (Live & OnDemand), movies, music videos and hyper-casual games, music streaming, short form video and more. With more than 1 billion installs worldwide – MX Player is present on 1 out of every 2 smartphones, making it the largest entertainment app/platform in the world.

 

Position : Product Analyst / Business Analyst - Ad Tech


Key Responsibilities:

 

  • Driving the collection of new data that would help build the next generation of algorithms (E.g. audience segmentation, contextual targeting)
  • Understanding user behavior and performing root-cause analysis of changes in data trends to identify corrections or propose desirable enhancements in product & across different verticals
  • Excellent problem solving skills and the ability to make sound judgments based on trade-offs for different solutions to complex problem constraints
  • Defining and monitoring KPIs for product/content/business performance and identifying ways to improve them
  • Should be a strong advocate of data driven approach and drive analytics decisions by doing user testing, data analysis, and A/B testing
  • Help in defining the analytics roadmap for the product
  • Prior knowledge and experience in ad tech industry or other advertising platforms will be preferred

Tools/ Skillset:

  • Knowledge of Google DFP (prefered)
  • SQL
  • R/Python (preferred) 
  • Any BI Tool such as tableau, sisense (preferred)
  • Go getter attitude
  • Ability to thrive in a fast paced dynamic environment
  • Self - Starter
Read more
Job posted by
Payal Thakker

Data Science Engineer (SDE I)

at Couture.ai

Founded 2017  •  Product  •  20-100 employees  •  Profitable
Spark
Algorithms
Data Structures
Scala
Machine Learning (ML)
Big Data
Hadoop
Python
icon
Bengaluru (Bangalore)
icon
1 - 3 yrs
icon
₹12L - ₹20L / yr
Couture.ai is building a patent-pending AI platform targeted towards vertical-specific solutions. The platform is already licensed by Reliance Jio and few European retailers, to empower real-time experiences for their combined >200 million end users. For this role, credible display of innovation in past projects (or academia) is a must. We are looking for a candidate who lives and talks Data & Algorithms, love to play with BigData engineering, hands-on with Apache Spark, Kafka, RDBMS/NoSQL DBs, Big Data Analytics and handling Unix & Production Server. Tier-1 college (BE from IITs, BITS-Pilani, top NITs, IIITs or MS in Stanford, Berkley, CMU, UW–Madison) or exceptionally bright work history is a must. Let us know if this interests you to explore the profile further.
Read more
Job posted by
Shobhit Agarwal
Did not find a job you were looking for?
icon
Search for relevant jobs from 10000+ companies such as Google, Amazon & Uber actively hiring on Cutshort.
Get to hear about interesting companies hiring right now
iconFollow Cutshort
Want to apply to this role at Virtusa?
Why apply via Cutshort?
Connect with actual hiring teams and get their fast response. No spam.
Learn more
Get to hear about interesting companies hiring right now
iconFollow Cutshort