Cutshort logo
MNC logo
Data Engineer (PySpark)
at MNC
Data Engineer (PySpark)
MNC's logo

Data Engineer (PySpark)

at MNC

Agency job
3 - 7 yrs
₹8L - ₹16L / yr
Bengaluru (Bangalore)
Skills
PySpark
skill iconPython
Spark
Roles and Responsibilities:

• Responsible for developing and maintaining applications with PySpark 
• Contribute to the overall design and architecture of the application developed and deployed.
• Performance Tuning wrt to executor sizing and other environmental parameters, code optimization, partitions tuning, etc.
• Interact with business users to understand requirements and troubleshoot issues.
• Implement Projects based on functional specifications.

Must-Have Skills:

• Good experience in Pyspark - Including Dataframe core functions and Spark SQL
• Good experience in SQL DBs - Be able to write queries including fair complexity.
• Should have excellent experience in Big Data programming for data transformation and aggregations
• Good at ETL architecture. Business rules processing and data extraction from Data Lake into data streams for business consumption.
• Good customer communication.
• Good Analytical skills
Read more
Users love Cutshort
Read about what our users have to say about finding their next opportunity on Cutshort.
Subodh Popalwar's profile image

Subodh Popalwar

Software Engineer, Memorres
For 2 years, I had trouble finding a company with good work culture and a role that will help me grow in my career. Soon after I started using Cutshort, I had access to information about the work culture, compensation and what each company was clearly offering.
Companies hiring on Cutshort
companies logos

About MNC

Founded
Type
Size
Stage
About
N/A
Company social profiles
N/A

Similar jobs

India's best Short Video App
Bengaluru (Bangalore)
4 - 12 yrs
₹25L - ₹50L / yr
Data engineering
Big Data
Spark
Apache Kafka
Apache Hive
+26 more
What Makes You a Great Fit for The Role?

You’re awesome at and will be responsible for
 
Extensive programming experience with cross-platform development of one of the following Java/SpringBoot, Javascript/Node.js, Express.js or Python
3-4 years of experience in big data analytics technologies like Storm, Spark/Spark streaming, Flink, AWS Kinesis, Kafka streaming, Hive, Druid, Presto, Elasticsearch, Airflow, etc.
3-4 years of experience in building high performance RPC services using different high performance paradigms: multi-threading, multi-processing, asynchronous programming (nonblocking IO), reactive programming,
3-4 years of experience working high throughput low latency databases and cache layers like MongoDB, Hbase, Cassandra, DynamoDB,, Elasticache ( Redis + Memcache )
Experience with designing and building high scale app backends and micro-services leveraging cloud native services on AWS like proxies, caches, CDNs, messaging systems, Serverless compute(e.g. lambda), monitoring and telemetry.
Strong understanding of distributed systems fundamentals around scalability, elasticity, availability, fault-tolerance.
Experience in analysing and improving the efficiency, scalability, and stability of distributed systems and backend micro services.
5-7 years of strong design/development experience in building massively large scale, high throughput low latency distributed internet systems and products.
Good experience in working with Hadoop and Big Data technologies like HDFS, Pig, Hive, Storm, HBase, Scribe, Zookeeper and NoSQL systems etc.
Agile methodologies, Sprint management, Roadmap, Mentoring, Documenting, Software architecture.
Liaison with Product Management, DevOps, QA, Client and other teams
 
Your Experience Across The Years in the Roles You’ve Played
 
Have total or more 5 - 7 years of experience with 2-3 years in a startup.
Have B.Tech or M.Tech or equivalent academic qualification from premier institute.
Experience in Product companies working on Internet-scale applications is preferred
Thoroughly aware of cloud computing infrastructure on AWS leveraging cloud native service and infrastructure services to design solutions.
Follow Cloud Native Computing Foundation leveraging mature open source projects including understanding of containerisation/Kubernetes.
 
You are passionate about learning or growing your expertise in some or all of the following
Data Pipelines
Data Warehousing
Statistics
Metrics Development
 
We Value Engineers Who Are
 
Customer-focused: We believe that doing what’s right for the creator is ultimately what will drive our business forward.
Obsessed with Quality: Your Production code just works & scales linearly
Team players. You believe that more can be achieved together. You listen to feedback and also provide supportive feedback to help others grow/improve.
Pragmatic: We do things quickly to learn what our creators desire. You know when it’s appropriate to take shortcuts that don’t sacrifice quality or maintainability.
Owners: Engineers at Chingari know how to positively impact the business.
Read more
Expand My Business
Bengaluru (Bangalore), Mumbai, Chennai
7 - 15 yrs
₹10L - ₹28L / yr
Spark
Hadoop
Big Data
Data engineering
PySpark
+2 more

Design, implement, and improve the analytics platform

Implement and simplify self-service data query and analysis capabilities of the BI platform

Develop and improve the current BI architecture, emphasizing data security, data quality

and timeliness, scalability, and extensibility

Deploy and use various big data technologies and run pilots to design low latency

data architectures at scale

Collaborate with business analysts, data scientists, product managers, software development engineers,

and other BI teams to develop, implement, and validate KPIs, statistical analyses, data profiling, prediction,

forecasting, clustering, and machine learning algorithms


Educational

At Ganit we are building an elite team, ergo we are seeking candidates who possess the

following backgrounds:

7+ years relevant experience

Expert level skills writing and optimizing complex SQL

Knowledge of data warehousing concepts

Experience in data mining, profiling, and analysis

Experience with complex data modelling, ETL design, and using large databases

in a business environment

Proficiency with Linux command line and systems administration

Experience with languages like Python/Java/Scala

Experience with Big Data technologies such as Hive/Spark

Proven ability to develop unconventional solutions, sees opportunities to

innovate and leads the way

Good experience of working in cloud platforms like AWS, GCP & Azure. Having worked on

projects involving creation of data lake or data warehouse

Excellent verbal and written communication.

Proven interpersonal skills and ability to convey key insights from complex analyses in

summarized business terms. Ability to effectively communicate with multiple teams


Good to have

AWS/GCP/Azure Data Engineer Certification

Read more
Personal Care Product Manufacturing
Mumbai
3 - 8 yrs
₹12L - ₹30L / yr
Spark
Hadoop
Big Data
Data engineering
PySpark
+9 more

DATA ENGINEER


Overview

They started with a singular belief - what is beautiful cannot and should not be defined in marketing meetings. It's defined by the regular people like us, our sisters, our next-door neighbours, and the friends we make on the playground and in lecture halls. That's why we stand for people-proving everything we do. From the inception of a product idea to testing the final formulations before launch, our consumers are a part of each and every process. They guide and inspire us by sharing their stories with us. They tell us not only about the product they need and the skincare issues they face but also the tales of their struggles, dreams and triumphs. Skincare goes deeper than skin. It's a form of self-care for many. Wherever someone is on this journey, we want to cheer them on through the products we make, the content we create and the conversations we have. What we wish to build is more than a brand. We want to build a community that grows and glows together - cheering each other on, sharing knowledge, and ensuring people always have access to skincare that really works.

 

Job Description:

We are seeking a skilled and motivated Data Engineer to join our team. As a Data Engineer, you will be responsible for designing, developing, and maintaining the data infrastructure and systems that enable efficient data collection, storage, processing, and analysis. You will collaborate with cross-functional teams, including data scientists, analysts, and software engineers, to implement data pipelines and ensure the availability, reliability, and scalability of our data platform.


Responsibilities:

Design and implement scalable and robust data pipelines to collect, process, and store data from various sources.

Develop and maintain data warehouse and ETL (Extract, Transform, Load) processes for data integration and transformation.

Optimize and tune the performance of data systems to ensure efficient data processing and analysis.

Collaborate with data scientists and analysts to understand data requirements and implement solutions for data modeling and analysis.

Identify and resolve data quality issues, ensuring data accuracy, consistency, and completeness.

Implement and maintain data governance and security measures to protect sensitive data.

Monitor and troubleshoot data infrastructure, perform root cause analysis, and implement necessary fixes.

Stay up-to-date with emerging technologies and industry trends in data engineering and recommend their adoption when appropriate.


Qualifications:

Bachelor’s or higher degree in Computer Science, Information Systems, or a related field.

Proven experience as a Data Engineer or similar role, working with large-scale data processing and storage systems.

Strong programming skills in languages such as Python, Java, or Scala.

Experience with big data technologies and frameworks like Hadoop, Spark, or Kafka.

Proficiency in SQL and database management systems (e.g., MySQL, PostgreSQL, or Oracle).

Familiarity with cloud platforms like AWS, Azure, or GCP, and their data services (e.g., S3, Redshift, BigQuery).

Solid understanding of data modeling, data warehousing, and ETL principles.

Knowledge of data integration techniques and tools (e.g., Apache Nifi, Talend, or Informatica).

Strong problem-solving and analytical skills, with the ability to handle complex data challenges.

Excellent communication and collaboration skills to work effectively in a team environment.


Preferred Qualifications:

Advanced knowledge of distributed computing and parallel processing.

Experience with real-time data processing and streaming technologies (e.g., Apache Kafka, Apache Flink).

Familiarity with machine learning concepts and frameworks (e.g., TensorFlow, PyTorch).

Knowledge of containerization and orchestration technologies (e.g., Docker, Kubernetes).

Experience with data visualization and reporting tools (e.g., Tableau, Power BI).

Certification in relevant technologies or data engineering disciplines.



Read more
Cloudbloom Systems LLP
at Cloudbloom Systems LLP
5 recruiters
Sahil Rana
Posted by Sahil Rana
Bengaluru (Bangalore)
6 - 10 yrs
₹13L - ₹25L / yr
skill iconMachine Learning (ML)
skill iconPython
skill iconData Science
Natural Language Processing (NLP)
Computer Vision
+3 more
Description
Duties and Responsibilities:
 Research and Develop Innovative Use Cases, Solutions and Quantitative Models
 Quantitative Models in Video and Image Recognition and Signal Processing for cloudbloom’s
cross-industry business (e.g., Retail, Energy, Industry, Mobility, Smart Life and
Entertainment).
 Design, Implement and Demonstrate Proof-of-Concept and Working Proto-types
 Provide R&D support to productize research prototypes.
 Explore emerging tools, techniques, and technologies, and work with academia for cutting-
edge solutions.
 Collaborate with cross-functional teams and eco-system partners for mutual business benefit.
 Team Management Skills
Academic Qualification
 7+ years of professional hands-on work experience in data science, statistical modelling, data
engineering, and predictive analytics assignments
 Mandatory Requirements: Bachelor’s degree with STEM background (Science, Technology,
Engineering and Management) with strong quantitative flavour
 Innovative and creative in data analysis, problem solving and presentation of solutions.
 Ability to establish effective cross-functional partnerships and relationships at all levels in a
highly collaborative environment
 Strong experience in handling multi-national client engagements
 Good verbal, writing & presentation skills
Core Expertise
 Excellent understanding of basics in mathematics and statistics (such as differential
equations, linear algebra, matrix, combinatorics, probability, Bayesian statistics, eigen
vectors, Markov models, Fourier analysis).
 Building data analytics models using Python, ML libraries, Jupyter/Anaconda and Knowledge
database query languages like  SQL
 Good knowledge of machine learning methods like k-Nearest Neighbors, Naive Bayes, SVM,
Decision Forests. 
 Strong Math Skills (Multivariable Calculus and Linear Algebra) - understanding the
fundamentals of Multivariable Calculus and Linear Algebra is important as they form the basis
of a lot of predictive performance or algorithm optimization techniques. 
 Deep learning : CNN, neural Network, RNN, tensorflow, pytorch, computervision,
 Large-scale data extraction/mining, data cleansing, diagnostics, preparation for Modeling
 Good applied statistical skills, including knowledge of statistical tests, distributions,
regression, maximum likelihood estimators, Multivariate techniques & predictive modeling
cluster analysis, discriminant analysis, CHAID, logistic & multiple regression analysis
 Experience with Data Visualization Tools like Tableau, Power BI, Qlik Sense that help to
visually encode data
 Excellent Communication Skills – it is incredibly important to describe findings to a technical
and non-technical audience
 Capability for continuous learning and knowledge acquisition.
 Mentor colleagues for growth and success
 Strong Software Engineering Background
 Hands-on experience with data science tools
Read more
MNC
Bengaluru (Bangalore)
4 - 7 yrs
₹25L - ₹28L / yr
skill iconData Science
Data Scientist
skill iconR Programming
skill iconPython
SQL
  • Banking Domain
  • Assist the team in building Machine learning/AI/Analytics models on open-source stack using Python and the Azure cloud stack.
  • Be part of the internal data science team at fragma data - that provides data science consultation to large organizations such as Banks, e-commerce Cos, Social Media companies etc on their scalable AI/ML needs on the cloud and help build POCs, and develop Production ready solutions.
  • Candidates will be provided with opportunities for training and professional certifications on the job in these areas - Azure Machine learning services, Microsoft Customer Insights, Spark, Chatbots, DataBricks, NoSQL databases etc.
  • Assist the team in conducting AI demos, talks, and workshops occasionally to large audiences of senior stakeholders in the industry.
  • Work on large enterprise scale projects end-to-end, involving domain specific projects across banking, finance, ecommerce, social media etc.
  • Keen interest to learn new technologies and latest developments and apply them to projects assigned.
Desired Skills
  • Professional Hands-on coding experience in python for over 1 year for Data scientist, and over 3 years for Sr Data Scientist. 
  • This is primarily a programming/development-oriented role - hence strong programming skills in writing object-oriented and modular code in python and experience of pushing projects to production is important.
  • Strong foundational knowledge and professional experience in 
  • Machine learning, (Compulsory)
  • Deep Learning (Compulsory)
  • Strong knowledge of At least One of : Natural Language Processing or Computer Vision or Speech Processing or Business Analytics
  • Understanding of Database technologies and SQL. (Compulsory)
  • Knowledge of the following Frameworks:
  • Scikit-learn (Compulsory)
  • Keras/tensorflow/pytorch (At least one of these is Compulsory)
  • API development in python for ML models (good to have)
  • Excellent communication skills.
  • Excellent communication skills are necessary to succeed in this role, as this is a role with high external visibility, and with multiple opportunities to present data science results to a large external audience that will include external VPs, Directors, CXOs etc.  
  • Hence communication skills will be a key consideration in the selection process.
Read more
VIMANA
at VIMANA
4 recruiters
Loshy Chandran
Posted by Loshy Chandran
Remote, Chennai
2 - 5 yrs
₹10L - ₹20L / yr
Data engineering
Data Engineer
Apache Kafka
Big Data
skill iconJava
+4 more

We are looking for passionate, talented and super-smart engineers to join our product development team. If you are someone who innovates, loves solving hard problems, and enjoys end-to-end product development, then this job is for you! You will be working with some of the best developers in the industry in a self-organising, agile environment where talent is valued over job title or years of experience.

 

Responsibilities:

  • You will be involved in end-to-end development of VIMANA technology, adhering to our development practices and expected quality standards.
  • You will be part of a highly collaborative Agile team which passionately follows SAFe Agile practices, including pair-programming, PR reviews, TDD, and Continuous Integration/Delivery (CI/CD).
  • You will be working with cutting-edge technologies and tools for stream processing using Java, NodeJS and Python, using frameworks like Spring, RxJS etc.
  • You will be leveraging big data technologies like Kafka, Elasticsearch and Spark, processing more than 10 Billion events per day to build a maintainable system at scale.
  • You will be building Domain Driven APIs as part of a micro-service architecture.
  • You will be part of a DevOps culture where you will get to work with production systems, including operations, deployment, and maintenance.
  • You will have an opportunity to continuously grow and build your capabilities, learning new technologies, languages, and platforms.

 

Requirements:

  • Undergraduate degree in Computer Science or a related field, or equivalent practical experience.
  • 2 to 5 years of product development experience.
  • Experience building applications using Java, NodeJS, or Python.
  • Deep knowledge in Object-Oriented Design Principles, Data Structures, Dependency Management, and Algorithms.
  • Working knowledge of message queuing, stream processing, and highly scalable Big Data technologies.
  • Experience in working with Agile software methodologies (XP, Scrum, Kanban), TDD and Continuous Integration (CI/CD).
  • Experience using no-SQL databases like MongoDB or Elasticsearch.
  • Prior experience with container orchestrators like Kubernetes is a plus.
About VIMANA

We build products and platforms for the Industrial Internet of Things. Our technology is being used around the world in mission-critical applications - from improving the performance of manufacturing plants, to making electric vehicles safer and more efficient, to making industrial equipment smarter.

Please visit https://govimana.com/ to learn more about what we do.

Why Explore a Career at VIMANA
  • We recognize that our dedicated team members make us successful and we offer competitive salaries.
  • We are a workplace that values work-life balance, provides flexible working hours, and full time remote work options.
  • You will be part of a team that is highly motivated to learn and work on cutting edge technologies, tools, and development practices.
  • Bon Appetit! Enjoy catered breakfasts, lunches and free snacks!

VIMANA Interview Process
We usually target to complete all the interviews in a week's time and would provide prompt feedback to the candidate. As of now, all the interviews are conducted online due to covid situation.

1.Telephonic screening (30 Min )

A 30 minute telephonic interview to understand and evaluate the candidate's fit with the job role and the company.
Clarify any queries regarding the job/company.
Give an overview about further interview rounds

2. Technical Rounds

This would be deep technical round to evaluate the candidate's technical capability pertaining to the job role.

3. HR Round

Candidate's team and cultural fit will be evaluated during this round

We would proceed with releasing the offer if the candidate clears all the above rounds.

Note: In certain cases, we might schedule additional rounds if needed before releasing the offer.
Read more
Vestibulum Technologies Pvt Ltd
Srinivasan Saripalli
Posted by Srinivasan Saripalli
Bengaluru (Bangalore)
0.3 - 0.6 yrs
₹15000 - ₹25000 / mo
skill iconData Science
skill iconMachine Learning (ML)
skill iconDeep Learning
skill iconPython
Computer Vision
+3 more
Hiring for Machine Learning Deep Learning, who have build solutions for computer vision ,nlp, Tensorflow, Pytorch, Tensorflow Lite for Android phone 
Read more
Mirafra Technologies
at Mirafra Technologies
4 recruiters
Nirmala N S
Posted by Nirmala N S
Remote, Bengaluru (Bangalore)
4 - 7 yrs
₹5L - ₹18L / yr
Big Data
skill iconScala
Spark
"spark streaming"
"Hadoop
Should have experience in Big data development
Strong experience in Scala/Spark

End client: Sapient
Mode of Hiring : FTE
Notice should be less than 30days
Read more
Saama Technologies
at Saama Technologies
6 recruiters
Sandeep Chaudhary
Posted by Sandeep Chaudhary
Pune
4 - 8 yrs
₹1L - ₹16L / yr
skill iconData Science
skill iconPython
skill iconMachine Learning (ML)
Natural Language Processing (NLP)
Big Data
+2 more
Description Must have Direct Hands- on, 4 years of experience, building complex Data Science solutions Must have fundamental knowledge of Inferential Statistics Should have worked on Predictive Modelling, using Python / R Experience should include the following, File I/ O, Data Harmonization, Data Exploration Machine Learning Techniques (Supervised, Unsupervised) Multi- Dimensional Array Processing Deep Learning NLP, Image Processing Prior experience in Healthcare Domain, is a plus Experience using Big Data, is a plus Should have Excellent Analytical, Problem Solving ability. Should be able to grasp new concepts quickly Should be well familiar with Agile Project Management Methodology Should have excellent written and verbal communication skills Should be a team player with open mind
Read more
Woodcutter Film Technologies Pvt. Ltd.
Athul Krishnan
Posted by Athul Krishnan
Hyderabad
1 - 5 yrs
₹3L - ₹6L / yr
skill iconData Science
skill iconR Programming
skill iconPython
We're an early stage film-tech startup with a mission to empower filmmakers and independent content creators with data-driven decision-making tools. We're looking for a data person to join the core team. Please get in touch if you would be excited to join us on this super exciting journey of disrupting the film production and distribution business. We are currently collaborating with Rana Daggubatt's Suresh Productions, and work out of their studio in Hyderabad - so exposure and opportunities to work on real issues faced by the media industry will be in plenty.
Read more
Why apply to jobs via Cutshort
people_solving_puzzle
Personalized job matches
Stop wasting time. Get matched with jobs that meet your skills, aspirations and preferences.
people_verifying_people
Verified hiring teams
See actual hiring teams, find common social connections or connect with them directly. No 3rd party agencies here.
ai_chip
Move faster with AI
We use AI to get you faster responses, recommendations and unmatched user experience.
21,01,133
Matches delivered
37,12,187
Network size
15,000
Companies hiring
Did not find a job you were looking for?
icon
Search for relevant jobs from 10000+ companies such as Google, Amazon & Uber actively hiring on Cutshort.
companies logo
companies logo
companies logo
companies logo
companies logo
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort
Users love Cutshort
Read about what our users have to say about finding their next opportunity on Cutshort.
Subodh Popalwar's profile image

Subodh Popalwar

Software Engineer, Memorres
For 2 years, I had trouble finding a company with good work culture and a role that will help me grow in my career. Soon after I started using Cutshort, I had access to information about the work culture, compensation and what each company was clearly offering.
Companies hiring on Cutshort
companies logos