5+ Apache Spark Jobs in Mumbai | Apache Spark Job openings in Mumbai
Apply to 5+ Apache Spark Jobs in Mumbai on CutShort.io. Explore the latest Apache Spark Job opportunities across top companies like Google, Amazon & Adobe.
• Project Planning and Management
o Take end-to-end ownership of multiple projects / project tracks
o Create and maintain project plans and other related documentation for project
objectives, scope, schedule and delivery milestones
o Lead and participate across all the phases of software engineering, right from
requirements gathering to GO LIVE
o Lead internal team meetings on solution architecture, effort estimation, manpower
planning and resource (software/hardware/licensing) planning
o Manage RIDA (Risks, Impediments, Dependencies, Assumptions) for projects by
developing effective mitigation plans
• Team Management
o Act as the Scrum Master
o Conduct SCRUM ceremonies like Sprint Planning, Daily Standup, Sprint Retrospective
o Set clear objectives for the project and roles/responsibilities for each team member
o Train and mentor the team on their job responsibilities and SCRUM principles
o Make the team accountable for their tasks and help the team in achieving them
o Identify the requirements and come up with a plan for Skill Development for all team
members
• Communication
o Be the Single Point of Contact for the client in terms of day-to-day communication
o Periodically communicate project status to all the stakeholders (internal/external)
• Process Management and Improvement
o Create and document processes across all disciplines of software engineering
o Identify gaps and continuously improve processes within the team
o Encourage team members to contribute towards process improvement
o Develop a culture of quality and efficiency within the team
Must have:
• Minimum 08 years of experience (hands-on as well as leadership) in software / data engineering
across multiple job functions like Business Analysis, Development, Solutioning, QA, DevOps and
Project Management
• Hands-on as well as leadership experience in Big Data Engineering projects
• Experience developing or managing cloud solutions using Azure or other cloud provider
• Demonstrable knowledge on Hadoop, Hive, Spark, NoSQL DBs, SQL, Data Warehousing, ETL/ELT,
DevOps tools
• Strong project management and communication skills
• Strong analytical and problem-solving skills
• Strong systems level critical thinking skills
• Strong collaboration and influencing skills
Good to have:
• Knowledge on PySpark, Azure Data Factory, Azure Data Lake Storage, Synapse Dedicated SQL
Pool, Databricks, PowerBI, Machine Learning, Cloud Infrastructure
• Background in BFSI with focus on core banking
• Willingness to travel
Work Environment
• Customer Office (Mumbai) / Remote Work
Education
• UG: B. Tech - Computers / B. E. – Computers / BCA / B.Sc. Computer Science
upGrad is an online education platform building the careers of tomorrow by offering the most industry-relevant programs in an immersive learning experience. Our mission is to create a new digital-first learning experience to deliver tangible career impact to individuals at scale. upGrad currently offers programs in Data Science, Machine Learning, Product Management, Digital Marketing, and Entrepreneurship, etc. upGrad is looking for people passionate about management and education to help design learning programs for working professionals to stay sharp and stay relevant and help build the careers of tomorrow.
- upGrad was awarded the Best Tech for Education by IAMAI for 2018-19,
- upGrad was also ranked as one of the LinkedIn Top Startups 2018: The 25 most sought-after startups in India.
- upGrad was earlier selected as one of the top ten most innovative companies in India by FastCompany.
- We were also covered by the Financial Times along with other disruptors in Ed-Tech.
- upGrad is the official education partner for Government of India - Startup India program.
- Our program with IIIT B has been ranked #1 program in the country in the domain of Artificial Intelligence and Machine Learning.
About the Role
A highly motivated individual who has expe rience in architecting end to end web based ecommerce/online/SaaS products and systems; bringing them to production quickly and with high quality. Able to understand expected business results and map architecture to drive business forward. Passionate about building world class solutions.
Role and Responsibilities
- Work with Product Managers and Business to understand business/product requirements and vision.
- Provide a clear architectural vision in line with business and product vision.
- Lead a team of architects, developers, and data engineers to provide platform services to other engineering teams.
- Provide architectural oversight to engineering teams across the organization.
- Hands on design and development of platform services and features owned by self - this is a hands-on coding role.
- Define guidelines for best practices covering design, unit testing, secure coding etc.
- Ensure quality by reviewing design, code, test plans, load test plans etc. as appropriate.
- Work closely with the QA and Support teams to track quality and proactively identify improvement opportunities.
- Work closely with DevOps and IT to ensure highly secure and cost optimized operations in the cloud.
- Grow technical skills in the team - identify skill gaps with plans to address them, participate in hiring, mentor other architects and engineers.
- Support other engineers in resolving complex technical issues as a go-to person.
Skills/Experience
- 12+ years of experience in design and development of ecommerce scale systems and highly scalable SaaS or enterprise products.
- Extensive experience in developing extensible and scalable web applications with
- Java, Spring Boot, Go
- Web Services - REST, OAuth, OData
- Database/Caching - MySQL, Cassandra, MongoDB, Memcached/Redis
- Queue/Broker services - RabbitMQ/Kafka
- Microservices architecture via Docker on AWS or Azure.
- Experience with web front end technologies - HTML5, CSS3, JavaScript libraries and frameworks such as jQuery, AngularJS, React, Vue.js, Bootstrap etc.
- Extensive experience with cloud based architectures and how to optimize design for cost.
- Expert level understanding of secure application design practices and a working understanding of cloud infrastructure security.
- Experience with CI/CD processes and design for testability.
- Experience working with big data technologies such as Spark/Storm/Hadoop/Data Lake Architectures is a big plus.
- Action and result-oriented problem-solver who works well both independently and as part of a team; able to foster and develop others' ideas as well as his/her own.
- Ability to organize, prioritize and schedule a high workload and multiple parallel projects efficiently.
- Excellent verbal and written communication with stakeholders in a matrixed environment.
- Long term experience with at least one product from inception to completion and evolution of the product over multiple years.
B.Tech/MCA (IT/Computer Science) from a premier institution (IIT/NIT/BITS) and/or a US Master's degree in Computer Science.
Understand various raw data input formats, build consumers on Kafka/ksqldb for them and ingest large amounts of raw data into Flink and Spark.
Conduct complex data analysis and report on results.
Build various aggregation streams for data and convert raw data into various logical processing streams.
Build algorithms to integrate multiple sources of data and create a unified data model from all the sources.
Build a unified data model on both SQL and NO-SQL databases to act as data sink.
Communicate the designs effectively with the fullstack engineering team for development.
Explore machine learning models that can be fitted on top of the data pipelines.
Mandatory Qualifications Skills:
Deep knowledge of Scala and Java programming languages is mandatory
Strong background in streaming data frameworks (Apache Flink, Apache Spark) is mandatory
Good understanding and hands on skills on streaming messaging platforms such as Kafka
Familiarity with R, C and Python is an asset
Analytical mind and business acumen with strong math skills (e.g. statistics, algebra)
Problem-solving aptitude
Excellent communication and presentation skills
at Magic9 Media and Consumer Knowledge Pvt. Ltd.
Job Description
This requirement is to service our client which is a leading big data technology company that measures what viewers consume across platforms to enable marketers make better advertising decisions. We are seeking a Senior Data Operations Analyst to mine large-scale datasets for our client. Their work will have a direct impact on driving business strategies for prominent industry leaders. Self-motivation and strong communication skills are both must-haves. Ability to work in a fast-paced work environment is desired.
Problems being solved by our client:
Measure consumer usage of devices linked to the internet and home networks including computers, mobile phones, tablets, streaming sticks, smart TVs, thermostats and other appliances. There are more screens and other connected devices in homes than ever before, yet there have been major gaps in understanding how consumers interact with this technology. Our client uses a measurement technology to unravel dynamics of consumers’ interactions with multiple devices.
Duties and responsibilities:
- The successful candidate will contribute to the development of novel audience measurement and demographic inference solutions.
- Develop, implement, and support statistical or machine learning methodologies and processes.
- Build, test new features and concepts and integrate into production process
- Participate in ongoing research and evaluation of new technologies
- Exercise your experience in the development lifecycle through analysis, design, development, testing and deployment of this system
- Collaborate with teams in Software Engineering, Operations, and Product Management to deliver timely and quality data. You will be the knowledge expert, delivering quality data to our clients
Qualifications:
- 3-5 years relevant work experience in areas as outlined below
- Experience in extracting data using SQL from large databases
- Experience in writing complex ETL processes and frameworks for analytics and data management. Must have experience in working on ETL tools.
- Master’s degree or PhD in Statistics, Data Science, Economics, Operations Research, Computer Science, or a similar degree with a focus on statistical methods. A Bachelor’s degree in the same fields with significant, demonstrated professional research experience will also be considered.
- Programming experience in scientific computing language (R, Python, Julia) and the ability to interact with relational data (SQL, Apache Pig, SparkSQL). General purpose programming (Python, Scala, Java) and familiarity with Hadoop is a plus.
- Excellent verbal and written communication skills.
- Experience with TV or digital audience measurement or market research data is a plus.
- Familiarity with systems analysis or systems thinking is a plus.
- Must be comfortable with analyzing complex, high-volume and high-dimension data from varying sources
- Excellent verbal, written and computer communication skills
- Ability to engage with Senior Leaders across all functional departments
- Ability to take on new responsibilities and adapt to changes
We’re looking to hire someone to help scale Machine Learning and NLP efforts at Episource. You’ll work with the team that develops the models powering Episource’s product focused on NLP driven medical coding. Some of the problems include improving our ICD code recommendations , clinical named entity recognition and information extraction from clinical notes.
This is a role for highly technical machine learning & data engineers who combine outstanding oral and written communication skills, and the ability to code up prototypes and productionalize using a large range of tools, algorithms, and languages. Most importantly they need to have the ability to autonomously plan and organize their work assignments based on high-level team goals.
You will be responsible for setting an agenda to develop and ship machine learning models that positively impact the business, working with partners across the company including operations and engineering. You will use research results to shape strategy for the company, and help build a foundation of tools and practices used by quantitative staff across the company.
What you will achieve:
-
Define the research vision for data science, and oversee planning, staffing, and prioritization to make sure the team is advancing that roadmap
-
Invest in your team’s skills, tools, and processes to improve their velocity, including working with engineering counterparts to shape the roadmap for machine learning needs
-
Hire, retain, and develop talented and diverse staff through ownership of our data science hiring processes, brand, and functional leadership of data scientists
-
Evangelise machine learning and AI internally and externally, including attending conferences and being a thought leader in the space
-
Partner with the executive team and other business leaders to deliver cross-functional research work and models
Required Skills:
-
Strong background in classical machine learning and machine learning deployments is a must and preferably with 4-8 years of experience
-
Knowledge of deep learning & NLP
-
Hands-on experience in TensorFlow/PyTorch, Scikit-Learn, Python, Apache Spark & Big Data platforms to manipulate large-scale structured and unstructured datasets.
-
Experience with GPU computing is a plus.
-
Professional experience as a data science leader, setting the vision for how to most effectively use data in your organization. This could be through technical leadership with ownership over a research agenda, or developing a team as a personnel manager in a new area at a larger company.
-
Expert-level experience with a wide range of quantitative methods that can be applied to business problems.
-
Evidence you’ve successfully been able to scope, deliver and sell your own research in a way that shifts the agenda of a large organization.
-
Excellent written and verbal communication skills on quantitative topics for a variety of audiences: product managers, designers, engineers, and business leaders.
-
Fluent in data fundamentals: SQL, data manipulation using a procedural language, statistics, experimentation, and modeling
Qualifications
-
Professional experience as a data science leader, setting the vision for how to most effectively use data in your organization
-
Expert-level experience with machine learning that can be applied to business problems
-
Evidence you’ve successfully been able to scope, deliver and sell your own work in a way that shifts the agenda of a large organization
-
Fluent in data fundamentals: SQL, data manipulation using a procedural language, statistics, experimentation, and modeling
-
Degree in a field that has very applicable use of data science / statistics techniques (e.g. statistics, applied math, computer science, OR a science field with direct statistics application)
-
5+ years of industry experience in data science and machine learning, preferably at a software product company
-
3+ years of experience managing data science teams, incl. managing/grooming managers beneath you
-
3+ years of experience partnering with executive staff on data topics