- Minimum 1 years of relevant experience, in PySpark (mandatory)
- Hands on experience in development, test, deploy, maintain and improving data integration pipeline in AWS cloud environment is added plus
- Ability to play lead role and independently manage 3-5 member of Pyspark development team
- EMR ,Python and PYspark mandate.
- Knowledge and awareness working with AWS Cloud technologies like Apache Spark, , Glue, Kafka, Kinesis, and Lambda in S3, Redshift, RDS
2 Years - 10 Years
Must Have Skills:
• Good experience in Pyspark - Including Dataframe core functions and Spark SQL
• Good experience in SQL DBs - Be able to write queries including fair complexity.
• Should have excellent experience in Big Data programming for data transformation and aggregations
• Good at ELT architecture. Business rules processing and data extraction from Data Lake into data streams for business consumption.
• Good customer communication.
• Good Analytical skills
|Degree / Diploma||Bachelor of Engineering, Bachelor of Computer Applications, Any Engineering|
|Specialization / Subject||Any Specialisation|
|Job Type||Full Time|
We are looking for an exceptional Data Scientist Lead / Manager who is passionate about data and motivated to build large scale machine learning solutions to shine our data products. This person will be contributing to the analytics of data for insight discovery and development of machine learning pipeline to support modeling of terabytes of daily data for various use cases.
Location: Pune (Initially remote due to COVID 19)
*****Looking for someone who can start immediately / Within a month. Hands-on experience in Python programming (Minimum 5 Years) is a must.
About the Organisation :
- It provides a dynamic, fun workplace filled with passionate individuals. We are at the cutting edge of advertising technology and there is never a dull moment at work.
- We have a truly global footprint, with our headquarters in Singapore and offices in Australia, United States, Germany, United Kingdom and India.
- You will gain work experience in a global environment. We speak over 20 different languages, from more than 16 different nationalities and over 42% of our staff are multilingual.
• 8+ years relevant working experience
• Master / Bachelors in computer science or engineering
• Working knowledge of Python and SQL
• Experience in time series data, data manipulation, analytics, and visualization
• Experience working with large-scale data
• Proficiency of various ML algorithms for supervised and unsupervised learning
• Experience working in Agile/Lean model
• Experience with Java and Golang is a plus
• Experience with BI toolkit such as Tableau, Superset, Quicksight, etc is a plus
• Exposure to building large-scale ML models using one or more of modern tools and libraries such as AWS Sagemaker, Spark ML-Lib, Dask, Tensorflow, PyTorch, Keras, GCP ML Stack
• Exposure to modern Big Data tech such as Cassandra/Scylla, Kafka, Ceph, Hadoop, Spark
• Exposure to IAAS platforms such as AWS, GCP, Azure
Typical persona: Data Science Manager/Architect
Experience: 8+ years programming/engineering experience (with at least last 4 years in Data science in a Product development company)
Type: Hands-on candidate only
a. Hands-on Python: pandas,scikit-learn
b. Working knowledge of Kafka
c. Able to carry out own tasks and help the team in resolving problems - logical or technical (25% of job)
d. Good on analytical & debugging skills
e. Strong communication skills
Desired (in order of priorities)
a.Go (Strong advantage)
b. Airflow (Strong advantage)
c. Familiarity & working experience on more than one type of database: relational, object, columnar, graph and other unstructured databases
d. Data structures, Algorithms
e. Experience with multi-threaded and thread sync concepts
f. AWS Sagemaker
Responsible for the development and implementation of machine learning algorithms and techniques to solve business problems and optimize member experiences. Primary duties may include are but not limited to: Design machine learning projects to address specific business problems determined by consultation with business partners. Work with data-sets of varying degrees of size and complexity including both structured and unstructured data. Piping and processing massive data-streams in distributed computing environments such as Hadoop to facilitate analysis. Implements batch and real-time model scoring to drive actions. Develops machine learning algorithms to build customized solutions that go beyond standard industry tools and lead to innovative solutions. Develop sophisticated visualization of analysis output for business users.
BS/MA/MS/PhD in Statistics, Computer Science, Mathematics, Machine Learning, Econometrics, Physics, Biostatistics or related Quantitative disciplines. 2-4 years of experience in predictive analytics and advanced expertise with software such as Python, or any combination of education and experience which would provide an equivalent background. Experience in the healthcare sector. Experience in Deep Learning strongly preferred.
Required Technical Skill Set:
- Full cycle of building machine learning solutions,
o Understanding of wide range of algorithms and their corresponding problems to solve
o Data preparation and analysis
o Model training and validation
o Model application to the problem
- Experience using the full open source programming tools and utilities
- Experience in working in end-to-end data science project implementation.
- 2+ years of experience with development and deployment of Machine Learning applications
- 2+ years of experience with NLP approaches in a production setting
- Experience in building models using bagging and boosting algorithms
- Exposure/experience in building Deep Learning models for NLP/Computer Vision use cases preferred
- Ability to write efficient code with good understanding of core Data Structures/algorithms is critical
- Strong python skills following software engineering best practices
- Experience in using code versioning tools like GIT, bit bucket
- Experience in working in Agile projects
- Comfort & familiarity with SQL and Hadoop ecosystem of tools including spark
- Experience managing big data with efficient query program good to have
- Good to have experience in training ML models in tools like Sage Maker, Kubeflow etc.
- Good to have experience in frameworks to depict interpretability of models using libraries like Lime, Shap etc.
- Experience with Health care sector is preferred
- MS/M.Tech or PhD is a plus
What you’ll do
- Deliver plugins for our Python-based ETL pipelines.
- Deliver Python microservices for provisioning and managing cloud infrastructure.
- Implement algorithms to analyse large data sets.
- Draft design documents that translate requirements into code.
- Deal with challenges associated with handling large volumes of data.
- Assume responsibilities from technical design through technical client support.
- Manage expectations with internal stakeholders and context-switch in a fast paced environment.
- Thrive in an environment that uses AWS and Elasticsearch extensively.
- Keep abreast of technology and contribute to the engineering strategy.
- Champion best development practices and provide mentorship.
What we’re looking for
- Experience in Python 3.
- Python libraries used for data (such as pandas, numpy).
- Performance tuning.
- Object Oriented Design and Modelling.
- Delivering complex software, ideally in a FinTech setting.
- CI/CD tools.
- Knowledge of design patterns.
- Sharp analytical and problem-solving skills.
- Strong sense of ownership.
- Demonstrable desire to learn and grow.
- Excellent written and oral communication skills.
- Mature collaboration and mentoring abilities.
About SteelEye Culture
- Work from home until you are vaccinated against COVID-19
- Top of the line health insurance • Order discounted meals every day from a dedicated portal
- Fair and simple salary structure
- 30+ holidays in a year
- Fresh fruits every day
- Centrally located. 5 mins to the nearest metro station (MG Road)
- Measured on output and not input
About the role
- Collaborating with a team of like-minded and experienced engineers for Tier 1 customers, you will focus on data engineering on large complex data projects. Your work will have an impact on platforms that handle crores of customers and millions of transactions daily.
- As an engineer, you will use the latest cloud services to design and develop reusable core components and frameworks to modernise data integrations in a cloud first world and own those integrations end to end working closely with business units. You will design and build for efficiency, reliability, security and scalability. As a consultant, you will help drive a data engineering culture and advocate best practices.
- 1-6 years of relevant experience
- Strong SQL skills and data literacy
- Hands-on experience designing and developing data integrations, either in ETL tools, cloud native tools or in custom software
- Proficiency in scripting and automation (e.g. PowerShell, Bash, Python)
- Experience in an enterprise data environment
- Strong communication skills
- Ability to work on data architecture, data models, data migration, integration and pipelines
- Ability to work on data platform modernisation from on-premise to cloud-native
- Proficiency in data security best practices
- Stakeholder management experience
- Positive attitude with the flexibility and ability to adapt to an ever-changing technology landscape
- Desire to gain breadth and depth of technologies to support customer's vision and project objectives
What to expect if you join Servian?
- Learning & Development: We invest heavily in our consultants and offer internal training weekly (both technical and non-technical alike!) and abide by a ‘You Pass We Pay” policy.
- Career progression: We take a longer term view of every hire. We have a flat org structure and promote from within. Every hire is developed as a future leader and client adviser.
- Variety of projects: As a consultant, you will have the opportunity to work across multiple projects across our client base significantly increasing your skills and exposure in the industry.
- Great culture: Working on the latest Apple MacBook pro in our custom designed offices in the heart of leafy Jayanagar, we provide a peaceful and productive work environment close to shops, parks and metro station.
- Professional development: We invest heavily in professional development both technically, through training and guided certification pathways, and in consulting, through workshops in client engagement and communication. Growth in our organisation happens from the growth of our people.
We are looking for an outstanding ML Architect (Deployments) with expertise in deploying Machine Learning solutions/models into production and scaling them to serve millions of customers. A candidate with an adaptable and productive working style which fits in a fast-moving environment.
- 5+ years deploying Machine Learning pipelines in large enterprise production systems.
- Experience developing end to end ML solutions from business hypothesis to deployment / understanding the entirety of the ML development life cycle.
- Expert in modern software development practices; solid experience using source control management (CI/CD).
- Proficient in designing relevant architecture / microservices to fulfil application integration, model monitoring, training / re-training, model management, model deployment, model experimentation/development, alert mechanisms.
- Experience with public cloud platforms (Azure, AWS, GCP).
- Serverless services like lambda, azure functions, and/or cloud functions.
- Orchestration services like data factory, data pipeline, and/or data flow.
- Data science workbench/managed services like azure machine learning, sagemaker, and/or AI platform.
- Data warehouse services like snowflake, redshift, bigquery, azure sql dw, AWS Redshift.
- Distributed computing services like Pyspark, EMR, Databricks.
- Data storage services like cloud storage, S3, blob, S3 Glacier.
- Data visualization tools like Power BI, Tableau, Quicksight, and/or Qlik.
- Proven experience serving up predictive algorithms and analytics through batch and real-time APIs.
- Solid working experience with software engineers, data scientists, product owners, business analysts, project managers, and business stakeholders to design the holistic solution.
- Strong technical acumen around automated testing.
- Extensive background in statistical analysis and modeling (distributions, hypothesis testing, probability theory, etc.)
- Strong hands-on experience with statistical packages and ML libraries (e.g., Python scikit learn, Spark MLlib, etc.)
- Experience in effective data exploration and visualization (e.g., Excel, Power BI, Tableau, Qlik, etc.)
- Experience in developing and debugging in one or more of the languages Java, Python.
- Ability to work in cross functional teams.
- Apply Machine Learning techniques in production including, but not limited to, neuralnets, regression, decision trees, random forests, ensembles, SVM, Bayesian models, K-Means, etc.
Roles and Responsibilities:
Deploying ML models into production, and scaling them to serve millions of customers.
Technical solutioning skills with deep understanding of technical API integrations, AI / Data Science, BigData and public cloud architectures / deployments in a SaaS environment.
Strong stakeholder relationship management skills - able to influence and manage the expectations of senior executives.
Strong networking skills with the ability to build and maintain strong relationships with both business, operations and technology teams internally and externally.
Provide software design and programming support to projects.
Qualifications & Experience:
Engineering and post graduate candidates, preferably in Computer Science, from premier institutions with proven work experience as a Machine Learning Architect (Deployments) or a similar role for 5-7 years.
• Responsible for developing and maintaining applications with PySpark
We are building a global content marketplace that brings companies and content
creators together to scale up content creation processes across 50+ content verticals and 150+ industries. Over the past 2.5 years, we’ve worked with companies like India Today, Amazon India, Adobe, Swiggy, Dunzo, Businessworld, Paisabazaar, IndiGo Airlines, Apollo Hospitals, Infoedge, Times Group, Digit, BookMyShow, UpGrad, Yulu, YourStory, and 350+ other brands.
Our mission is to become the world’s largest content creation and distribution platform for all kinds of content creators and brands.
We are a 25+ member company and is scaling up rapidly in both team size and our ambition.
If we were to define the kind of people and the culture we have, it would be -
a) Individuals with an Extreme Sense of Passion About Work
b) Individuals with Strong Customer and Creator Obsession
c) Individuals with Extraordinary Hustle, Perseverance & Ambition
We are on the lookout for individuals who are always open to going the extra mile and thrive in a fast-paced environment. We are strong believers in building a great, enduring
a company that can outlast its builders and create a massive impact on the lives of our
employees, creators, and customers alike.
We are fortunate to be backed by some of the industry’s most prolific angel investors - Kunal Bahl and Rohit Bansal (Snapdeal founders), YourStory Media. (Shradha Sharma); Dr. Saurabh Srivastava, Co-founder of IAN and NASSCOM; Slideshare co-founder Amit Ranjan; Indifi's Co-founder and CEO Alok Mittal; Sidharth Rao, Chairman of Dentsu Webchutney; Ritesh Malik, Co-founder and CEO of Innov8; Sanjay Tripathy, former CMO, HDFC Life, and CEO of Agilio Labs; Manan Maheshwari, Co-founder of WYSH; and Hemanshu Jain, Co-founder of Diabeto.
Backed by Lightspeed Venture Partners
● Design, develop, test, deploy, maintain and improve ML models
● Implement novel learning algorithms and recommendation engines
● Apply Data Science concepts to solve routine problems of target users
● Translates business analysis needs into well-defined machine learning problems, and
selecting appropriate models and algorithms
● Create an architecture, implement, maintain and monitor various data source pipelines
that can be used across various different types of data sources
● Monitor performance of the architecture and conduct optimization
● Produce clean, efficient code based on specifications
● Verify and deploy programs and systems
● Troubleshoot, debug and upgrade existing applications
● Guide junior engineers for productive contribution to the development
The ideal candidate must -
ML and NLP Engineer
● 4 or more years of experience in ML Engineering
● Proven experience in NLP
● Familiarity with language generative model - GPT3
● Ability to write robust code in Python
● Familiarity with ML frameworks and libraries
● Hands on experience with AWS services like Sagemaker and Personalize
● Exposure to state of the art techniques in ML and NLP
● Understanding of data structures, data modeling, and software architecture
● Outstanding analytical and problem-solving skills
● Team player, an ability to work cooperatively with the other engineers.
● Ability to make quick decisions in high-pressure environments with limited information.
About MX Player (https://play.google.com/store/apps/details?id=com.mxtech.videoplayer.ad&hl=en_IN">Playstore Link)
MX Player is the world’s #1 entertainment superapp offering 100,000+ hours of premium OTT (over the top) content spanning acclaimed MX Originals, Web Shows, TV (Live & OnDemand), movies, music videos and hyper-casual games, music streaming, short form video and more. With more than 1 billion installs worldwide – MX Player is present on 1 out of every 2 smartphones, making it the largest entertainment app/platform in the world.
Position : Product Analyst / Business Analyst - Ad Tech
- Driving the collection of new data that would help build the next generation of algorithms (E.g. audience segmentation, contextual targeting)
- Understanding user behavior and performing root-cause analysis of changes in data trends to identify corrections or propose desirable enhancements in product & across different verticals
- Excellent problem solving skills and the ability to make sound judgments based on trade-offs for different solutions to complex problem constraints
- Defining and monitoring KPIs for product/content/business performance and identifying ways to improve them
- Should be a strong advocate of data driven approach and drive analytics decisions by doing user testing, data analysis, and A/B testing
- Help in defining the analytics roadmap for the product
- Prior knowledge and experience in ad tech industry or other advertising platforms will be preferred
- Knowledge of Google DFP (prefered)
- R/Python (preferred)
- Any BI Tool such as tableau, sisense (preferred)
- Go getter attitude
- Ability to thrive in a fast paced dynamic environment
- Self - Starter