- Research and develop statistical learning models for data analysis
- Collaborate with product management and engineering departments to understand company needs and devise possible solutions
- Keep up-to-date with latest technology trends
- Communicate results and ideas to key decision makers
- Implement new statistical or other mathematical methodologies as needed for specific models or analysis
- Optimize joint development efforts through appropriate database use and project design
Qualifications/Requirements:
- Masters or PhD in Computer Science, Electrical Engineering, Statistics, Applied Math or equivalent fields with strong mathematical background
- Excellent understanding of machine learning techniques and algorithms, including clustering, anomaly detection, optimization, neural network etc
- 3+ years experiences building data science-driven solutions including data collection, feature selection, model training, post-deployment validation
- Strong hands-on coding skills (preferably in Python) processing large-scale data set and developing machine learning models
- Familiar with one or more machine learning or statistical modeling tools such as Numpy, ScikitLearn, MLlib, Tensorflow
- Good team worker with excellent communication skills written, verbal and presentation
Desired Experience:
- Experience with AWS, S3, Flink, Spark, Kafka, Elastic Search
- Knowledge and experience with NLP technology
- Previous work in a start-up environment
Similar jobs
The Platform Data Science team works at the intersection of data science and engineering. Domain experts develop and advance platforms, including the data platforms, machine learning platform, other platforms for Forecasting, Experimentation, Anomaly Detection, Conversational AI, Underwriting of Risk, Portfolio Management, Fraud Detection & Prevention and many more. We also are the Data Science and Analytics partners for Product and provide Behavioural Science insights across Jupiter.
About the role:
We’re looking for strong Software Engineers that can combine EMR, Redshift, Hadoop, Spark, Kafka, Elastic Search, Tensorflow, Pytorch and other technologies to build the next generation Data Platform, ML Platform, Experimentation Platform. If this sounds interesting we’d love to hear from you!
This role will involve designing and developing software products that impact many areas of our business. The individual in this role will have responsibility help define requirements, create software designs, implement code to these specifications, provide thorough unit and integration testing, and support products while deployed and used by our stakeholders.
Key Responsibilities:
Participate, Own & Influence in architecting & designing of systems
Collaborate with other engineers, data scientists, product managers
Build intelligent systems that drive decisions
Build systems that enable us to perform experiments and iterate quickly
Build platforms that enable scientists to train, deploy and monitor models at scale
Build analytical systems that drives better decision making
Required Skills:
Programming experience with at least one modern language such as Java, Scala including object-oriented design
Experience in contributing to the architecture and design (architecture, design patterns, reliability and scaling) of new and current systems
Bachelor’s degree in Computer Science or related field
Computer Science fundamentals in object-oriented design
Computer Science fundamentals in data structures
Computer Science fundamentals in algorithm design, problem solving, and complexity analysis
Experience in databases, analytics, big data systems or business intelligence products:
Data lake, data warehouse, ETL, ML platform
Big data tech like: Hadoop, Apache Spark
Ever dreamed of being part of new product initiatives? Feel the energy and excitement to work on version 1 of a product, and bring the idea on paper to life. Do you want to work on SAAS products that can become the next Uber, Airbnb, or Flipkart? We allow you to be part of a team leading the development of a SAAS product.
Our organization relies on its central engineering workforce to develop and maintain a product portfolio of several different startups. Our product portfolio continuously grows as we incubate more startups, which means that different products are very likely to use different technologies, architecture & frameworks - a fun place for smart tech lovers!
We are looking for an experienced (4-8yrs) Senior Data Engineer to join one of our engineering teams at our office in Hyderabad.
What would you be doing?
- Create and maintain optimal data pipeline architecture,
- Assemble large, complex data sets that meet functional / non-functional business requirements.
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS ‘big data’ technologies.
- Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
- Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
- Keep our data separated and secure across national boundaries through multiple data centers and AWS regions.
- Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
- Work with data and analytics experts to strive for greater functionality in our data systems.
Who Should Apply?
- Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
- Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
- Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
- Strong analytic skills related to working with unstructured datasets.
- Build processes supporting data transformation, data structures, metadata, dependency and workload management.
- A successful history of manipulating, processing and extracting value from large disconnected datasets.
- Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores.
- Strong project management and organizational skills.
- Experience supporting and working with cross-functional teams in a dynamic environment.
About CAW Studios
CAW Studios is a Product Engineering Studio. WE BUILD TRUE PRODUCT TEAMS for our clients. Each team is a small, well-balanced group of geeks and a product manager that together produce relevant and high-quality products. We believe the product development process is broken as most development companies operate as IT Services. Unlike IT Apps, we believe product development requires Ownership, Creativity, Agility, and design that scales.
CAW Studios is a Product Engineering Company of 70+ geeks. We have built these products - Interakt, CashFlo, KaiPulse, and FastBar. And we still run their complete engineering. We are part of the engineering teams for Haptik, EmailAnalytics, GrowthZone, Reliance General Insurance, and KWE Logistics.
We are obsessed with automation, DevOps, OOPS, and SOLID. We are not into one tech stack - we are into solving problems.
Find us: https://goo.gl/maps/dvR6L26JUa42
Website: https://www.cawstudios.com/
Know More: https://www.cawstudios.com/handbook
We are looking out for a technically driven "ML OPS Engineer" for one of our premium client
COMPANY DESCRIPTION:
Key Skills
• Excellent hands-on expert knowledge of cloud platform infrastructure and administration
(Azure/AWS/GCP) with strong knowledge of cloud services integration, and cloud security
• Expertise setting up CI/CD processes, building and maintaining secure DevOps pipelines with at
least 2 major DevOps stacks (e.g., Azure DevOps, Gitlab, Argo)
• Experience with modern development methods and tooling: Containers (e.g., docker) and
container orchestration (K8s), CI/CD tools (e.g., Circle CI, Jenkins, GitHub actions, Azure
DevOps), version control (Git, GitHub, GitLab), orchestration/DAGs tools (e.g., Argo, Airflow,
Kubeflow)
• Hands-on coding skills Python 3 (e.g., API including automated testing frameworks and libraries
(e.g., pytest) and Infrastructure as Code (e.g., Terraform) and Kubernetes artifacts (e.g.,
deployments, operators, helm charts)
• Experience setting up at least one contemporary MLOps tooling (e.g., experiment tracking,
model governance, packaging, deployment, feature store)
• Practical knowledge delivering and maintaining production software such as APIs and cloud
infrastructure
• Knowledge of SQL (intermediate level or more preferred) and familiarity working with at least
one common RDBMS (MySQL, Postgres, SQL Server, Oracle)
Job Title
Data Analyst
Job Brief
The successful candidate will turn data into information, information into insight and insight into business decisions.
Data Analyst Job Duties
Data analyst responsibilities include conducting full lifecycle analysis to include requirements, activities and design. Data analysts will develop analysis and reporting capabilities. They will also monitor performance and quality control plans to identify improvements.
Responsibilities
● Interpret data, analyze results using statistical techniques and provide ongoing reports.
● Develop and implement databases, data collection systems, data analytics and other strategies that optimize statistical efficiency and quality.
● Acquire data fromprimary orsecondary data sources andmaintain databases/data systems.
● Identify, analyze, and interpret trends orpatternsin complex data sets.
● Filter and “clean” data by reviewing computerreports, printouts, and performance indicatorsto locate and correct code problems.
● Work withmanagementto prioritize business and information needs.
● Locate and define new processimprovement opportunities.
Requirements
● Proven working experienceas aData Analyst or BusinessDataAnalyst.
● Technical expertise regarding data models, database design development, data mining and segmentation techniques.
● Strong knowledge of and experience with reporting packages (Business Objects etc), databases (SQL etc), programming (XML, Javascript, or ETL frameworks).
● Knowledge of statistics and experience using statistical packages for analyzing datasets (Excel, SPSS, SAS etc).
● Strong analytical skills with the ability to collect, organize, analyze, and disseminate significant amounts of information with attention to detail and accuracy.
● Adept atqueries,reportwriting and presenting findings.
Job Location SouthDelhi, New Delhi
Responsibilities:
- Designing and implementing fine-tuned production ready data/ML pipelines in Hadoop platform.
- Driving optimization, testing and tooling to improve quality.
- Reviewing and approving high level & amp; detailed design to ensure that the solution delivers to the business needs and aligns to the data & analytics architecture principles and roadmap.
- Understanding business requirements and solution design to develop and implement solutions that adhere to big data architectural guidelines and address business requirements.
- Following proper SDLC (Code review, sprint process).
- Identifying, designing, and implementing internal process improvements: automating manual processes, optimizing data delivery, etc.
- Building robust and scalable data infrastructure (both batch processing and real-time) to support needs from internal and external users.
- Understanding various data security standards and using secure data security tools to apply and adhere to the required data controls for user access in the Hadoop platform.
- Supporting and contributing to development guidelines and standards for data ingestion.
- Working with a data scientist and business analytics team to assist in data ingestion and data related technical issues.
- Designing and documenting the development & deployment flow.
Requirements:
- Experience in developing rest API services using one of the Scala frameworks.
- Ability to troubleshoot and optimize complex queries on the Spark platform
- Expert in building and optimizing ‘big data’ data/ML pipelines, architectures and data sets.
- Knowledge in modelling unstructured to structured data design.
- Experience in Big Data access and storage techniques.
- Experience in doing cost estimation based on the design and development.
- Excellent debugging skills for the technical stack mentioned above which even includes analyzing server logs and application logs.
- Highly organized, self-motivated, proactive, and ability to propose best design solutions.
- Good time management and multitasking skills to work to deadlines by working independently and as a part of a team.
● Research and develop advanced statistical and machine learning models for
analysis of large-scale, high-dimensional data.
● Dig deeper into data, understand characteristics of data, evaluate alternate
models and validate hypothesis through theoretical and empirical approaches.
● Productize proven or working models into production quality code.
● Collaborate with product management, marketing and engineering teams in
Business Units to elicit & understand their requirements & challenges and
develop potential solutions
● Stay current with latest research and technology ideas; share knowledge by
clearly articulating results and ideas to key decision makers.
● File patents for innovative solutions that add to company's IP portfolio
Requirements
● 4 to 6 years of strong experience in data mining, machine learning and
statistical analysis.
● BS/MS/PhD in Computer Science, Statistics, Applied Math, or related areas
from Premier institutes (only IITs / IISc / BITS / Top NITs or top US university
should apply)
● Experience in productizing models to code in a fast-paced start-up
environment.
● Expertise in Python programming language and fluency in analytical tools
such as Matlab, R, Weka etc.
● Strong intuition for data and Keen aptitude on large scale data analysis
● Strong communication and collaboration skills.
Responsibilities
- Develop process workflows for data preparations, modeling, and mining Manage configurations to build reliable datasets for analysis Troubleshooting services, system bottlenecks, and application integration.
- Designing, integrating, and documenting technical components, and dependencies of big data platform Ensuring best practices that can be adopted in the Big Data stack and shared across teams.
- Design and Development of Data pipeline on AWS Cloud
- Data Pipeline development using Pyspark, AWS, and Python.
- Developing Pyspark streaming applications
Eligibility
- Hands-on experience in Spark, Python, and Cloud
- Highly analytical and data-oriented
- Good to have - Databricks
We are looking for an exceptional Software Developer for our Data Engineering India team who can-
contribute to building a world-class big data engineering stack that will be used to fuel us
Analytics and Machine Learning products. This person will be contributing to the architecture,
operation, and enhancement of:
Our petabyte-scale data platform with a key focus on finding solutions that can support
Analytics and Machine Learning product roadmap. Everyday terabytes of ingested data
need to be processed and made available for querying and insights extraction for
various use cases.
About the Organisation:
- It provides a dynamic, fun workplace filled with passionate individuals. We are at the cutting edge of advertising technology and there is never a dull moment at work.
- We have a truly global footprint, with our headquarters in Singapore and offices in Australia, United States, Germany, United Kingdom, and India.
- You will gain work experience in a global environment. We speak over 20 different languages, from more than 16 different nationalities and over 42% of our staff are multilingual.
Job Description
Position:
Software Developer, Data Engineering team
Location: Pune(Initially 100% Remote due to Covid 19 for coming 1 year)
- Our bespoke Machine Learning pipelines. This will also provide opportunities to
contribute to the prototyping, building, and deployment of Machine Learning models.
You:
- Have at least 4+ years’ Experience.
- Deep technical understanding of Java or Golang.
- Production experience with Python is a big plus, extremely valuable supporting skill for
us.
- Exposure to modern Big Data tech: Cassandra/Scylla, Kafka, Ceph, the Hadoop Stack,
Spark, Flume, Hive, Druid etc… while at the same time understanding that certain
problems may require completely novel solutions.
- Exposure to one or more modern ML tech stacks: Spark ML-Lib, TensorFlow, Keras,
GCP ML Stack, AWS Sagemaker - is a plus.
- Experience includes working in Agile/Lean model
- Experience with supporting and troubleshooting large systems
- Exposure to configuration management tools such as Ansible or Salt
- Exposure to IAAS platforms such as AWS, GCP, Azure…
- Good addition - Experience working with large-scale data
- Good addition - Good to have experience architecting, developing, and operating data
warehouses, big data analytics platforms, and high velocity data pipelines
**** Not looking for a Big Data Developer / Hadoop Developer
Company Profile:
Easebuzz is a payment solutions (fintech organisation) company which enables online merchants to accept, process and disburse payments through developer friendly APIs. We are focusing on building plug n play products including the payment infrastructure to solve complete business problems. Definitely a wonderful place where all the actions related to payments, lending, subscription, eKYC is happening at the same time.
We have been consistently profitable and are constantly developing new innovative products, as a result, we are able to grow 4x over the past year alone. We are well capitalised and have recently closed a fundraise of $4M in March, 2021 from prominent VC firms and angel investors. The company is based out of Pune and has a total strength of 180 employees. Easebuzz’s corporate culture is tied into the vision of building a workplace which breeds open communication and minimal bureaucracy. An equal opportunity employer, we welcome and encourage diversity in the workplace. One thing you can be sure of is that you will be surrounded by colleagues who are committed to helping each other grow.
Easebuzz Pvt. Ltd. has its presence in Pune, Bangalore, Gurugram.
Salary: As per company standards.
Designation: Data Engineering
Location: Pune
Experience with ETL, Data Modeling, and Data Architecture
Design, build and operationalize large scale enterprise data solutions and applications using one or more of AWS data and analytics services in combination with 3rd parties
- Spark, EMR, DynamoDB, RedShift, Kinesis, Lambda, Glue.
Experience with AWS cloud data lake for development of real-time or near real-time use cases
Experience with messaging systems such as Kafka/Kinesis for real time data ingestion and processing
Build data pipeline frameworks to automate high-volume and real-time data delivery
Create prototypes and proof-of-concepts for iterative development.
Experience with NoSQL databases, such as DynamoDB, MongoDB etc
Create and maintain optimal data pipeline architecture,
Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS ‘big data’ technologies.
Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
Keep our data separated and secure across national boundaries through multiple data centers and AWS regions.
Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
Evangelize a very high standard of quality, reliability and performance for data models and algorithms that can be streamlined into the engineering and sciences workflow
Build and enhance data pipeline architecture by designing and implementing data ingestion solutions.
Employment Type
Full-time