- Experience in AWS Glue
- Experience in Apache Parquet
- Proficient in AWS S3 and data lake
- Knowledge of Snowflake
- Understanding of file-based ingestion best practices.
- Scripting language - Python & pyspark
- Create and manage cloud resources in AWS
- Data ingestion from different data sources which exposes data using different technologies, such as: RDBMS, REST HTTP API, flat files, Streams, and Time series data based on various proprietary systems. Implement data ingestion and processing with the help of Big Data technologies
- Data processing/transformation using various technologies such as Spark and Cloud Services. You will need to understand your part of business logic and implement it using the language supported by the base data platform
- Develop automated data quality check to make sure right data enters the platform and verifying the results of the calculations
- Develop an infrastructure to collect, transform, combine and publish/distribute customer data.
- Define process improvement opportunities to optimize data collection, insights and displays.
- Ensure data and results are accessible, scalable, efficient, accurate, complete and flexible
- Identify and interpret trends and patterns from complex data sets
- Construct a framework utilizing data visualization tools and techniques to present consolidated analytical and actionable results to relevant stakeholders.
- Key participant in regular Scrum ceremonies with the agile teams
- Proficient at developing queries, writing reports and presenting findings
- Mentor junior members and bring best industry practices
- 5-7+ years’ experience as data engineer in consumer finance or equivalent industry (consumer loans, collections, servicing, optional product, and insurance sales)
- Strong background in math, statistics, computer science, data science or related discipline
- Advanced knowledge one of language: Java, Scala, Python, C#
- Production experience with: HDFS, YARN, Hive, Spark, Kafka, Oozie / Airflow, Amazon Web Services (AWS), Docker / Kubernetes, Snowflake
- Proficient with
- Data mining/programming tools (e.g. SAS, SQL, R, Python)
- Database technologies (e.g. PostgreSQL, Redshift, Snowflake. and Greenplum)
- Data visualization (e.g. Tableau, Looker, MicroStrategy)
- Comfortable learning about and deploying new technologies and tools.
- Organizational skills and the ability to handle multiple projects and priorities simultaneously and meet established deadlines.
- Good written and oral communication skills and ability to present results to non-technical audiences
- Knowledge of business intelligence and analytical tools, technologies and techniques.
Familiarity and experience in the following is a plus:
- AWS certification
- Spark Streaming
- Kafka Streaming / Kafka Connect
- ELK Stack
- Cassandra / MongoDB
- CI/CD: Jenkins, GitLab, Jira, Confluence other related tools
About Consulting and Services company
We are looking out for a technically driven "ML OPS Engineer" for one of our premium client
• Excellent hands-on expert knowledge of cloud platform infrastructure and administration
(Azure/AWS/GCP) with strong knowledge of cloud services integration, and cloud security
• Expertise setting up CI/CD processes, building and maintaining secure DevOps pipelines with at
least 2 major DevOps stacks (e.g., Azure DevOps, Gitlab, Argo)
• Experience with modern development methods and tooling: Containers (e.g., docker) and
container orchestration (K8s), CI/CD tools (e.g., Circle CI, Jenkins, GitHub actions, Azure
DevOps), version control (Git, GitHub, GitLab), orchestration/DAGs tools (e.g., Argo, Airflow,
• Hands-on coding skills Python 3 (e.g., API including automated testing frameworks and libraries
(e.g., pytest) and Infrastructure as Code (e.g., Terraform) and Kubernetes artifacts (e.g.,
deployments, operators, helm charts)
• Experience setting up at least one contemporary MLOps tooling (e.g., experiment tracking,
model governance, packaging, deployment, feature store)
• Practical knowledge delivering and maintaining production software such as APIs and cloud
• Knowledge of SQL (intermediate level or more preferred) and familiarity working with at least
one common RDBMS (MySQL, Postgres, SQL Server, Oracle)
Enterprise minds is looking for Data Scientist.
Strong in Python,Pyspark.
Prefer immediate joiners
- B.E Computer Science or equivalent.
- In-depth knowledge of machine learning algorithms and their applications including
practical experience with and theoretical understanding of algorithms for classification,
regression and clustering.
- Hands-on experience in computer vision and deep learning projects to solve real world
problems involving vision tasks such as object detection, Object tracking, instance
segmentation, activity detection, depth estimation, optical flow, multi-view geometry,
domain adaptation etc.
- Strong understanding of modern and traditional Computer Vision Algorithms.
- Experience in one of the Deep Learning Frameworks / Networks: PyTorch, TensorFlow,
Darknet (YOLO v4 v5), U-Net, Mask R-CNN, EfficientDet, BERT etc.
- Proficiency with CNN architectures such as ResNet, VGG, UNet, MobileNet, pix2pix,
and Cycle GAN.
- Experienced user of libraries such as OpenCV, scikit-learn, matplotlib and pandas.
- Ability to transform research articles into working solutions to solve real-world problems.
- High proficiency in Python programming knowledge.
- Familiar with software development practices/pipelines (DevOps- Kubernetes, docker
containers, CI/CD tools).
- Strong communication skills.
Responsible for the development and implementation of machine learning algorithms and techniques to solve business problems and optimize member experiences. Primary duties may include are but not limited to: Design machine learning projects to address specific business problems determined by consultation with business partners. Work with data-sets of varying degrees of size and complexity including both structured and unstructured data. Piping and processing massive data-streams in distributed computing environments such as Hadoop to facilitate analysis. Implements batch and real-time model scoring to drive actions. Develops machine learning algorithms to build customized solutions that go beyond standard industry tools and lead to innovative solutions. Develop sophisticated visualization of analysis output for business users.
BS/MA/MS/PhD in Statistics, Computer Science, Mathematics, Machine Learning, Econometrics, Physics, Biostatistics or related Quantitative disciplines. 2-4 years of experience in predictive analytics and advanced expertise with software such as Python, or any combination of education and experience which would provide an equivalent background. Experience in the healthcare sector. Experience in Deep Learning strongly preferred.
Required Technical Skill Set:
- Full cycle of building machine learning solutions,
o Understanding of wide range of algorithms and their corresponding problems to solve
o Data preparation and analysis
o Model training and validation
o Model application to the problem
- Experience using the full open source programming tools and utilities
- Experience in working in end-to-end data science project implementation.
- 2+ years of experience with development and deployment of Machine Learning applications
- 2+ years of experience with NLP approaches in a production setting
- Experience in building models using bagging and boosting algorithms
- Exposure/experience in building Deep Learning models for NLP/Computer Vision use cases preferred
- Ability to write efficient code with good understanding of core Data Structures/algorithms is critical
- Strong python skills following software engineering best practices
- Experience in using code versioning tools like GIT, bit bucket
- Experience in working in Agile projects
- Comfort & familiarity with SQL and Hadoop ecosystem of tools including spark
- Experience managing big data with efficient query program good to have
- Good to have experience in training ML models in tools like Sage Maker, Kubeflow etc.
- Good to have experience in frameworks to depict interpretability of models using libraries like Lime, Shap etc.
- Experience with Health care sector is preferred
- MS/M.Tech or PhD is a plus
- Building and operationalizing large scale enterprise data solutions and applications using one or more of AZURE data and analytics services in combination with custom solutions - Azure Synapse/Azure SQL DWH, Azure Data Lake, Azure Blob Storage, Spark, HDInsights, Databricks, CosmosDB, EventHub/IOTHub.
- Experience in migrating on-premise data warehouses to data platforms on AZURE cloud.
- Designing and implementing data engineering, ingestion, and transformation functions
Azure Synapse or Azure SQL data warehouse
Spark on Azure is available in HD insights and data bricks
Tiger Analytics is a global AI & analytics consulting firm. With data and technology at the core of our solutions, we are solving some of the toughest problems out there. Our culture is modeled around expertise and mutual respect with a team first mindset. Working at Tiger, you’ll be at the heart of this AI revolution. You’ll work with teams that push the boundaries of what-is-possible and build solutions that energize and inspire.
We are headquartered in the Silicon Valley and have our delivery centres across the globe. The below role is for our Chennai or Bangalore office, or you can choose to work remotely.
About the Role:
As an Associate Director - Data Science at Tiger Analytics, you will lead data science aspects of endto-end client AI & analytics programs. Your role will be a combination of hands-on contribution, technical team management, and client interaction.
• Work closely with internal teams and client stakeholders to design analytical approaches to
solve business problems
• Develop and enhance a broad range of cutting-edge data analytics and machine learning
problems across a variety of industries.
• Work on various aspects of the ML ecosystem – model building, ML pipelines, logging &
versioning, documentation, scaling, deployment, monitoring and maintenance etc.
• Lead a team of data scientists and engineers to embed AI and analytics into the client
business decision processes.
• High level of proficiency in a structured programming language, e.g. Python, R.
• Experience designing data science solutions to business problems
• Deep understanding of ML algorithms for common use cases in both structured and
unstructured data ecosystems.
• Comfortable with large scale data processing and distributed computing
• Excellent written and verbal communication skills
• 10+ years exp of which 8 years of relevant data science experience including hands-on
Designation will be commensurate with expertise/experience. Compensation packages among the best in the industry.