Similar jobs
Job Description
We are looking for an experienced engineer to join our data science team, who will help us design, develop, and deploy machine learning models in production. You will develop robust models, prepare their deployment into production in a controlled manner, while providing appropriate means to monitor their performance and stability after deployment.
What You’ll Do will include (But not limited to):
- Preparing datasets needed to train and validate our machine learning models
- Anticipate and build solutions for problems that interrupt availability, performance, and stability in our systems, services, and products at scale.
- Defining and implementing metrics to evaluate the performance of the models, both for computing performance (such as CPU & memory usage) and for ML performance (such as precision, recall, and F1)
- Supporting the deployment of machine learning models on our infrastructure, including containerization, instrumentation, and versioning
- Supporting the whole lifecycle of our machine learning models, including gathering data for retraining, A/B testing, and redeployments
- Developing, testing, and evaluating tools for machine learning models deployment, monitoring, retraining.
- Working closely within a distributed team to analyze and apply innovative solutions over billions of documents
- Supporting solutions ranging from rule-bases, classical ML techniques to the latest deep learning systems.
- Partnering with cross-functional team members to bring large scale data engineering solutions to production
- Communicating your approach and results to a wider audience through presentations
Your Qualifications:
- Demonstrated success with machine learning in a SaaS or Cloud environment, with hands–on knowledge of model creation and deployments in production at scale
- Good knowledge of traditional machine learning methods and neural networks
- Experience with practical machine learning modeling, especially on time-series forecasting, analysis, and causal inference.
- Experience with data mining algorithms and statistical modeling techniques for anomaly detection in time series such as clustering, classification, ARIMA, and decision trees is preferred.
- Ability to implement data import, cleansing and transformation functions at scale
- Fluency in Docker, Kubernetes
- Working knowledge of relational and dimensional data models with appropriate visualization techniques such as PCA.
- Solid English skills to effectively communicate with other team members
Due to the nature of the role, it would be nice if you have also:
- Experience with large datasets and distributed computing, especially with the Google Cloud Platform
- Fluency in at least one deep learning framework: PyTorch, TensorFlow / Keras
- Experience with No–SQL and Graph databases
- Experience working in a Colab, Jupyter, or Python notebook environment
- Some experience with monitoring, analysis, and alerting tools like New Relic, Prometheus, and the ELK stack
- Knowledge of Java, Scala or Go-Lang programming languages
- Familiarity with KubeFlow
- Experience with transformers, for example the Hugging Face libraries
- Experience with OpenCV
About Egnyte
In a content critical age, Egnyte fuels business growth by enabling content-rich business processes, while also providing organizations with visibility and control over their content assets. Egnyte’s cloud-native content services platform leverages the industry’s leading content intelligence engine to deliver a simple, secure, and vendor-neutral foundation for managing enterprise content across business applications and storage repositories. More than 16,000 customers trust Egnyte to enhance employee productivity, automate data management, and reduce file-sharing cost and complexity. Investors include Google Ventures, Kleiner Perkins, Caufield & Byers, and Goldman Sachs. For more information, visit www.egnyte.com
#LI-Remote
Must Have Skills:
- Solid Knowledge on DWH, ETL and Big Data Concepts
- Excellent SQL Skills (With knowledge of SQL Analytics Functions)
- Working Experience on any ETL tool i.e. SSIS / Informatica
- Working Experience on any Azure or AWS Big Data Tools.
- Experience on Implementing Data Jobs (Batch / Real time Streaming)
- Excellent written and verbal communication skills in English, Self-motivated with strong sense of ownership and Ready to learn new tools and technologies
Preferred Skills:
- Experience on Py-Spark / Spark SQL
- AWS Data Tools (AWS Glue, AWS Athena)
- Azure Data Tools (Azure Databricks, Azure Data Factory)
Other Skills:
- Knowledge about Azure Blob, Azure File Storage, AWS S3, Elastic Search / Redis Search
- Knowledge on domain/function (across pricing, promotions and assortment).
- Implementation Experience on Schema and Data Validator framework (Python / Java / SQL),
- Knowledge on DQS and MDM.
Key Responsibilities:
- Independently work on ETL / DWH / Big data Projects
- Gather and process raw data at scale.
- Design and develop data applications using selected tools and frameworks as required and requested.
- Read, extract, transform, stage and load data to selected tools and frameworks as required and requested.
- Perform tasks such as writing scripts, web scraping, calling APIs, write SQL queries, etc.
- Work closely with the engineering team to integrate your work into our production systems.
- Process unstructured data into a form suitable for analysis.
- Analyse processed data.
- Support business decisions with ad hoc analysis as needed.
- Monitoring data performance and modifying infrastructure as needed.
Responsibility: Smart Resource, having excellent communication skills
● Proficiency in Linux.
● Experience working with AWS cloud services: EC2, S3, RDS, Redshift.
● Must have SQL knowledge and experience working with relational databases, query
authoring (SQL) as well as familiarity with databases including Mysql, Mongo, Cassandra,
and Athena.
● Must have experience with Python/Scala.
● Must have experience with Big Data technologies like Apache Spark.
● Must have experience with Apache Airflow.
● Experience with data pipelines and ETL tools like AWS Glue.
Role Summary
As a Data Engineer, you will be an integral part of our Data Engineering team supporting an event-driven server less data engineering pipeline on AWS cloud, responsible for assisting in the end-to-end analysis, development & maintenance of data pipelines and systems (DataOps). You will work closely with fellow data engineers & production support to ensure the availability and reliability of data for analytics and business intelligence purposes.
Requirements:
· Around 4 years of working experience in data warehousing / BI system.
· Strong hands-on experience with Snowflake AND strong programming skills in Python
· Strong hands-on SQL skills
· Knowledge with any of the cloud databases such as Snowflake,Redshift,Google BigQuery,RDS,etc.
· Knowledge on debt for cloud databases
· AWS Services such as SNS, SQS, ECS, Docker, Kinesis & Lambda functions
· Solid understanding of ETL processes, and data warehousing concepts
· Familiarity with version control systems (e.g., Git/bit bucket, etc.) and collaborative development practices in an agile framework
· Experience with scrum methodologies
· Infrastructure build tools such as CFT / Terraform is a plus.
· Knowledge on Denodo, data cataloguing tools & data quality mechanisms is a plus.
· Strong team player with good communication skills.
Overview Optisol Business Solutions
OptiSol was named on this year's Best Companies to Work for list by Great place to work. We are a team of about 500+ Agile employees with a development center in India and global offices in the US, UK (United Kingdom), Australia, Ireland, Sweden, and Dubai. 16+ years of joyful journey and we have built about 500+ digital solutions. We have 200+ happy and satisfied clients across 24 countries.
Benefits, working with Optisol
· Great Learning & Development program
· Flextime, Work-at-Home & Hybrid Options
· A knowledgeable, high-achieving, experienced & fun team.
· Spot Awards & Recognition.
· The chance to be a part of next success story.
· A competitive base salary.
More Than Just a Job, We Offer an Opportunity To Grow. Are you the one, who looks out to Build your Future & Build your Dream? We have the Job for you, to make your dream comes true.
TOP 3 SKILLS
Python (Language)
Spark Framework
Spark Streaming
Docker/Jenkins/ Spinakar
AWS
Hive Queries
He/She should be good coder.
Preff: - Airflow
Must have experience: -
Python
Spark framework and streaming
exposure to Machine Learning Lifecycle is mandatory.
Project:
This is searching domain project. Any searching activity which is happening on website this team create the model for the same, they create sorting/scored model for any search. This is done by the data
scientist This team is working more on the streaming side of data, the candidate would work extensively on Spark streaming and there will be a lot of work in Machine Learning.
INTERVIEW INFORMATION
3-4 rounds.
1st round based on data engineering batching experience.
2nd round based on data engineering streaming experience.
3rd round based on ML lifecycle (3rd round can be a techno-functional round based on previous
feedbacks otherwise 4th round will be a functional round if required.
Required Experience
· 3+ years of relevant technical experience as a data analyst role
· Intermediate / expert skills with SQL and basic statistics
· Experience in Advance SQL
· Python programming- Added advantage
· Strong problem solving and structuring skills
· Automation in connecting various sources to the data and representing it through various dashboards
· Excellent with Numbers and communicate data points through various reports/templates
· Ability to communicate effectively internally and outside Data Analytics team
· Proactively take up work responsibilities and take adhocs as and when needed
· Ability and desire to take ownership of and initiative for analysis; from requirements clarification to deliverable
· Strong technical communication skills; both written and verbal
· Ability to understand and articulate the "big picture" and simplify complex ideas
· Ability to identify and learn applicable new techniques independently as needed
· Must have worked with various Databases (Relational and Non-Relational) and ETL processes
· Must have experience in handling large volume and data and adhere to optimization and performance standards
· Should have the ability to analyse and provide relationship views of the data from different angles
· Must have excellent Communication skills (written and oral).
· Knowing Data Science is an added advantage
Required Skills
MYSQL, Advanced Excel, Tableau, Reporting and dashboards, MS office, VBA, Analytical skills
Preferred Experience
· Strong understanding of relational database MY SQL etc.
· Prior experience working remotely full-time
· Prior Experience working in Advance SQL
· Experience with one or more BI tools, such as Superset, Tableau etc.
· High level of logical and mathematical ability in Problem Solving
Responsibilities:
- Should act as a technical resource for the Data Science team and be involved in creating and implementing current and future Analytics projects like data lake design, data warehouse design, etc.
- Analysis and design of ETL solutions to store/fetch data from multiple systems like Google Analytics, CleverTap, CRM systems etc.
- Developing and maintaining data pipelines for real time analytics as well as batch analytics use cases.
- Collaborate with data scientists and actively work in the feature engineering and data preparation phase of model building
- Collaborate with product development and dev ops teams in implementing the data collection and aggregation solutions
- Ensure quality and consistency of the data in Data warehouse and follow best data governance practices
- Analyse large amounts of information to discover trends and patterns
- Mine and analyse data from company databases to drive optimization and improvement of product development, marketing techniques and business strategies.\
Requirements
- Bachelor’s or Masters in a highly numerate discipline such as Engineering, Science and Economics
- 2-6 years of proven experience working as a Data Engineer preferably in ecommerce/web based or consumer technologies company
- Hands on experience of working with different big data tools like Hadoop, Spark , Flink, Kafka and so on
- Good understanding of AWS ecosystem for big data analytics
- Hands on experience in creating data pipelines either using tools or by independently writing scripts
- Hands on experience in scripting languages like Python, Scala, Unix Shell scripting and so on
- Strong problem solving skills with an emphasis on product development.
- Experience using business intelligence tools e.g. Tableau, Power BI would be an added advantage (not mandatory)
Position Name: Software Developer
Required Experience: 3+ Years
Number of positions: 4
Qualifications: Master’s or Bachelor s degree in Engineering, Computer Science, or equivalent (BE/BTech or MS in Computer Science).
Key Skills: Python, Django, Ngnix, Linux, Sanic, Pandas, Numpy, Snowflake, SciPy, Data Visualization, RedShift, BigData, Charting
Compensation - As per industry standards.
Joining - Immediate joining is preferrable.
Required Skills:
- Strong Experience in Python and web frameworks like Django, Tornado and/or Flask
- Experience in data analytics using standard python libraries using Pandas, NumPy, MatPlotLib
- Conversant in implementing charts using charting libraries like Highcharts, d3.js, c3.js, dc.js and data Visualization tools like Plotly, GGPlot
- Handling and using large databases and Datawarehouse technologies like MongoDB, MySQL, BigData, Snowflake, Redshift.
- Experience in building APIs, Multi-threading for tasks on Linux platform
- Exposure to finance and capital markets will be added advantage.
- Strong understanding of software design principles, algorithms, data structures, design patterns, and multithreading concepts.
- Worked on building highly-available distributed systems on cloud infrastructure or have had exposure to architectural pattern of a large, high-scale web application.
- Strong understanding of software design principles, algorithms, data structures, design patterns, and multithreading concepts.
- Basic understanding of front-end technologies, such as JavaScript, HTML5, and CSS3
Company Description:
Reval Analytical Services is a fully-owned subsidiary of Virtua Research Inc. US. It is a financial services technology company focused on consensus analytics, peer analytics and Web-enabled information delivery. The Company’s unique combination of investment research experience, modeling expertise, and software development capabilities enables it to provide industry-leading financial research tools and services for investors, analysts, and corporate management.
Website: http://www.virtuaresearch.com" target="_blank">www.virtuaresearch.com
What you will be doing:
As a part of the Global Credit Risk and Data Analytics team, this person will be responsible for carrying out analytical initiatives which will be as follows: -
- Dive into the data and identify patterns
- Development of end-to-end Credit models and credit policy for our existing credit products
- Leverage alternate data to develop best-in-class underwriting models
- Working on Big Data to develop risk analytical solutions
- Development of Fraud models and fraud rule engine
- Collaborate with various stakeholders (e.g. tech, product) to understand and design best solutions which can be implemented
- Working on cutting-edge techniques e.g. machine learning and deep learning models
Example of projects done in past:
- Lazypay Credit Risk model using CatBoost modelling technique ; end-to-end pipeline for feature engineering and model deployment in production using Python
- Fraud model development, deployment and rules for EMEA region
Basic Requirements:
- 1-3 years of work experience as a Data scientist (in Credit domain)
- 2016 or 2017 batch from a premium college (e.g B.Tech. from IITs, NITs, Economics from DSE/ISI etc)
- Strong problem solving and understand and execute complex analysis
- Experience in at least one of the languages - R/Python/SAS and SQL
- Experience in in Credit industry (Fintech/bank)
- Familiarity with the best practices of Data Science
Add-on Skills :
- Experience in working with big data
- Solid coding practices
- Passion for building new tools/algorithms
- Experience in developing Machine Learning models