Job Description
We are looking for an experienced engineer to join our data science team, who will help us design, develop, and deploy machine learning models in production. You will develop robust models, prepare their deployment into production in a controlled manner, while providing appropriate means to monitor their performance and stability after deployment.
What You’ll Do will include (But not limited to):
- Preparing datasets needed to train and validate our machine learning models
- Anticipate and build solutions for problems that interrupt availability, performance, and stability in our systems, services, and products at scale.
- Defining and implementing metrics to evaluate the performance of the models, both for computing performance (such as CPU & memory usage) and for ML performance (such as precision, recall, and F1)
- Supporting the deployment of machine learning models on our infrastructure, including containerization, instrumentation, and versioning
- Supporting the whole lifecycle of our machine learning models, including gathering data for retraining, A/B testing, and redeployments
- Developing, testing, and evaluating tools for machine learning models deployment, monitoring, retraining.
- Working closely within a distributed team to analyze and apply innovative solutions over billions of documents
- Supporting solutions ranging from rule-bases, classical ML techniques to the latest deep learning systems.
- Partnering with cross-functional team members to bring large scale data engineering solutions to production
- Communicating your approach and results to a wider audience through presentations
Your Qualifications:
- Demonstrated success with machine learning in a SaaS or Cloud environment, with hands–on knowledge of model creation and deployments in production at scale
- Good knowledge of traditional machine learning methods and neural networks
- Experience with practical machine learning modeling, especially on time-series forecasting, analysis, and causal inference.
- Experience with data mining algorithms and statistical modeling techniques for anomaly detection in time series such as clustering, classification, ARIMA, and decision trees is preferred.
- Ability to implement data import, cleansing and transformation functions at scale
- Fluency in Docker, Kubernetes
- Working knowledge of relational and dimensional data models with appropriate visualization techniques such as PCA.
- Solid English skills to effectively communicate with other team members
Due to the nature of the role, it would be nice if you have also:
- Experience with large datasets and distributed computing, especially with the Google Cloud Platform
- Fluency in at least one deep learning framework: PyTorch, TensorFlow / Keras
- Experience with No–SQL and Graph databases
- Experience working in a Colab, Jupyter, or Python notebook environment
- Some experience with monitoring, analysis, and alerting tools like New Relic, Prometheus, and the ELK stack
- Knowledge of Java, Scala or Go-Lang programming languages
- Familiarity with KubeFlow
- Experience with transformers, for example the Hugging Face libraries
- Experience with OpenCV
About Egnyte
In a content critical age, Egnyte fuels business growth by enabling content-rich business processes, while also providing organizations with visibility and control over their content assets. Egnyte’s cloud-native content services platform leverages the industry’s leading content intelligence engine to deliver a simple, secure, and vendor-neutral foundation for managing enterprise content across business applications and storage repositories. More than 16,000 customers trust Egnyte to enhance employee productivity, automate data management, and reduce file-sharing cost and complexity. Investors include Google Ventures, Kleiner Perkins, Caufield & Byers, and Goldman Sachs. For more information, visit www.egnyte.com
#LI-Remote
About Egnyte
Egnyte provides secure Enterprise File Sharing and Content Governance built from the Cloud down. Access, Share and Control 100% of your data from anywhere using any smartphone, tablet or computer.
Egnyte store billion of files and petabytes of data and we are looking for help to take the platform used by millions of users to the next level of scale. Autonomy and ownership is integral to our culture and engineers own one or more services end to end.
We’re looking for Engineers and they should be able to take a complex problem and work with product managers, devops and other team members to execute end to end.
Similar jobs
- You're proficient in AI/Machine learning latest technologies
- You're proficient in GPT-3 based algorithms
- You have a passion for writing code as well as understanding and crafting the ways systems interact
- You believe in the benefits of agile processes and shipping code often
- You are pragmatic and work to coalesce requirements into reasonable solutions that provide value
Responsibilities
- Deploy well-tested, maintainable and scalable software solutions
- Take end-to-end ownership of the technology stack and product
- Collaborate with other engineers to architect scalable technical solutions
- Embrace and improve our standards and processes to reduce friction and unlock efficiency
Current Ecosystem :
ShibaSwap : https://shibaswap.com/#/
Metaverse : https://shib.io/#/
NFTs : https://opensea.io/collection/theshiboshis
Game : Shiba Eternity on iOS and Android
Job Description – Data Science
Basic Qualification:
- ME/MS from premier institute with a background in Mechanical/Industrial/Chemical/Materials engineering.
- Strong Analytical skills and application of Statistical techniques to problem solving
- Expertise in algorithms, data structures and performance optimization techniques
- Proven track record of demonstrating end to end ownership involving taking an idea from incubator to market
- Minimum years of experience in data analysis (2+), statistical analysis, data mining, algorithms for optimization.
Responsibilities
The Data Engineer/Analyst will
- Work with stakeholders throughout the organization to identify opportunities for leveraging company data to drive business solutions.
- Clear interaction with Business teams including product planning, sales, marketing, finance for defining the projects, objectives.
- Mine and analyze data from company databases to drive optimization and improvement of product and process development, marketing techniques and business strategies
- Coordinate with different R&D and Business teams to implement models and monitor outcomes.
- Mentor team members towards developing quick solutions for business impact.
- Skilled at all stages of the analysis process including defining key business questions, recommending measures, data sources, methodology and study design, dataset creation, analysis execution, interpretation and presentation and publication of results.
- 4+ years’ experience in MNC environment with projects involving ML, DL and/or DS
- Experience in Machine Learning, Data Mining or Machine Intelligence (Artificial Intelligence)
- Knowledge on Microsoft Azure will be desired.
- Expertise in machine learning such as Classification, Data/Text Mining, NLP, Image Processing, Decision Trees, Random Forest, Neural Networks, Deep Learning Algorithms
- Proficient in Python and its various libraries such as Numpy, MatPlotLib, Pandas
- Superior verbal and written communication skills, ability to convey rigorous mathematical concepts and considerations to Business Teams.
- Experience in infra development / building platforms is highly desired.
- A drive to learn and master new technologies and techniques.
Responsibilities include:
- Convert the machine learning models into application program interfaces (APIs) so that other applications can use it
- Build AI models from scratch and help the different components of the organization (such as product managers and stakeholders) understand what results they gain from the model
- Build data ingestion and data transformation infrastructure
- Automate infrastructure that the data science team uses
- Perform statistical analysis and tune the results so that the organization can make better-informed decisions
- Set up and manage AI development and product infrastructure
- Be a good team player, as coordinating with others is a must
- Banking Domain
- Assist the team in building Machine learning/AI/Analytics models on open-source stack using Python and the Azure cloud stack.
- Be part of the internal data science team at fragma data - that provides data science consultation to large organizations such as Banks, e-commerce Cos, Social Media companies etc on their scalable AI/ML needs on the cloud and help build POCs, and develop Production ready solutions.
- Candidates will be provided with opportunities for training and professional certifications on the job in these areas - Azure Machine learning services, Microsoft Customer Insights, Spark, Chatbots, DataBricks, NoSQL databases etc.
- Assist the team in conducting AI demos, talks, and workshops occasionally to large audiences of senior stakeholders in the industry.
- Work on large enterprise scale projects end-to-end, involving domain specific projects across banking, finance, ecommerce, social media etc.
- Keen interest to learn new technologies and latest developments and apply them to projects assigned.
Desired Skills |
- Professional Hands-on coding experience in python for over 1 year for Data scientist, and over 3 years for Sr Data Scientist.
- This is primarily a programming/development-
oriented role - hence strong programming skills in writing object-oriented and modular code in python and experience of pushing projects to production is important. - Strong foundational knowledge and professional experience in
- Machine learning, (Compulsory)
- Deep Learning (Compulsory)
- Strong knowledge of At least One of : Natural Language Processing or Computer Vision or Speech Processing or Business Analytics
- Understanding of Database technologies and SQL. (Compulsory)
- Knowledge of the following Frameworks:
- Scikit-learn (Compulsory)
- Keras/tensorflow/pytorch (At least one of these is Compulsory)
- API development in python for ML models (good to have)
- Excellent communication skills.
- Excellent communication skills are necessary to succeed in this role, as this is a role with high external visibility, and with multiple opportunities to present data science results to a large external audience that will include external VPs, Directors, CXOs etc.
- Hence communication skills will be a key consideration in the selection process.
Why us?
We at Wow Labz are always striving to look for exciting problems to solve. Whether we’re creating new products or helping a small startup extend its reach, we build from our heart. We’re entrepreneurial and we love new ideas. Fun culture with a team that cares about your development and growth.
What are we looking for?
We are looking for an expert in machine learning to help us extract maximum value from our data. You will be leading all the processes from data collection, cleaning, and preprocessing, to training models and deploying them to production. In this role, you should be highly analytical with a knack for analysis, math and statistics. Critical thinking and problem-solving skills are essential for interpreting data. We also want to see a passion for machine-learning and research.
Role & Responsibilities:
- Identify valuable data sources and automate collection processes
- Study and transform data science prototypes
- Research and Implement appropriate ML algorithms and tools
- Develop machine learning applications according to requirements
- Extend existing ML libraries and frameworks
- Cross-validate models to ensure their generalizability
- Present information using data visualization techniques
- Propose solutions and strategies to business challenges
- Collaborate with engineering and product development teams
- Guide and mentor the respective teams
Desired Skills and Experience:
- Proven experience as a Machine Learning Engineer or similar role
- Demonstrable history of devising and overseeing data-centered projects
- Understanding of data structures, data modeling and software architecture
- Deep knowledge of math, probability, statistics and algorithms
- Experience with cloud platforms like AWS/Azure/GCP
- Knowledge of server configurations and maintenance.
- Knowledge of R, SQL and Python; familiarity with Scala, Java or C++ is an asset
- Familiarity with machine learning frameworks (like Keras or PyTorch) and libraries (like scikit-learn)
- Experience using business intelligence tools (e.g. Tableau) and data frameworks (e.g. Hadoop)
- Ability to select hardware to run an ML model with the required latency
- Excellent communication skills
- Ability to work in a team
- Outstanding analytical and problem-solving skills
Must have:
- Inclination towards Mathematics and statistics to understand the algorithms at a deeper level
- Strong OOPs concepts (python preferable)
- Hands on experience with Flask or Django
- Ability to learn latest deployed models and understand their core architecture to gain breadth of expertise
Persona of the kind of people who would be a culture fit:
- You are curious and aware of the latest tech trends
- You are self-driven
- You get a kick out of leading a solution towards its completion.
- You have the capacity to foster a healthy, stimulating work environment that frequently harnesses teamwork
- You are fun to hang out with!
We are a nascent quantitative hedge fund led by an MIT PhD and Math Olympiad medallist, offering opportunities to grow with us as we build out the team. Our fund has world class investors and big data experts as part of the GP, top-notch ML experts as advisers to the fund, plus has equity funding to grow the team, license data and scale the data processing.
We are interested in researching and taking in live a variety of quantitative strategies based on historic and live market data, alternative datasets, social media data (both audio and video) and stock fundamental data.
You would join, and, if qualified, lead a growing team of data scientists and researchers, and be responsible for a complete lifecycle of quantitative strategy implementation and trading.
Requirements:
- Atleast 3 years of relevant ML experience
- Graduation date : 2018 and earlier
- 3-5 years of experience in high level Python programming.
- Master Degree (or Phd) in quantitative disciplines such as Statistics, Mathematics, Physics, Computer Science in top universities.
- Good knowledge of applied and theoretical statistics, linear algebra and machine learning techniques.
- Ability to leverage financial and statistical insights to research, explore and harness a large collection of quantitative strategies and financial datasets in order to build strong predictive models.
- Should take ownership for the research, design, development and implementation of the strategy development and effectively communicate with other team mates
- Prior experience and good knowledge of lifecycle and pitfalls of algorithmic strategy development and modelling.
- Good practical knowledge in understanding financial statements, value investing, portfolio and risk management techniques.
- A proven ability to lead and drive innovation to solve challenges and road blocks in project completion.
- A valid Github profile with some activity in it
Bonus to have:
- Experience in storing and retrieving data from large and complex time series databases
- Very good practical knowledge on time-series modelling and forecasting (ARIMA, ARCH and Stochastic modelling)
- Prior experience in optimizing and back testing quantitative strategies, doing return and risk attribution, feature/factor evaluation.
- Knowledge of AWS/Cloud ecosystem is an added plus (EC2s, Lambda, EKS, Sagemaker etc.)
- Knowledge of REST APIs and data extracting and cleaning techniques
- Good to have experience in Pyspark or any other big data programming/parallel computing
- Familiarity with derivatives, knowledge in multiple asset classes along with Equities.
- Any progress towards CFA or FRM is a bonus
- Average tenure of atleast 1.5 years in a company
along with metrics to track their progress
Managing available resources such as hardware, data, and personnel so that deadlines
are met
Analysing the ML algorithms that could be used to solve a given problem and ranking
them by their success probability
Exploring and visualizing data to gain an understanding of it, then identifying
differences in data distribution that could affect performance when deploying the model
in the real world
Verifying data quality, and/or ensuring it via data cleaning
Supervising the data acquisition process if more data is needed
Defining validation strategies
Defining the pre-processing or feature engineering to be done on a given dataset
Defining data augmentation pipelines
Training models and tuning their hyper parameters
Analysing the errors of the model and designing strategies to overcome them
Deploying models to production