ML Engineer-Analyst/ Senior Analyst
To design and develop machine learning and deep learning systems. Run machine learning tests andexperiments and implementing appropriate ML algorithms. Works cross-functionally with the Data Scientists, Software application developers and business groups for the development of innovative ML models. Use Agile experience to work collaboratively with other Managers/Owners in geographically distributed teams.
- Work with Data Scientists and Business Analysts to frame problems in a business context. Assist all the processes from data collection, cleaning, and preprocessing, to training models and deploying them to production.
- Understand business objectives and developing models that help to achieve them, along with metrics to track their progress.
- Explore and visualize data to gain an understanding of it, then identify differences in data distribution that could affect performance when deploying the model in the real world.
- Define validation strategies, preprocess or feature engineering to be done on a given dataset and data augmentation pipelines.
- Analyze the errors of the model and design strategies to overcome them.
- Collaborate with data engineers to build data and model pipelines, manage the infrastructure and data pipelines needed to bring code to production and demonstrate end-to-end understanding of applications (including, but not limited to, the machine learning algorithms) being created.
Qualifications & Specifications
- Bachelor's degree in Engineering /Computer Science/ Math/ Statistics or equivalent. Master's degree in relevant specification will be first preference
- Experience of machine learning algorithms and libraries
- Understanding of data structures, data modeling and software architecture.
- Deep knowledge of math, probability, statistics and algorithms
- Experience with machine learning platforms such as Microsoft Azure, Google Cloud, IBM Watson, and Amazon
- Big data environment: Hadoop, Spark
- Programming languages: Python, R, PySpark
- Supervised & Unsupervised machine learning: linear regression, logistic regression, k-means
clustering, ensemble models, random forest, svm, gradient boosting
- Sampling data: bagging & boosting, bootstrapping
- Neural networks: ANN, CNN, RNN related topics
- Deep learning: Keras, Tensorflow
- Experience with AWS Sagemaker deployment and agile methodology
- Convert the machine learning models into application program interfaces (APIs) so that other applications can use it
- Build AI models from scratch and help the different components of the organization (such as product managers and stakeholders) understand what results they gain from the model
- Build data ingestion and data transformation infrastructure
- Automate infrastructure that the data science team uses
- Perform statistical analysis and tune the results so that the organization can make better-informed decisions
- Set up and manage AI development and product infrastructure
- Be a good team player, as coordinating with others is a must
Exp: 3-6 Yrs
Notice: Immediate to 15 days
- Develop advanced algorithms that solve problems of large dimensionality in a computationally efficient and statistically effective manner;
- Execute statistical and data mining techniques (e.g. hypothesis testing, machine learning and retrieval processes) on large data sets to identify trends, figures and other relevant information;
- Evaluate emerging datasets and technologies that may contribute to our analytical platform;
- Participate in development of select assets/accelerators that create scale;
- Contribute to thought leadership through research and publication support;
- Guide and mentor Associates on teams.
- 3-6 years of relevant post-collegiate work experience;
- Knowledge of big data/advanced analytics concepts and algorithms (e.g. text mining, social listening, recommender systems, predictive modeling, etc.);
- Should have experience on NLP, Pyspark
- Exposure to tools/platforms (e.g. Hadoop eco system and database systems);
- Agile project planning and project management skills;
- Relevant domain knowledge preferred; (healthcare/transportation/hi-tech/insurance);
- Excellent oral and written communication skills;
- Strong attention to detail, with a research-focused mindset;
- Excellent critical thinking and problem solving skills;
- High motivation, good work ethic and maturity.
Synapsica is a growth stage HealthTech startup founded by alumni from IIT Kharagpur, AIIMS New Delhi, and IIM Ahmedabad. We believe healthcare needs to be transparent and objective, while being affordable. Every patient has the right to know exactly what is happening in their bodies and they don’t have to rely on cryptic 2 liners given to them as diagnosis. Towards this aim, we are building an artificial intelligence enabled cloud based platform to analyse medical images and create v2.0 of advanced radiology reporting. We are backed by YCombinator and other investors from India, US and Japan. We are proud to have GE, AIIMS, and the Spinal Kinetics as our partners.
Your Roles and Responsibilities
The role involves computer vision tasks including development, customization and training of Convolutional Neural Networks (CNNs); application of ML techniques (SVM, regression, clustering etc.) and traditional Image Processing (OpenCV etc.). The role is research focused and would involve going through and implementing existing research papers, deep dive of problem analysis, generating new ideas, automating and optimizing key processes.
- Strong problem-solving ability
- Prior experience with Python, cuDNN, Tensorflow, PyTorch, Keras, Caffe (or similar Deep Learning frameworks).
- Extensive understanding of computer vision/image processing applications like object classification, segmentation, object detection etc
- Ability to write custom Convolutional Neural Network Architecture in Pytorch (or similar)
- Experience of GPU/DSP/other Multi-core architecture programming
- Effective communication with other project members and project stakeholders
- Detail-oriented, eager to learn, acquire new skills
- Prior Project Management and Team Leadership experience
- Ability to plan work and meet deadlines
- End to end deployment of deep learning models.
Location: Chennai or Gurgaon
- Bring in industry best practices around creating and maintaining robust data pipelines for complex data projects with/without AI component
- programmatically ingesting data from several static and real-time sources (incl. web scraping)
- rendering results through dynamic interfaces incl. web / mobile / dashboard with the ability to log usage and granular user feedbacks
- performance tuning and optimal implementation of complex Python scripts (using SPARK), SQL (using stored procedures, HIVE), and NoSQL queries in a production environment
- Industrialize ML / DL solutions and deploy and manage production services; proactively handle data issues arising on live apps
- Perform ETL on large and complex datasets for AI applications - work closely with data scientists on performance optimization of large-scale ML/DL model training
- Build data tools to facilitate fast data cleaning and statistical analysis
- Ensure data architecture is secure and compliant
- Resolve issues escalated from Business and Functional areas on data quality, accuracy, and availability
- Work closely with APAC CDO and coordinate with a fully decentralized team across different locations in APAC and global HQ (Paris).
You should be
- Expert in structured and unstructured data in traditional and Big data environments – Oracle / SQLserver, MongoDB, Hive / Pig, BigQuery, and Spark
- Have excellent knowledge of Python programming both in traditional and distributed models (PySpark)
- Expert in shell scripting and writing schedulers
- Hands-on experience with Cloud - deploying complex data solutions in hybrid cloud / on-premise environment both for data extraction/storage and computation
- Hands-on experience in deploying production apps using large volumes of data with state-of-the-art technologies like Dockers, Kubernetes, and Kafka
- Strong knowledge of data security best practices
- 5+ years experience in a data engineering role
- Science / Engineering graduate from a Tier-1 university in the country
- And most importantly, you must be a passionate coder who really cares about building apps that can help people do things better, smarter, and faster even when they sleep
● Frame ML / AI use cases that can improve the company’s product
● Implement and develop ML / AI / Data driven rule based algorithms as software items
● For example, building a chatbot that replies an answer from relevant FAQ, and
reinforcing the system with a feedback loop so that the bot improves
Must have skills:
● Data extraction and ETL
● Python (numpy, pandas, comfortable with OOP)
● Knowledge of basic Machine Learning / Deep Learning / AI algorithms and ability to
● Good understanding of SDLC
● Deployed ML / AI model in a mobile / web product
● Soft skills : Strong communication skills & Critical thinking ability
Good to have:
● Full stack development experience
B.Tech. / B.E. degree in Computer Science or equivalent software engineering
We are a nascent quantitative hedge fund led by an MIT PhD and Math Olympiad medallist, offering opportunities to grow with us as we build out the team. Our fund has world class investors and big data experts as part of the GP, top-notch ML experts as advisers to the fund, plus has equity funding to grow the team, license data and scale the data processing.
We are interested in researching and taking in live a variety of quantitative strategies based on historic and live market data, alternative datasets, social media data (both audio and video) and stock fundamental data.
You would join, and, if qualified, lead a growing team of data scientists and researchers, and be responsible for a complete lifecycle of quantitative strategy implementation and trading.
- Atleast 3 years of relevant ML experience
- Graduation date : 2018 and earlier
- 3-5 years of experience in high level Python programming.
- Master Degree (or Phd) in quantitative disciplines such as Statistics, Mathematics, Physics, Computer Science in top universities.
- Good knowledge of applied and theoretical statistics, linear algebra and machine learning techniques.
- Ability to leverage financial and statistical insights to research, explore and harness a large collection of quantitative strategies and financial datasets in order to build strong predictive models.
- Should take ownership for the research, design, development and implementation of the strategy development and effectively communicate with other team mates
- Prior experience and good knowledge of lifecycle and pitfalls of algorithmic strategy development and modelling.
- Good practical knowledge in understanding financial statements, value investing, portfolio and risk management techniques.
- A proven ability to lead and drive innovation to solve challenges and road blocks in project completion.
- A valid Github profile with some activity in it
Bonus to have:
- Experience in storing and retrieving data from large and complex time series databases
- Very good practical knowledge on time-series modelling and forecasting (ARIMA, ARCH and Stochastic modelling)
- Prior experience in optimizing and back testing quantitative strategies, doing return and risk attribution, feature/factor evaluation.
- Knowledge of AWS/Cloud ecosystem is an added plus (EC2s, Lambda, EKS, Sagemaker etc.)
- Knowledge of REST APIs and data extracting and cleaning techniques
- Good to have experience in Pyspark or any other big data programming/parallel computing
- Familiarity with derivatives, knowledge in multiple asset classes along with Equities.
- Any progress towards CFA or FRM is a bonus
- Average tenure of atleast 1.5 years in a company