- 4+ years of experience Solid understanding of Python, Java and general software development skills (source code management, debugging, testing, deployment etc.).
- Experience in working with Solr and ElasticSearch Experience with NLP technologies & the handling of unstructured text Detailed understanding of text pre-processing and normalisation techniques such as tokenisation, lemmatisation, stemming, POS tagging etc.
- Prior experience in implementation of traditional ML solutions - classification, regression or clustering problem Expertise in text-analytics - Sentiment Analysis, Entity Extraction, Language modelling - and associated sequence learning models ( RNN, LSTM, GRU).
- Comfortable working with deep-learning libraries (eg. PyTorch)
- Candidate can even be a fresher with 1 or 2 years of experience IIIT, IIIT, Bits Pilani, top 5 local colleges are preferred colleges and universities.
- A Masters candidate in machine learning.
- Can source candidates from Mu Sigma and Manthan.
Similar jobs
- Experience with Cloud native Data tools/Services such as AWS Athena, AWS Glue, Redshift Spectrum, AWS EMR, AWS Aurora, Big Query, Big Table, S3, etc.
- Strong programming skills in at least one of the following languages: Java, Scala, C++.
- Familiarity with a scripting language like Python as well as Unix/Linux shells.
- Comfortable with multiple AWS components including RDS, AWS Lambda, AWS Glue, AWS Athena, EMR. Equivalent tools in the GCP stack will also suffice.
- Strong analytical skills and advanced SQL knowledge, indexing, query optimization techniques.
- Experience implementing software around data processing, metadata management, and ETL pipeline tools like Airflow.
Experience with the following software/tools is highly desired:
- Apache Spark, Kafka, Hive, etc.
- SQL and NoSQL databases like MySQL, Postgres, DynamoDB.
- Workflow management tools like Airflow.
- AWS cloud services: RDS, AWS Lambda, AWS Glue, AWS Athena, EMR.
- Familiarity with Spark programming paradigms (batch and stream-processing).
- RESTful API services.
Requirements:
● Understanding our data sets and how to bring them together.
● Working with our engineering team to support custom solutions offered to the product development.
● Filling the gap between development, engineering and data ops.
● Creating, maintaining and documenting scripts to support ongoing custom solutions.
● Excellent organizational skills, including attention to precise details
● Strong multitasking skills and ability to work in a fast-paced environment
● 5+ years experience with Python to develop scripts.
● Know your way around RESTFUL APIs.[Able to integrate not necessary to publish]
● You are familiar with pulling and pushing files from SFTP and AWS S3.
● Experience with any Cloud solutions including GCP / AWS / OCI / Azure.
● Familiarity with SQL programming to query and transform data from relational Databases.
● Familiarity to work with Linux (and Linux work environment).
● Excellent written and verbal communication skills
● Extracting, transforming, and loading data into internal databases and Hadoop
● Optimizing our new and existing data pipelines for speed and reliability
● Deploying product build and product improvements
● Documenting and managing multiple repositories of code
● Experience with SQL and NoSQL databases (Casendra, MySQL)
● Hands-on experience in data pipelining and ETL. (Any of these frameworks/tools: Hadoop, BigQuery,
RedShift, Athena)
● Hands-on experience in AirFlow
● Understanding of best practices, common coding patterns and good practices around
● storing, partitioning, warehousing and indexing of data
● Experience in reading the data from Kafka topic (both live stream and offline)
● Experience in PySpark and Data frames
Responsibilities:
You’ll
● Collaborating across an agile team to continuously design, iterate, and develop big data systems.
● Extracting, transforming, and loading data into internal databases.
● Optimizing our new and existing data pipelines for speed and reliability.
● Deploying new products and product improvements.
● Documenting and managing multiple repositories of code.
Role : Sr Data Scientist / Tech Lead – Data Science
Number of positions : 8
Responsibilities
- Lead a team of data scientists, machine learning engineers and big data specialists
- Be the main point of contact for the customers
- Lead data mining and collection procedures
- Ensure data quality and integrity
- Interpret and analyze data problems
- Conceive, plan and prioritize data projects
- Build analytic systems and predictive models
- Test performance of data-driven products
- Visualize data and create reports
- Experiment with new models and techniques
- Align data projects with organizational goals
Requirements (please read carefully)
- Very strong in statistics fundamentals. Not all data is Big Data. The candidate should be able to derive statistical insights from very few data points if required, using traditional statistical methods.
- Msc-Statistics/ Phd.Statistics
- Education – no bar, but preferably from a Statistics academic background (eg MSc-Stats, MSc-Econometrics etc), given the first point
- Strong expertise in Python (any other statistical languages/tools like R, SAS, SPSS etc are just optional, but Python is absolutely essential). If the person is very strong in Python, but has almost nil knowledge in the other statistical tools, he/she will still be considered a good candidate for this role.
- Proven experience as a Data Scientist or similar role, for about 7-8 years
- Solid understanding of machine learning and AI concepts, especially wrt choice of apt candidate algorithms for a use case, and model evaluation.
- Good expertise in writing SQL queries (should not be dependent upon anyone else for pulling in data, joining them, data wrangling etc)
- Knowledge of data management and visualization techniques --- more from a Data Science perspective.
- Should be able to grasp business problems, ask the right questions to better understand the problem breadthwise /depthwise, design apt solutions, and explain that to the business stakeholders.
- Again, the last point above is extremely important --- should be able to identify solutions that can be explained to stakeholders, and furthermore, be able to present them in simple, direct language.
http://www.altimetrik.com/">http://www.altimetrik.com
https://www.youtube.com/watch?v=3nUs4YxppNE&feature=emb_rel_end">https://www.youtube.com/watch?v=3nUs4YxppNE&feature=emb_rel_end
https://www.youtube.com/watch?v=e40r6kJdC8c">https://www.youtube.com/watch?v=e40r6kJdC8c
closely with the Kinara management team to investigate strategically important business
questions.
Lead a team through the entire analytical and machine learning model life cycle:
Define the problem statement
Build and clean datasets
Exploratory data analysis
Feature engineering
Apply ML algorithms and assess the performance
Code for deployment
Code testing and troubleshooting
Communicate Analysis to Stakeholders
Manage Data Analysts and Data Scientists
- Architecting end-to-end prediction pipelines and managing them
- Scoping projects and mentoring 2-4 people
- Owning parts of the AI and data infrastructure of the organization
- Develop state-of-the-art deep learning/classical models
- Continuously learn new skills and technologies and implement them when relevant
- Contribute to the community through open-source, blogs, etc.
- Take a number of high-quality decisions about infrastructure, pipelines, and internal tooling.
What are we looking for
- Deep understanding of core concepts
- Broader knowledge of different types of problem statements and approaches
- Great hold on Python and the standard library
- Knowledge of industry-standard tools like scikit-learn, TensorFlow/PyTorch, etc.
- Experience with at least one among Computer Vision, Forecasting, NLP, or Recommendation
Systems a must
- A get shit done attitude
- A research mindset and a creative caliber to utilize previous work to your advantage.
- A helping/mentoring first approach towards work
- 5+ years of industry experience in administering (including setting up, managing, monitoring) data processing pipelines (both streaming and batch) using frameworks such as Kafka Streams, Py Spark, and streaming databases like druid or equivalent like Hive
- Strong industry expertise with containerization technologies including kubernetes (EKS/AKS), Kubeflow
- Experience with cloud platform services such as AWS, Azure or GCP especially with EKS, Managed Kafka
- 5+ Industry experience in python
- Experience with popular modern web frameworks such as Spring boot, Play framework, or Django
- Experience with scripting languages. Python experience highly desirable. Experience in API development using Swagger
- Implementing automated testing platforms and unit tests
- Proficient understanding of code versioning tools, such as Git
- Familiarity with continuous integration, Jenkins
Responsibilities
- Architect, Design and Implement Large scale data processing pipelines using Kafka Streams, PySpark, Fluentd and Druid
- Create custom Operators for Kubernetes, Kubeflow
- Develop data ingestion processes and ETLs
- Assist in dev ops operations
- Design and Implement APIs
- Identify performance bottlenecks and bugs, and devise solutions to these problems
- Help maintain code quality, organization, and documentation
- Communicate with stakeholders regarding various aspects of solution.
- Mentor team members on best practices
Responsibilities:
- Should act as a technical resource for the Data Science team and be involved in creating and implementing current and future Analytics projects like data lake design, data warehouse design, etc.
- Analysis and design of ETL solutions to store/fetch data from multiple systems like Google Analytics, CleverTap, CRM systems etc.
- Developing and maintaining data pipelines for real time analytics as well as batch analytics use cases.
- Collaborate with data scientists and actively work in the feature engineering and data preparation phase of model building
- Collaborate with product development and dev ops teams in implementing the data collection and aggregation solutions
- Ensure quality and consistency of the data in Data warehouse and follow best data governance practices
- Analyse large amounts of information to discover trends and patterns
- Mine and analyse data from company databases to drive optimization and improvement of product development, marketing techniques and business strategies.\
Requirements
- Bachelor’s or Masters in a highly numerate discipline such as Engineering, Science and Economics
- 2-6 years of proven experience working as a Data Engineer preferably in ecommerce/web based or consumer technologies company
- Hands on experience of working with different big data tools like Hadoop, Spark , Flink, Kafka and so on
- Good understanding of AWS ecosystem for big data analytics
- Hands on experience in creating data pipelines either using tools or by independently writing scripts
- Hands on experience in scripting languages like Python, Scala, Unix Shell scripting and so on
- Strong problem solving skills with an emphasis on product development.
- Experience using business intelligence tools e.g. Tableau, Power BI would be an added advantage (not mandatory)
We’re building the future of private financial markets
Traditionally a space only for the wealthy and well-connected, we believe in a future where private markets are more accessible to investors and fundraisers. By leveling the playing field we hope to create a more equitable economy, where inspiring companies are connected to inspired investors, whoever and wherever they are.
Leveraging our trusted brand, global networks and incredible team, we’re building a technology-enabled ecosystem that is as diverse and dynamic as our investor network. As we progress on this ambitious journey, we’re looking for energetic and creative people to support and leave their mark on our platform.
Before Applying
- We have big plans to disrupt the traditional fundraising process for private businesses
- You will work with a diverse team of former investment bankers, strategy consultants and business owners in developing, monitoring, and improving products to facilitate the activity of private investing
- Everything we do is focused on helping build the private capital markets for the next generation of business owners and investors
- We work really hard but play really hard as well
Job purpose
- We are looking for passionate Data Scientists with strong problem-solving skills and prior experience in building machine learning models. You should possess the ability to thrive in a fast-paced environment. As a Data Scientist, working with passionate data-driven enthusiasts, you will lead the deployment of decision sciences with advanced analytics as well as machine learning and AI capabilities to support various lines of businesses. You will also help to enable a data driven culture within the organization.
Roles and responsibilities
- Work with other Data Scientists, Data Engineers, Data Analysts, Software engineers to build and manage data products
- Work on cross-functional projects using advanced data modeling and analysis techniques to discover insights that will guide strategic decisions and uncover optimization opportunities.
- Develop an enterprise data science strategy to achieve scale, synergies, and sustainability of model deployment
- Undertake rigorous analyses of business problems on structured and unstructured data with advanced quantitative techniques.
- Apply your expertise in data science, statistical analysis, data mining and the visualisation of data to derive insights that value-add to business decision making (e.g. hypothesis testing, development of MVPs, prototyping etc).
- Manage and optimize processes for data intake, validation, mining, and engineering as well as modeling, visualization and communication deliverable.
You’ll be a great fit for us if you
- Bachelor or Master’s in Computer Science, Statistics, Mathematics, Economics, or any other related fields
- At least 3 to 5 years of hands-on experience in a Data Science role with exposure and proficiency in quantitative and statistical analysis, predictive analytics,multi-variate testing and algorithm-optimization for machine learning
- Deep expertise in a range of ML concepts, frameworks and techniques such as logistic regression, clustering, dimensional reduction, recommendation systems,neural nets etc.
- Strong understanding of data infrastructure technologies (e.g. Spark, TensorFlow etc).
- Familiarity with data engineering methodologies, including SQL, ETL and experience in manipulating data sets with structured and unstructured data using Hadoop, AWS or other big data platforms.
- Highly proficient in data visualization and the use of dash boarding tools (e.g.Tableau, Matplotlib, plot.ly etc).
- Proven track record in delivering bespoke data science solutions in a cross-functional setting.
- Experience in managing a small team is preferred.
Bonus attributes
- Interested in dealing with data, including finding, and exploring more efficient ways/programs (e.g. machine learning) to collect, store, and analyse data
- Preferably have some understanding of terms in financial statements and financial ratios
- Strong problem-solving skills – able to find various ways to solve problems and decide which solution to move forward
- Ability to work under pressure and tight timings
- Team oriented, but highly independent for their own projects
- High level of organisational skills and ability to prioritize
applied research.
● Understand, apply and extend state-of-the-art NLP research to better serve our customers.
● Work closely with engineering, product, and customers to scientifically frame the business problems and come up with the underlying AI models.
● Design, implement, test, deploy, and maintain innovative data and machine learning solutions to accelerate our business.
● Think creatively to identify new opportunities and contribute to high-quality publications or patents.
Desired Qualifications and Experience
● At Least 1 year of professional experience.
● Bachelors in Computer Science or related fields from the top colleges.
● Extensive knowledge and practical experience in one or more of the following areas: machine learning, deep learning, NLP, recommendation systems, information retrieval.
● Experience applying ML to solve complex business problems from scratch.
● Experience with Python and a deep learning framework like Pytorch/Tensorflow.
● Awareness of the state of the art research in the NLP community.
● Excellent verbal and written communication and presentation skills.
Responsibilities
- Own the design, development, testing, deployment, and craftsmanship of the team’s infrastructure and systems capable of handling massive amounts of requests with high reliability and scalability
- Leverage the deep and broad technical expertise to mentor engineers and provide leadership on resolving complex technology issues
- Entrepreneurial and out-of-box thinking essential for a technology startup
- Guide the team for unit-test code for robustness, including edge cases, usability, and general reliability
Requirements
- In-depth understanding of image processing algorithms, pattern recognition methods, and rule-based classifiers
- Experience in feature extraction, object recognition and tracking, image registration, noise reduction, image calibration, and correction
- Ability to understand, optimize and debug imaging algorithms
- Understating and experience in openCV library
- Fundamental understanding of mathematical techniques involved in ML and DL schemas (Instance-based methods, Boosting methods, PGM, Neural Networks etc.)
- Thorough understanding of state-of-the-art DL concepts (Sequence modeling, Attention, Convolution etc.) along with knack to imagine new schemas that work for the given data.
- Understanding of engineering principles and a clear understanding of data structures and algorithms
- Experience in writing production level codes using either C++ or Java
- Experience with technologies/libraries such as python pandas, numpy, scipy
- Experience with tensorflow and scikit.