Advanced degree in computer science, math, statistics or a related discipline ( Must have master degree )
Extensive data modeling and data architecture skills
Programming experience in Python, R
Background in machine learning frameworks such as TensorFlow or Keras
Knowledge of Hadoop or another distributed computing systems
Experience working in an Agile environment
Advanced math skills (Linear algebra
Differential equations (ODEs and numerical)
Theory of statistics 1
Numerical analysis 1 (numerical linear algebra) and 2 (quadrature)
Intermediate analysis (point set topology)) ( important )
Strong written and verbal communications
Hands on experience on NLP and NLG
Experience in advanced statistical techniques and concepts. ( GLM/regression, Random forest, boosting, trees, text mining ) and experience with application.
About The other Fruit
We believe that privacy and free choice are not mutually exclusive. They reinforce one another. We aim to empower the individual, thereby permitting trusted and honest control-exchange. And by doing the right thing – for our team, our clients, our partners and our communities – it is also the best thing for our business.
Building in-house as well as fully bespoke B2B, B2C and DFA S/P/BaaS solutions - we are lucky enough to partner with some of the leading firms and groups from around the world. TOF®'s own headquarters are in the UK with branch locations in Singapore, Hong Kong, Thailand and Pune.
● Able contribute to the gathering of functional requirements, developing technical
specifications, and project & test planning
● Demonstrating technical expertise, and solving challenging programming and design
● Roughly 80% hands-on coding
● Generate technical documentation and PowerPoint presentations to communicate
architectural and design options, and educate development teams and business users
● Resolve defects/bugs during QA testing, pre-production, production, and post-release
● Work cross-functionally with various bidgely teams including: product management,
QA/QE, various product lines, and/or business units to drive forward results
● BS/MS in computer science or equivalent work experience
● 2-4 years’ experience designing and developing applications in Data Engineering
● Hands-on experience with Big data Eco Systems.
● Hadoop,Hdfs,Map Reduce,YARN,AWS Cloud, EMR, S3, Spark, Cassandra, Kafka,
● Expertise with any of the following Object-Oriented Languages (OOD): Java/J2EE,Scala,
● Strong leadership experience: Leading meetings, presenting if required
● Excellent communication skills: Demonstrated ability to explain complex technical
issues to both technical and non-technical audiences
● Expertise in the Software design/architecture process
● Expertise with unit testing & Test-Driven Development (TDD)
● Experience on Cloud or AWS is preferable
● Have a good understanding and ability to develop software, prototypes, or proofs of
concepts (POC's) for various Data Engineering requirements.
Create data funnels to feed into models via web, structured and unstructured data
Maintain coding standards using SDLC, Git, AWS deployments etc
Keep abreast of developments in the field
Deploy models in production and monitor them
Documentations of processes and logic
Take ownership of the solution from code to deployment and performance
We are looking for a Big Data Engineer who have worked across the entire ETL stack. Someone who has ingested data in a batch and live stream format, transformed large volumes of daily and built Data-warehouse to store the transformed data and has integrated different visualization dashboards and applications with the data stores. The primary focus will be on choosing optimal solutions to use for these purposes, then maintaining, implementing, and monitoring them.
- Develop, test, and implement data solutions based on functional / non-functional business requirements.
- You would be required to code in Scala and PySpark daily on Cloud as well as on-prem infrastructure
- Build Data Models to store the data in a most optimized manner
- Identify, design, and implement process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Implementing the ETL process and optimal data pipeline architecture
- Monitoring performance and advising any necessary infrastructure changes.
- Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
- Work with data and analytics experts to strive for greater functionality in our data systems.
- Proactively identify potential production issues and recommend and implement solutions
- Must be able to write quality code and build secure, highly available systems.
- Create design documents that describe the functionality, capacity, architecture, and process.
- Review peer-codes and pipelines before deploying to Production for optimization issues and code standards
- Good understanding of optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and ‘big data’ technologies.
- Proficient understanding of distributed computing principles
- Experience in working with batch processing/ real-time systems using various open-source technologies like NoSQL, Spark, Pig, Hive, Apache Airflow.
- Implemented complex projects dealing with the considerable data size (PB).
- Optimization techniques (performance, scalability, monitoring, etc.)
- Experience with integration of data from multiple data sources
- Experience with NoSQL databases, such as HBase, Cassandra, MongoDB, etc.,
- Knowledge of various ETL techniques and frameworks, such as Flume
- Experience with various messaging systems, such as Kafka or RabbitMQ
- Creation of DAGs for data engineering
- Expert at Python /Scala programming, especially for data engineering/ ETL purposes
Designation - Data Scientist
Urgently required. (NP of maximum 15 days)
Experience:- 5-7 years.
Package Offered:- Rs.5,00,000/- to Rs.9,00,000/- pa.
- Identify valuable data sources and automate collection processes
- Undertake preprocessing of structured and unstructured data
- Analyze large amounts of information to discover trends and patterns
- Build predictive models and machine-learning algorithms
- Combine models through ensemble modeling
- Present information using data visualization techniques
- Propose solutions and strategies to business challenges
- Collaborate with engineering and product development teams
- Proven experience as a Data Scientist or Data Analyst
- Experience in data mining
- Understanding of machine-learning and operations research
- Knowledge of R, SQL and Python; familiarity with Scala, Java is an asset
- Experience using business intelligence tools (e.g. Tableau) and data frameworks (e.g. Hadoop)
- Analytical mind and business acumen
- Strong math skills (e.g. statistics, algebra)
- Problem-solving aptitude
- Excellent communication and presentation skills
- BSc/BA in Computer Science, Engineering or relevant field; graduate degree in Data Science or other quantitative field is preferred
- Minimum 3-4 years of experience with ETL tools, SQL, SSAS & SSIS
- Good understanding of Data Governance, including Master Data Management (MDM) and Data Quality tools and processes
- Knowledge of programming languages eg. JASON, Python, R
- Hands on experience of SQL database design
- Experience working with REST API
- Influencing and supporting project delivery through involvement in project/sprint planning and QA
- Working experience with Azure
- Stakeholder management
- Good communication skills
Lead Machine Learning (ML)/
5 + years of experience
Contify is an AI-enabled Market and Competitive Intelligence (MCI)
software to help professionals make informed decisions. Its B2B SaaS
platform helps leading organizations such as Ericsson, EY, Wipro,
Deloitte, L&T, BCG, MetLife, etc. track information on their competitors,
customers, industries, and topics of interest by continuously monitoring
over 500,000+ sources on a real-time basis. Contify is rapidly growing
with 185+ people across two offices in India. Contify is the winner of
Frost and Sullivan’s Product Innovation Award for Market and
Competitive Intelligence Platforms.
We are looking for a hardworking, aspirational, and innovative
engineering person for the Lead ML/ NLP Engineer position. You’ll build
Contify’s ML and NLP capabilities and help us extract value from
unstructured data. Using advanced NLP, ML, and text analytics, you will
develop applications that will extract business insights by analyzing a
large amount of unstructured text information, identifying patterns, and
by connecting the events.
You will be responsible for all the processes from data collection, and
pre-processing, to training models and deploying them to production.
➔ Understand the business objectives; design and deploy scalable
ML models/ NLP applications to meet those objectives
➔ Use of NLP techniques for text representation, semantic analysis,
information extraction, to meet the business objectives in an
efficient manner along with metrics to measure progress
➔ Extend existing ML libraries and frameworks and use effective text
representations to transform natural language into useful features
➔ Defining and supervising the data collection process, verifying data
quality, and employing data augmentation techniques
➔ Defining the preprocessing or feature engineering to be done on a
➔ Analyze the errors of the model and design strategies to overcome
➔ Research and implement the right algorithms and tools for ML/
➔ Collaborate with engineering and product development teams
➔ Represent Contify in external ML industry events and publish
thought leadership articles
Desired Skills and Experience
To succeed in this role, you should possess outstanding skills in
statistical analysis, machine learning methods, and text representation
➔ Deep understanding of text representation techniques (such as n-
grams, bag of words, sentiment analysis, etc), statistics and
➔ Hand on experience in feature extraction techniques for text
classification and topic mining
➔ Knowledge of text analytics with a strong understanding of NLP
algorithms and models (GLMs, SVM, PCA, NB, Clustering, DTs)
and their underlying computational and probabilistic statistics
◆ Word Embedding like Tfidf, Word2Vec, GLove, FastText, etc.
◆ Language models like Bert, GPT, RoBERTa, XLNet
◆ Neural networks like RNN, GRU, LSTM, Bi-LSTM
◆ Classification algorithms like LinearSVC, SVM, LR
◆ XGB, MultinomialNB, etc.
◆ Other Algos- PCA, Clustering methods, etc
➔ Excellent knowledge and demonstrable experience in using NLP
packages such as NLTK, Word2Vec, SpaCy, Gensim, Standford
CoreNLP, TensorFlow/ PyTorch.
➔ Experience in setting up supervised & unsupervised learning
models including data cleaning, data analytics, feature creation,
model selection & ensemble methods, performance metrics &
➔ Evaluation Metrics- Root Mean Squared Error, Confusion Matrix, F
Score, AUC – ROC, etc
➔ Understanding of knowledge graph will be a plus
➔ Education: Bachelors or Masters in Computer Science,
Mathematics, Computational Linguistics or similar field
➔ At least 4 years' experience building Machine Learning & NLP
solutions over open-source platforms such as SciKit-Learn,
Tensorflow, SparkML, etc
➔ At least 2 years' experience in designing and developing
enterprise-scale NLP solutions in one or more of: Named Entity
Recognition, Document Classification, Feature Extraction, Triplet
Extraction, Clustering, Summarization, Topic Modelling, Dialog
Systems, Sentiment Analysis
➔ Self-starter who can see the big picture, and prioritize your work to
make the largest impact on the business’ and customer’s vision
➔ Being a committer or a contributor to an open-source project is a
Contify is a people-oriented company. Emotional intelligence, therefore,
is a must. You should enjoy working in a team environment, supporting
your teammates in pursuit of our common goals, and working with your
colleagues to drive customer value. You strive to not only improve
yourself, but also those around you.
Key Responsibilities : ( Data Developer Python, Spark)
Exp : 2 to 9 Yrs
Development of data platforms, integration frameworks, processes, and code.
Develop and deliver APIs in Python or Scala for Business Intelligence applications build using a range of web languages
Develop comprehensive automated tests for features via end-to-end integration tests, performance tests, acceptance tests and unit tests.
Elaborate stories in a collaborative agile environment (SCRUM or Kanban)
Familiarity with cloud platforms like GCP, AWS or Azure.
Experience with large data volumes.
Familiarity with writing rest-based services.
Experience with distributed processing and systems
Experience with Hadoop / Spark toolsets
Experience with relational database management systems (RDBMS)
Experience with Data Flow development
Knowledge of Agile and associated development techniques including:
Proactively fetches information from various sources and analyzes it for a better understanding of how the business performs, and to build AI tools that automate certain processes within the company.
Roles & Responsibilities
- Develop novel computer vision/NLP algorithms
- Build large datasets that will be used to train the models
- Empirically evaluate related research works
- Train and evaluate deep learning architectures on multiple large scale datasets
- Collaborate with the rest of the research team to produce high quality research
- Manage a team of 2+ interns
- 2+years of experience in building deep learning models
- Strong basics around probability and statistics, linear algebra, data structure & algorithms
- Good knowledge of classic ML algorithms (regression, SVM, PCA etc.), deep learning
- Strong programming skills
Nice to have skills
- Familiarity with pytorch
- Knowledge of SOTA techniques in NLP and Vision
- High level of responsibility and ownership for a product impacting billions of lives.
- Extremely high-quality talent to work with. Work with a global team between US / India.
- Work from anywhere anytime!
- Best of breed industry benefits packages.
- 3+ years of experience in Machine Learning
- Bachelors/Masters in Computer Engineering/Science.
- Bachelors/Masters in Engineering/Mathematics/Statistics with sound knowledge of programming and computer concepts.
- 10 and 12th acedemics 70 % & above.
- Strong Python/ programming skills
- Good conceptual understanding of Machine Learning/Deep Learning/Natural Language Processing
- Strong verbal and written communication skills.
- Should be able to manage team, meet project deadlines and interface with clients.
- Should be able to work across different domains and quickly ramp up the business processes & flows & translate business problems into the data solutions