· Build data products and processes alongside the core engineering and technology team.
· Collaborate with senior data scientists to curate, wrangle, and prepare data for use in their advanced analytical models
· Integrate data from a variety of sources, assuring that they adhere to data quality and accessibility standards
· Modify and improve data engineering processes to handle ever larger, more complex, and more types of data sources and pipelines
· Use Hadoop architecture and HDFS commands to design and optimize data queries at scale
· Evaluate and experiment with novel data engineering tools and advises information technology leads and partners about new capabilities to determine optimal solutions for particular technical problems or designated use cases .
Striving for excellence is in our DNA.
We are more than just specialists; we are experts in agile software development with a keen focus on Cloud Native D3 (Digital, Data, DevSecOps. We help leading global businesses imagine, design, engineer, and deliver software and digital experiences that change the world.
Headquartered in Princeton, NJ (United States) we are multinational company that is growing fast . This role is based out of our India setup.
We believe that we are only as good as the quality of our people. Our offices are digital pods. Our clients are fortune brands. We’re always looking for the most talented and skilled teammates. Do you have it in you?
Synapsica is a series-A funded HealthTech startup founded by alumni from IIT Kharagpur, AIIMS New Delhi, and IIM Ahmedabad. We believe healthcare needs to be transparent and objective while being affordable. Every patient has the right to know exactly what is happening in their bodies and they don't have to rely on cryptic 2 liners given to them as a diagnosis.
Towards this aim, we are building an artificial intelligence enabled cloud based platform to analyse medical images and create v2.0 of advanced radiology reporting. We are backed by IvyCap, Endia Partners, YCombinator and other investors from India, US, and Japan. We are proud to have GE and The Spinal Kinetics as our partners. Here’s a small sample of what we’re building: https://www.youtube.com/watch?v=FR6a94Tqqls
Your Roles and Responsibilities
Synapsica is looking for a Principal AI Researcher to lead and drive AI based research and development efforts. Ideal candidate should have extensive experience in Computer Vision and AI Research, either through studies or industrial R&D projects and should be excited to work on advanced exploratory research and development projects in computer vision and machine learning to create the next generation of advanced radiology solutions.
The role involves computer vision tasks including development customization and training of Convolutional Neural Networks (CNNs); application of ML techniques (SVM, regression, clustering etc.), and traditional Image Processing (OpenCV, etc.). The role is research-focused and would involve going through and implementing existing research papers, deep dive of problem analysis, frequent review of results, generating new ideas, building new models from scratch, publishing papers, automating and optimizing key processes. The role will span from real-world data handling to the most advanced methods such as transfer learning, generative models, reinforcement learning, etc., with a focus on understanding quickly and experimenting even faster. Suitable candidate will collaborate closely both with the medical research team, software developers and AI research scientists. The candidate must be creative, ask questions, and be comfortable challenging the status quo. The position is based in our Bangalore office.
- Interface between product managers and engineers to design, build, and deliver AI models and capabilities for our spine products.
- Formulate and design AI capabilities of our stack with special focus on computer vision.
- Strategize end-to-end model training flow including data annotation, model experiments, model optimizations, model deployment and relevant automations
- Lead teams, engineers, and scientists to envision and build new research capabilities and ensure delivery of our product roadmap.
- Organize regular reviews and discussions.
- Keep the team up-to-date with latest industrial and research updates.
- Publish research and clinical validation papers
- 6+ years of relevant experience in solving complex real-world problems at scale using computer vision-based deep learning.
- Prior experience in leading and managing a team.
- Strong problem-solving ability
- Prior experience with Python, cuDNN, Tensorflow, PyTorch, Keras, Caffe (or similar Deep Learning frameworks).
- Extensive understanding of computer vision/image processing applications like object classification, segmentation, object detection etc
- Ability to write custom Convolutional Neural Network Architecture in Pytorch (or similar)
- Background in publishing research papers and/or patents
- Computer Vision and AI Research background in medical domain will be a plus
- Experience of GPU/DSP/other Multi-core architecture programming
- Effective communication with other project members and project stakeholders
- Detail-oriented, eager to learn, acquire new skills
- Prior Project Management and Team Leadership experience
- Ability to plan work and meet the deadline
Ganit has flipped the data science value chain as we do not start with a technique but for us, consumption comes first. With this philosophy, we have successfully scaled from being a small start-up to a 200 resource company with clients in the US, Singapore, Africa, UAE, and India.
We are looking for experienced data enthusiasts who can make the data talk to them.
- Understand business problems and translate business requirements into technical requirements.
- Conduct complex data analysis to ensure data quality & reliability i.e., make the data talk by extracting, preparing, and transforming it.
- Identify, develop and implement statistical techniques and algorithms to address business challenges and add value to the organization.
- Gather requirements and communicate findings in the form of a meaningful story with the stakeholders
- Build & implement data models using predictive modelling techniques. Interact with clients and provide support for queries and delivery adoption.
- Lead and mentor data analysts.
We are looking for someone who has:
- Apart from your love for data and ability to code even while sleeping you would need the following.
- Minimum of 02 years of experience in designing and delivery of data science solutions.
- You should have successful projects of retail/BFSI/FMCG/Manufacturing/QSR in your kitty to show-off.
- Deep understanding of various statistical techniques, mathematical models, and algorithms to start the conversation with the data in hand.
- Ability to choose the right model for the data and translate that into a code using R, Python, VBA, SQL, etc.
- Bachelors/Masters degree in Engineering/Technology or MBA from Tier-1 B School or MSc. in Statistics or Mathematics
- Predictive Modelling
- Prescriptive Modelling
- Descriptive Modelling
- Time Series
What is in it for you:
- Be a part of building the biggest brand in Data science.
- An opportunity to be a part of a young and energetic team with a strong pedigree.
- Work on awesome projects across industries and learn from the best in the industry, while growing at a hyper rate.
At Ganit, we are looking for people who love problem solving. You are encouraged to apply even if your experience does not precisely match the job description above. Your passion and skills will stand out and set you apart—especially if your career has taken some extraordinary twists and turns over the years. We welcome diverse perspectives, people who think rigorously and are not afraid to challenge assumptions in a problem. Join us and punch above your weight!
Ganit is an equal opportunity employer and is committed to providing a work environment that is free from harassment and discrimination.
All recruitment, selection procedures and decisions will reflect Ganit’s commitment to providing equal opportunity. All potential candidates will be assessed according to their skills, knowledge, qualifications, and capabilities. No regard will be given to factors such as age, gender, marital status, race, religion, physical impairment, or political opinions.
- Develop robust, scalable and maintainable machine learning models to answer business problems against large data sets.
- Build methods for document clustering, topic modeling, text classification, named entity recognition, sentiment analysis, and POS tagging.
- Perform elements of data cleaning, feature selection and feature engineering and organize experiments in conjunction with best practices.
- Benchmark, apply, and test algorithms against success metrics. Interpret the results in terms of relating those metrics to the business process.
- Work with development teams to ensure models can be implemented as part of a delivered solution replicable across many clients.
- Knowledge of Machine Learning, NLP, Document Classification, Topic Modeling and Information Extraction with a proven track record of applying them to real problems.
- Experience working with big data systems and big data concepts.
- Ability to provide clear and concise communication both with other technical teams and non-technical domain specialists.
- Strong team player; ability to provide both a strong individual contribution but also work as a team and contribute to wider goals is a must in this dynamic environment.
- Experience with noisy and/or unstructured textual data.
knowledge graph and NLP including summarization, topic modelling etc
- Strong coding ability with statistical analysis tools in Python or R, and general software development skills (source code management, debugging, testing, deployment, etc.)
- Working knowledge of various text mining algorithms and their use-cases such as keyword extraction, PLSA, LDA, HMM, CRF, deep learning & recurrent ANN, word2vec/doc2vec, Bayesian modeling.
- Strong understanding of text pre-processing and normalization techniques, such as tokenization,
- POS tagging and parsing and how they work at a low level.
- Excellent problem solving skills.
- Strong verbal and written communication skills
- Masters or higher in data mining or machine learning; or equivalent practical analytics / modelling experience
- Practical experience in using NLP related techniques and algorithms
- Experience in open source coding and communities desirable.
Able to containerize Models and associated modules and work in a Microservices environment
Who we are?
Searce is a niche’ Cloud Consulting business with futuristic tech DNA. We do new-age tech to realise the “Next” in the “Now” for our Clients. We specialise in Cloud Data Engineering, AI/Machine Learning and Advanced Cloud infra tech such as Anthos and Kubernetes. We are one of the top & the fastest growing partners for Google Cloud and AWS globally with over 2,500 clients successfully moved to cloud.
What do we believe?
- Best practices are overrated
- Implementing best practices can only make one an average .
- Honesty and Transparency
- We believe in naked truth. We do what we tell and tell what we do.
- Client Partnership
- Client - Vendor relationship: No. We partner with clients instead.
- And our sales team comprises 100% of our clients.
How do we work ?
It’s all about being Happier first. And rest follows. Searce work culture is defined by HAPPIER.
- Humble: Happy people don’t carry ego around. We listen to understand; not to respond.
- Adaptable: We are comfortable with uncertainty. And we accept changes well. As that’s what life's about.
- Positive: We are super positive about work & life in general. We love to forget and forgive. We don’t hold grudges. We don’t have time or adequate space for it.
- Passionate: We are as passionate about the great street-food vendor across the street as about Tesla’s new model and so on. Passion is what drives us to work and makes us deliver the quality we deliver.
- Innovative: Innovate or Die. We love to challenge the status quo.
- Experimental: We encourage curiosity & making mistakes.
- Responsible: Driven. Self motivated. Self governing teams. We own it.
So, what are we hunting for ?
As a Principal Data Scientist, you will help develop and enhance the algorithms and technology that powers our unique system. This role covers a wide range of challenges,
from developing new models using pre-existing components to enable current systems to
be more intelligent. You should be able to train models using existing data and use them in
most creative manner to deliver the smartest experience to customers. You will have to
develop multiple AI applications that push the threshold of intelligence in machines.
Working on multiple projects at a time, you will have to maintain a consistently high level of attention to detail while finding creative ways to provide analytical insights. You will also have to thrive in a fast, high-energy environment and should be able to balance multiple projects in real-time. The thrill of the next big challenge should drive you, and when faced with an obstacle, you should be able to find clever solutions.You must have the ability and interest to work on a range of different types of projects and business processes, and must have a background that demonstrates this ability.
Your bucket of Undertakings :
- Collaborate with team members to develop new models to be used for classification problems
- Work on software profiling, performance tuning and analysis, and other general software engineering tasks
- Use independent judgment to take existing data and build new models from it
- Collaborate and provide technical guidance and come up with new ideas,rapid prototyping and converting prototypes into scalable products
- Conduct experiments to assess the accuracy and recall of language processing modules and to study the effect of such experiences
- Lead AI R&D initiatives to include prototypes and minimum viable products
- Work closely with multiple teams on projects like Visual quality inspection, ML Ops, Conversational banking, Demand forecasting, Anomaly detection etc.
- Build reusable and scalable solutions for use across the customer base
- Prototype and demonstrate AI related products and solutions for customers
- Assist business development teams in the expansion and enhancement of a pipeline to support short- and long-range growth plans
- Identify new business opportunities and prioritize pursuits for AI
- Participate in long range strategic planning activities designed to meet the Company’s objectives and to increase its enterprise value and revenue goals
Education & Experience :
- BE/B.Tech/Masters in a quantitative field such as CS, EE, Information sciences, Statistics, Mathematics, Economics, Operations Research, or related, with focus on applied and foundational Machine Learning , AI , NLP and/or / data-driven statistical analysis & modelling
- 8+ years of Experience majorly in applying AI/ML/ NLP / deep learning / data-driven statistical analysis & modelling solutions to multiple domains, including financial engineering, financial processes a plus
- Strong, proven programming skills and with machine learning and deep learning and Big data frameworks including TensorFlow, Caffe, Spark, Hadoop. Experience with writing complex programs and implementing custom algorithms in these and other environments
- Experience beyond using open source tools as-is, and writing custom code on top of, or in addition to, existing open source frameworks
- Proven capability in demonstrating successful advanced technology solutions (either prototypes, POCs, well-cited research publications, and/or products) using ML/AI/NLP/data science in one or more domains
- Research and implement novel machine learning and statistical approaches
- Experience in data management, data analytics middleware, platforms and infrastructure, cloud and fog computing is a plus
- Excellent communication skills (oral and written) to explain complex algorithms, solutions to stakeholders across multiple disciplines, and ability to work in a diverse team
- Extensive experience with Hadoop and Machine learning algorithms
- Exposure to Deep Learning, Neural Networks, or related fields and a strong interest and desire to pursue them
- Experience in Natural Language Processing, Computer Vision, Machine Learning or Machine Intelligence (Artificial Intelligence)
- Passion for solving NLP problems
- Experience with specialized tools and project for working with natural language processing
- Knowledge of machine learning frameworks like Tensorflow, Pytorch
- Experience with software version control systems like Github
- Fast learner and be able to work independently as well as in a team environment with good written and verbal communication skills
bachelor’s degree or equivalent experience
● Knowledge of database fundamentals and fluency in advanced SQL, including concepts
such as windowing functions
● Knowledge of popular scripting languages for data processing such as Python, as well as
familiarity with common frameworks such as Pandas
● Experience building streaming ETL pipelines with tools such as Apache Flink, Apache
Beam, Google Cloud Dataflow, DBT and equivalents
● Experience building batch ETL pipelines with tools such as Apache Airflow, Spark, DBT, or
● Experience working with messaging systems such as Apache Kafka (and hosted
equivalents such as Amazon MSK), Apache Pulsar
● Familiarity with BI applications such as Tableau, Looker, or Superset
● Hands on coding experience in Java or Scala
Role and Responsibilities
- Execute data mining projects, training and deploying models over a typical duration of 2 -12 months.
- The ideal candidate should be able to innovate, analyze the customer requirement, develop a solution in the time box of the project plan, execute and deploy the solution.
- Integrate the data mining projects embedded data mining applications in the FogHorn platform (on Docker or Android).
Candidates must meet ALL of the following qualifications:
- Have analyzed, trained and deployed at least three data mining models in the past. If the candidate did not directly deploy their own models, they will have worked with others who have put their models into production. The models should have been validated as robust over at least an initial time period.
- Three years of industry work experience, developing data mining models which were deployed and used.
- Programming experience in Python is core using data mining related libraries like Scikit-Learn. Other relevant Python mining libraries include NumPy, SciPy and Pandas.
- Data mining algorithm experience in at least 3 algorithms across: prediction (statistical regression, neural nets, deep learning, decision trees, SVM, ensembles), clustering (k-means, DBSCAN or other) or Bayesian networks
Any of the following extra qualifications will make a candidate more competitive:
- Soft Skills
- Sets expectations, develops project plans and meets expectations.
- Experience adapting technical dialogue to the right level for the audience (i.e. executives) or specific jargon for a given vertical market and job function.
- Technical skills
- Commonly, candidates have a MS or Ph.D. in Computer Science, Math, Statistics or an engineering technical discipline. BS candidates with experience are considered.
- Have managed past models in production over their full life cycle until model replacement is needed. Have developed automated model refreshing on newer data. Have developed frameworks for model automation as a prototype for product.
- Training or experience in Deep Learning, such as TensorFlow, Keras, convolutional neural networks (CNN) or Long Short Term Memory (LSTM) neural network architectures. If you don’t have deep learning experience, we will train you on the job.
- Shrinking deep learning models, optimizing to speed up execution time of scoring or inference.
- OpenCV or other image processing tools or libraries
- Cloud computing: Google Cloud, Amazon AWS or Microsoft Azure. We have integration with Google Cloud and are working on other integrations.
- Decision trees like XGBoost or Random Forests is helpful.
- Complex Event Processing (CEP) or other streaming data as a data source for data mining analysis
- Time series algorithms from ARIMA to LSTM to Digital Signal Processing (DSP).
- Bayesian Networks (BN), a.k.a. Bayesian Belief Networks (BBN) or Graphical Belief Networks (GBN)
- Experience with PMML is of interest (see www.DMG.org).
- Vertical experience in Industrial Internet of Things (IoT) applications:
- Energy: Oil and Gas, Wind Turbines
- Manufacturing: Motors, chemical processes, tools, automotive
- Smart Cities: Elevators, cameras on population or cars, power grid
- Transportation: Cars, truck fleets, trains
About FogHorn Systems
FogHorn is a leading developer of “edge intelligence” software for industrial and commercial IoT application solutions. FogHorn’s Lightning software platform brings the power of advanced analytics and machine learning to the on-premise edge environment enabling a new class of applications for advanced monitoring and diagnostics, machine performance optimization, proactive maintenance and operational intelligence use cases. FogHorn’s technology is ideally suited for OEMs, systems integrators and end customers in manufacturing, power and water, oil and gas, renewable energy, mining, transportation, healthcare, retail, as well as Smart Grid, Smart City, Smart Building and connected vehicle applications.
- 2019 Edge Computing Company of the Year – Compass Intelligence
- 2019 Internet of Things 50: 10 Coolest Industrial IoT Companies – CRN
- 2018 IoT Planforms Leadership Award & Edge Computing Excellence – IoT Evolution World Magazine
- 2018 10 Hot IoT Startups to Watch – Network World. (Gartner estimated 20 billion connected things in use worldwide by 2020)
- 2018 Winner in Artificial Intelligence and Machine Learning – Globe Awards
- 2018 Ten Edge Computing Vendors to Watch – ZDNet & 451 Research
- 2018 The 10 Most Innovative AI Solution Providers – Insights Success
- 2018 The AI 100 – CB Insights
- 2017 Cool Vendor in IoT Edge Computing – Gartner
- 2017 20 Most Promising AI Service Providers – CIO Review
Our Series A round was for $15 million. Our Series B round was for $30 million October 2017. Investors include: Saudi Aramco Energy Ventures, Intel Capital, GE, Dell, Bosch, Honeywell and The Hive.
About the Data Science Solutions team
In 2018, our Data Science Solutions team grew from 4 to 9. We are growing again from 11. We work on revenue generating projects for clients, such as predictive maintenance, time to failure, manufacturing defects. About half of our projects have been related to vision recognition or deep learning. We are not only working on consulting projects but developing vertical solution applications that run on our Lightning platform, with embedded data mining.
Our data scientists like our team because:
- We care about “best practices”
- Have a direct impact on the company’s revenue
- Give or receive mentoring as part of the collaborative process
- Questions and challenging the status quo with data is safe
- Intellectual curiosity balanced with humility
- Present papers or projects in our “Thought Leadership” meeting series, to support continuous learning
Role and Responsibilities
- Build a low latency serving layer that powers DataWeave's Dashboards, Reports, and Analytics functionality
- Build robust RESTful APIs that serve data and insights to DataWeave and other products
- Design user interaction workflows on our products and integrating them with data APIs
- Help stabilize and scale our existing systems. Help design the next generation systems.
- Scale our back end data and analytics pipeline to handle increasingly large amounts of data.
- Work closely with the Head of Products and UX designers to understand the product vision and design philosophy
- Lead/be a part of all major tech decisions. Bring in best practices. Mentor younger team members and interns.
- Constantly think scale, think automation. Measure everything. Optimize proactively.
- Be a tech thought leader. Add passion and vibrance to the team. Push the envelope.
Skills and Requirements
- 8- 15 years of experience building and scaling APIs and web applications.
- Experience building and managing large scale data/analytics systems.
- Have a strong grasp of CS fundamentals and excellent problem solving abilities. Have a good understanding of software design principles and architectural best practices.
- Be passionate about writing code and have experience coding in multiple languages, including at least one scripting language, preferably Python.
- Be able to argue convincingly why feature X of language Y rocks/sucks, or why a certain design decision is right/wrong, and so on.
- Be a self-starter—someone who thrives in fast paced environments with minimal ‘management’.
- Have experience working with multiple storage and indexing technologies such as MySQL, Redis, MongoDB, Cassandra, Elastic.
- Good knowledge (including internals) of messaging systems such as Kafka and RabbitMQ.
- Use the command line like a pro. Be proficient in Git and other essential software development tools.
- Working knowledge of large-scale computational models such as MapReduce and Spark is a bonus.
- Exposure to one or more centralized logging, monitoring, and instrumentation tools, such as Kibana, Graylog, StatsD, Datadog etc.
- Working knowledge of building websites and apps. Good understanding of integration complexities and dependencies.
- Working knowledge linux server administration as well as the AWS ecosystem is desirable.
- It's a huge bonus if you have some personal projects (including open source contributions) that you work on during your spare time. Show off some of your projects you have hosted on GitHub.