4+ BERT Jobs in Pune | BERT Job openings in Pune
Apply to 4+ BERT Jobs in Pune on CutShort.io. Explore the latest BERT Job opportunities across top companies like Google, Amazon & Adobe.
We are looking for a talented and driven Data Scientist to join our growing Analytics team in India. In this role, you will work at the intersection of advanced machine learning, scalable MLOps infrastructure, and domain-specific healthcare analytics. You will collaborate closely with cross-functional teams to build, deploy, and maintain production-grade ML models that drive real-world impact in clinical trials and healthcare operations.
KEY RESPONSIBILITIES
End-to-End ML Development
• Design, build, and optimize predictive models across the full ML lifecycle—from data ingestion to model serving.
• Conduct rigorous Exploratory Data Analysis (EDA) to surface insights and drive feature engineering decisions.
• Validate model performance using appropriate statistical techniques and domain knowledge.
MLOps & Production Deployment
• Deploy, monitor, and maintain production-grade ML models using Databricks MLFlow endpoints and Unity Catalog.
• Implement CI/CD pipelines for model versioning, experiment tracking, and automated retraining.
• Ensure model reliability, observability, and performance in live production environments.
Language Models & LLM Applications
• Apply transformer-based models (BERT, ClinicalBERT, Trial2Vec) for NLP tasks including classification, NER, and information extraction.
• Build and maintain vector similarity search pipelines for semantic retrieval and recommendation use cases.
• Fine-tune pre-trained models for domain-specific applications in clinical and healthcare contexts.
• Support exploratory work around LLM integration and prompt engineering for internal tooling.
Domain-Driven Analytics
• Apply advanced analytics within complex healthcare and clinical trial datasets—including patient records, trial protocols, and adverse event data.
• Translate ambiguous business problems into structured analytical frameworks with measurable outcomes.
• Partner with domain experts, product managers, and engineering teams to deliver data-driven solutions.
REQUIRED QUALIFICATIONS
Education
• Bachelor’s or Master’s degree in Computer Science, Statistics, Mathematics, Bioinformatics, or a closely related field.
Experience
• 2–4 years of hands-on experience in a data science or machine learning role.
• Demonstrable experience deploying ML models in production environments (not just prototyping).
Technical Skills
• Strong proficiency in Python (pandas, NumPy, scikit-learn, PyTorch / TensorFlow).
• Experience with Databricks, MLFlow (experiment tracking, model registry, endpoints), and Unity Catalog.
• Hands-on experience with BERT-family models and Hugging Face Transformers library.
• Familiarity with vector databases (e.g., FAISS, Pinecone, Weaviate) and embedding-based retrieval.
• Solid understanding of SQL and working with large structured/unstructured datasets.
• Exposure to cloud platforms (AWS / GCP / Azure) and distributed computing frameworks (Spark).
GOOD TO HAVE
• Prior experience with clinical trial data standards (CDISC, CDASH, SDTM) or healthcare ontologies (SNOMED, ICD-10).
• Familiarity with Trial2Vec or similar trial-to-vector embedding approaches.
• Experience with LLM fine-tuning, RAG pipelines, or prompt engineering in a production setting.
• Knowledge of regulatory and compliance considerations in healthcare AI (e.g., FDA guidelines, HIPAA).
• Contributions to open-source ML projects or published research.
THIS ROLE IS NOT FOR YOU IF…
• You have strong SQL/BI skills but limited hands-on ML modelling experience — or you’ve built models only in notebooks without ever deploying them to production.
• Your LLM exposure is limited to API calls and prompt engineering — with no experience fine-tuning models, working with embeddings, or building vector search pipelines.
Data Scientist-
We are looking for an experienced Data Scientists to join our engineering team and
help us enhance our mobile application with data. In this role, we're looking for
people who are passionate about developing ML/AI in various domains that solves
enterprise problems. We are keen on hiring someone who loves working in fast paced start-up environment and looking to solve some challenging engineering
problems.
As one of the earliest members in engineering, you will have the flexibility to design
the models and architecture from ground up. As any early-stage start-up, we expect
you to be comfortable wearing various hats, and be proactive contributor in building
something truly remarkable.
Responsibilities
Researches, develops and maintains machine learning and statistical models for
business requirements
Work across the spectrum of statistical modelling including supervised,
unsupervised, & deep learning techniques to apply the right level of solution to
the right problem Coordinate with different functional teams to monitor outcomes and refine/
improve the machine learning models Implements models to uncover patterns and predictions creating business value and innovation
Identify unexplored data opportunities for the business to unlock and maximize
the potential of digital data within the organization
Develop NLP concepts and algorithms to classify and summarize structured/unstructured text data
Qualifications
3+ years of experience solving complex business problems using machine
learning.
Fluency in programming languages such as Python, NLP and Bert, is a must
Strong analytical and critical thinking skills
Experience in building production quality models using state-of-the-art technologies
Familiarity with databases like MySQL, Oracle, SQL Server, NoSQL, etc. is
desirable Ability to collaborate on projects and work independently when required.
Previous experience in Fintech/payments domain is a bonus
You should have Bachelor’s or Master’s degree in Computer Science, Statistics
or Mathematics or another quantitative field from a top tier Institute
Location: Pune
Experience: 3+ Years
Experience applying statistical methods (distribution analysis, classification, clustering, etc.).
The individual requires excellent analytical skills required to mine data, develop algorithms and then analyze results to determine decisions or actions
At least good experience in using data science with a focus on deep neural nets, statistics, empirical data analysis, machine learning and Natural Language Processing
Solid knowledge of various statistical techniques and experience using machine learning algorithms
Ability to come up with solutions to loosely defined business problems by leveraging pattern detection over potentially large datasets
Excellent relationship management skills with senior stakeholders is paramount
Experience in practical data processing, data mining, text mining and information retrieval tasks
Location: Ahmedabad / Pune
Team: Technology
Company Profile
InFoCusp is a company working in the broad field of Computer Science, Software Engineering, and Artificial Intelligence (AI). It is headquartered in Ahmedabad, India, having a branch office in Pune.
We have worked on / are working on AI projects / algorithms-heavy projects with applications ranging in finance, healthcare, e-commerce, legal, HR/recruiting, pharmaceutical, leisure sports and computer gaming domains. All of this is based on the core concepts of data science,
computer vision, machine learning (with emphasis on deep learning), cloud computing, biomedical signal processing, text and natural language processing, distributed systems, embedded systems and the Internet of Things.
PRIMARY RESPONSIBILITIES:
● Applying machine learning, deep learning, and signal processing on large datasets (Audio, sensors, images, videos, text) to develop models.
● Architecting large scale data analytics/modeling systems.
● Designing and programming machine learning methods and integrating them into our ML framework/pipeline.
● Analyzing data collected from various sources,
● Evaluate and validate the analysis with statistical methods. Also presenting this in a lucid form to people not familiar with the domain of data science/computer science.
● Writing specifications for algorithms, reports on data analysis, and documentation of algorithms.
● Evaluating new machine learning methods and adapting them for our
purposes.
● Feature engineering to add new features that improve model
performance.
KNOWLEDGE AND SKILL REQUIREMENTS:
● Background and knowledge of recent advances in machine learning, deep learning, natural language processing, and/or image/signal/video processing with at least 3 years of professional work experience working on real-world data.
● Strong programming background, e.g. Python, C/C++, R, Java, and knowledge of software engineering concepts (OOP, design patterns).
● Knowledge of machine learning libraries Tensorflow, Jax, Keras, scikit-learn, pyTorch. Excellent mathematical skills and background, e.g. accuracy, significance tests, visualization, advanced probability concepts
● Ability to perform both independent and collaborative research.
● Excellent written and spoken communication skills.
● A proven ability to work in a cross-discipline environment in defined time frames. Knowledge and experience of deploying large-scale systems using distributed and cloud-based systems (Hadoop, Spark, Amazon EC2, Dataflow) is a big plus.
● Knowledge of systems engineering is a big plus.
● Some experience in project management and mentoring is also a big plus.
EDUCATION:
- B.E.\B. Tech\B.S. candidates' entries with significant prior experience in the aforementioned fields will be considered.
- M.E.\M.S.\M. Tech\PhD preferably in fields related to Computer Science with experience in machine learning, image and signal processing, or statistics preferred.

