BERT Jobs in Delhi, NCR and Gurgaon

2+ BERT Jobs in Delhi, NCR and Gurgaon | BERT Job openings in Delhi, NCR and Gurgaon

Apply to 2+ BERT Jobs in Delhi, NCR and Gurgaon on CutShort.io. Explore the latest BERT Job opportunities across top companies like Google, Amazon & Adobe.

Data Scientist - Machine Learning & MLOps | Healthcare Analytics

at Sentiaflow

2 candid answers

Posted by Sonal Agarwal

Remote, Delhi, Gurugram, Noida, Ghaziabad, Faridabad, Pune

2 - 4 yrs

₹30L - ₹40L / yr

databricks

MLFlow

Python

BERT

Large Language Models (LLM) tuning

+1 more

We are looking for a talented and driven Data Scientist to join our growing Analytics team in India. In this role, you will work at the intersection of advanced machine learning, scalable MLOps infrastructure, and domain-specific healthcare analytics. You will collaborate closely with cross-functional teams to build, deploy, and maintain production-grade ML models that drive real-world impact in clinical trials and healthcare operations.

KEY RESPONSIBILITIES

End-to-End ML Development

• Design, build, and optimize predictive models across the full ML lifecycle—from data ingestion to model serving.

• Conduct rigorous Exploratory Data Analysis (EDA) to surface insights and drive feature engineering decisions.

• Validate model performance using appropriate statistical techniques and domain knowledge.

MLOps & Production Deployment

• Deploy, monitor, and maintain production-grade ML models using Databricks MLFlow endpoints and Unity Catalog.

• Implement CI/CD pipelines for model versioning, experiment tracking, and automated retraining.

• Ensure model reliability, observability, and performance in live production environments.

Language Models & LLM Applications

• Apply transformer-based models (BERT, ClinicalBERT, Trial2Vec) for NLP tasks including classification, NER, and information extraction.

• Build and maintain vector similarity search pipelines for semantic retrieval and recommendation use cases.

• Fine-tune pre-trained models for domain-specific applications in clinical and healthcare contexts.

• Support exploratory work around LLM integration and prompt engineering for internal tooling.

Domain-Driven Analytics

• Apply advanced analytics within complex healthcare and clinical trial datasets—including patient records, trial protocols, and adverse event data.

• Translate ambiguous business problems into structured analytical frameworks with measurable outcomes.

• Partner with domain experts, product managers, and engineering teams to deliver data-driven solutions.

REQUIRED QUALIFICATIONS

Education

• Bachelor’s or Master’s degree in Computer Science, Statistics, Mathematics, Bioinformatics, or a closely related field.

Experience

• 2–4 years of hands-on experience in a data science or machine learning role.

• Demonstrable experience deploying ML models in production environments (not just prototyping).

Technical Skills

• Strong proficiency in Python (pandas, NumPy, scikit-learn, PyTorch / TensorFlow).

• Experience with Databricks, MLFlow (experiment tracking, model registry, endpoints), and Unity Catalog.

• Hands-on experience with BERT-family models and Hugging Face Transformers library.

• Familiarity with vector databases (e.g., FAISS, Pinecone, Weaviate) and embedding-based retrieval.

• Solid understanding of SQL and working with large structured/unstructured datasets.

• Exposure to cloud platforms (AWS / GCP / Azure) and distributed computing frameworks (Spark).

GOOD TO HAVE

• Prior experience with clinical trial data standards (CDISC, CDASH, SDTM) or healthcare ontologies (SNOMED, ICD-10).

• Familiarity with Trial2Vec or similar trial-to-vector embedding approaches.

• Experience with LLM fine-tuning, RAG pipelines, or prompt engineering in a production setting.

• Knowledge of regulatory and compliance considerations in healthcare AI (e.g., FDA guidelines, HIPAA).

• Contributions to open-source ML projects or published research.

THIS ROLE IS NOT FOR YOU IF…

• You have strong SQL/BI skills but limited hands-on ML modelling experience — or you’ve built models only in notebooks without ever deploying them to production.

• Your LLM exposure is limited to API calls and prompt engineering — with no experience fine-tuning models, working with embeddings, or building vector search pipelines.

KEY RESPONSIBILITIES

End-to-End ML Development

• Design, build, and optimize predictive models across the full ML lifecycle—from data ingestion to model serving.

• Conduct rigorous Exploratory Data Analysis (EDA) to surface insights and drive feature engineering decisions.

• Validate model performance using appropriate statistical techniques and domain knowledge.

MLOps & Production Deployment

• Deploy, monitor, and maintain production-grade ML models using Databricks MLFlow endpoints and Unity Catalog.

• Implement CI/CD pipelines for model versioning, experiment tracking, and automated retraining.

• Ensure model reliability, observability, and performance in live production environments.

Language Models & LLM Applications

• Apply transformer-based models (BERT, ClinicalBERT, Trial2Vec) for NLP tasks including classification, NER, and information extraction.

• Build and maintain vector similarity search pipelines for semantic retrieval and recommendation use cases.

• Fine-tune pre-trained models for domain-specific applications in clinical and healthcare contexts.

• Support exploratory work around LLM integration and prompt engineering for internal tooling.

Domain-Driven Analytics

• Apply advanced analytics within complex healthcare and clinical trial datasets—including patient records, trial protocols, and adverse event data.

• Translate ambiguous business problems into structured analytical frameworks with measurable outcomes.

• Partner with domain experts, product managers, and engineering teams to deliver data-driven solutions.

REQUIRED QUALIFICATIONS

Education

• Bachelor’s or Master’s degree in Computer Science, Statistics, Mathematics, Bioinformatics, or a closely related field.

Experience

• 2–4 years of hands-on experience in a data science or machine learning role.

• Demonstrable experience deploying ML models in production environments (not just prototyping).

Technical Skills

• Strong proficiency in Python (pandas, NumPy, scikit-learn, PyTorch / TensorFlow).

• Experience with Databricks, MLFlow (experiment tracking, model registry, endpoints), and Unity Catalog.

• Hands-on experience with BERT-family models and Hugging Face Transformers library.

• Familiarity with vector databases (e.g., FAISS, Pinecone, Weaviate) and embedding-based retrieval.

• Solid understanding of SQL and working with large structured/unstructured datasets.

• Exposure to cloud platforms (AWS / GCP / Azure) and distributed computing frameworks (Spark).

GOOD TO HAVE

• Prior experience with clinical trial data standards (CDISC, CDASH, SDTM) or healthcare ontologies (SNOMED, ICD-10).

• Familiarity with Trial2Vec or similar trial-to-vector embedding approaches.

• Experience with LLM fine-tuning, RAG pipelines, or prompt engineering in a production setting.

• Knowledge of regulatory and compliance considerations in healthcare AI (e.g., FDA guidelines, HIPAA).

• Contributions to open-source ML projects or published research.

THIS ROLE IS NOT FOR YOU IF…

• You have strong SQL/BI skills but limited hands-on ML modelling experience — or you’ve built models only in notebooks without ever deploying them to production.

• Your LLM exposure is limited to API calls and prompt engineering — with no experience fine-tuning models, working with embeddings, or building vector search pipelines.

Data Scientist

Fintech lead,

Agency job

via The Hub by Sridevi Viswanathan

Gurugram, Noida

3 - 8 yrs

₹5L - ₹15L / yr

Natural Language Processing (NLP)

BERT

Machine Learning (ML)

Data Science

Python

+1 more

Who we are looking for

· A Natural Language Processing (NLP) expert with strong computer science fundamentals and experience in working with deep learning frameworks. You will be working at the cutting edge of NLP and Machine Learning.

Roles and Responsibilities

· Work as part of a distributed team to research, build and deploy Machine Learning models for NLP.

· Mentor and coach other team members

· Evaluate the performance of NLP models and ideate on how they can be improved

· Support internal and external NLP-facing APIs

· Keep up to date on current research around NLP, Machine Learning and Deep Learning

Mandatory Requirements

· Any graduation with at least 2 years of demonstrated experience as a Data Scientist.

Behavioural Skills

· Strong analytical and problem-solving capabilities.

· Proven ability to multi-task and deliver results within tight time frames

· Must have strong verbal and written communication skills

· Strong listening skills and eagerness to learn

· Strong attention to detail and the ability to work efficiently in a team as well as individually

Technical Skills

Hands-on experience with

· NLP

· Deep Learning

· Machine Learning

· Python

· Bert

Preferred Requirements

· Experience in Computer Vision is preferred

Role: Data Scientist

Industry Type: Banking

Department: Data Science & Analytics

Employment Type: Full Time, Permanent

Role Category: Data Science & Machine Learning