- Sr. Data Engineer:
Core Skills – Data Engineering, Big Data, Pyspark, Spark SQL and Python
Candidate with prior Palantir Cloud Foundry OR Clinical Trial Data Model background is preferred
- Responsible for Data Engineering, Foundry Data Pipeline Creation, Foundry Analysis & Reporting, Slate Application development, re-usable code development & management and Integrating Internal or External System with Foundry for data ingestion with high quality.
- Have good understanding on Foundry Platform landscape and it’s capabilities
- Performs data analysis required to troubleshoot data related issues and assist in the resolution of data issues.
- Defines company data assets (data models), Pyspark, spark SQL, jobs to populate data models.
- Designs data integrations and data quality framework.
- Design & Implement integration with Internal, External Systems, F1 AWS platform using Foundry Data Connector or Magritte Agent
- Collaboration with data scientists, data analyst and technology teams to document and leverage their understanding of the Foundry integration with different data sources - Actively participate in agile work practices
- Coordinating with Quality Engineer to ensure the all quality controls, naming convention & best practices have been followed
Desired Candidate Profile :
- Strong data engineering background
- Experience with Clinical Data Model is preferred
- Experience in
- SQL Server ,Postgres, Cassandra, Hadoop, and Spark for distributed data storage and parallel computing
- Java and Groovy for our back-end applications and data integration tools
- Python for data processing and analysis
- Cloud infrastructure based on AWS EC2 and S3
- 7+ years IT experience, 2+ years’ experience in Palantir Foundry Platform, 4+ years’ experience in Big Data platform
- 5+ years of Python and Pyspark development experience
- Strong troubleshooting and problem solving skills
- BTech or master's degree in computer science or a related technical field
- Experience designing, building, and maintaining big data pipelines systems
- Hands-on experience on Palantir Foundry Platform and Foundry custom Apps development
- Able to design and implement data integration between Palantir Foundry and external Apps based on Foundry data connector framework
- Hands-on in programming languages primarily Python, R, Java, Unix shell scripts
- Hand-on experience in AWS / Azure cloud platform and stack
- Strong in API based architecture and concept, able to do quick PoC using API integration and development
- Knowledge of machine learning and AI
- Skill and comfort working in a rapidly changing environment with dynamic objectives and iteration with users.
Demonstrated ability to continuously learn, work independently, and make decisions with minimal supervision
Hiring Developer with multiple combinations with experience of 3 to 6 years.
Hands-on experience with SQL ,Java and Python
Knowledge of these tools DBT, ADF, Snowflakes , Databricks would be added advantage for our current project
ML and AWS would be a plus
We need people who can work from our Chennai branch.
Do share your profile to gayathrirajagopalan @jmangroup.com
The data science team is responsible for solving business problems with complex data. Data complexity could be characterized in terms of volume, dimensionality and multiple touchpoints/sources. We understand the data, ask fundamental-first-principle questions, apply our analytical and machine learning skills to solve the problem in the best way possible.
Our ideal candidate
The role would be a client facing one, hence good communication skills are a must.
The candidate should have the ability to communicate complex models and analysis in a clear and precise manner.
The candidate would be responsible for:
- Comprehending business problems properly - what to predict, how to build DV, what value addition he/she is bringing to the client, etc.
- Understanding and analyzing large, complex, multi-dimensional datasets and build features relevant for business
- Understanding the math behind algorithms and choosing one over another
- Understanding approaches like stacking, ensemble and applying them correctly to increase accuracy
Desired technical requirements
- Proficiency with Python and the ability to write production-ready codes.
- Experience in pyspark, machine learning and deep learning
- Big data experience, e.g. familiarity with Spark, Hadoop, is highly preferred
- Familiarity with SQL or other databases.
Title: Data Engineer (Azure) (Location: Gurgaon/Hyderabad)
Salary: Competitive as per Industry Standard
We are expanding our Data Engineering Team and hiring passionate professionals with extensive
knowledge and experience in building and managing large enterprise data and analytics platforms. We
are looking for creative individuals with strong programming skills, who can understand complex
business and architectural problems and develop solutions. The individual will work closely with the rest
of our data engineering and data science team in implementing and managing Scalable Smart Data
Lakes, Data Ingestion Platforms, Machine Learning and NLP based Analytics Platforms, Hyper-Scale
Processing Clusters, Data Mining and Search Engines.
What You’ll Need:
- 3+ years of industry experience in creating and managing end-to-end Data Solutions, Optimal
Data Processing Pipelines and Architecture dealing with large volume, big data sets of varied
- Proficiency in Python, Linux and shell scripting.
- Strong knowledge of working with PySpark dataframes, Pandas dataframes for writing efficient pre-processing and other data manipulation tasks.
● Strong experience in developing the infrastructure required for data ingestion, optimal
extraction, transformation, and loading of data from a wide variety of data sources using tools like Azure Data Factory, Azure Databricks (or Jupyter notebooks/ Google Colab) (or other similiar tools).
- Working knowledge of github or other version control tools.
- Experience with creating Restful web services and API platforms.
- Work with data science and infrastructure team members to implement practical machine
learning solutions and pipelines in production.
- Experience with cloud providers like Azure/AWS/GCP.
- Experience with SQL and NoSQL databases. MySQL/ Azure Cosmosdb / Hbase/MongoDB/ Elasticsearch etc.
- Experience with stream-processing systems: Spark-Streaming, Kafka etc and working experience with event driven architectures.
- Strong analytic skills related to working with unstructured datasets.
Good to have (to filter or prioritize candidates)
- Experience with testing libraries such as pytest for writing unit-tests for the developed code.
- Knowledge of Machine Learning algorithms and libraries would be good to have,
implementation experience would be an added advantage.
- Knowledge and experience of Datalake, Dockers and Kubernetes would be good to have.
- Knowledge of Azure functions , Elastic search etc will be good to have.
- Having experience with model versioning (mlflow) and data versioning will be beneficial
- Having experience with microservices libraries or with python libraries such as flask for hosting ml services and models would be great.
- Data & Analytics team is responsible to integrate new data sources and build data models, data dictionaries and machine learning models for the Wholesale Bank.
- The goal is to design and build data products to support squads in Wholesale Bank with business outcomes and development of business insights. In this Job Family we make a distinction between Data Analysts and Data Scientist. Both scientists as analysts work with data and are expected to write queries, work with engineering teams to source the right data, perform data munging (getting data into the correct format, convenient for analysis/interpretation) and derive information from data.
- The data analyst typically works on simpler structured SQL or similar databases or with other BI tools/packages. The Data Scientists are expected to build statistical models or be hands-on in machine learning and advanced programming.
- Role of Data Scientist to support our Corporate banking teams with insights gained from analyzing company data. The ideal candidate is adept at using large data sets to find opportunities for product and process optimization and using models to test the effectiveness of different courses of action. They must have strong experience using a variety of data mining/data analysis methods, using a variety of data tools, building and implementing models, using/creating algorithms and creating/running simulations. They must have banking or corporate banking experience.
6 Years - 10 Years
- Should be comfortable in solving Wholesale Banking domain analytical solution within AI/ML platform
1. Work on identifying and implementing data pre-processing pipelines for various datasets (images, videos) based on problem statement
2. Experimentation, Identifying, testing the right techniques / Deep Learning libraries to use to train models
3. Training models for inferencing on cloud and edge
4. Work on improving the accuracy of deployed models
1. Experience in state-of-the-art Deep Learning Convolutional Neural Networks
2. Sound knowledge of Object detection, Semantic segmentation, Instance segmentation (Faster-RCNN, Single Shot Detector(SSD), Mask RCNN, Mobile-net)
3. Hands-on experience in classic Image Processing Techniques (feature engineering) using OpenCV, Pillow
4. Proficiency in Python along with OOPs concepts
5. Should be comfortable building ML models on various deep learning and machine learning libraries using Pytorch
6. Work experience in a startup environment is essential
Our client is an innovative Fintech company that is revolutionizing the business of short term finance. The company is an online lending startup that is driven by an app-enabled technology platform to solve the funding challenges of SMEs by offering quick-turnaround, paperless business loans without collateral. It counts over 2 million small businesses across 18 cities and towns as its customers. Its founders are IIT and ISB alumni with deep experience in the fin-tech industry, from earlier working with organizations like Axis Bank, Aditya Birla Group, Fractal Analytics, and Housing.com. It has raised funds of Rs. 100 Crore from finance industry stalwarts and is growing by leaps and bounds.
- Ensuring ease of data availability, with relevant dimensions, using Business Intelligence tools.
- Providing strong reporting and analytical information support to the management team.
- Transforming raw data into essential metrics basis needs of relevant stakeholders.
- Performing data analysis for generating reports on a periodic basis.
- Converting essential data into easy to reference visuals using Data Visualization tools (PowerBI, Metabase).
- Providing recommendations to update current MIS to improve reporting efficiency and consistency.
- Bringing fresh ideas to the table and keen observers of trends in the analytics and financial services industry.
What you need to have:
- MBA/ BE/ Graduate, with work experience of 3+ years.
- B.Tech /B.E.; MBA / PGDM
- Experience in Reporting, Data Management (SQL, MongoDB), Visualization (PowerBI, Metabase, Data studio)
- Work experience (into financial services, Indian Banks/ NBFCs in-house analytics units or Fintech/ analytics start-ups would be a plus.)
- Skilled at writing & optimizing large complicated SQL queries & MongoDB scripts.
- Strong knowledge of Banking/ Financial Services domain
- Experience with some of the modern relational databases
- Ability to work on multiple projects of different nature and self- driven,
- Liaise with cross-functional teams to resolve data issues and build strong reports
We are looking for an ML engineer (Neuroscience) and would like them to
- Build ML algorithms on MRI/ Medical data sets
- Design and build intelligent agents using continual learning and deep reinforcement learning techniques
- Study and transform data science prototypes
- Design machine learning systems
- Research and implement appropriate ML algorithms and tools
- Develop machine learning applications according to requirements
- Select appropriate datasets and data representation methods
- Run machine learning tests and experiments
- Perform statistical analysis and fine-tune using test results
- Train and retrain systems when necessary
- Extend existing ML libraries and frameworks
- Keep abreast of developments in the field
- Collaborate on technical proposals to grow and define artificial intelligence research for intelligent systems
- Knowledge of MRI image processing and inferencing
- Hands-on machine learning expertise with an intensive knowledge of hyperparameter optimization, statistical assumption, and implications
- Experience in deep learning models
- Excellent Python programming skills for ML coding
- Understanding of ML integration with our software
- Good understanding of neuroscience/ clinical data
Bonus if you:
- Are enthusiastic about all things brain science and/or mental well-being
- We are a company that highly values the ability to communicate well. We all take turns at the blog roster, so writing experience and/or enthusiasm is appreciated
- Our core values encourage empathy and innovation, you are gold if you share these
More about us : bit.ly/workatjarapp
Jar is seeking a talented Senior Product Analyst to join our Team. If you are intellectually curious, if you eat/sleep/drink data and are committed to translating data to insights & insights to actionable work items, want new challenges daily and impact the lives of hundreds of thousands of users, this is the role for you!
What You Will Do
- Deliver insight and analysis using statistical tools, data visualization, and business use cases with the Product and Business teams
- Understanding of tools like product analytics and engagement platforms like Clevertap, Amplitude, Apxor etc
- Conduct analysis to determine new project pilot settings, new features, user behaviour, and in-app behaviour
- Build & maintain dashboards for tracking business performance and product adoption
- Assist Product Managers and Business teams in creating data-backed decisions
- Collaborate with Consumer Platform's Product and Business teams in identifying new avenues for growth and opportunities, and back their product delivery with experimentation
- Build first cut Machine Learning models based on product requirements
- Automate data extraction by creating de-normalized tables
What You Will Need
- At least 2+ years of work experience dealing with product analytics, data, and statistics
- Expertise in SQL with experience using data visualization and dashboarding tools (e.g. Tableau, Metabase, Google Data Studio, Clevertap, Python)
- Experience in Machine Learning technologies (i.e. forecasting, clustering, statistical significance test, predictive modeling, and text mining)
- Experience in delivering products as end-to-end data solutions (from data pipelining to analysis, presenting, and scalable adaption)
- A strong business sense with the ability to transform ambiguous business and product issues into well-scoped, impactful analysis
- Strong ability to design and conduct simple experiments
- A goal-oriented, critical-thinking mindset with the ability to work equally well within a team and independently with minimal supervision
culture and operating norms as a result of the fast-paced nature of a new, high-growth
• 7+ years of Industry experience primarily related to Unstructured Text Data and NLP
(PhD work and internships will be considered if they are related to unstructured text
in lieu of industry experience but not more than 2 years will be accounted towards
• Develop Natural Language Medical/Healthcare documents comprehension related
products to support Health business objectives, products and improve
processing efficiency, reducing overall healthcare costs
• Gather external data sets; build synthetic data and label data sets as per the needs
• Apply expert software engineering skills to build Natural Language products to
improve automation and improve user experiences leveraging unstructured data storage, Entity Recognition, POS Tagging, ontologies, taxonomies, data mining,
information retrieval techniques, machine learning approach, distributed and cloud
• Own the Natural Language and Text Mining products — from platforms to systems
for model training, versioning, deploying, storage and testing models with creating
real time feedback loops to fully automated services
• Work closely and collaborate with Data Scientists, Machine Learning engineers, IT
teams and Business stakeholders spread out across various locations in US and India
to achieve business goals
• Provide mentoring to other Data Scientist and Machine Learning Engineers
• Strong understanding of mathematical concepts including but not limited to linear
algebra, Advanced calculus, partial differential equations and statistics including
• Strong programming experience including understanding of concepts in data
structures, algorithms, compression techniques, high performance computing,
distributed computing, and various computer architecture
• Good understanding and experience with traditional data science approaches like
sampling techniques, feature engineering, classification and regressions, SVM, trees,
• Additional course work, projects, research participation and/or publications in
Natural Language processing, reasoning and understanding, information retrieval,
text mining, search, computational linguistics, ontologies, semantics
• Experience with developing and deploying products in production with experience
in two or more of the following languages (Python, C++, Java, Scala)
• Strong Unix/Linux background and experience with at least one of the following
cloud vendors like AWS, Azure, and Google for 2+ years
• Hands on experience with one or more of high-performance computing and
distributed computing like Spark, Dask, Hadoop, CUDA distributed GPU (2+ years)
• Thorough understanding of deep learning architectures and hands on experience
with one or more frameworks like tensorflow, pytorch, keras (2+ years)
• Hands on experience with libraries and tools like Spacy, NLTK, Stanford core NLP,
Genism, johnsnowlabs for 5+ years
• Understanding business use cases and be able to translate them to team with a
vision on how to implement
• Identify enhancements and build best practices that can help to improve the
productivity of the team.
We are looking for a data scientist that will help us to discover the information hidden in vast amounts of data, and help us make smarter decisions to deliver even better products. Your primary focus will be in applying data mining techniques, doing statistical analysis, and building high quality prediction systems integrated with our products.
- Selecting features, building and optimizing classifiers using machine learning techniques
- Data mining using state-of-the-art methods
- Extending company’s data with third party sources of information when needed
- Enhancing data collection procedures to include information that is relevant for building analytic systems
- Processing, cleansing, and verifying the integrity of data used for analysis
- Doing ad-hoc analysis and presenting results in a clear manner
- Creating automated anomaly detection systems and constant tracking of its performance
Skills and Qualifications
- Excellent understanding of machine learning techniques and algorithms, such as Linear regression, SVM, Decision Forests, LSTM, CNN etc.
- Experience with Deep Learning preferred.
- Experience with common data science toolkits, such as R, NumPy, MatLab, etc. Excellence in at least one of these is highly desirable
- Great communication skills
- Proficiency in using query languages such as SQL, Hive, Pig
- Good applied statistics skills, such as statistical testing, regression, etc.
- Good scripting and programming skills
- Data-oriented personality