- Adept at Machine learning techniques and algorithms.
Feature selection, dimensionality reduction, building and
- optimizing classifiers using machine learning techniques
- Data mining using state-of-the-art methods
- Doing ad-hoc analysis and presenting results
- Proficiency in using query languages such as N1QL, SQL
Experience with data visualization tools, such as D3.js, GGplot,
- Plotly, PyPlot, etc.
Creating automated anomaly detection systems and constant tracking
- of its performance
- Strong in Python is a must.
- Strong in Data Analysis and mining is a must
- Deep Learning, Neural Network, CNN, Image Processing (Must)
Building analytic systems - data collection, cleansing and
- integration
Experience with NoSQL databases, such as Couchbase, MongoDB,
Cassandra, HBase
About Accolite Software
Similar jobs
Requirements-
● B.Tech/Masters in Mathematics, Statistics, Computer Science or another quantitative field
● 2-3+ years of work experience in ML domain ( 2-5 years experience )
● Hands-on coding experience in Python
● Experience in machine learning techniques such as Regression, Classification,Predictive modeling, Clustering, Deep Learning stack, NLP.
● Working knowledge of Tensorflow/PyTorch
Optional Add-ons-
● Experience with distributed computing frameworks: Map/Reduce, Hadoop, Spark etc.
● Experience with databases: MongoDB
- At least 4 to 7 years of relevant experience as Big Data Engineer
- Hands-on experience in Scala or Python
- Hands-on experience on major components in Hadoop Ecosystem like HDFS, Map Reduce, Hive, Impala.
- Strong programming experience in building applications/platform using Scala or Python.
- Experienced in implementing Spark RDD Transformations, actions to implement business analysis
We are specialized in productizing solutions of new technology.
Our vision is to build engineers with entrepreneurial and leadership mindsets who can create highly impactful products and solutions using technology to deliver immense value to our clients.
We strive to develop innovation and passion into everything we do, whether it is services or products, or solutions.
- The ideal candidate is adept at using large data sets to find opportunities for product and process optimization and using models to test the effectiveness of different courses of action.
- Mine and analyze data from company databases to drive optimization and improvement of product development, marketing techniques and business strategies.
- Assess the effectiveness and accuracy of new data sources and data gathering techniques.
- Develop custom data models and algorithms to apply to data sets.
- Use predictive modeling to increase and optimize customer experiences, revenue generation, ad targeting and other business outcomes.
- Develop company A/B testing framework and test model quality.
- Develop processes and tools to monitor and analyze model performance and data accuracy.
Roles & Responsibilities
- Experience using statistical languages (R, Python, SQL, etc.) to manipulate data and draw insights from large data sets.
- Experience working with and creating data architectures.
- Looking for someone with 3-7 years of experience manipulating data sets and building statistical models
- Has a Bachelor's, Master's in Computer Science or another quantitative field
- Knowledge and experience in statistical and data mining techniques :
- GLM/Regression, Random Forest, Boosting, Trees, text mining,social network analysis, etc.
- Experience querying databases and using statistical computer languages :R, Python, SQL, etc.
- Experience creating and using advanced machine learning algorithms and statistics: regression, simulation, scenario analysis, modeling, clustering, decision trees,neural networks, etc.
- Experience with distributed data/computing tools: Map/Reduce, Hadoop, Hive, Spark, Gurobi, MySQL, etc.
- Experience visualizing/presenting data for stakeholders using: Periscope, Business Objects, D3, ggplot, etc.
2-5 yrs of proven experience in ML, DL, and preferably NLP.
Preferred Educational Background - B.E/B.Tech, M.S./M.Tech, Ph.D.
𝐖𝐡𝐚𝐭 𝐰𝐢𝐥𝐥 𝐲𝐨𝐮 𝐰𝐨𝐫𝐤 𝐨𝐧?
𝟏) Problem formulation and solution designing of ML/NLP applications across complex well-defined as well as open-ended healthcare problems.
2) Cutting-edge machine learning, data mining, and statistical techniques to analyse and utilise large-scale structured and unstructured clinical data.
3) End-to-end development of company proprietary AI engines - data collection, cleaning, data modelling, model training / testing, monitoring, and deployment.
4) Research and innovate novel ML algorithms and their applications suited to the problem at hand.
𝐖𝐡𝐚𝐭 𝐚𝐫𝐞 𝐰𝐞 𝐥𝐨𝐨𝐤𝐢𝐧𝐠 𝐟𝐨𝐫?
𝟏) Deeper understanding of business objectives and ability to formulate the problem as a Data Science problem.
𝟐) Solid expertise in knowledge graphs, graph neural nets, clustering, classification.
𝟑) Strong understanding of data normalization techniques, SVM, Random forest, data visualization techniques.
𝟒) Expertise in RNN, LSTM, and other neural network architectures.
𝟓) DL frameworks: Tensorflow, Pytorch, Keras
𝟔) High proficiency with standard database skills (e.g., SQL, MongoDB, Graph DB), data preparation, cleaning, and wrangling/munging.
𝟕) Comfortable with web scraping, extracting, manipulating, and analyzing complex, high-volume, high-dimensionality data from varying sources.
𝟖) Experience with deploying ML models on cloud platforms like AWS or Azure.
9) Familiarity with version control with GIT, BitBucket, SVN, or similar.
𝐖𝐡𝐲 𝐜𝐡𝐨𝐨𝐬𝐞 𝐮𝐬?
𝟏) We offer Competitive remuneration.
𝟐) We give opportunities to work on exciting and cutting-edge machine learning problems so you contribute towards transforming the healthcare industry.
𝟑) We offer flexibility to choose your tools, methods, and ways to collaborate.
𝟒) We always value and believe in new ideas and encourage creative thinking.
𝟓) We offer open culture where you will work closely with the founding team and have the chance to influence the product design and execution.
𝟔) And, of course, the thrill of being part of an early-stage startup, launching a product, and seeing it in the hands of the users.
• The incumbent should have hands on experience in data engineering and GCP data technologies.
• Should Work with client teams to design and implement modern, scalable data solutions using a range of new and emerging technologies from the Google Cloud Platform.
• Should Work with Agile and DevOps techniques and implementation approaches in the delivery.
• Showcase your GCP Data engineering experience when communicating with clients on their requirements, turning these into technical data solutions.
• Build and deliver Data solutions using GCP products and offerings.
• Have hands on Experience on Python
Experience on SQL or MySQL. Experience on Looker is an added advantage.
GCP Data Analyst profile must have below skills sets :
- Knowledge of programming languages like https://apc01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.simplilearn.com%2Ftutorials%2Fsql-tutorial%2Fhow-to-become-sql-developer&data=05%7C01%7Ca_anjali%40hcl.com%7C4ae720b3f3cc45c3e04608da3346b335%7C189de737c93a4f5a8b686f4ca9941912%7C0%7C0%7C637878675987971859%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=EImfaJAD1KHOyrBQ7FkbaPl1STtfnf4QdQlbjw72%2BmE%3D&reserved=0" target="_blank">SQL, Oracle, R, MATLAB, Java and https://apc01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.simplilearn.com%2Fwhy-learn-python-a-guide-to-unlock-your-python-career-article&data=05%7C01%7Ca_anjali%40hcl.com%7C4ae720b3f3cc45c3e04608da3346b335%7C189de737c93a4f5a8b686f4ca9941912%7C0%7C0%7C637878675987971859%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Z2n1Xy%2F3YN6nQqSweU5T7EfUTa1kPAAjbCMTWxDCh%2FY%3D&reserved=0" target="_blank">Python
- Data cleansing, data visualization, data wrangling
- Data modeling , data warehouse concepts
- Adapt to Big data platform like Hadoop, Spark for stream & batch processing
- GCP (Cloud Dataproc, Cloud Dataflow, Cloud Datalab, Cloud Dataprep, BigQuery, Cloud Datastore, Cloud Datafusion, Auto ML etc)
● Good communication and collaboration skills with 4-7 years of experience.
● Ability to code and script with strong grasp of CS fundamentals, excellent problem solving abilities.
● Comfort with frequent, incremental code testing and deployment, Data management skills
● Good understanding of RDBMS
● Experience in building Data pipelines and processing large datasets .
● Knowledge of building Web Scraping and data mining is a plus.
● Working knowledge of open source tools such as mysql, Solr, ElasticSearch, Cassandra ( data stores )
would be a plus.
● Expert in Python programming
Role and responsibilities
● Inclined towards working in a start-up environment.
● Comfort with frequent, incremental code testing and deployment, Data management skills
● Design and Build robust and scalable data engineering solutions for structured and unstructured data for
delivering business insights, reporting and analytics.
● Expertise in troubleshooting, debugging, data completeness and quality issues and scaling overall
system performance.
● Build robust API ’s that powers our delivery points (Dashboards, Visualizations and other integrations).
Your mission is to help lead team towards creating solutions that improve the way our business is run. Your knowledge of design, development, coding, testing and application programming will help your team raise their game, meeting your standards, as well as satisfying both business and functional requirements. Your expertise in various technology domains will be counted on to set strategic direction and solve complex and mission critical problems, internally and externally. Your quest to embracing leading-edge technologies and methodologies inspires your team to follow suit.
Responsibilities and Duties :
- As a Data Engineer you will be responsible for the development of data pipelines for numerous applications handling all kinds of data like structured, semi-structured &
unstructured. Having big data knowledge specially in Spark & Hive is highly preferred.
- Work in team and provide proactive technical oversight, advice development teams fostering re-use, design for scale, stability, and operational efficiency of data/analytical solutions
Education level :
- Bachelor's degree in Computer Science or equivalent
Experience :
- Minimum 5+ years relevant experience working on production grade projects experience in hands on, end to end software development
- Expertise in application, data and infrastructure architecture disciplines
- Expert designing data integrations using ETL and other data integration patterns
- Advanced knowledge of architecture, design and business processes
Proficiency in :
- Modern programming languages like Java, Python, Scala
- Big Data technologies Hadoop, Spark, HIVE, Kafka
- Writing decently optimized SQL queries
- Orchestration and deployment tools like Airflow & Jenkins for CI/CD (Optional)
- Responsible for design and development of integration solutions with Hadoop/HDFS, Real-Time Systems, Data Warehouses, and Analytics solutions
- Knowledge of system development lifecycle methodologies, such as waterfall and AGILE.
- An understanding of data architecture and modeling practices and concepts including entity-relationship diagrams, normalization, abstraction, denormalization, dimensional
modeling, and Meta data modeling practices.
- Experience generating physical data models and the associated DDL from logical data models.
- Experience developing data models for operational, transactional, and operational reporting, including the development of or interfacing with data analysis, data mapping,
and data rationalization artifacts.
- Experience enforcing data modeling standards and procedures.
- Knowledge of web technologies, application programming languages, OLTP/OLAP technologies, data strategy disciplines, relational databases, data warehouse development and Big Data solutions.
- Ability to work collaboratively in teams and develop meaningful relationships to achieve common goals
Skills :
Must Know :
- Core big-data concepts
- Spark - PySpark/Scala
- Data integration tool like Pentaho, Nifi, SSIS, etc (at least 1)
- Handling of various file formats
- Cloud platform - AWS/Azure/GCP
- Orchestration tool - Airflow