Job Responsibilities:-
- Develop robust, scalable and maintainable machine learning models to answer business problems against large data sets.
- Build methods for document clustering, topic modeling, text classification, named entity recognition, sentiment analysis, and POS tagging.
- Perform elements of data cleaning, feature selection and feature engineering and organize experiments in conjunction with best practices.
- Benchmark, apply, and test algorithms against success metrics. Interpret the results in terms of relating those metrics to the business process.
- Work with development teams to ensure models can be implemented as part of a delivered solution replicable across many clients.
- Knowledge of Machine Learning, NLP, Document Classification, Topic Modeling and Information Extraction with a proven track record of applying them to real problems.
- Experience working with big data systems and big data concepts.
- Ability to provide clear and concise communication both with other technical teams and non-technical domain specialists.
- Strong team player; ability to provide both a strong individual contribution but also work as a team and contribute to wider goals is a must in this dynamic environment.
- Experience with noisy and/or unstructured textual data.
knowledge graph and NLP including summarization, topic modelling etc
- Strong coding ability with statistical analysis tools in Python or R, and general software development skills (source code management, debugging, testing, deployment, etc.)
- Working knowledge of various text mining algorithms and their use-cases such as keyword extraction, PLSA, LDA, HMM, CRF, deep learning & recurrent ANN, word2vec/doc2vec, Bayesian modeling.
- Strong understanding of text pre-processing and normalization techniques, such as tokenization,
- POS tagging and parsing and how they work at a low level.
- Excellent problem solving skills.
- Strong verbal and written communication skills
- Masters or higher in data mining or machine learning; or equivalent practical analytics / modelling experience
- Practical experience in using NLP related techniques and algorithms
- Experience in open source coding and communities desirable.
Able to containerize Models and associated modules and work in a Microservices environment
About Srijan Technologies
Similar jobs
CORE RESPONSIBILITIES
- Create and manage cloud resources in AWS
- Data ingestion from different data sources which exposes data using different technologies, such as: RDBMS, REST HTTP API, flat files, Streams, and Time series data based on various proprietary systems. Implement data ingestion and processing with the help of Big Data technologies
- Data processing/transformation using various technologies such as Spark and Cloud Services. You will need to understand your part of business logic and implement it using the language supported by the base data platform
- Develop automated data quality check to make sure right data enters the platform and verifying the results of the calculations
- Develop an infrastructure to collect, transform, combine and publish/distribute customer data.
- Define process improvement opportunities to optimize data collection, insights and displays.
- Ensure data and results are accessible, scalable, efficient, accurate, complete and flexible
- Identify and interpret trends and patterns from complex data sets
- Construct a framework utilizing data visualization tools and techniques to present consolidated analytical and actionable results to relevant stakeholders.
- Key participant in regular Scrum ceremonies with the agile teams
- Proficient at developing queries, writing reports and presenting findings
- Mentor junior members and bring best industry practices
QUALIFICATIONS
- 5-7+ years’ experience as data engineer in consumer finance or equivalent industry (consumer loans, collections, servicing, optional product, and insurance sales)
- Strong background in math, statistics, computer science, data science or related discipline
- Advanced knowledge one of language: Java, Scala, Python, C#
- Production experience with: HDFS, YARN, Hive, Spark, Kafka, Oozie / Airflow, Amazon Web Services (AWS), Docker / Kubernetes, Snowflake
- Proficient with
- Data mining/programming tools (e.g. SAS, SQL, R, Python)
- Database technologies (e.g. PostgreSQL, Redshift, Snowflake. and Greenplum)
- Data visualization (e.g. Tableau, Looker, MicroStrategy)
- Comfortable learning about and deploying new technologies and tools.
- Organizational skills and the ability to handle multiple projects and priorities simultaneously and meet established deadlines.
- Good written and oral communication skills and ability to present results to non-technical audiences
- Knowledge of business intelligence and analytical tools, technologies and techniques.
Mandatory Requirements
- Experience in AWS Glue
- Experience in Apache Parquet
- Proficient in AWS S3 and data lake
- Knowledge of Snowflake
- Understanding of file-based ingestion best practices.
- Scripting language - Python & pyspark
Roles and Responsibilities:
- Design, develop, and maintain the end-to-end MLOps infrastructure from the ground up, leveraging open-source systems across the entire MLOps landscape.
- Creating pipelines for data ingestion, data transformation, building, testing, and deploying machine learning models, as well as monitoring and maintaining the performance of these models in production.
- Managing the MLOps stack, including version control systems, continuous integration and deployment tools, containerization, orchestration, and monitoring systems.
- Ensure that the MLOps stack is scalable, reliable, and secure.
Skills Required:
- 3-6 years of MLOps experience
- Preferably worked in the startup ecosystem
Primary Skills:
- Experience with E2E MLOps systems like ClearML, Kubeflow, MLFlow etc.
- Technical expertise in MLOps: Should have a deep understanding of the MLOps landscape and be able to leverage open-source systems to build scalable, reliable, and secure MLOps infrastructure.
- Programming skills: Proficient in at least one programming language, such as Python, and have experience with data science libraries, such as TensorFlow, PyTorch, or Scikit-learn.
- DevOps experience: Should have experience with DevOps tools and practices, such as Git, Docker, Kubernetes, and Jenkins.
Secondary Skills:
- Version Control Systems (VCS) tools like Git and Subversion
- Containerization technologies like Docker and Kubernetes
- Cloud Platforms like AWS, Azure, and Google Cloud Platform
- Data Preparation and Management tools like Apache Spark, Apache Hadoop, and SQL databases like PostgreSQL and MySQL
- Machine Learning Frameworks like TensorFlow, PyTorch, and Scikit-learn
- Monitoring and Logging tools like Prometheus, Grafana, and Elasticsearch
- Continuous Integration and Continuous Deployment (CI/CD) tools like Jenkins, GitLab CI, and CircleCI
- Explain ability and Interpretability tools like LIME and SHAP
Data Engineer JD:
- Designing, developing, constructing, installing, testing and maintaining the complete data management & processing systems.
- Building highly scalable, robust, fault-tolerant, & secure user data platform adhering to data protection laws.
- Taking care of the complete ETL (Extract, Transform & Load) process.
- Ensuring architecture is planned in such a way that it meets all the business requirements.
- Exploring new ways of using existing data, to provide more insights out of it.
- Proposing ways to improve data quality, reliability & efficiency of the whole system.
- Creating data models to reduce system complexity and hence increase efficiency & reduce cost.
- Introducing new data management tools & technologies into the existing system to make it more efficient.
- Setting up monitoring and alarming on data pipeline jobs to detect failures and anomalies
What do we expect from you?
- BS/MS in Computer Science or equivalent experience
- 5 years of recent experience in Big Data Engineering.
- Good experience in working with Hadoop and Big Data technologies like HDFS, Pig, Hive, Zookeeper, Storm, Spark, Airflow and NoSQL systems
- Excellent programming and debugging skills in Java or Python.
- Apache spark, python, hands on experience in deploying ML models
- Has worked on streaming and realtime pipelines
- Experience with Apache Kafka or has worked with any of Spark Streaming, Flume or Storm
Focus Area:
R1 |
Data structure & Algorithms |
R2 |
Problem solving + Coding |
R3 |
Design (LLD) |
-5+ years hands on experience with penetration testing would be added plus
-Strong Knowledge of programming or scripting languages, such as Python, PowerShell, Bash
-Industry certifications like OSCP and AWS are highly desired for this role
-Well-rounded knowledge in security tools, software and processes
- Play a critical role as a member of the leadership team in shaping and supporting our overall company vision, day-to-day operations, and culture.
- Set the technical vision and build the technical product roadmap from launch to scale; including defining long-term goals and strategies
- Define best practices around coding methodologies, software development, and quality assurance
- Define innovative technical requirements and systems while balancing time, feasibility, cost and customer experience
- Build and support production products
- Ensure our internal processes and services comply with privacy and security regulations
- Establish a high performing, inclusive engineering culture focused on innovation, execution, growth and development
- Set a high bar for our overall engineering practices in support of our mission and goals
- Develop goals, roadmaps and delivery dates to help us scale quickly and sustainably
- Collaborate closely with Product, Business, Marketing and Data Science
- Experience with financial and transactional systems
- Experience engineering for large volumes of data at scale
- Experience with financial audit and compliance is a plus
- Experience building a successful consumer facing web and mobile apps at scale
Job Title : Analyst / Sr. Analyst – Data Science Developer - Python
Exp : 2 to 5 yrs
Loc : B’lore / Hyd / Chennai
NP: Candidate should join us in 2 months (Max) / Immediate Joiners Pref.
About the role:
We are looking for an Analyst / Senior Analyst who works in the analytics domain with a strong python background.
Desired Skills, Competencies & Experience:
• • 2-4 years of experience in working in the analytics domain with a strong python background. • • Visualization skills in python with plotly, matplotlib, seaborn etc. Ability to create customized plots using such tools. • • Ability to write effective, scalable and modular code. Should be able to understand, test and debug existing python project modules quickly and contribute to that. • • Should be familiarized with Git workflows.
Good to Have: • • Familiarity with cloud platforms like AWS, AzureML, Databricks, GCP etc. • • Understanding of shell scripting, python package development. • • Experienced with Python data science packages like Pandas, numpy, sklearn etc. • • ML model building and evaluation experience using sklearn.
|
Responsibilities for Data Engineer
- Create and maintain optimal data pipeline architecture,
- Assemble large, complex data sets that meet functional / non-functional business requirements.
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS ‘big data’ technologies.
- Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
- Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
- Keep our data separated and secure across national boundaries through multiple data centers and AWS regions.
- Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
- Work with data and analytics experts to strive for greater functionality in our data systems.
Qualifications for Data Engineer
- Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
- Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
- Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
- Strong analytic skills related to working with unstructured datasets.
- Build processes supporting data transformation, data structures, metadata, dependency and workload management.
- A successful history of manipulating, processing and extracting value from large disconnected datasets.
- Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores.
- Strong project management and organizational skills.
- Experience supporting and working with cross-functional teams in a dynamic environment.
- We are looking for a candidate with 5+ years of experience in a Data Engineer role, who has attained a Graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field. They should also have experience using the following software/tools:
- Experience with big data tools: Hadoop, Spark, Kafka, etc.
- Experience with relational SQL and NoSQL databases, including Postgres and Cassandra.
- Experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
- Experience with AWS cloud services: EC2, EMR, RDS, Redshift
- Experience with stream-processing systems: Storm, Spark-Streaming, etc.
- Experience with object-oriented/object function scripting languages: Python, Java, C++, Scala, etc.
Required skill
- Around 6- 8.5 years of experience and around 4+ years in AI / Machine learning space
- Extensive experience in designing large scale machine learning solution for the ML use case, large scale deployments and establishing continues automated improvement / retraining framework.
- Strong experience in Python and Java is required.
- Hands on experience on Scikit-learn, Pandas, NLTK
- Experience in Handling of Timeseries data and associated techniques like Prophet, LSTM
- Experience in Regression, Clustering, classification algorithms
- Extensive experience in buildings traditional Machine Learning SVM, XGBoost, Decision tree and Deep Neural Network models like RNN, Feedforward is required.
- Experience in AutoML like TPOT or other
- Must have strong hands on experience in Deep learning frameworks like Keras, TensorFlow or PyTorch
- Knowledge of Capsule Network or reinforcement learning, SageMaker is a desirable skill
- Understanding of Financial domain is desirable skill
Responsibilities
- Design and implementation of solutions for ML Use cases
- Productionize System and Maintain those
- Lead and implement data acquisition process for ML work
- Learn new methods and model quickly and utilize those in solving use cases