Graas uses predictive AI to turbo-charge growth for eCommerce businesses. We are “Growth-as-a-Service”. Graas is a technology solution provider using predictive AI to turbo-charge growth for eCommerce businesses. Graas integrates traditional data silos and applies a machine-learning AI engine, acting as an in-house data scientist to predict trends and give real-time insights and actionable recommendations for brands. The platform can also turn insights into action by seamlessly executing these recommendations across marketplace store fronts, brand.coms, social and conversational commerce, performance marketing, inventory management, warehousing, and last mile logistics - all of which impacts a brand’s bottom line, driving profitable growth.
Roles & Responsibilities:
Work on implementation of real-time and batch data pipelines for disparate data sources.
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS technologies.
- Build and maintain an analytics layer that utilizes the underlying data to generate dashboards and provide actionable insights.
- Identify improvement areas in the current data system and implement optimizations.
- Work on specific areas of data governance including metadata management and data quality management.
- Participate in discussions with Product Management and Business stakeholders to understand functional requirements and interact with other cross-functional teams as needed to develop, test, and release features.
- Develop Proof-of-Concepts to validate new technology solutions or advancements.
- Work in an Agile Scrum team and help with planning, scoping and creation of technical solutions for the new product capabilities, through to continuous delivery to production.
- Work on building intelligent systems using various AI/ML algorithms.
Desired Experience/Skill:
- Must have worked on Analytics Applications involving Data Lakes, Data Warehouses and Reporting Implementations.
- Experience with private and public cloud architectures with pros/cons.
- Ability to write robust code in Python and SQL for data processing. Experience in libraries such as Pandas is a must; knowledge of one of the frameworks such as Django or Flask is a plus.
- Experience in implementing data processing pipelines using AWS services: Kinesis, Lambda, Redshift/Snowflake, RDS.
- Knowledge of Kafka, Redis is preferred
- Experience on design and implementation of real-time and batch pipelines. Knowledge of Airflow is preferred.
- Familiarity with machine learning frameworks (like Keras or PyTorch) and libraries (like scikit-learn)
Similar jobs
We are looking out for a technically driven "ML OPS Engineer" for one of our premium client
COMPANY DESCRIPTION:
Key Skills
• Excellent hands-on expert knowledge of cloud platform infrastructure and administration
(Azure/AWS/GCP) with strong knowledge of cloud services integration, and cloud security
• Expertise setting up CI/CD processes, building and maintaining secure DevOps pipelines with at
least 2 major DevOps stacks (e.g., Azure DevOps, Gitlab, Argo)
• Experience with modern development methods and tooling: Containers (e.g., docker) and
container orchestration (K8s), CI/CD tools (e.g., Circle CI, Jenkins, GitHub actions, Azure
DevOps), version control (Git, GitHub, GitLab), orchestration/DAGs tools (e.g., Argo, Airflow,
Kubeflow)
• Hands-on coding skills Python 3 (e.g., API including automated testing frameworks and libraries
(e.g., pytest) and Infrastructure as Code (e.g., Terraform) and Kubernetes artifacts (e.g.,
deployments, operators, helm charts)
• Experience setting up at least one contemporary MLOps tooling (e.g., experiment tracking,
model governance, packaging, deployment, feature store)
• Practical knowledge delivering and maintaining production software such as APIs and cloud
infrastructure
• Knowledge of SQL (intermediate level or more preferred) and familiarity working with at least
one common RDBMS (MySQL, Postgres, SQL Server, Oracle)
- Sound knowledge of Mongo as a primary skill
- . Should have hands on experience of MySQL as a secondary skill will be enough
- . Experience with replication , sharding and scaling.
- . Design, install, maintain highly available systems (includes monitoring, security, backup, and performance tuning)
- . Implement secure database and server installations (privilege access methodology / role based access)
- . Help application team in query writing, performance tuning & other D2D issues
- • Deploy automation techniques for d2d operations
- . Must possess good analytical and problem solving skills
- . Must be willing to work flexible hours as needed
- . Scripting experience a plus
- . Ability to work independently and as a member of a team
- . good verbal and written communication skills
- A Data and MLOps Engineering lead that has a good understanding of modern Data engineering frameworks with a focus on Microsoft Azure and Azure Machine Learning and its development lifecycle and DevOps.
- Aims to solve the problems encountered when turning Data into meaningful solutions using transformations and data science code into production Machine Learning systems. Some of these challenges include:
- ML orchestration - how can I automate my ML workflows across multiple environments
- Scalability - how can I take advantage of the huge computational power available in the cloud?
- Serving - how can I make my ML models available to make predictions reliably when needed?
- Monitoring - how can I effectively monitor my ML system in production to ensure reliability? Not just system metrics, but also get insight into how my models are performing over time
- Reuse – how can I profess reuse of artefacts built and establish templates and patterns?
The MLOps team works closely with ML Engineering and DevOps teams. Rather than focus just on individual use cases, the focus would be to specialise in building the platforms and tools that can help adoption of MLOps across the organisation and develop best practices and ways of working to develop a state of the art MLOps capability.
A good understanding of AI/Machine Learning and software engineering best practices such as Cloud Engineering, Infrastructure-as-Code, and CI/CD.
Have excellent communication and consulting skills, while delivering innovative AI solutions on Azure.
Responsibilities will include:
- Building state-of-the-art MLOps platforms and tooling to help adoption of MLOps across organization
- Designing cloud ML architectures and provide a roadmap for flexible patterns
- Optimizing solutions for performance and scalability
- Leading and driving the evolving best practices for MLOps
- Helping to showcase expertise and leadership in this field
Tech stack
These are some of the tools and technologies that we use day to day. Key to success will be attitude and aptitude with a vision to build the next big thing in AI/ML field.
- Python - including poetry for dependency management, pytest for automated testing and fastapi for building APIs
- Microsoft Azure Platform - primarily focused on Databricks, Azure ML
- Containers
- CI/CD – Azure DevOps
- Strong programming skills in Python
- Solid understanding of cloud concepts
- Demonstrable interest in Machine Learning
- Understanding of IaC and CI/CD concepts
- Strong communication and presentation skills.
Remuneration: Best in the industry
Connect: https://www.linkedin.com/in/shweta-gupta-a361511
Responsibilities:
• Designing Hive/HCatalog data model includes creating table definitions, file formats, compression techniques for Structured & Semi-structured data processing
• Implementing Spark processing based ETL frameworks
• Implementing Big data pipeline for Data Ingestion, Storage, Processing & Consumption
• Modifying the Informatica-Teradata & Unix based data pipeline
• Enhancing the Talend-Hive/Spark & Unix based data pipelines
• Develop and Deploy Scala/Python based Spark Jobs for ETL processing
• Strong SQL & DWH concepts.
Preferred Background:
• Function as integrator between business needs and technology solutions, helping to create technology solutions to meet clients’ business needs
• Lead project efforts in defining scope, planning, executing, and reporting to stakeholders on strategic initiatives
• Understanding of EDW system of business and creating High level design document and low level implementation document
• Understanding of Big Data Lake system of business and creating High level design document and low level implementation document
• Designing Big data pipeline for Data Ingestion, Storage, Processing & Consumption
Company Name: Intraedge Technologies Ltd (https://intraedge.com/" target="_blank">https://intraedge.com/)
Type: Permanent, Full time
Location: Any
A Bachelor’s degree in computer science, computer engineering, other technical discipline, or equivalent work experience
- 4+ years of software development experience
- 4+ years exp in programming languages- Python, spark, Scala, Hadoop, hive
- Demonstrated experience with Agile or other rapid application development methods
- Demonstrated experience with object-oriented design and coding.
Please mail you rresume to poornimakattherateintraedgedotcomalong with NP, how soon can you join, ECTC, Availability for interview, Location
Requirements:
- Overall 3 to 5 years of experience in designing and implementing complex large scale Software.
- Good in Python is must.
- Experience in Apache Spark, Scala, Java and Delta Lake
- Experience in designing and implementing templated ETL/ELT data pipelines
- Expert level experience in Data Pipeline Orchestrationusing Apache Airflow for large scale production deployment
- Experience in visualizing data from various tasks in the data pipeline using Apache Zeppelin/Plotly or any other visualization library.
- Log management and log monitoring using ELK/Grafana
- Git Hub Integration
Technology Stack: Apache Spark, Apache Airflow, Python, AWS, EC2, S3, Kubernetes, ELK, Grafana , Apache Arrow, Java
• Responsible for developing and maintaining applications with PySpark
Must Have Skills:
ML ARCHITECT
Job Overview
We are looking for a ML Architect to help us discover the information hidden in vast amounts of data, and help us make smarter decisions to deliver even better products. Your primary focus will be in applying data mining techniques, doing statistical analysis, and building high quality prediction systems integrated with our products. They must have strong experience using variety of data mining and data analysis methods, building and implementing models, using/creating algorithm’s and creating/running simulations. They must be comfortable working with a wide range of stakeholders and functional teams. The right candidate will have a passion for discovering solutions hidden in large data sets and working with stakeholders to improve business outcomes. Automating to identify the textual data with their properties and structure form various type of document.
Responsibilities
- Selecting features, building and optimizing classifiers using machine learning techniques
- Data mining using state-of-the-art methods
- Enhancing data collection procedures to include information that is relevant for building analytic systems
- Processing, cleansing, and verifying the integrity of data used for analysis
- Creating automated anomaly detection systems and constant tracking of its performance
- Assemble large, complex data sets that meet functional / non-functional business requirements.
- Secure and manage when needed GPU cluster resources for events
- Write comprehensive internal feedback reports and find opportunities for improvements
- Manage GPU instances/machines to increase the performance and efficiency of the ML/DL model.
Skills and Qualifications
- Strong Hands-on experience in Python Programming
- Working experience with Computer Vision models - Object Detection Model, Image Classification
- Good experience in feature extraction, feature selection techniques and transfer learning
- Working Experience in building deep learning NLP Models for text classification, image analytics-CNN,RNN,LSTM.
- Working Experience in any of the AWS/GCP cloud platforms, exposure in fetching data from various sources.
- Good experience in exploratory data analysis, data visualisation, and other data preprocessing techniques.
- Knowledge in any one of the DL frameworks like Tensorflow, Pytorch, Keras, Caffe
- Good knowledge in statistics,distribution of data and in supervised and unsupervised machine learning algorithms.
- Exposure to OpenCV Familiarity with GPUs + CUDA
- Experience with NVIDIA software for cluster management and provisioning such as nvsm, dcgm and DeepOps.
- We are looking for a candidate with 14+ years of experience, who has attained a Graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field. They should also have experience using the following software/tools:
- Experience with big data tools: Hadoop, Spark, Kafka, etc.
- Experience with AWS cloud services: EC2, RDS, AWS-Sagemaker(Added advantage)
- Experience with object-oriented/object function scripting languages in any: Python, Java, C++, Scala, etc.