
In-depth knowledge and hands-on experience with all of the AWS services and other similar cloud services
Strong knowledge of core architectural concepts including distributed computing , scalability, availability, and performance to recommend the best backend solutions for our products
Preferred AWS Certifications:
- AWS Solutions Architect Professional/Associate AWS DevOps Engineer Professional
- AWS SysOps Administrator - Associate AWS Developer Associate
Key Skills
- ITCAN is looking for an AWS Solution Architect who will be responsible for development of scalable, optimized, and reliable backend solutions using AWS services for all our products. You will ensure that our products consume AWS services in the mast effective methods. Therefore, a commitment to collaborative problem solving, sophisticated design, and quality product is important.
Responsibilities:
- Analyse requirements and devise innovative, efficient, and cost-effective architecture using AWS components and services that ensure scalability, availability and high- performance.
- Develop automation and deployment utilities using Ruby, Bash and Shell scripting and implementing
- CI/CD pipelines using Jenkins, Code Deploy, Git, Code Pipeline, Code Commit etc. To ensure seamless deployment with no downtime.
- Redesign architectures end Lo-end seamlessly by working through major software upgrades such as Apache.
- Ensure an always-running network with the ability to set up redundant DNS systems with failover capabilities.
- Ensure the AWS services consumed are aligned with best practices to ensure higher availability and security along with optimal cost utilization.
- Using AWS-managed services, implement ELK systems end-to-end.

Similar jobs
Key Responsibilities
- Design, implement, and maintain CI/CD pipelines for backend, frontend, and mobile applications.
- Manage cloud infrastructure using AWS (EC2, Lambda, S3, VPC, RDS, CloudWatch, ECS/EKS).
- Configure and maintain Docker containers and/or Kubernetes clusters.
- Implement and maintain Infrastructure as Code (IaC) using Terraform / CloudFormation.
- Automate build, deployment, and monitoring processes.
- Manage code repositories using Git/GitHub/GitLab, enforce branching strategies.
- Implement monitoring and alerting using tools like Prometheus, Grafana, CloudWatch, ELK, Splunk.
- Ensure system scalability, reliability, and security.
- Troubleshoot production issues and perform root-cause analysis.
- Collaborate with engineering teams to improve deployment and development workflows.
- Optimize infrastructure costs and improve performance.
Required Skills & Qualifications
- 3+ years of experience in DevOps, SRE, or Cloud Engineering.
- Strong hands-on knowledge of AWS cloud services.
- Experience with Docker, containers, and orchestrators (ECS, EKS, Kubernetes).
- Strong understanding of CI/CD tools: GitHub Actions, Jenkins, GitLab CI, or AWS CodePipeline.
- Experience with Linux administration and shell scripting.
- Strong understanding of Networking, VPC, DNS, Load Balancers, Security Groups.
- Experience with monitoring/logging tools: CloudWatch, ELK, Prometheus, Grafana.
- Experience with Terraform or CloudFormation (IaC).
- Good understanding of Node.js or similar application deployments.
- Knowledge of NGINX/Apache and load balancing concepts.
- Strong problem-solving and communication skills.
Preferred/Good to Have
- Experience with Kubernetes (EKS).
- Experience with Serverless architectures (Lambda).
- Experience with Redis, MongoDB, RDS.
- Certification in AWS Solutions Architect / DevOps Engineer.
- Experience with security best practices, IAM policies, and DevSecOps.
- Understanding of cost optimization and cloud cost management.
Job Title: Lead DevOps Engineer
Experience Required: 4 to 5 years in DevOps or related fields
Employment Type: Full-time
About the Role:
We are seeking a highly skilled and experienced Lead DevOps Engineer. This role will focus on driving the design, implementation, and optimization of our CI/CD pipelines, cloud infrastructure, and operational processes. As a Lead DevOps Engineer, you will play a pivotal role in enhancing the scalability, reliability, and security of our systems while mentoring a team of DevOps engineers to achieve operational excellence.
Key Responsibilities:
Infrastructure Management: Architect, deploy, and maintain scalable, secure, and resilient cloud infrastructure (e.g., AWS, Azure, or GCP).
CI/CD Pipelines: Design and optimize CI/CD pipelines, to improve development velocity and deployment quality.
Automation: Automate repetitive tasks and workflows, such as provisioning cloud resources, configuring servers, managing deployments, and implementing infrastructure as code (IaC) using tools like Terraform, CloudFormation, or Ansible.
Monitoring & Logging: Implement robust monitoring, alerting, and logging systems for enterprise and cloud-native environments using tools like Prometheus, Grafana, ELK Stack, NewRelic or Datadog.
Security: Ensure the infrastructure adheres to security best practices, including vulnerability assessments and incident response processes.
Collaboration: Work closely with development, QA, and IT teams to align DevOps strategies with project goals.
Mentorship: Lead, mentor, and train a team of DevOps engineers to foster growth and technical expertise.
Incident Management: Oversee production system reliability, including root cause analysis and performance tuning.
Required Skills & Qualifications:
Technical Expertise:
Strong proficiency in cloud platforms like AWS, Azure, or GCP.
Advanced knowledge of containerization technologies (e.g., Docker, Kubernetes).
Expertise in IaC tools such as Terraform, CloudFormation, or Pulumi.
Hands-on experience with CI/CD tools, particularly Bitbucket Pipelines, Jenkins, GitLab CI/CD, Github Actions or CircleCI.
Proficiency in scripting languages (e.g., Python, Bash, PowerShell).
Soft Skills:
Excellent communication and leadership skills.
Strong analytical and problem-solving abilities.
Proven ability to manage and lead a team effectively.
Experience:
4 years + of experience in DevOps or Site Reliability Engineering (SRE).
4+ years + in a leadership or team lead role, with proven experience managing distributed teams, mentoring team members, and driving cross-functional collaboration.
Strong understanding of microservices, APIs, and serverless architectures.
Nice to Have:
Certifications like AWS Certified Solutions Architect, Kubernetes Administrator, or similar.
Experience with GitOps tools such as ArgoCD or Flux.
Knowledge of compliance standards (e.g., GDPR, SOC 2, ISO 27001).
Perks & Benefits:
Competitive salary and performance bonuses.
Comprehensive health insurance for you and your family.
Professional development opportunities and certifications, including sponsored certifications and access to training programs to help you grow your skills and expertise.
Flexible working hours and remote work options.
Collaborative and inclusive work culture.
Join us to build and scale world-class systems that empower innovation and deliver exceptional user experiences.
You can directly contact us: Nine three one six one two zero one three two
Job Responsibilities:
Section 1 -
- Responsible for managing and providing L1 support to Build, design, deploy and maintain the implementation of Cloud solutions on AWS.
- Implement, deploy and maintain development, staging & production environments on AWS.
- Familiar with serverless architecture and services on AWS like Lambda, Fargate, EBS, Glue, etc.
- Understanding of Infra as a code and familiar with related tools like Terraform, Ansible Cloudformation etc.
Section 2 -
- Managing the Windows and Linux machines, Kubernetes, Git, etc.
- Responsible for L1 management of Servers, Networks, Containers, Storage, and Databases services on AWS.
Section 3 -
- Timely monitoring of production workload alerts and quick addressing the issues
- Responsible for monitoring and maintaining the Backup and DR process.
Section 4 -
- Responsible for documenting the process.
- Responsible for leading cloud implementation projects with end-to-end execution.
Qualifications: Bachelors of Engineering / MCA Preferably with AWS, Cloud certification
Skills & Competencies
- Linux and Windows servers management and troubleshooting.
- AWS services experience on CloudFormation, EC2, RDS, VPC, EKS, ECS, Redshift, Glue, etc. - AWS EKS
- Kubernetes and containers knowledge
- Understanding of setting up AWS Messaging, streaming and queuing Services(MSK, Kinesis, SQS, SNS, MQ)
- Understanding of serverless architecture. - High understanding of Networking concepts
- High understanding of Serverless architecture concept - Managing to monitor and alerting systems
- Sound knowledge of Database concepts like Dataware house, Data Lake, and ETL jobs
- Good Project management skills
- Documentation skills
- Backup, and DR understanding
Soft Skills - Project management, Process Documentation
Ideal Candidate:
- AWS certification with between 2-4 years of experience with certification and project execution experience.
- Someone who is interested in building sustainable cloud architecture with automation on AWS.
- Someone who is interested in learning and being challenged on a day-to-day basis.
- Someone who can take ownership of the tasks and is willing to take the necessary action to get it done.
- Someone who is curious to analyze and solve complex problems.
- Someone who is honest with their quality of work and is comfortable with taking ownership of their success and failure, both.
Behavioral Traits
- We are looking for someone who is interested to be part of creativity and the innovation-based environment with other team members.
- We are looking for someone who understands the idea/importance of teamwork and individual ownership at the same time.
- We are looking for someone who can debate logically, respectfully disagree, and can admit if proven wrong and who can learn from their mistakes and grow quickly
What you will do
We are looking for an exceptional engineering lead to join our team. You will be responsible for building and owning the systems that would have critical impact for the business and the experience of our community from day one.
- Build and lead an agile engineering team
- Work closely with Founder on product development
- Collaborate with operations team to understand customer pain points and solve interesting problems
- Code, test, ship - manage the entire application cycle
- Build libraries and documentation for future references
- Research and develop best practices and tools to enable delivery of features
- Set up capabilities to track and report business and user metrics
- Design and improve architecture to ensure scalability
Requirements
- Proven experience at scaling tech companies, preferably in commerce or social network
- Keen to innovate, open-minded and collaborative
- Able to interpret product needs and suggest appropriate solutions
- Have led a team, also able to code hands-on
- Strong communication skills
- Strong work ethic: responsible, responsive, and detail-oriented.
Technologies we use
Go, Flutter, AWS, Google Cloud
-
Working with Ruby, Python, Perl, and Java
-
Troubleshooting and having working knowledge of various tools, open-source technologies, and cloud services.
-
Configuring and managing databases and cache layers such as MySQL, Mongo, Elasticsearch, Redis
-
Setting up all databases and for optimisations (sharding, replication, shell scripting etc)
-
Creating user, Domain handling, Service handling, Backup management, Port management, SSL services
-
Planning, testing & development of IT Infrastructure ( Server configuration and Database) and handling the technical issue related to server Docker and VM optimization
-
Demonstrate awareness of DB management, server related work, Elasticsearch.
-
Selecting and deploying appropriate CI/CD tools
-
Striving for continuous improvement and build continuous integration, continuous development, and constant deployment pipeline (CI/CD Pipeline)
-
Experience working on Linux based infrastructure
-
Awareness of critical concepts in DevOps and Agile principles
-
6-8 years of experience
Job description
The role requires you to design development pipelines from the ground up, Creation of Docker Files, design and operate highly available systems in AWS Cloud environments. Also involves Configuration Management, Web Services Architectures, DevOps Implementation, Database management, Backups, and Monitoring.
Key responsibility area
- Ensure reliable operation of CI/CD pipelines
- Orchestrate the provisioning, load balancing, configuration, monitoring and billing of resources in the cloud environment in a highly automated manner
- Logging, metrics and alerting management.
- Creation of Bash/Python scripts for automation
- Performing root cause analysis for production errors.
Requirement
- 2 years experience as Team Lead.
- Good Command on kubernetes.
- Proficient in Linux Commands line and troubleshooting.
- Proficient in AWS Services. Deployment, Monitoring and troubleshooting applications in AWS.
- Hands-on experience with CI tooling preferably with Jenkins.
- Proficient in deployment using Ansible.
- Knowledge of infrastructure management tools (Infrastructure as cloud) such as terraform, AWS cloudformation etc.
- Proficient in deployment of applications behind load balancers and proxy servers such as nginx, apache.
- Scripting languages: Bash, Python, Groovy.
- Experience with Logging, Monitoring, and Alerting tools like ELK(Elastic-search, Logstash, Kibana), Nagios. Graylog, splunk Prometheus, Grafana is a plus.
Must Have:
Linux, CI/CD(Jenkin), AWS, Scripting(Bash,shell Python, Go), Ngnix, Docker.
Good to have
Configuration Management(Ansible or similar tool), Logging tool( ELK or similar), Monitoring tool(Ngios or similar), IaC(Terraform, cloudformation).About Us
We have grown over 1400% in revenues in the last year.
Interface.ai provides an Intelligent Virtual Assistant (IVA) to FIs to automate calls and customer inquiries across multiple channels and engage their customers with financial insights and upsell/cross-sell.
Our IVA is transforming financial institutions’ call centers from a cost to a revenue center.
Our core technology is built 100% in-house with several breakthroughs in Natural Language Understanding. Our parser is built based on zero-shot learning that helps us to launch industry-specific IVA that can achieve over 90% accuracy on Day-1.
We are 45 people strong with employees spread across India and US locations. Many of them come from ML teams at Apple, Microsoft, and Salesforce in the US along with enterprise architects with over 20+ years of experience building large-scale systems. Our India team consists of people from ISB, IIMs, and many who have been previously part of early-stage startups.
We are a fully remote team.
Founders come from Banking and Enterprise Technology backgrounds with previous experience scaling companies from scratch to $50M+ in revenues.
As a Site Reliability Engineer you will be in charge of:
- Designing, analyzing and troubleshooting large-scale distributed systems
- Engaging in cross-functional team discussions on design, deployment, operation, and maintenance, in a fast-moving, collaborative set up
- Building automation scripts to validate the stability, scalability, and reliability of interface.ai’s products & services as well as enhance interface.ai’s employees’ productivity
- Debugging and optimizing code and automating routine tasks
- Troubleshoot and diagnose issues (hardware or software), propose and implement solutions to ensure they occur with reduced frequency
- Perform the periodic on-call duty to handle security, availability, and reliability of interface.ai’s products
- You will follow and write good code and solid engineering practices
Requirements
You can be a great fit if you are :
- Extremely self motivated
- Ability to learn quickly
- Growth Mindset (read this if you don't know what it means - https://www.amazon.com/Mindset-Psychology-Carol-S-Dweck/dp/0345472322" target="_blank">link)
- Emotional Maturity (read this if you don't know what it means - https://medium.com/@krisgage/15-signs-of-emotional-maturity-38b1a2ab9766" target="_blank">link)
- Passionate about the possibilities at the intersection of AI + Banking
- Worked in a startup of 5 to 30 employees
- Developer with a strong interest in systems Design. You will be building, maintaining, and scaling our cloud infrastructure through software tooling and automation.
- 4-8 years of industry experience developing and troubleshooting large-scale infrastructure on the cloud
- Have a solid understanding of system availability, latency, and performance
- Strong programming skills in at least one major programming language and the ability to learn new languages as needed
- Strong System/network debugging skills
- Experience with management/automation tools such as Terraform/Puppet/Chef/SALT
- Experience with setting up production-level monitoring and telemetry
- Expertise in Container management & AWS
- Experience with kubernetes is a plus
- Experience building CI/CD pipelines
- Experience working with Web sockets, Redis, Postgres, Elastic search, Logstash
- Experience working in an agile team environment and proficient understanding of code versioning tools, such as Git.
- Ability to effectively articulate technical challenges and solutions.
- Proactive outlook for ways to make our systems more reliable
Skill: Python, Docker or Ansible , AWS
➢ Experience Building a multi-region highly available auto-scaling infrastructure that optimizes
performance and cost. plan for future infrastructure as well as Maintain & optimize existing
infrastructure.
➢ Conceptualize, architect and build automated deployment pipelines in a CI/CD environment like
Jenkins.
➢ Conceptualize, architect and build a containerized infrastructure using Docker,Mesosphere or
similar SaaS platforms.
Work with developers to institute systems, policies and workflows which allow for rollback of
deployments Triage release of applications to production environment on a daily basis.
➢ Interface with developers and triage SQL queries that need to be executed inproduction
environments.
➢ Maintain 24/7 on-call rotation to respond and support troubleshooting of issues in production.
➢ Assist the developers and on calls for other teams with post mortem, follow up and review of
issues affecting production availability.
➢ Establishing and enforcing systems monitoring tools and standards
➢ Establishing and enforcing Risk Assessment policies and standards
➢ Establishing and enforcing Escalation policies and standards








