
ketteQ is a supply chain planning and automation platform. We are looking for an experienced AWS Devops Engineer to help manage AWS infrastructure and automation. This job comes with a attractive compensation package, work-from-home and flex-time benefits. You will get to work on projects for large global brands with a highly experienced team based in US and India. If you are high-energy, motivated, and initiative-taking individual then this could be a fantastic opportunity for you. Candidates must meet the following requirements:
Duties & Responsibilities
- Deployment, automation, management, and maintenance of AWS cloud-based production system
- Build a deployment pipeline for AWS and Salesforce
- Design cloud infrastructure that is secure, scalable, and highly available on AWS
- Work collaboratively with software engineering to define infrastructure and deployment requirements
- Provision, configure and maintain AWS cloud infrastructure defined as cloud formation template
- Ensure configuration and compliance with configuration management tools
- Administer and troubleshoot Linux based systems
- Troubleshoot problems across a wide array of services and functional areas
- Build and maintain operational tools for deployment, monitoring, and analysis of AWS infrastructure and systems
- Perform infrastructure cost analysis and optimization
Requirements
- At least 5 years of experience building and maintaining AWS infrastructure (VPC, EC2, Security Groups, IAM, ECS, Fargate, S3, Cloud Formation)
- Strong understanding of how to secure AWS environments and meet compliance requirements
- Solid foundation of networking and Linux administration
- Experience with Docker, GitHub, Jenkins, Cloud Formation and deploying applications on AWS
- Ability to learn/use a wide variety of open source technologies and tools
- Database experience to help with monitoring and performance; PostgreSql experience preferred
- AWS certification preferred
Education
- Bachelors in Engineering or related field

About ketteq
About
Connect with the team
Similar jobs
GCP Cloud Engineer:
- Proficiency in infrastructure as code (Terraform).
- Scripting and automation skills (e.g., Python, Shell). Knowing python is must.
- Collaborate with teams across the company (i.e., network, security, operations) to build complete cloud offerings.
- Design Disaster Recovery and backup strategies to meet application objectives.
- Working knowledge of Google Cloud
- Working knowledge of various tools, open-source technologies, and cloud services
- Experience working on Linux based infrastructure.
- Excellent problem-solving and troubleshooting skills
DESIRED SKILLS AND EXPERIENCE
Strong analytical and problem-solving skills
Ability to work independently, learn quickly and be proactive
3-5 years overall and at least 1-2 years of hands-on experience in designing and managing DevOps Cloud infrastructure
Experience must include a combination of:
o Experience working with configuration management tools – Ansible, Chef, Puppet, SaltStack (expertise in at least one tool is a must)
o Ability to write and maintain code in at least one scripting language (Python preferred)
o Practical knowledge of shell scripting
o Cloud knowledge – AWS, VMware vSphere o Good understanding and familiarity with Linux
o Networking knowledge – Firewalls, VPNs, Load Balancers
o Web/Application servers, Nginx, JVM environments
o Virtualization and containers - Xen, KVM, Qemu, Docker, Kubernetes, etc.
o Familiarity with logging systems - Logstash, Elasticsearch, Kibana
o Git, Jenkins, Jira
About Us
We have grown over 1400% in revenues in the last year.
Interface.ai provides an Intelligent Virtual Assistant (IVA) to FIs to automate calls and customer inquiries across multiple channels and engage their customers with financial insights and upsell/cross-sell.
Our IVA is transforming financial institutions’ call centers from a cost to a revenue center.
Our core technology is built 100% in-house with several breakthroughs in Natural Language Understanding. Our parser is built based on zero-shot learning that helps us to launch industry-specific IVA that can achieve over 90% accuracy on Day-1.
We are 45 people strong with employees spread across India and US locations. Many of them come from ML teams at Apple, Microsoft, and Salesforce in the US along with enterprise architects with over 20+ years of experience building large-scale systems. Our India team consists of people from ISB, IIMs, and many who have been previously part of early-stage startups.
We are a fully remote team.
Founders come from Banking and Enterprise Technology backgrounds with previous experience scaling companies from scratch to $50M+ in revenues.
As a Site Reliability Engineer you will be in charge of:
- Designing, analyzing and troubleshooting large-scale distributed systems
- Engaging in cross-functional team discussions on design, deployment, operation, and maintenance, in a fast-moving, collaborative set up
- Building automation scripts to validate the stability, scalability, and reliability of interface.ai’s products & services as well as enhance interface.ai’s employees’ productivity
- Debugging and optimizing code and automating routine tasks
- Troubleshoot and diagnose issues (hardware or software), propose and implement solutions to ensure they occur with reduced frequency
- Perform the periodic on-call duty to handle security, availability, and reliability of interface.ai’s products
- You will follow and write good code and solid engineering practices
Requirements
You can be a great fit if you are :
- Extremely self motivated
- Ability to learn quickly
- Growth Mindset (read this if you don't know what it means - https://www.amazon.com/Mindset-Psychology-Carol-S-Dweck/dp/0345472322" target="_blank">link)
- Emotional Maturity (read this if you don't know what it means - https://medium.com/@krisgage/15-signs-of-emotional-maturity-38b1a2ab9766" target="_blank">link)
- Passionate about the possibilities at the intersection of AI + Banking
- Worked in a startup of 5 to 30 employees
- Developer with a strong interest in systems Design. You will be building, maintaining, and scaling our cloud infrastructure through software tooling and automation.
- 4-8 years of industry experience developing and troubleshooting large-scale infrastructure on the cloud
- Have a solid understanding of system availability, latency, and performance
- Strong programming skills in at least one major programming language and the ability to learn new languages as needed
- Strong System/network debugging skills
- Experience with management/automation tools such as Terraform/Puppet/Chef/SALT
- Experience with setting up production-level monitoring and telemetry
- Expertise in Container management & AWS
- Experience with kubernetes is a plus
- Experience building CI/CD pipelines
- Experience working with Web sockets, Redis, Postgres, Elastic search, Logstash
- Experience working in an agile team environment and proficient understanding of code versioning tools, such as Git.
- Ability to effectively articulate technical challenges and solutions.
- Proactive outlook for ways to make our systems more reliable
Roles and Responsibilities
- 5 - 8 years of experience in Infrastructure setup on Cloud, Build/Release Engineering, Continuous Integration and Delivery, Configuration/Change Management.
- Good experience with Linux/Unix administration and moderate to significant experience administering relational databases such as PostgreSQL, etc.
- Experience with Docker and related tools (Cassandra, Rancher, Kubernetes etc.)
- Experience of working in Config management tools (Ansible, Chef, Puppet, Terraform etc.) is a plus.
- Experience with cloud technologies like Azure
- Experience with monitoring and alerting (TICK, ELK, Nagios, PagerDuty)
- Experience with distributed systems and related technologies (NSQ, RabbitMQ, SQS, etc.) is a plus
- Experience with scaling data store technologies is a plus (PostgreSQL, Scylla, Redis) is a plus
- Experience with SSH Certificate Authorities and Identity Management (Netflix BLESS) is a plus
- Experience with multi-domain SSL certs and provisioning a plus (Let's Encrypt) is a plus
- Experience with chaos or similar methodologies is a plus

Total Experience: 6 – 12 Years
Required Skills and Experience
- 3+ years of relevant experience with DevOps tools Jenkins, Ansible, Chef etc
- 3+ years of experience in continuous integration/deployment and software tools development experience with Python and shell scripts etc
- Building and running Docker images and deployment on Amazon ECS
- Working with AWS services (EC2, S3, ELB, VPC, RDS, Cloudwatch, ECS, ECR, EKS)
- Knowledge and experience working with container technologies such as Docker and Amazon ECS, EKS, Kubernetes
- Experience with source code and configuration management tools such as Git, Bitbucket, and Maven
- Ability to work with and support Linux environments (Ubuntu, Amazon Linux, CentOS)
- Knowledge and experience in cloud orchestration tools such as AWS Cloudformation/Terraform etc
- Experience with implementing "infrastructure as code", “pipeline as code” and "security as code" to enable continuous integration and delivery
- Understanding of IAM, RBAC, NACLs, and KMS
- Good communication skills
Good to have:
- Strong understanding of security concepts, methodologies and apply them such as SSH, public key encryption, access credentials, certificates etc.
- Knowledge of database administration such as MongoDB.
- Knowledge of maintaining and using tools such as Jira, Bitbucket, Confluence.
Responsibilities
- Work with Leads and Architects in designing and implementation of technical infrastructure, platform, and tools to support modern best practices and facilitate the efficiency of our development teams through automation, CI/CD pipelines, and ease of access and performance.
- Establish and promote DevOps thinking, guidelines, best practices, and standards.
- Contribute to architectural discussions, Agile software development process improvement, and DevOps best practices.
We are looking for an experienced software engineer with a strong background in DevOps and handling traffic & infrastructure at scale.
Responsibilities :
Work closely with product engineers to implement scalable and highly reliable systems.
Scale existing backend systems to handle ever-increasing amounts of traffic and new product requirements.
Collaborate with other developers to understand & setup tooling needed for - Continuous Integration/Delivery/
Build & operate infrastructure to support website, backend cluster, ML projects in the organization.
Monitor and track performance and reliability of our services and software to meet promised SLA
2+ years of experience working on distributed systems and shipping high-quality product features on schedule
Intimate knowledge of the whole web stack (Front end, APIs, database, networks etc.)
Ability to build highly scalable, robust, and fault-tolerant services and stay up-to-date with the latest architectural trends
Experience with container based deployment, microservices, in-memory caches, relational databases, key-value stores
Hands-on experience with cloud infrastructure provisioning, deployment, monitoring (we are on AWS and use ECS, RDS, ELB, EC2, Elasticache, Elasticsearch, S3, CloudWatch)
What we are looking for
Work closely with product & engineering groups to identify and document
infrastructure requirements.
Design infrastructure solutions balancing requirements, operational
constraints and architecture guidelines.
Implement infrastructure including network connectivity, virtual machines
and monitoring.
Implement and follow security guidelines, both policy and technical to
protect our customers.
Resolve incidents as escalated from monitoring solutions and lower tiers.
Identify root cause for issues and develop long term solutions to fix recurring
issues.
Ability to automate recurring tasks to increase velocity and quality.
Partner with the engineering team to build software tolerance for
infrastructure failure or issues.
Research emerging technologies, trends and methodologies and enhance
existing systems and processes.
Qualifications
Master’s/Bachelors degree in Computer Science, Computer Engineering,
Electrical Engineering, or related technical field, and two years of experience
in software/systems or related.
5+ years overall experience.
Work experience must have included:
Proven track record in deploying, configuring and maintaining Ubuntu server
systems on premise and in the cloud.
Minimum of 4 years’ experience designing, implementing and troubleshooting
TCP/IP networks, VPN, Load Balancers & Firewalls.
Minimum 3 years of experience working in public clouds like AWS & Azure.
Hands on experience in any of the configuration management tools like Anisble,
Chef & Puppet.
Strong in performing production operation activities.
Experience with Container & Container Orchestrator tools like Kubernetes, Docker
Swarm is plus.
Good at source code management tools like Bitbucket, GIT.
Configuring and utilizing monitoring and alerting tools.
Scripting to automate infrastructure and operational processes.
Hands on work to secure networks and systems.
Sound problem resolution, judgment, negotiating and decision making skills
Ability to manage and deliver multiple project phases at the same time
Strong analytical and organizational skills
Excellent written and verbal communication skills
Interview focus areas
Networks, systems, monitoring
AWS (EC2, S3, VPC)
Problem solving, scripting, network design, systems administration and
troubleshooting scenarios
Culture fit, agility, bias for action, ownership, communication






