
Site Reliability-Devops Engineer
at Altimetrik
Platform Services Engineer
DevSecOps Engineer
- Strong Systems Experience- Linux, networking, cloud, APIs
- Scripting language Programming - Shell, Python
- Strong Debugging Capability
- AWS Platform -IAM, Network,EC2, Lambda, S3, CloudWatch
- Knowledge on Terraform, Packer, Ansible, Jenkins
- Observability - Prometheus, InfluxDB, Dynatrace,
- Grafana, Splunk • DevSecOps-CI/CD - Jenkins
- Microservices
- Security & Access Management
- Container Orchestration a plus - Kubernetes, Docker etc.
- Big Data Platforms knowledge EMR, Databricks. Cloudera a plus

Similar jobs
Required Skills
● Experience: Minimum of 5 years of professional experience in a DevOps Engineer
role
● Cloud Proficiency: Proven experience with at least one major cloud provider (AWS,
Azure, or GCP).
● Scripting & Programming: Strong scripting skills in languages such as Bash, Python,
or Go.
● IaC Tools: Hands-on experience with Terraform.
● Container Technology: Expertise in Docker and Kubernetes.
● CI/CD Tools: Proficient with CI/CD platforms like Jenkins, GitLab CI, or Travis CI.
● Configuration Management: Experience with configuration management tools like
Ansible, Chef, or Puppet.
● Version Control: Strong knowledge of Git and branching strategies.
● Problem-Solving:Excellent problem-solving abilities and a commitment to automation
and continuous improvement.
Associate Principal Engineer, Linux Administrator
Location: Bengaluru, India (Hybrid)
Employment Type: Full-time
Experience:9-11 years
Job description
REQUIREMENTS:
- Strong experience in DevOps, Platform Engineering, and Infrastructure Automation
- Deep hands-on expertise in Linux Administration (RHEL, CentOS, Ubuntu) – OS hardening, security, patching, and performance management (Must Have)
- Strong experience with Cloud Technologies – Public & Private Cloud environments (Must Have)
- Hands-on experience with Infrastructure as Code (IaC) using Terraform (Must Have)
- Strong automation expertise using Ansible for configuration management and infrastructure provisioning (Must Have)
- Experience building and managing CI/CD pipelines and end-to-end deployment automation
- Strong experience with Kubernetes administration, orchestration, and cluster management (Must Have)
- Hands-on experience with Docker containerization and Helm package management
- Experience managing large-scale development and infrastructure environments
- Strong understanding of Networking concepts, connectivity, design, troubleshooting, and network automation
- Experience with Observability & Monitoring tools and best practices
- Experience with Proxmox virtualization platform administration and management
- Knowledge of Edge Technologies and distributed infrastructure environments
- Basic understanding and administration of Active Directory (AD)
- Experience implementing AI-driven Automation solutions and operational efficiencies
- Strong understanding of infrastructure security, compliance, and governance
- Experience working in Agile/Scrum environments
- Strong troubleshooting, analytical, and problem-solving skills
- Excellent communication and stakeholder management skills
RESPONSIBILITIES:
- Design, build, and manage scalable infrastructure platforms across cloud and on-premise environments
- Administer and maintain Linux servers including security hardening, patching, performance tuning, and troubleshooting
- Develop and manage Infrastructure as Code (IaC) solutions using Terraform
- Automate infrastructure provisioning, configuration management, and operational tasks using Ansible
- Design, implement, and maintain CI/CD pipelines for application and infrastructure deployments
- Deploy, manage, and optimize Kubernetes clusters and containerized workloads
- Manage Docker environments and Helm-based application deployments
- Design and implement network solutions ensuring security, reliability, and scalability
- Monitor infrastructure health, performance, and availability using observability and monitoring tools
- Manage and support Proxmox virtualization environments
- Implement AI-driven automation initiatives to improve operational efficiency and reduce manual effort
- Support edge infrastructure deployments and distributed computing environments
- Collaborate with development, security, and operations teams to deliver reliable platform services
- Troubleshoot production incidents and perform root cause analysis
- Define infrastructure standards, automation frameworks, and operational best practices
- Ensure high availability, scalability, security, and reliability of infrastructure platforms
- Mentor junior engineers and provide technical leadership on DevOps and platform engineering initiatives
- Participate in Agile ceremonies and contribute to continuous improvement initiatives
- Work closely with stakeholders to understand infrastructure requirements and deliver optimal solutions
Qualifications
Bachelor’s or master’s degree in computer science, Information Technology, or a related fields
Required Skills: Advanced AWS Infrastructure Expertise, CI/CD Pipeline Automation, Monitoring, Observability & Incident Management, Security, Networking & Risk Management, Infrastructure as Code & Scripting
Criteria:
- 5+ years of DevOps/SRE experience in cloud-native, product-based companies (B2C scale preferred)
- Strong hands-on AWS expertise across core and advanced services (EC2, ECS/EKS, Lambda, S3, CloudFront, RDS, VPC, IAM, ELB/ALB, Route53)
- Proven experience designing high-availability, fault-tolerant cloud architectures for large-scale traffic
- Strong experience building & maintaining CI/CD pipelines (Jenkins mandatory; GitHub Actions/GitLab CI a plus)
- Prior experience running production-grade microservices deployments and automated rollout strategies (Blue/Green, Canary)
- Hands-on experience with monitoring & observability tools (Grafana, Prometheus, ELK, CloudWatch, New Relic, etc.)
- Solid hands-on experience with MongoDB in production, including performance tuning, indexing & replication
- Strong scripting skills (Bash, Shell, Python) for automation
- Hands-on experience with IaC (Terraform, CloudFormation, or Ansible)
- Deep understanding of networking fundamentals (VPC, subnets, routing, NAT, security groups)
- Strong experience in incident management, root cause analysis & production firefighting
Description
Role Overview
Company is seeking an experienced Senior DevOps Engineer to design, build, and optimize cloud infrastructure on AWS, automate CI/CD pipelines, implement monitoring and security frameworks, and proactively identify scalability challenges. This role requires someone who has hands-on experience running infrastructure at B2C product scale, ideally in media/OTT or high-traffic applications.
Key Responsibilities
1. Cloud Infrastructure — AWS (Primary Focus)
- Architect, deploy, and manage scalable infrastructure using AWS services such as EC2, ECS/EKS, Lambda, S3, CloudFront, RDS, ELB/ALB, VPC, IAM, Route53, etc.
- Optimize cloud cost, resource utilization, and performance across environments.
- Design high-availability, fault-tolerant systems for streaming workloads.
2. CI/CD Automation
- Build and maintain CI/CD pipelines using Jenkins, GitHub Actions, or GitLab CI.
- Automate deployments for microservices, mobile apps, and backend APIs.
- Implement blue/green and canary deployments for seamless production rollouts.
3. Observability & Monitoring
- Implement logging, metrics, and alerting using tools like Grafana, Prometheus, ELK, CloudWatch, New Relic, etc.
- Perform proactive performance analysis to minimize downtime and bottlenecks.
- Set up dashboards for real-time visibility into system health and user traffic spikes.
4. Security, Compliance & Risk Highlighting
• Conduct frequent risk assessments and identify vulnerabilities in:
o Cloud architecture
o Access policies (IAM)
o Secrets & key management
o Data flows & network exposure
• Implement security best practices including VPC isolation, WAF rules, firewall policies, and SSL/TLS management.
5. Scalability & Reliability Engineering
- Analyze traffic patterns for OTT-specific load variations (weekends, new releases, peak hours).
- Identify scalability gaps and propose solutions across:
- o Microservices
- o Caching layers
- o CDN distribution (CloudFront)
- o Database workloads
- Perform capacity planning and load testing to ensure readiness for 10x traffic growth.
6. Database & Storage Support
- Administer and optimize MongoDB for high-read/low-latency use cases.
- Design backup, recovery, and data replication strategies.
- Work closely with backend teams to tune query performance and indexing.
7. Automation & Infrastructure as Code
- Implement IaC using Terraform, CloudFormation, or Ansible.
- Automate repetitive infrastructure tasks to ensure consistency across environments.
Required Skills & Experience
Technical Must-Haves
- 5+ years of DevOps/SRE experience in cloud-native, product-based companies.
- Strong hands-on experience with AWS (core and advanced services).
- Expertise in Jenkins CI/CD pipelines.
- Solid background working with MongoDB in production environments.
- Good understanding of networking: VPCs, subnets, security groups, NAT, routing.
- Strong scripting experience (Bash, Python, Shell).
- Experience handling risk identification, root cause analysis, and incident management.
Nice to Have
- Experience with OTT, video streaming, media, or any content-heavy product environments.
- Familiarity with containers (Docker), orchestration (Kubernetes/EKS), and service mesh.
- Understanding of CDN, caching, and streaming pipelines.
Personality & Mindset
- Strong sense of ownership and urgency—DevOps is mission critical at OTT scale.
- Proactive problem solver with ability to think about long-term scalability.
- Comfortable working with cross-functional engineering teams.
Why Join company?
• Build and operate infrastructure powering millions of monthly users.
• Opportunity to shape DevOps culture and cloud architecture from the ground up.
• High-impact role in a fast-scaling Indian OTT product.
Key Responsibilities
DevOps Strategy & Leadership
- Define and execute the end-to-end DevOps strategy for high-frequency trading and fintech platforms.
- Lead, mentor, and scale a high-performing DevOps team focused on automation, reliability, and performance.
- Partner closely with engineering and product leaders to ensure infrastructure strategy supports business and technical goals.
CI/CD & Infrastructure Automation
- Architect, implement, and optimize enterprise-grade CI/CD pipelines for ultra-low-latency trading systems.
- Drive Infrastructure as Code (IaC) adoption using Terraform, Helm, Kubernetes, and advanced automation toolsets.
- Establish robust release management, deployment workflows, and versioning best practices for mission‑critical environments.
Cloud & On‑Prem Infrastructure Management
- Design and manage hybrid infrastructures across AWS, GCP, and on-premise data centers ensuring high availability and fault tolerance.
- Implement sophisticated networking strategies for low-latency workloads including routing optimization and performance tuning.
- Lead multi‑cloud scalability, cost optimization, and environment standardization initiatives.
Performance Monitoring & Optimization
- Oversee large-scale monitoring systems using Prometheus, Grafana, ELK, and related observability tools.
- Implement predictive alerting, automated remediation, and system‑wide health checks for zero‑downtime operations.
- Conduct root-cause analyses and performance tuning for systems processing millions of transactions per second.
Security & Compliance
- Champion DevSecOps practices and embed security across the entire development and deployment lifecycle.
- Ensure adherence to financial regulatory standards (SEBI and global frameworks) with strong audit and compliance mechanisms.
- Lead security automation efforts, vulnerability management, and advanced IAM policy implementation.
Required Skills & Qualifications
- 10+ years of DevOps experience, with 5+ years in a leadership capacity.
- Deep hands-on expertise in CI/CD tools such as Jenkins, GitLab CI/CD, and ArgoCD.
- Strong command of AWS, GCP, and hybrid cloud infrastructures.
- Expert-level knowledge of Kubernetes, Docker, and large-scale container orchestration.
- Advanced proficiency in Terraform, Helm, and overall IaC workflows.
- Strong Linux administration, networking fundamentals (TCP/IP, DNS, Firewalls), and system internals.
- Experience with monitoring and observability platforms (Prometheus, Grafana, ELK).
- Excellent scripting skills in Python, Bash, or Go for automation and tooling.
- Deep understanding of security principles, encryption, IAM, and compliance frameworks.
Good to Have
- Experience with ultra-low-latency or high-frequency trading systems.
- Knowledge of FIX protocol, FPGA acceleration, or network‑level optimizations.
- Familiarity with Redis, Nginx, or other high‑throughput systems.
- Exposure to micro‑second‑level performance tuning or network acceleration technologies.
Why Join Us?
- Be part of a team that consistently raises the bar and delivers exceptional engineering outcomes.
- A culture where innovation, ownership, and bold thinking are valued.
- Exceptional growth opportunities—ideal for someone who thrives in fast-paced, high-impact environments.
- Build systems that influence markets and redefine the fintech landscape.
This isn’t just a role—it’s a challenge, a platform, and a proving ground.
Ready to step up? Apply now.
Key Responsibilities:
- Design, implement, and maintain scalable, secure, and cost-effective infrastructure on AWS and Azure
- Set up and manage CI/CD pipelines for smooth code integration and delivery using tools like GitHub Actions, Bitbucket Runners, AWS Code build/deploy, Azure DevOps, etc.
- Containerize applications using Docker and manage orchestration with Kubernetes, ECS, Fargate, AWS EKS, Azure AKS.
- Manage and monitor production deployments to ensure high availability and performance
- Implement and manage CDN solutions using AWS CloudFront and Azure Front Door for optimal content delivery and latency reduction
- Define and apply caching strategies at application, CDN, and reverse proxy layers for performance and scalability
- Set up and manage reverse proxies and Cloudflare WAF to ensure application security and performance
- Implement infrastructure as code (IaC) using Terraform, CloudFormation, or ARM templates
- Administer and optimize databases (RDS, PostgreSQL, MySQL, etc.) including backups, scaling, and monitoring
- Configure and maintain VPCs, subnets, routing, VPNs, and security groups for secure and isolated network setups
- Implement monitoring, logging, and alerting using tools like CloudWatch, Grafana, ELK, or Azure Monitor
- Collaborate with development and QA teams to align infrastructure with application needs
- Troubleshoot infrastructure and deployment issues efficiently and proactively
- Ensure cloud cost optimization and usage tracking
Required Skills & Experience:
- 3-4 years of hands-on experience in a DevOps
- Strong expertise with both AWS and Azure cloud platforms
- Proficient in Git, branching strategies, and pull request workflows
- Deep understanding of CI/CD concepts and experience with pipeline tools
- Proficiency in Docker, container orchestration (Kubernetes, ECS/EKS/AKS)
- Good knowledge of relational databases and experience in managing DB backups, performance, and migrations
- Experience with networking concepts including VPC, subnets, firewalls, VPNs, etc.
- Experience with Infrastructure as Code tools (Terraform preferred)
- Strong working knowledge of CDN technologies: AWS CloudFront and Azure Front Door
- Understanding of caching strategies: edge caching, browser caching, API caching, and reverse proxy-level caching
- Experience with Cloudflare WAF, reverse proxy setups, SSL termination, and rate-limiting
- Familiarity with Linux system administration, scripting (Bash, Python), and automation tools
- Working knowledge of monitoring and logging tools
- Strong troubleshooting and problem-solving skills
Good to Have (Bonus Points):
- Experience with serverless architecture (e.g., AWS Lambda, Azure Functions)
- Exposure to cost monitoring tools like CloudHealth, Azure Cost Management
- Experience with compliance/security best practices (SOC2, ISO, etc.)
- Familiarity with Service Mesh (Istio, Linkerd) and API gateways
- Knowledge of Secrets Management tools (e.g., HashiCorp Vault, AWS Secrets Manager)
Interested candidates are requested to email their resumes with the subject line "Application for [Job Title]".
Only applications received via email will be reviewed. Applications through other channels will not be considered.
Job Description
The client’s department DPS, Digital People Solutions, offers a sophisticated portfolio of IT applications, providing a strong foundation for professional and efficient People & Organization (P&O) and Business Management, both globally and locally, for a well-known German company listed on the DAX-40 index, which includes the 40 largest and most liquid companies on the Frankfurt Stock Exchange
We are seeking talented DevOps-Engineers with focus on Elastic Stack (ELK) to join our dynamic DPS team. In this role, you will be responsible for refining and advising on the further development of an existing monitoring solution based on the Elastic Stack (ELK). You will independently handle tasks related to architecture, setup, technical migration, and documentation.
The current application landscape features multiple Java web services running on JEE application servers, primarily hosted on AWS, and integrated with various systems such as SAP, other services, and external partners. DPS is committed to delivering the best digital work experience for the customers employees and customers alike.
Responsibilities:
Install, set up, and automate rollouts using Ansible/CloudFormation for all stages (Dev, QA, Prod) in the AWS Cloud for components such as Elastic Search, Kibana, Metric beats, APM server, APM agents, and interface configuration.
Create and develop regular "Default Dashboards" for visualizing metrics from various sources like Apache Webserver, application servers and databases.
Improve and fix bugs in installation and automation routines.
Monitor CPU usage, security findings, and AWS alerts.
Develop and extend "Default Alerting" for issues like OOM errors, datasource issues, and LDAP errors.
Monitor storage space and create concepts for expanding the Elastic landscape in AWS Cloud and Elastic Cloud Enterprise (ECE).
Implement machine learning, uptime monitoring including SLA, JIRA integration, security analysis, anomaly detection, and other useful ELK Stack features.
Integrate data from AWS CloudWatch.
Document all relevant information and train involved personnel in the used technologies.
Requirements:
Experience with Elastic Stack (ELK) components and related technologies.
Proficiency in automation tools like Ansible and CloudFormation.
Strong knowledge of AWS Cloud services.
Experience in creating and managing dashboards and alerts.
Familiarity with IAM roles and rights management.
Ability to document processes and train team members.
Excellent problem-solving skills and attention to detail.
Skills & Requirements
Elastic Stack (ELK), Elasticsearch, Kibana, Logstash, Beats, APM, Ansible, CloudFormation, AWS Cloud, AWS CloudWatch, IAM roles, AWS security, Automation, Monitoring, Dashboard creation, Alerting, Anomaly detection, Machine learning integration, Uptime monitoring, JIRA integration, Apache Webserver, JEE application servers, SAP integration, Database monitoring, Troubleshooting, Performance optimization, Documentation, Training, Problem-solving, Security analysis.
Position: SDE-1 DevSecOps
Location: Pune, India
Experience Required: 0+ Years
We are looking for a DevSecOps engineer to contribute to product development, mentor team members, and devise creative solutions for customer needs. We value effective communication in person, in documentation, and in code. Ideal candidates thrive in small, collaborative teams, love making an impact, and take pride in their work with a product-focused, self-driven approach. If you're passionate about integrating security and deployment seamlessly into the development process, we want you on our team.
About FlytBase
FlytBase is a global leader in enterprise drone software automation. FlytBase platform is enabling drone-in-a-box deployments all across the globe and has the largest network of partners in 50+ countries.
The team comprises young engineers and designers from top-tier universities such as IIT-B, IIT-KGP, University of Maryland, Georgia Tech, COEP, SRM, KIIT and with deep expertise in drone technology, computer science, electronics, aerospace, and robotics.
The company is headquartered in Silicon Valley, California, USA, and has R&D offices in Pune, India. Widely recognized as a pioneer in the commercial drone ecosystem, FlytBase continues to win awards globally - FlytBase was the Global Grand Champion at the ‘NTT Data Open Innovation Contest’ held in Tokyo, Japan, and was the recipient of ‘ TiE50 Award’ at TiE Silicon Valley.
Role and Responsibilities:
- Participate in the creation and maintenance of CI/CD solutions and pipelines.
- Leverage Linux and shell scripting for automating security and system updates, and design secure architectures using AWS services (VPC, EC2, S3, IAM, EKS/Kubernetes) to enhance application deployment and management.
- Build and maintain secure Docker containers, manage orchestration using Kubernetes, and automate configuration management with tools like Ansible and Chef, ensuring compliance with security standards.
- Implement and manage infrastructure using Terraform, aligning with security and compliance requirements, and set up Dynatrace for advanced monitoring, alerting, and visualization of security metrics. Develop Terraform scripts to automate and optimize infrastructure provisioning and management tasks.
- Utilize Git for secure source code management and integrate continuous security practices into CI/CD pipelines, applying vulnerability scanning and automated security testing tools.
- Contribute to security assessments, including vulnerability and penetration testing, NIST, CIS AWS, NIS2 etc.
- Implement and oversee compliance processes for SOC II, ISO27001, and GDPR.
- Stay updated on cybersecurity trends and best practices, including knowledge of SAST and DAST tools, OWASP Top10.
- Automate routine tasks and create tools to improve team efficiency and system robustness.
- Contribute to disaster recovery plans and ensure robust backup systems are in place.
- Develop and enforce security policies and respond effectively to security incidents.
- Manage incident response protocols, including on-call rotations and strategic planning.
- Conduct post-incident reviews to prevent recurrence and refine the system reliability framework.
- Implementing Service Level Indicators (SLIs) and maintaining Service Level Objectives (SLOs) and Service Level Agreements (SLAs) to ensure high standards of service delivery and reliability.
Best suited for candidates who: (Skills/Experience)
- Up to 4 years of experience in a related field, with a strong emphasis on learning and execution.
- Background in IT or computer science.
- Familiarity with CI/CD tools, cloud platforms (AWS, Azure, or GCP), and programming languages like Python, JavaScript, or Ruby.
- Solid understanding of network layers and TCP/IP protocols.
- In-depth understanding of operating systems, networking, and cloud services.
- Strong problem-solving skills with a 'hacker' mindset.
- Knowledge of security principles, threat modeling, risk assessment, and vulnerability management is a plus.
- Relevant certifications (e.g., CISSP, GWAPT, OSCP) are a plus.
Compensation:
This role comes with an annual CTC that is market competitive and depends on the quality of your work experience, degree of professionalism, culture fit, and alignment with FlytBase’s long-term business strategy.
Perks:
- Fast-paced Startup culture
- Hacker mode environment
- Enthusiastic and approachable team
- Professional autonomy
- Company-wide sense of purpose
- Flexible work hours
- Informal dress code
● Auditing, monitoring and improving existing infrastructure components of highly available and scaled
product on cloud with Ubuntu servers
● Running daily maintenance tasks and improving it with possible automation
● Deploying new components, server and other infrastructure when needed
● Coming up with innovative ways to automate tasks
● Working with telecom carriers and getting rates and destinations and update regularly on the system
● Working with Docker containers, Tinc, Iptables, HAproxy, ETCD, mySQL, mongoDB, CouchDB and
ansible
You would be bringing below skills to our team :
● Expertise with Docker containers and its networking, Tinc, Iptables, HAproxy, ETCD, and ansible
● Extensive experience with setup, maintenance, monitoring, backup and replication with mySQL
● Expertise with the Ubuntu servers and its OS and server level networking
● Good experience of working with mongoDB, CouchDB
● Good with the networking tools
● Open Source server monitoring solutions like nagios, Zabbix etc.
● Worked on highly scaled, distributed applications running on the Datacenter Ubuntu VPS instances
● Innovative and out of box thinker with multitasking skills working in a small team efficiently
● Working Knowledge of any scripting languages like bash, node or python
● It would be an advantage if have experience with the calling platforms like FreeSWITCH, OpenSIPS or
Kamailio and have basic knowledge of SIP protocol
About RaRa Delivery
Not just a delivery company…
RaRa Delivery is revolutionising instant delivery for e-commerce in Indonesia through data driven logistics.
RaRa Delivery is making instant and same-day deliveries scalable and cost-effective by leveraging a differentiated operating model and real-time optimisation technology. RaRa makes it possible for anyone, anywhere to get same day delivery in Indonesia. While others are focusing on ‘one-to-one’ deliveries, the company has developed proprietary, real-time batching tech to do ‘many-to-many’ deliveries within a few hours.. RaRa is already in partnership with some of the top eCommerce players in Indonesia like Blibli, Sayurbox, Kopi Kenangan and many more.
We are a distributed team with the company headquartered in Singapore 🇸🇬 , core operations in Indonesia 🇮🇩 and technology team based out of India 🇮🇳
Future of eCommerce Logistics.
- Data driven logistics company that is bringing in same day delivery revolution in Indonesia 🇮🇩
- Revolutionising delivery as an experience
- Empowering D2C Sellers with logistics as the core technology
- Build and maintain CI/CD tools and pipelines.
- Designing and managing highly scalable, reliable, and fault-tolerant infrastructure & networking that forms the backbone of distributed systems at RaRa Delivery.
- Continuously improve code quality, product execution, and customer delight.
- Communicate, collaborate and work effectively across distributed teams in a global environment.
- Operate to strengthen teams across their product with their knowledge base
- Contribute to improving team relatedness, and help build a culture of camaraderie.
- Continuously refactor applications to ensure high-quality design
- Pair with team members on functional and non-functional requirements and spread design philosophy and goals across the team
- Excellent bash, and scripting fundamentals and hands-on with scripting in programming languages such as Python, Ruby, Golang, etc.
- Good understanding of distributed system fundamentals and ability to troubleshoot issues in a larger distributed infrastructure
- Working knowledge of the TCP/IP stack, internet routing, and load balancing
- Basic understanding of cluster orchestrators and schedulers (Kubernetes)
- Deep knowledge of Linux as a production environment, container technologies. e.g. Docker, Infrastructure As Code such as Terraform, K8s administration at large scale.
- Have worked on production distributed systems and have an understanding of microservices architecture, RESTful services, CI/CD.
The AWS Cloud/Devops Engineer will be working with the engineering team and focusing on AWS infrastructure and automation. A key part of the role is championing and leading infrastructure as code. The Engineer will work closely with the Manager of Operations and Devops to build, manage and automate our AWS infrastructure.
Duties & Responsibilities:
- Design cloud infrastructure that is secure, scalable, and highly available on AWS
- Work collaboratively with software engineering to define infrastructure and deployment requirements
- Provision, configure and maintain AWS cloud infrastructure defined as code
- Ensure configuration and compliance with configuration management tools
- Administer and troubleshoot Linux based systems
- Troubleshoot problems across a wide array of services and functional areas
- Build and maintain operational tools for deployment, monitoring, and analysis of AWS infrastructure and systems
- Perform infrastructure cost analysis and optimization
Qualifications:
- At least 1-5 years of experience building and maintaining AWS infrastructure (VPC, EC2, Security Groups, IAM, ECS, CodeDeploy, CloudFront, S3)
- Strong understanding of how to secure AWS environments and meet compliance requirements
- Expertise using Chef for configuration management
- Hands-on experience deploying and managing infrastructure with Terraform
- Solid foundation of networking and Linux administration
- Experience with CI-CD, Docker, GitLab, Jenkins, ELK and deploying applications on AWS
- Ability to learn/use a wide variety of open source technologies and tools
- Strong bias for action and ownership











