- Understanding customer requirements and project KPIs
- Implementing various development, testing, automation tools, and IT infrastructure
- Planning the team structure, activities, and involvement in project management activities.
- Managing stakeholders and external interfaces
- Setting up tools and required infrastructure
- Defining and setting development, test, release, update, and support processes for DevOps operation
- Have the technical skill to review, verify, and validate the software code developed in the project.
- Troubleshooting techniques and fixing the code bugs
- Monitoring the processes during the entire lifecycle for its adherence and updating or creating new processes for improvement and minimizing the wastage
- Encouraging and building automated processes wherever possible
- Identifying and deploying cybersecurity measures by continuously performing vulnerability assessment and risk management
- Incidence management and root cause analysis
- Coordination and communication within the team and with customers
- Selecting and deploying appropriate CI/CD tools
- Strive for continuous improvement and build continuous integration, continuous development, and constant deployment pipeline (CI/CD Pipeline)
- Mentoring and guiding the team members
- Monitoring and measuring customer experience and KPIs
- Managing periodic reporting on the progress to the management and the customer
![companies logos](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fhiring_companies_logos-v2.webp&w=3840&q=80)
About codersbrain
About
Connect with the team
Similar jobs
Key Responsibilities:
- Cloud Infrastructure Management: Oversee the deployment, scaling, and management of cloud infrastructure across platforms like AWS, GCP, and Azure. Ensure optimal configuration, security, and cost-effectiveness.
- Application Deployment and Maintenance: Responsible for deploying and maintaining web applications, particularly those built on Django and the MERN stack (MongoDB, Express.js, React, Node.js). This includes setting up CI/CD pipelines, monitoring performance, and troubleshooting.
- Automation and Optimization: Develop scripts and automation tools to streamline operations. Continuously seek ways to improve system efficiency and reduce downtime.
- Security Compliance: Ensure that all cloud deployments comply with relevant security standards and practices. Regularly conduct security audits and coordinate with security teams to address vulnerabilities.
- Collaboration and Support: Work closely with development teams to understand their needs and provide technical support. Act as a liaison between developers, IT staff, and management to ensure smooth operation and implementation of cloud solutions.
- Disaster Recovery and Backup: Implement and manage disaster recovery plans and backup strategies to ensure data integrity and availability.
- Performance Monitoring: Regularly monitor and report on the performance of cloud services and applications. Use data to make informed decisions about upgrades, scaling, and other changes.
Required Skills and Experience:
- Proven experience in managing cloud infrastructure on AWS, GCP, and Azure.
- Strong background in deploying and maintaining Django-based and MERN stack web applications.
- Expertise in automation tools and scripting languages.
- Solid understanding of network architecture and security protocols.
- Experience with continuous integration and deployment (CI/CD) methodologies.
- Excellent problem-solving abilities and a proactive approach to system optimization.
- Good communication skills for effective collaboration with various teams.
Desired Qualifications:
- Bachelor’s degree in Computer Science, Information Technology, or a related field.
- Relevant certifications in AWS, GCP, or Azure are highly desirable.
- Minimum 5 years of experience in a DevOps or similar role, with a focus on cloud computing and web application deployment.
- 2+ years work experience in a DevOps or similar role
- Knowledge of OO programming and concepts (Java, C++, C#, Python)
- A drive towards automating repetitive tasks (e.g., scripting via Bash, Python, etc)
- Fluency in one or more scripting languages such as Python or Ruby.
- Familiarity with Microservice-based architectures
- Practical experience with Docker containerization and clustering (Kubernetes/ECS)
- In-depth, hands-on experience with Linux, networking, server, and cloud architectures.
- Experience with CI/CD tools Azure DevOps, AWS cloud formation, Lamda functions, Jenkins, and Ansible
- Experience with AWS, Azure, or another cloud PaaS provider.
- Solid understanding of configuration, deployment, management, and maintenance of large cloud-hosted systems; including auto-scaling, monitoring, performance tuning, troubleshooting, and disaster recovery
- Proficiency with source control, continuous integration, and testing pipelines
- Effective communication skills
Job Responsibilities:
- Deploy and maintain critical applications on cloud-native microservices architecture.
- Implement automation, effective monitoring, and infrastructure-as-code.
- Deploy and maintain CI/CD pipelines across multiple environments.
- Streamline the software development lifecycle by identifying pain points and productivity barriers and determining ways to resolve them.
- Analyze how customers are using the platform and help drive continuous improvement.
- Support and work alongside a cross-functional engineering team on the latest technologies.
- Iterate on best practices to increase the quality & velocity of deployments.
- Sustain and improve the process of knowledge sharing throughout the engineering team
- Identification and prioritization of technical debt that risks instability or creates wasteful operational toil.
- Own daily operational goals with the team.
Job Responsibilities:
Section 1 -
- Responsible for managing and providing L1 support to Build, design, deploy and maintain the implementation of Cloud solutions on AWS.
- Implement, deploy and maintain development, staging & production environments on AWS.
- Familiar with serverless architecture and services on AWS like Lambda, Fargate, EBS, Glue, etc.
- Understanding of Infra as a code and familiar with related tools like Terraform, Ansible Cloudformation etc.
Section 2 -
- Managing the Windows and Linux machines, Kubernetes, Git, etc.
- Responsible for L1 management of Servers, Networks, Containers, Storage, and Databases services on AWS.
Section 3 -
- Timely monitoring of production workload alerts and quick addressing the issues
- Responsible for monitoring and maintaining the Backup and DR process.
Section 4 -
- Responsible for documenting the process.
- Responsible for leading cloud implementation projects with end-to-end execution.
Qualifications: Bachelors of Engineering / MCA Preferably with AWS, Cloud certification
Skills & Competencies
- Linux and Windows servers management and troubleshooting.
- AWS services experience on CloudFormation, EC2, RDS, VPC, EKS, ECS, Redshift, Glue, etc. - AWS EKS
- Kubernetes and containers knowledge
- Understanding of setting up AWS Messaging, streaming and queuing Services(MSK, Kinesis, SQS, SNS, MQ)
- Understanding of serverless architecture. - High understanding of Networking concepts
- High understanding of Serverless architecture concept - Managing to monitor and alerting systems
- Sound knowledge of Database concepts like Dataware house, Data Lake, and ETL jobs
- Good Project management skills
- Documentation skills
- Backup, and DR understanding
Soft Skills - Project management, Process Documentation
Ideal Candidate:
- AWS certification with between 2-4 years of experience with certification and project execution experience.
- Someone who is interested in building sustainable cloud architecture with automation on AWS.
- Someone who is interested in learning and being challenged on a day-to-day basis.
- Someone who can take ownership of the tasks and is willing to take the necessary action to get it done.
- Someone who is curious to analyze and solve complex problems.
- Someone who is honest with their quality of work and is comfortable with taking ownership of their success and failure, both.
Behavioral Traits
- We are looking for someone who is interested to be part of creativity and the innovation-based environment with other team members.
- We are looking for someone who understands the idea/importance of teamwork and individual ownership at the same time.
- We are looking for someone who can debate logically, respectfully disagree, and can admit if proven wrong and who can learn from their mistakes and grow quickly
We are looking for an excellent experienced person in the Dev-Ops field. Be a part of a vibrant, rapidly growing tech enterprise with a great working environment. As a DevOps Engineer, you will be responsible for managing and building upon the infrastructure that supports our data intelligence platform. You'll also be involved in building tools and establishing processes to empower developers to
deploy and release their code seamlessly.
Responsibilities
The ideal DevOps Engineers possess a solid understanding of system internals and distributed systems.
Understanding accessibility and security compliance (Depending on the specific project)
User authentication and authorization between multiple systems,
servers, and environments
Integration of multiple data sources and databases into one system
Understanding fundamental design principles behind a scalable
application
Configuration management tools (Ansible/Chef/Puppet), Cloud
Service Providers (AWS/DigitalOcean), Docker+Kubernetes ecosystem is a plus.
Should be able to make key decisions for our infrastructure,
networking and security.
Manipulation of shell scripts during migration and DB connection.
Monitor Production Server Health of different parameters (CPU Load, Physical Memory, Swap Memory and Setup Monitoring tool to
Monitor Production Servers Health, Nagios
Created Alerts and configured monitoring of specified metrics to
manage their cloud infrastructure efficiently.
Setup/Managing VPC, Subnets; make connection between different zones; blocking suspicious ip/subnet via ACL.
Creating/Managing AMI/Snapshots/Volumes, Upgrade/downgrade
AWS resources (CPU, Memory, EBS)
The candidate would be Responsible for managing microservices at scale maintain the compute and storage infrastructure for various product teams.
Strong Knowledge about Configuration Management Tools like –
Ansible, Chef, Puppet
Extensively worked with Change tracking tools like JIRA and log
Analysis, Maintaining documents of production server error log's
reports.
Experienced in Troubleshooting, Backup, and Recovery
Excellent Knowledge of Cloud Service Providers like – AWS, Digital
Ocean
Good Knowledge about Docker, Kubernetes eco-system.
Proficient understanding of code versioning tools, such as Git
Must have experience working in an automated environment.
Good knowledge of Amazon Web Service Architects like – Amazon EC2, Amazon S3 (Amazon Glacier), Amazon VPC, Amazon Cloud Watch.
Scheduling jobs using crontab, Create SWAP Memory
Proficient Knowledge about Access Management (IAM)
Must have expertise in Maven, Jenkins, Chef, SVN, GitHub, Tomcat, Linux, etc.
Candidate Should have good knowledge about GCP.
EducationalQualifications
B-Tech-IT/M-Tech -/MBA- IT/ BCA /MCA or any degree in the relevant field
EXPERIENCE: 2-6 yr
This company is a network of the world's best developers - full-time, long-term remote software jobs with better compensation and career growth. We enable our clients to accelerate their Cloud Offering, and Capitalize on Cloud. We have our own IOT/AI platform and we provide professional services on that platform to build custom clouds for their IOT devices. We also build mobile apps, run 24x7 devops/site reliability engineering for our clients.
We are looking for very hands-on SRE (Site Reliability Engineering) engineers with 3 to 6 years of experience. The person will be part of team that is responsible for designing & implementing automation from scratch for medium to large scale cloud infrastructure and providing 24x7 services to our North American / European customers. This also includes ensuring ~100% uptime for almost 50+ internal sites. The person is expected to deliver with both high speed and high quality as well as work for 40 Hours per week (~6.5 hours per day, 6 days per week) in shifts which will rotate every month.
This person MUST have:
- B.E Computer Science or equivalent
- 2+ Years of hands-on experience troubleshooting/setting up of the Linux environment, who can write shell scripts for any given requirement.
- 1+ Years of hands-on experience setting up/configuring AWS or GCP services from SCRATCH and maintaining them.
- 1+ Years of hands-on experience setting up/configuring Kubernetes & EKS and ensuring high availability of container orchestration.
- 1+ Years of hands-on experience setting up CICD from SCRATCH in Jenkins & Gitlab.
- Experience configuring/maintaining one monitoring tool.
- Excellent verbal & written communication skills.
- Candidates with certifications - AWS, GCP, CKA, etc will be preferred
- Hands-on experience with databases (Cassandra, MongoDB, MySQL, RDS).
Experience:
- Min 3 years of experience as SRE automation engineer building, running, and maintaining production sites. Not looking for candidates who have experience only as L1/L2 or Build & Deploy..
Location:
- Remotely, anywhere in India
Timings:
- The person is expected to deliver with both high speed and high quality as well as work for 40 Hours per week (~6.5 hours per day, 6 days per week) in shifts which will rotate every month.
Position:
- Full time/Direct
- We have great benefits such as PF, medical insurance, 12 annual company holidays, 12 PTO leaves per year, annual increments, Diwali bonus, spot bonuses and other incentives etc.
- We dont believe in locking in people with large notice periods. You will stay here because you love the company. We have only a 15 days notice period.
Araali Networks is seeking a highly driven DevOps engineer to help streamline the release and deployment process, while assuring high quality.
Responsibilities:
Use best practices to streamline the release process
Use IaC methods to manage cloud infrastructure
Use test automation for release qualification
Skills and Qualifications:
Hands-on experience in managing release pipelines
Good understanding of public cloud infrastructure
Working knowledge of python test frameworks like pyunit, and automation tools like Selenium, Cucumber etc
Strong analytical and debugging skills
Bachelor’s degree in Computer Science
About Araali:
Araali Networks is a SaaS based cybersecurity startup that has raised a total of $10M from well known investors like A Capital, Firebolt Ventures and SV Angels, and and through a strategic investment by a publicly traded security company.
The company is disrupting the cloud firewall market by auto-creating nano-perimeters around every cloud app. The Araali solution enables developers to focus on features and improves the security posture through simplification and automation.
The security controls are embedded at the time of DevOps. The precision of Araali controls also helps with security operations where alerts are precise and intelligently routed to the right app team, making it actionable in real-time.
Projects you'll be working on:
- We're focused on enhancing our product for our clients and their users, as well as streamlining operations and improving our technical foundation.
- Writing scripts for procurement, configuration and deployment of instances (infrastructure automation) on GCP
- Managing Kubernetes cluster
- Manage product and services like VPC, Elasticsearch, cloud functions, rabbitMQ, redis servers, postgres infrastructure, app engine, etc.
- Supporting developers in setting up infrastructure for services
- Manage and improve microservices infrastructure
- Managing high availability, low latency applications
- Focus on security best practices to ensure assist in security and compliance activities
Requirements
- Minimum 3 years experience as DevOps
- Minimum 1 years' experience with Kubernetes Cluster (Infrastructure as code, maintaining and scalability).
- BASH expertise, node or python professional programming experience
- Experience with setting up, configuring and using Jenkins or any CI tools, building CI/CD pipeline
- Experience setting microservices architecture
- Experience with package management and deployments
- Thorough understanding of networking.
- Understanding of all common services and protocols
- Experience in web server configuration, monitoring, network design and high availability
- Thorough understanding of DNS, VPN, SSL
Technologies you'll work with:
- GKE, Prometheus, Grafana, Stackdriver
- ArgoCD and GitHub Actions
- NodeJS Backend
- Postgres, ElasticSearch, Redis, RabbitMQ
- Whatever else you decide - we're constantly re-evaluating our stack and tools
- Having prior experience with the technologies is a plus, but not mandatory for skilled candidates.
Benefits
- Remote Option - You can work from location of your choice :)
- Reimbursement of Home Office Setup
- Competitive Salary
- Friendly atmosphere
- Flexible paid vacation policy
- Cloud and virtualization-based technologies (Amazon Web Services (AWS), VMWare).
- Java Application Server Administration (Weblogic, WidlFfy, JBoss, Tomcat).
- Docker and Kubernetes (EKS)
- Linux/UNIX Administration (Amazon Linux and RedHat).
- Developing and supporting cloud infrastructure designs and implementations and guiding application development teams.
- Configuration Management tools (Chef or Puppet or ansible).
- Log aggregations tools such as Elastic and/or Splunk.
- Automate infrastructure and application deployment-related tasks using terraform.
- Automate repetitive tasks required to maintain a secure and up-to-date operational environment.
Responsibilities
- Build and support always-available private/public cloud-based software-as-a-service (SaaS) applications.
- Build AWS or other public cloud infrastructure using Terraform.
- Deploy and manage Kubernetes (EKS) based docker applications in AWS.
- Create custom OS images using Packer.
- Create and revise infrastructure and architectural designs and implementation plans and guide the implementation with operations.
- Liaison between application development, infrastructure support, and tools (IT Services) teams.
- Development and documentation of Chef recipes and/or ansible scripts. Support throughout the entire deployment lifecycle (development, quality assurance, and production).
- Help developers leverage infrastructure, application, and cloud platform features and functionality participate in code and design reviews, and support developers by building CI/CD pipelines using Bamboo, Jenkins, or Spinnaker.
- Create knowledge-sharing presentations and documentation to help developers and operations teams understand and leverage the system's capabilities.
- Learn on the job and explore new technologies with little supervision.
- Leverage scripting (BASH, Perl, Ruby, Python) to build required automation and tools on an ad-hoc basis.
Who we have in mind:
- Solid experience in building a solution on AWS or other public cloud services using Terraform.
- Excellent problem-solving skills with a desire to take on responsibility.
- Extensive knowledge in containerized application and deployment in Kubernetes
- Extensive knowledge of the Linux operating system, RHEL preferred.
- Proficiency with shell scripting.
- Experience with Java application servers.
- Experience with GiT and Subversion.
- Excellent written and verbal communication skills with the ability to communicate technical issues to non-technical and technical audiences.
- Experience working in a large-scale operational environment.
- Internet and operating system security fundamentals.
- Extensive knowledge of massively scalable systems. Linux operating system/application development desirable.
- Programming in scripting languages such as Python. Other object-oriented languages (C++, Java) are a plus.
- Experience with Configuration Management Automation tools (chef or puppet).
- Experience with virtualization, preferably on multiple hypervisors.
- BS/MS in Computer Science or equivalent experience.
- Excellent written and verbal skills.
Education or Equivalent Experience:
- Bachelor's degree or equivalent education in related fields
- Certificates of training in associated fields/equipment’s
Roles and Responsibilities
● Managing Availability, Performance, Capacity of infrastructure and applications.
● Building and implementing observability for applications health/performance/capacity.
● Optimizing On-call rotations and processes.
● Documenting “tribal” knowledge.
● Managing Infra-platforms like
- Mesos/Kubernetes
- CICD
- Observability(Prometheus/New Relic/ELK)
- Cloud Platforms ( AWS/ Azure )
- Databases
- Data Platforms Infrastructure
● Providing help in onboarding new services with the production readiness review process.
● Providing reports on services SLO/Error Budgets/Alerts and Operational Overhead.
● Working with Dev and Product teams to define SLO/Error Budgets/Alerts.
● Working with the Dev team to have an in-depth understanding of the application architecture and its bottlenecks.
● Identifying observability gaps in product services, infrastructure and working with stake owners to fix it.
● Managing Outages and doing detailed RCA with developers and identifying ways to avoid that situation.
● Managing/Automating upgrades of the infrastructure services.
● Automate toil work.
Experience & Skills
● 3+ Years of experience as an SRE/DevOps/Infrastructure Engineer on large scale microservices and infrastructure.
● A collaborative spirit with the ability to work across disciplines to influence, learn, and deliver.
● A deep understanding of computer science, software development, and networking principles.
● Demonstrated experience with languages, such as Python, Java, Golang etc.
● Extensive experience with Linux administration and good understanding of the various linux kernel subsystems (memory, storage, network etc).
● Extensive experience in DNS, TCP/IP, UDP, GRPC, Routing and Load Balancing.
● Expertise in GitOps, Infrastructure as a Code tools such as Terraform etc.. and Configuration Management Tools such as Chef, Puppet, Saltstack, Ansible.
● Expertise of Amazon Web Services (AWS) and/or other relevant Cloud Infrastructure solutions like Microsoft Azure or Google Cloud.
● Experience in building CI/CD solutions with tools such as Jenkins, GitLab, Spinnaker, Argo etc.
● Experience in managing and deploying containerized environments using Docker,
Mesos/Kubernetes is a plus.
● Experience with multiple datastores is a plus (MySQL, PostgreSQL, Aerospike,
Couchbase, Scylla, Cassandra, Elasticsearch).
● Experience with data platforms tech stacks like Hadoop, Hive, Presto etc is a plus
![icon](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fsearch.png&w=48&q=75)
![companies logos](/_next/image?url=https%3A%2F%2Fcdn.cutshort.io%2Fpublic%2Fimages%2Fhiring_companies_logos-v2.webp&w=3840&q=80)