Description
Do you dream about code every night? If so, we’d love to talk to you about a new product that we’re making to enable delightful testing experiences at scale for development teams who build modern software solutions.
What You'll Do
Troubleshooting and analyzing technical issues raised by internal and external users.
Working with Monitoring tools like Prometheus / Nagios / Zabbix.
Developing automation in one or more technologies such as Terraform, Ansible, Cloud Formation, Puppet, Chef will be preferred.
Monitor infrastructure alerts and take proactive action to avoid downtime and customer impacts.
Working closely with the cross-functional teams to resolve issues.
Test, build, design, deployment, and ability to maintain continuous integration and continuous delivery process using tools like Jenkins, maven Git, etc.
Work in close coordination with the development and operations team such that the application is in line with performance according to the customer's expectations.
What you should have
Bachelor’s or Master’s degree in computer science or any related field.
3 - 6 years of experience in Linux / Unix, cloud computing techniques.
Familiar with working on cloud and datacenter for enterprise customers.
Hands-on experience on Linux / Windows / Mac OS’s and Batch/Apple/Bash scripting.
Experience with various databases such as MongoDB, PostgreSQL, MySQL, MSSQL.
Familiar with AWS technologies like EC2, S3, Lambda, IAM, etc.
Must know how to choose the best tools and technologies which best fit the business needs.
Experience in developing and maintaining CI/CD processes using tools like Git, GitHub, Jenkins etc.
Excellent organizational skills to adapt to a constantly changing technical environment

About LambdaTest
About
Perform manual or automated cross browser testing on 2000+ browsers online. Deploy and scale faster with the most powerful cross browser testing tool online.
Tech stack
Connect with the team
Similar jobs
Please Apply - https://zrec.in/7EYKe?source=CareerSite
About Us
Infra360 Solutions is a services company specializing in Cloud, DevSecOps, Security, and Observability solutions. We help technology companies adapt DevOps culture in their organization by focusing on long-term DevOps roadmap. We focus on identifying technical and cultural issues in the journey of successfully implementing the DevOps practices in the organization and work with respective teams to fix issues to increase overall productivity. We also do training sessions for the developers and make them realize the importance of DevOps. We provide these services - DevOps, DevSecOps, FinOps, Cost Optimizations, CI/CD, Observability, Cloud Security, Containerization, Cloud Migration, Site Reliability, Performance Optimizations, SIEM and SecOps, Serverless automation, Well-Architected Review, MLOps, Governance, Risk & Compliance. We do assessments of technology architecture, security, governance, compliance, and DevOps maturity model for any technology company and help them optimize their cloud cost, streamline their technology architecture, and set up processes to improve the availability and reliability of their website and applications. We set up tools for monitoring, logging, and observability. We focus on bringing the DevOps culture to the organization to improve its efficiency and delivery.
Job Description
Job Title: Senior DevOps Engineer / SRE
Department: Technology
Location: Gurgaon
Work Mode: On-site
Working Hours: 10 AM - 7 PM
Terms: Permanent
Experience: 4-6 years
Education: B.Tech/MCA
Notice Period: Immediately
About Us
At Infra360.io, we are a next-generation cloud consulting and services company committed to delivering comprehensive, 360-degree solutions for cloud, infrastructure, DevOps, and security. We partner with clients to transform and optimize their technology landscape, ensuring resilience, scalability, cost efficiency and innovation.
Our core services include Cloud Strategy, Site Reliability Engineering (SRE), DevOps, Cloud Security Posture Management (CSPM), and related Managed Services. We specialize in driving operational excellence across multi-cloud environments, helping businesses achieve their goals with agility and reliability.
We thrive on ownership, collaboration, problem-solving, and excellence, fostering an environment where innovation and continuous learning are at the forefront. Join us as we expand and redefine what’s possible in cloud technology and infrastructure.
Role Summary
We are seeking a Senior DevOps Engineer (SRE) to manage and optimize large-scale, mission-critical production systems. The ideal candidate will have a strong problem-solving mindset, extensive experience in troubleshooting, and expertise in scaling, automating, and enhancing system reliability. This role requires hands-on proficiency in tools like Kubernetes, Terraform, CI/CD, and cloud platforms (AWS, GCP, Azure), along with scripting skills in Python or Go. The candidate will drive observability and monitoring initiatives using tools like Prometheus, Grafana, and APM solutions (Datadog, New Relic, OpenTelemetry).
Strong communication, incident management skills, and a collaborative approach are essential. Experience in team leadership and multi-client engagement is a plus.
Ideal Candidate Profile
- Solid 4-6 years of experience as an SRE and DevOps with a proven track record of handling large-scale production environments
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field
- Strong Hands-on experience with managing Large Scale Production Systems
- Strong Production Troubleshooting Skills and handling high-pressure situations.
- Strong Experience with Databases (PostgreSQL, MongoDB, ElasticSearch, Kafka)
- Worked on making production systems more Scalable, Highly Available and Fault-tolerant
- Hands-on experience with ELK or other logging and observability tools
- Hands-on experience with Prometheus, Grafana & Alertmanager and on-call processes like Pagerduty
- Problem-Solving Mindset
- Strong with skills - K8s, Terraform, Helm, ArgoCD, AWS/GCP/Azure etc
- Good with Python/Go Scripting Automation
- Strong with fundamentals like DNS, Networking, Linux
- Experience with APM tools like - Newrelic, Datadog, OpenTelemetry
- Good experience with Incident Response, Incident Management, Writing detailed RCAs
- Experience with Applications best practices in making apps more reliable and fault-tolerant
- Strong leadership skills and the ability to mentor team members and provide guidance on best practices.
- Able to manage multiple clients and take ownership of client issues.
- Experience with Git and coding best practices
Good to have
- Team-leading Experience
- Multiple Client Handling
- Requirements gathering from clients
- Good Communication
Key Responsibilities
- Design and Development:
- Architect, design, and develop high-quality, scalable, and secure cloud-based software solutions.
- Collaborate with product and engineering teams to translate business requirements into technical specifications.
- Write clean, maintainable, and efficient code, following best practices and coding standards.
- Cloud Infrastructure:
- Develop and optimise cloud-native applications, leveraging cloud services like AWS, Azure, or Google Cloud Platform (GCP).
- Implement and manage CI/CD pipelines for automated deployment and testing.
- Ensure the security, reliability, and performance of cloud infrastructure.
- Technical Leadership:
- Mentor and guide junior engineers, providing technical leadership and fostering a collaborative team environment.
- Participate in code reviews, ensuring adherence to best practices and high-quality code delivery.
- Lead technical discussions and contribute to architectural decisions.
- Problem Solving and Troubleshooting:
- Identify, diagnose, and resolve complex software and infrastructure issues.
- Perform root cause analysis for production incidents and implement preventative measures.
- Continuous Improvement:
- Stay up-to-date with the latest industry trends, tools, and technologies in cloud computing and software engineering.
- Contribute to the continuous improvement of development processes, tools, and methodologies.
- Drive innovation by experimenting with new technologies and solutions to enhance the platform.
- Collaboration:
- Work closely with DevOps, QA, and other teams to ensure smooth integration and delivery of software releases.
- Communicate effectively with stakeholders, including technical and non-technical team members.
- Client Interaction & Management:
- Will serve as a direct point of contact for multiple clients.
- Able to handle the unique technical needs and challenges of two or more clients concurrently.
- Involve both direct interaction with clients and internal team coordination.
- Production Systems Management:
- Must have extensive experience in managing, monitoring, and debugging production environments.
- Will work on troubleshooting complex issues and ensure that production systems are running smoothly with minimal downtime.
Classplus is India's largest B2B ed-tech start-up, enabling 1 Lac+ educators and content creators to create their digital identity with their own branded apps. Starting in 2018, we have grown more than 10x in the last year, into India's fastest-growing video learning platform.
Over the years, marquee investors like Tiger Global, Surge, GSV Ventures, Blume, Falcon, Capital, RTP Global, and Chimera Ventures have supported our vision. Thanks to our awesome and dedicated team, we achieved a major milestone in March this year when we secured a “Series-D” funding.
Now as we go global, we are super excited to have new folks on board who can take the rocketship higher🚀. Do you think you have what it takes to help us achieve this? Find Out Below!
What will you do?
• Define the overall process, which includes building a team for DevOps activities and ensuring that infrastructure changes are reviewed from an architecture and security perspective
• Create standardized tooling and templates for development teams to create CI/CD pipelines
• Ensure infrastructure is created and maintained using terraform
• Work with various stakeholders to design and implement infrastructure changes to support new feature sets in various product lines.
• Maintain transparency and clear visibility of costs associated with various product verticals, environments and work with stakeholders to plan for optimization and implementation
• Spearhead continuous experimenting and innovating initiatives to optimize the infrastructure in terms of uptime, availability, latency and costs
You should apply, if you
1. Are a seasoned Veteran: Have managed infrastructure at scale running web apps, microservices, and data pipelines using tools and languages like JavaScript(NodeJS), Go, Python, Java, Erlang, Elixir, C++ or Ruby (experience in any one of them is enough)
2. Are a Mr. Perfectionist: You have a strong bias for automation and taking the time to think about the right way to solve a problem versus quick fixes or band-aids.
3. Bring your A-Game: Have hands-on experience and ability to design/implement infrastructure with GCP services like Compute, Database, Storage, Load Balancers, API Gateway, Service Mesh, Firewalls, Message Brokers, Monitoring, Logging and experience in setting up backups, patching and DR planning
4. Are up with the times: Have expertise in one or more cloud platforms (Amazon WebServices or Google Cloud Platform or Microsoft Azure), and have experience in creating and managing infrastructure completely through Terraform kind of tool
5. Have it all on your fingertips: Have experience building CI/CD pipeline using Jenkins, Docker for applications majorly running on Kubernetes. Hands-on experience in managing and troubleshooting applications running on K8s
6. Have nailed the data storage game: Good knowledge of Relational and NoSQL databases (MySQL,Mongo, BigQuery, Cassandra…)
7. Bring that extra zing: Have the ability to program/script is and strong fundamentals in Linux and Networking.
8. Know your toys: Have a good understanding of Microservices architecture, Big Data technologies and experience with highly available distributed systems, scaling data store technologies, and creating multi-tenant and self hosted environments, that’s a plus
Being Part of the Clan
At Classplus, you’re not an “employee” but a part of our “Clan”. So, you can forget about being bound by the clock as long as you’re crushing it workwise😎. Add to that some passionate people working with and around you, and what you get is the perfect work vibe you’ve been looking for!
It doesn’t matter how long your journey has been or your position in the hierarchy (we don’t do Sirs and Ma’ams); you’ll be heard, appreciated, and rewarded. One can say, we have a special place in our hearts for the Doers! ✊🏼❤️
Are you a go-getter with the chops to nail what you do? Then this is the place for you.
DevOps Engineer
Job Description:
The position requires a broad set of technical and interpersonal skills that includes deployment technologies, monitoring and scripting from networking to infrastructure. Well versed in troubleshooting Prod issues and should be able to drive till the RCA.
Skills:
- Manage VMs across multiple datacenters and AWS to support dev/test and production workloads.
- Strong hands-on over Ansible is preferred
- Strong knowledge and hands-on experience in Kubernetes Architecture and administration.
- Should have core knowledge in Linux and System operations.
- Proactively and reactively resolve incidents as escalated from monitoring solutions and end users.
- Conduct and automate audits for network and systems infrastructure.
- Do software deployments, per documented processes, with no impact to customers.
- Follow existing devops processes while having flexibility to create and tweak processes to gain efficiency.
- Troubleshoot connectivity problems across network, systems or applications.
- Follow security guidelines, both policy and technical to protect our customers.
- Ability to automate recurring tasks to increase velocity and quality.
- Should have worked on any one of the Database (Postgres/Mongo/Cockroach/Cassandra)
- Should have knowledge and hands-on experience in managing ELK clusters.
- Scripting Knowledge in Shell/Python is added advantage.
- Hands-on Experience over K8s based Microservice Architecture is added advantage.
- Working on scalability, maintainability and reliability of company's products.
- Working with clients to solve their day-to-day challenges, moving manual processes to automation.
- Keeping systems reliable and gauging the effort it takes to reach there.
- Understanding Juxtapose tools and technologies to choose x over y.
- Understanding Infrastructure as a Code and applying software design principles to it.
- Automating tedious work using your favourite scripting languages.
- Taking code from the local system to production by implementing Continuous Integration and Delivery principles.
What you need to have:
- Worked with any one of the programming languages like Go, Python, Java, Ruby.
- Work experience with public cloud providers like AWS, GCP or Azure.
- Understanding of Linux systems and Containers
- Meticulous in creating and following runbooks and checklists
- Microservices experience and use of orchestration tools like Kubernetes/Nomad.
- Understanding of Computer Networking fundamentals like TCP, UDP.
- Strong bash scripting skills.
Requirements
Location – Pune
Experience - 1.5 to 3 YR
Payroll: Direct with Client
Salary Range: 3 to 5 Lacs (depending on existing)
Role and Responsibility
• Good understanding and Experience on AWS CloudWatch for ES2, Amazon Web Services, and Resources, and other sources.
• Collect and Store logs
• Monitor and Store Logs
• Log Analyze
• Configure Alarm
• Configure Dashboard
• Preparation and following of SOP's, Documentation.
• Good understanding AWS in DevOps.
• Experience with AWS services ( EC2, ECS, CloudWatch, VPC, Networking )
• Experience with a variety of infrastructure, application, and log monitoring tools ~ Prometheus, Grafana,
• Familiarity with Docker, Linux, and Linux security
• Knowledge and experience with container-based architectures like Docker
• Experience on performing troubleshooting on AWS service.
• Experience in configuring services in AWS like EC2, S3, ECS
• Experience with Linux system administration and engineering skills on Cloud infrastructure
• Knowledge of Load Balancers, Firewalls, and network switching components
• Knowledge of Internet-based technologies - TCP/IP, DNS, HTTP, SMTP & Networking concepts
• Knowledge of security best practices
• Comfortable 24x7 supporting Production environments
• Strong communication skills

- He has to perform architectural analysis, and he should know how to design enterprise-level systems.
- He should know how to design and simulate tools for the perfect delivery of systems.
- He should know how to design, develop, and maintain systems, processes, procedures to deliver a high-quality service design.
- He has to work with other members of a team and other departments to establish healthy communication and information flow.
- He should know how to deliver a high-performing solution architecture that can support the development efforts of a business.
- He has to plan, design, and configure the most typical business solutions as needed.
- He has to prepare technical documents and other presentations for multiple solutions areas.
- He has to be sure that the best practices for configuration management are carried our as it was needed.
- He has to work on customer specifications, analyze them, and conduct the best product recommendations associated with the platform
Requirements
- AWS Solution Architect 9-10 Years
- Responsible for managing applications on public cloud (AWS) infrastructure.
- Responsible for larger migrations of applications from VM to cloud/cloud-native.
- Responsible for setting up monitoring for cloud/cloud-native-based infrastructure and applications.
- MUST: AWS Solution Architect Professional certification.
BENEFITS: -
Competitive salary and stock options -
Premium medical benefits
LOCATION: - Remote (India)
EDUCATION AND EXPERIENCE -
Degree in computer science, software engineering, or a related field -
At least 3-5 years of professional work experience as a
DevOps / test automation / deployment engineer -
Experience in agile software development methodologies
JOB RESPONSIBILITIES: -
Design and development of scalable software test framework to automate test procedures - Perform health checks of existing sites periodically (manually and also developing a automation pipeline) - Manage per site software releases -
Generate release notes, user guides, technical documentation -
Perform upgrades/ downgrades (patch management) -
Generate and Manage configurations -
Suggest process improvements by interfacing with product owner -
File process/installation/deployment related issues & tracking them -
Verify customer-reported issues and translate them into technical tasks for development
REQUIREMENTS:
Following are must to have requirements, -
Automation frameworks like Selenium etc. -
Scripting languages like Perl, Python etc. -
Linux systems -
Version control system like git -
Issue tracking system like Jira -
Networking protocols -
Cloud infrastructure -
Ability to work individually and as part of a team with a sense of urgency -
Excellent communication skills in written and verbal English -
Great attention to detail
PREFERRED:
Following are good to have requirements, -
Knowledge in any of the programming languages like C, C++ etc.
Experience managing cloud-based (e.g., AWS, Google Cloud, etc.) and in-house server infrastructure -
Familiarity with machine learning / artificial intelligence infrastructure -
Experience in data visualization and statistics -
Basic knowledge of hardware infrastructure containing routers, switches and others -
Familiarity with web & data securityDesign and development of scalable software test framework to automate test
2. Extensive expertise in the below in AWS Development.
3. Amazon Dynamo Db, Amazon RDS , Amazon APIs. AWS Elastic Beanstalk, and AWS Cloud Formation.
4. Lambda, Kinesis. CodeCommit ,CodePipeline.
5. Leveraging AWS SDKs to interact with AWS services from the application.
6. Writing code that optimizes performance of AWS services used by the application.
7. Developing with Restful API interfaces.
8. Code-level application security (IAM roles, credentials, encryption, etc.).
9. Programming Language Python or .NET. Programming with AWS APIs.
10. General troubleshooting and debugging.






