
Roles and Responsibilities:
• Gather and analyse cloud infrastructure requirements
• Automating system tasks and infrastructure using a scripting language (Shell/Python/Ruby
preferred), with configuration management tools (Ansible/ Puppet/Chef), service registry and
discovery tools (Consul and Vault, etc), infrastructure orchestration tools (Terraform,
CloudFormation), and automated imaging tools (Packer)
• Support existing infrastructure, analyse problem areas and come up with solutions
• An eye for monitoring – the candidate should be able to look at complex infrastructure and be
able to figure out what to monitor and how.
• Work along with the Engineering team to help out with Infrastructure / Network automation needs.
• Deploy infrastructure as code and automate as much as possible
• Manage a team of DevOps
Desired Profile:
• Understanding of provisioning of Bare Metal and Virtual Machines
• Working knowledge of Configuration management tools like Ansible/ Chef/ Puppet, Redfish.
• Experience in scripting languages like Ruby/ Python/ Shell Scripting
• Working knowledge of IP networking, VPN's, DNS, load balancing, firewalling & IPS concepts
• Strong Linux/Unix administration skills.
• Self-starter who can implement with minimal guidance
• Hands-on experience setting up CICD from SCRATCH in Jenkins
• Experience with Managing K8s infrastructure

About CodeCraft Technologies Private Limited
About
CodeCraft Technologies is a digital transformation company offering mobility & cloud solutions along with design and consultancy services
With Codecraft you get a chance to work on cutting edge Technololgies.
You will be working on challenging projects into Cyber Security, IOT and Energy Domains
We have open and transparent work culture.
You will be working with one of the finest design teams
Company video


Photos
Connect with the team
Similar jobs

Senior Cloud Engineer Job Description
Position Title: Senior Cloud Engineer -- AWS [LONG TERM-CONTRACT POSITION]
Location: Remote [REQUIRES WORKING IN CST TIME ZONE]
Position Overview
The Senior Cloud Engineer will play a critical role in designing, deploying, and managing scalable, secure, and highly available cloud infrastructure across multiple platforms (AWS, Azure, Google Cloud). This role requires deep technical expertise, leadership in cloud
strategy, and hands-on experience with automation, DevOps practices, and cloud-native technologies. The ideal candidate will work collaboratively with cross-functional teams to deliver robust cloud solutions, drive best practices, and support business objectives
through innovative cloud engineering.
Key Responsibilities
Design, implement, and maintain cloud infrastructure and services, ensuring high availability, performance, and security across multi-cloud environments (AWS, Azure, GCP)
Develop and manage Infrastructure as Code (IaC) using tools such as Terraform, CloudFormation, and Ansible for automated provisioning and configuration
Lead the adoption and optimization of DevOps methodologies, including CI/CD pipelines, automated testing, and deployment processes
Collaborate with software engineers, architects, and stakeholders to architect cloud-native solutions that meet business and technical requirements
Monitor, troubleshoot, and optimize cloud systems for cost, performance, and reliability, using cloud monitoring and logging tools
Ensure cloud environments adhere to security best practices, compliance standards, and governance policies, including identity and access management, encryption, and vulnerability management
Mentor and guide junior engineers, sharing knowledge and fostering a culture of continuous improvement and innovation
Participate in on-call rotation and provide escalation support for critical cloud infrastructure issues
Document cloud architectures, processes, and procedures to ensure knowledge transfer and operational excellence
Stay current with emerging cloud technologies, trends, and best practices,
Required Qualifications
- Bachelors or Masters degree in Computer Science, Engineering, Information Systems, or a related field, or equivalent work experience
- 6–10 years of experience in cloud engineering or related roles, with a proven track record in large-scale cloud environments
- Deep expertise in at least one major cloud platform (AWS, Azure, Google Cloud) and experience in multi-cloud environments
- Strong programming and scripting skills (Python, Bash, PowerShell, etc.) for automation and cloud service integration
- Proficiency with DevOps tools and practices, including CI/CD (Jenkins, GitLab CI), containerization (Docker, Kubernetes), and configuration management (Ansible, Chef)
- Solid understanding of networking concepts (VPC, VPN, DNS, firewalls, load balancers), system administration (Linux/Windows), and cloud storage solutions
- Experience with cloud security, governance, and compliance frameworks
- Excellent analytical, troubleshooting, and root cause analysis skills
- Strong communication and collaboration abilities, with experience working in agile, interdisciplinary teams
- Ability to work independently, manage multiple priorities, and lead complex projects to completion
Preferred Qualifications
- Relevant cloud certifications (e.g., AWS Certified Solutions Architect, AWS DevOps Engineer, Microsoft AZ-300/400/500, Google Professional Cloud Architect)
- Experience with cloud cost optimization and FinOps practices
- Familiarity with monitoring/logging tools (CloudWatch, Kibana, Logstash, Datadog, etc.)
- Exposure to cloud database technologies (SQL, NoSQL, managed database services)
- Knowledge of cloud migration strategies and hybrid cloud architectures

Location: Bangalore
Experience: 2–5 years
Type: Full-time | On-site
Start: Immediate
Why this role exists
Most systems don’t fail because of one big outage.
They fail because reliability is treated as an afterthought.
Right now, uptime depends too much on individual heroics.
That doesn’t scale.
This role exists to build a reliability system where:
- Uptime is predictable
- Failures are contained
- Escalations don’t depend on leadership
What you’ll do
You will not just monitor systems.
You will own reliability as a product.
1. Drive uptime to production-grade reliability
- Improve system uptime to 99.9% customer-facing SLA within 4 months
- Define and track:
- SLAs / SLOs / error budgets
- Ensure reliability is measured from the customer’s perspective, not internal metrics
2. Build incident response as a system
- Set up a 24/7 incident response rotation across 3 engineers
- Eliminate dependency on leadership (no single escalation point)
- Define:
- Incident severity levels
- Response playbooks
- Escalation protocols
- Ensure fast detection → containment → resolution
3. Contain and fix erratic system behavior
- Identify and resolve:
- Latency spikes
- Downtime incidents
- Integration failures
- Build guardrails to prevent recurrence
- Focus on root cause elimination, not temporary fixes
4. Create continuous reliability feedback loops
- Work closely with engineering teams to:
- Surface recurring failure patterns
- Improve build quality
- Reduce production bugs
- Ensure learnings from incidents directly improve future releases
5. Improve observability and monitoring
- Build dashboards and alerts for:
- System health
- Performance metrics
- Failure signals
- Ensure issues are detected before customers report them
6. Reduce operational fragility
- Remove single points of failure (people, systems, workflows)
- Improve system resilience across:
- Deployments
- Integrations
- Runtime environments
What success looks like
- Uptime reaches 99.9%+ reliably
- Incidents are:
- Detected early
- Contained quickly
- Resolved permanently
- No dependency on a single individual for escalation
- System behavior becomes predictable and stable
- Engineering teams ship with higher reliability confidence
Who you are
- You have 2-5 years of experience in SRE / DevOps / backend systems
- You have worked on production systems with real uptime expectations
- You think in:
- Systems
- Failure modes
- Trade-offs
- You are comfortable debugging live, high-pressure environments
What will make you stand out
- Experience with:
- Distributed systems
- Cloud infrastructure (AWS / Azure / GCP)
- Monitoring & alerting tools
- Have built or improved:
- Incident response systems
- Reliability frameworks
- Strong debugging skills across:
- Infra
- Application
- Integrations
Compensation
₹60,000/month (fixed)
(Aligned with role scope and impact expectations)
Why join
- You will define reliability standards for a production AI platform
- Your work directly impacts:
- Customer trust
- Product performance
- Enterprise readiness
- You will move the system from reactive → predictable
What this role is not
- Not just monitoring dashboards
- Not limited to handling tickets
- Not dependent on escalation to leadership
What this role is
- A builder of reliability systems
- A guardian of uptime and performance
- A multiplier of engineering quality
One question to self-evaluate
Can you build a system where downtime is rare, predictable, and never dependent on a single person?
Managing cloud-based serverless infrastructure on AWS, GCP(firebase) with IaC
(Terraform, CloudFormation etc.,)
Deploying and maintaining products, services, and network components with a focus
on security, reliability, and zero downtime
Automating and streamlining existing processes to aid the development team
Working with the development team to create ephemeral environments, simplifying
the development lifecycle
Driving forward our blockchain infrastructure by creating and managing validators for
a wide variety of new and existing blockchains
Requirements:
1-3+ years in a SRE / DevOps / DevSecOps or Infrastructure Engineering role
Strong working knowledge of Amazon Web Services (AWS) or GCP or similar cloud
ecosystem
Experience working with declarative Infrastructure-as-Code frameworks(Terraform,
CloudFormation)
Experience with containerization technologies and tools (Docker, Kubernetes), CI/CD
pipelines and Linux/Unix administration
Bonus points - if you know more about crypto, staking, defi, proof-of-stake,
validators, delegations
Benefits:
Competitive CTC on par with market along with ESOPs/Tokens

We are now seeking a talented and motivated individual to contribute to our product in the Cloud data
protection space. Ability to clearly comprehend customer needs in a cloud environment, excellent
troubleshooting skills, and the ability to focus on problem resolution until completion are a requirement.
Responsibilities Include:
Review proposed feature requirements
Create test plan and test cases
Analyze performance, diagnosis, and troubleshooting
Enter and track defects
Interact with customers, partners, and development teams
Researching customer issues and product initiatives
Provide input for service documentation
Required Skills:
Bachelor's degree in Computer Science, Information Systems or related discipline
3+ years' experience inclusive of Software as a Service and/or DevOps engineering experience
Experience with AWS services like VPC, EC2, RDS, SES, ECS, Lambda, S3, ELB
Experience with technologies such as REST, Angular, Messaging, Databases, etc.
Strong troubleshooting skills and issue isolation skills
Possess excellent communication skills (written and verbal English)
Must be able to work as an individual contributor within a team
Ability to think outside the box
Experience in configuring infrastructure
Knowledge of CI / CD
Desirable skills:
Programming skills in scripting languages (e.g., python, bash)
Knowledge of Linux administration
Knowledge of testing tools/frameworks: TestNG, Selenium, etc
Knowledge of Identity and Security
DevOps Engineer
The DevOps team is one of the core technology teams of Lumiq.ai and is responsible for managing network activities, automating Cloud setups and application deployments. The team also interacts with our customers to work out solutions. If you are someone who is always pondering how to make things better, how technologies can interact, how various tools, technologies, and concepts can help a customer or how you can use various technologies to improve user experience, then Lumiq is the place of opportunities.
Job Description
- Explore about the newest innovations in scalable and distributed systems.
- Helps in designing the architecture of the project, solutions to the existing problems and future improvements to be done.
- Make the cloud infrastructure and services smart by implementing automation and trigger based solutions.
- Interact with Data Engineers and Application Engineers to create continuous integration and deployment frameworks and pipelines.
- Playing around with large clusters on different clouds to tune your jobs or to learn.
- Researching about new technologies, proving the concepts and planning how to integrate or update.
- Be part of discussions of other projects to learn or to help.
Responsibilities
- 2+years of experience as DevOps Engineer.
- You understand actual networking to Software defined networking.
- You like containers and open source orchestration system like Kubernetes, Mesos.
- Should have experience to secure system by creating robust access policy and network restrictions enforcement.
- Should have knowledge about how applications work are very important to design distributed systems.
- Should have experience to open source projects and have discussed the shortcomings or problems with the community on several occasions.
- You understand that provisioning a Virtual Machine is not DevOps.
- You know you are not a SysAdmin but DevOps Engineer who is the person behind developing operations for the system to run efficiently and scalably.
- Exposure on Private Cloud, Subnets, VPNs, Peering, Load Balancers and have worked with them.
- You check logs before screaming about error.
- Multiple Screens makes you more efficient.
- You are a doer who don’t say the word impossible.
- You understand the value of documentation of your work.
- You understand the Big Data ecosystem and how can you leverage cloud for it.
- You know these buddies - #airflow, #aws, #azure, #gcloud, #docker, #kubernetes, #mesos, #acs
- Responsible for the entire infrastructure including Production (both bare metal and AWS).
- Manage and maintain the production systems and operations including SysAdmin, DB activities.
- Improve tools and processes, automate manual efforts, and maintain the health of the system.
- Champion best practices, CI-CD, Metrics Driven Development
- Optimise the company's computing architecture
- Conduct systems tests for security, performance, and availability
- Maintain security of the system
- Develop and maintain design and troubleshooting documentation
- 7+ years of experience into DevOps/Technical Operations
- Extensive experience in operating scripting language like shell, python, etc
- Experience in developing and maintaining CI/CD process for SaaS applications using tools such as Jenkins
- Hands on experience in using configuration management tools such as Puppet, SaltStack, Ansible, etc
- Hands-on experience to build and handle VMs, Containers utilizing tools such as Kubernetes, Docker, etc
- Hands on experience in building, designing and maintaining cloud-based applications with AWS, Azure,GCP, etc
- Knowledge of Databases (MySQL, NoSQL)
- Knowledge of security/ethical hacking
- Have experience with ElasticSearch, Kibana, LogStash
- Have experience with Cassandra, Hadoop, or Spark
- Have experience with Mongo, Hive
Hi ,
Greetings from ToppersEdge.com India Pvt Ltd
We have job openings for our Client. Kindly find the details below:
Work Location : Bengaluru(remote axis presently)later on they should relocate to Bangalore.
Shift Timings – general shift
Job Type – Permanent Position
Experience – 3-7 years
Candidate should be from Product Based Company only
Job Description
We are looking to expand our DevOps team. This team is responsible for writing scripts to set up infrastructure to support 24*7 availability of the Netradyne services. The team is also responsible for setting up monitoring and alerting, to troubleshoot any issues reported in multiple environments. The team is responsible for triaging of production issues and providing appropriate and timely response to customers.
Requirements
- B Tech/M Tech/MS in Computer Science or a related field from a reputed university.
- Total industry experience of around 3-7 years.
- Programming experience in Python, Ruby, Perl or equivalent is a must.
- Good knowledge and experience of configuration management tool (like Ansible, etc.)
- Good knowledge and experience of provisioning tools (like Terraform, etc.)
- Good knowledge and experience with AWS.
- Experience with setting up CI/CD pipelines.
- Experience, in individual capacity, managing multiple live SaaS applications with high volume, high load, low-latency and high availability (24x7).
- Experience setting up web servers like apache, application servers like Tomcat/Websphere and databases (RDBMS and NoSQL).
- Good knowledge of UNIX (Linux) administration tools.
- Good knowledge of security best practices and knowledge of relevant tools (Firewalls, VPN) etc.
- Good knowledge of networking concepts and UNIX administration tools.
- Ability to troubleshoot issues quickly is required.
MTX Group Inc. is seeking a motivated Lead DevOps Engineer to join our team. MTX Group Inc. is a global implementation partner enabling organizations to become fit enterprises. MTX provides expertise across various platforms and technologies, including Google Cloud, Salesforce, artificial intelligence/machine learning, data integration, data governance, data quality, analytics, visualization and mobile technology. MTX’s very own Artificial Intelligence platform Maverick, enables clients to accelerate processes and critical decisions by leveraging a Cognitive Decision Engine, a collection of purpose-built Artificial Neural Networks designed to leverage the power of Machine Learning. The Maverick Platform includes Smart Asset Detection and Monitoring, Chatbot Services, Document Verification, to name a few.
Responsibilities:
- Be responsible for software releases, configuration, monitoring and support of production system components and infrastructure.
- Troubleshoot technical or functional issues in a complex environment to provide timely resolution, with various applications and platforms that are global.
- Bring experience on Google Cloud Platform.
- Write scripts and automation tools in languages such as Bash/Python/Ruby/Golang.
- Configure and manage data sources like PostgreSQL, MySQL, Mongo, Elasticsearch, Redis, Cassandra, Hadoop, etc
- Build automation and tooling around Google Cloud Platform using technologies such as Anthos, Kubernetes, Terraform, Google Deployment Manager, Helm, Cloud Build etc.
- Bring a passion to stay on top of DevOps trends, experiment with and learn new CI/CD technologies.
- Work with users to understand and gather their needs in our catalogue. Then participate in the required developments
- Manage several streams of work concurrently
- Understand how various systems work
- Understand how IT operations are managed
What you will bring:
- 5 years of work experience as a DevOps Engineer.
- Must possess ample knowledge and experience in system automation, deployment, and implementation.
- Must possess experience in using Linux, Jenkins, and ample experience in configuring and automating the monitoring tools.
- Experience in the software development process and tools and languages like SaaS, Python, Java, MongoDB, Shell scripting, Python, MySQL, and Git.
- Knowledge in handling distributed data systems. Examples: Elasticsearch, Cassandra, Hadoop, and others.
What we offer:
- Group Medical Insurance (Family Floater Plan - Self + Spouse + 2 Dependent Children)
- Sum Insured: INR 5,00,000/-
- Maternity cover upto two children
- Inclusive of COVID-19 Coverage
- Cashless & Reimbursement facility
- Access to free online doctor consultation
- Personal Accident Policy (Disability Insurance) -
- Sum Insured: INR. 25,00,000/- Per Employee
- Accidental Death and Permanent Total Disability is covered up to 100% of Sum Insured
- Permanent Partial Disability is covered as per the scale of benefits decided by the Insurer
- Temporary Total Disability is covered
- An option of Paytm Food Wallet (up to Rs. 2500) as a tax saver benefit
- Monthly Internet Reimbursement of upto Rs. 1,000
- Opportunity to pursue Executive Programs/ courses at top universities globally
- Professional Development opportunities through various MTX sponsored certifications on multiple technology stacks including Salesforce, Google Cloud, Amazon & others
*******************
- 2+ years of demonstrable experience leading site reliability and performance in large-scale, high-traffic environments
- 2+ years of hands-on experience as a DevOps engineer
- Strong leadership, communication and interpersonal skills geared to getting things done
- Developing themselves and the talent within their charge – fostering and creating opportunity for the team
- Strong understanding of SRE concepts and the DevOps culture. Set the direction and strategy for your team, and help shape the overall SRE program for the company
- Be able to lead complicated technical issues and communicating status updates/RCA with management and customers.
- Own site stability, performance, capacity planning, DevOps recruitment.










