Talented DevOps Supervisor with wide skills in programing and support


About Transportation | Warehouse Optimization
Similar jobs
About PGAGI:
At PGAGI, we believe in a future where AI and human intelligence coexist in harmony, creating a world that is smarter, faster, and better. We are not just building AI; we are shaping a future where AI is a fundamental and positive force for businesses, societies, and the planet.
Position Overview:
PGAGI Consultancy Pvt. Ltd. is seeking a proactive and motivated DevOps Intern with around 3-6 months of hands-on experience to support our AI model deployment and infrastructure initiatives. This role is ideal for someone looking to deepen their expertise in DevOps practices tailored to AI/ML environments, including CI/CD automation, cloud infrastructure, containerization, and monitoring.
Key Responsibilities:
AI Model Deployment & Integration
- Assist in containerizing and deploying AI/ML models into production using Docker.
- Support integration of models into existing systems and APIs.
Infrastructure Management
- Help manage cloud and on-premise environments to ensure scalability and consistency.
- Work with Kubernetes for orchestration and environment scaling.
CI/CD Pipeline Automation
- Collaborate on building and maintaining automated CI/CD pipelines (e.g., GitHub Actions, Jenkins).
- Implement basic automated testing and rollback mechanisms.
Hosting & Web Environment Management
- Assist in managing hosting platforms, web servers, and CDN configurations.
- Support DNS, load balancer setups, and ensure high availability of web services.
Monitoring, Logging & Optimization
- Set up and maintain monitoring/logging tools like Prometheus and Grafana.
- Participate in troubleshooting and resolving performance bottlenecks.
Security & Compliance
- Apply basic DevSecOps practices including security scans and access control implementations.
- Follow security and compliance checklists under supervision.
Cost & Resource Management
- Monitor resource usage and suggest cost optimization strategies in cloud environments.
Documentation
- Maintain accurate documentation for deployment processes and incident responses.
Continuous Learning & Innovation
- Suggest improvements to workflows and tools.
- Stay updated with the latest DevOps and AI infrastructure trends.
Requirements:
- Around 6 months of experience in a DevOps or related technical role (internship or professional).
- Basic understanding of Docker, Kubernetes, and CI/CD tools like GitHub Actions or Jenkins.
- Familiarity with cloud platforms (AWS, GCP, or Azure) and monitoring tools (e.g., Prometheus, Grafana).
- Exposure to scripting languages (e.g., Bash, Python) is a plus.
- Strong problem-solving skills and eagerness to learn.
- Good communication and documentation abilities.
Compensation
- Joining Bonus: INR 2,500 one-time bonus upon joining.
- Monthly Stipend: Base stipend of INR 8,000 per month, with the potential to increase up to INR 20,000 based on performance evaluations.
- Performance-Based Pay Scale: Eligibility for monthly performance-based bonuses, rewarding exceptional project contributions and teamwork.
- Additional Benefits: Access to professional development opportunities, including workshops, tech talks, and mentoring sessions.
Ready to kick-start your DevOps journey in a dynamic AI-driven environment? Apply now
#Devops #Docker #Kubernetes #DevOpsIntern
Electrum is looking for an experienced and proficient DevOps Engineer. This role will provide you with an opportunity to explore what’s possible in a collaborative and innovative work environment. If your goal is to work with a team of talented professionals that is keenly focused on solving complex business problems and supporting product innovation with technology, you might be our new DevOps Engineer. With this position, you will be involved in building out systems for our rapidly expanding team, enabling the whole engineering group to operate more effectively and iterate at top speed in an open, collaborative environment. The ideal candidate will have a solid background in software engineering and a vivid experience in deploying product updates, identifying production issues, and implementing integrations. The ideal candidate has proven capabilities and experience in risk-taking, is willing to take up challenges, and is a strong believer in efficiency and innovation with exceptional communication and documentation skills.
YOU WILL:
- Plan for future infrastructure as well as maintain & optimize the existing infrastructure.
- Conceptualize, architect, and build:
- 1. Automated deployment pipelines in a CI/CD environment like Jenkins;
- 2. Infrastructure using Docker, Kubernetes, and other serverless platforms;
- 3. Secured network utilizing VPCs with inputs from the security team.
- Work with developers & QA team to institute a policy of Continuous Integration with Automated testing Architect, build and manage dashboards to provide visibility into delivery, production application functional, and performance status.
- Work with developers to institute systems, policies, and workflows which allow for a rollback of deployments.
- Triage release of applications/ Hotfixes to the production environment on a daily basis.
- Interface with developers and triage SQL queries that need to be executed in production environments.
- Maintain 24/7 on-call rotation to respond and support troubleshooting of issues in production.
- Assist the developers and on calls for other teams with a postmortem, follow up and review of issues affecting production availability.
- Scale Electum platform to handle millions of requests concurrently.
- Reduce Mean Time To Recovery (MTTR), enable High Availability and Disaster Recovery
PREREQUISITES:
- Bachelor’s degree in engineering, computer science, or related field, or equivalent work experience.
- Minimum of six years of hands-on experience in software development and DevOps, specifically managing AWS Infrastructures such as EC2s, RDS, Elastic cache, S3, IAM, cloud trail, and other services provided by AWS.
- At least 2 years of experience in building and owning serverless infrastructure.
- At least 2 years of scripting experience in Python (Preferable) and Shell Web Application Deployment Systems Continuous Integration tools (Ansible).
- Experience building a multi-region highly available auto-scaling infrastructure that optimizes performance and cost.
- Experience in automating the provisioning of AWS infrastructure as well as automation of routine maintenance tasks.
- Must have prior experience automating deployments to production and lower environments.
- Worked on providing solutions for major automation with scripts or infrastructure.
- Experience with APM tools such as DataDog and log management tools.
- Experience in designing and implementing Essential Functions System Architecture Process; establishing and enforcing Network Security Policy (AWS VPC, Security Group) & ACLs.
- Experience establishing and enforcing:
- 1. System monitoring tools and standards
- 2. Risk Assessment policies and standards
- 3. Escalation policies and standards
- Excellent DevOps engineering, team management, and collaboration skills.
- Advanced knowledge of programming languages such as Python and writing code and scripts.
- Experience or knowledge in - Application Performance Monitoring (APM), and prior experience as an open-source contributor will be preferred.
About Us -Celebal Technologies is a premier software services company in the field of Data Science, Big Data and Enterprise Cloud. Celebal Technologies helps you to discover the competitive advantage by employing intelligent data solutions using cutting-edge technology solutions that can bring massive value to your organization. The core offerings are around "Data to Intelligence", wherein we leverage data to extract intelligence and patterns thereby facilitating smarter and quicker decision making for clients. With Celebal Technologies, who understands the core value of modern analytics over the enterprise, we help the business in improving business intelligence and more data-driven in architecting solutions.
Key Responsibilities
• As a part of the DevOps team, you will be responsible for configuration, optimization, documentation, and support of the CI/CD components.
• Creating and managing build and release pipelines with Azure DevOps and Jenkins.
• Assist in planning and reviewing application architecture and design to promote an efficient deployment process.
• Troubleshoot server performance issues & handle the continuous integration system.
• Automate infrastructure provisioning using ARM Templates and Terraform.
• Monitor and Support deployment, Cloud-based and On-premises Infrastructure.
• Diagnose and develop root cause solutions for failures and performance issues in the production environment.
• Deploy and manage Infrastructure for production applications
• Configure security best practices for application and infrastructure
Essential Requirements
• Good hands-on experience with cloud platforms like Azure, AWS & GCP. (Preferably Azure)
• Strong knowledge of CI/CD principles.
• Strong work experience with CI/CD implementation tools like Azure DevOps, Team city, Octopus Deploy, AWS Code Deploy, and Jenkins.
• Experience of writing automation scripts with PowerShell, Bash, Python, etc.
• GitHub, JIRA, Confluence, and Continuous Integration (CI) system.
• Understanding of secure DevOps practices
Good to Have -
• Knowledge of scripting languages such as PowerShell, Bash
• Experience with project management and workflow tools such as Agile, Jira, Scrum/Kanban, etc.
• Experience with Build technologies and cloud services. (Jenkins, TeamCity, Azure DevOps, Bamboo, AWS Code Deploy)
• Strong communication skills and ability to explain protocol and processes with team and management.
• Must be able to handle multiple tasks and adapt to a constantly changing environment.
• Must have a good understanding of SDLC.
• Knowledge of Linux, Windows server, Monitoring tools, and Shell scripting.
• Self-motivated; demonstrating the ability to achieve in technologies with minimal supervision.
• Organized, flexible, and analytical ability to solve problems creatively.
Description
DevOps Engineer / SRE
- Understanding of maintenance of existing systems (Virtual machines), Linux stack
- Experience running, operating and maintainence of Kubernetes pods
- Strong Scripting skills
- Experience in AWS
- Knowledge of configuring/optimizing open source tools like Kafka, etc.
- Strong automation maintenance - ability to identify opportunities to speed up build and deploy process with strong validation and automation
- Optimizing and standardizing monitoring, alerting.
- Experience in Google cloud platform
- Experience/ Knowledge in Python will be an added advantage
- Experience on Monitoring Tools like Jenkins, Kubernetes ,Nagios,Terraform etc
Responsibilities
● Be a hands-on engineer, ensure frameworks/infrastructure built is well designed,
scalable & are of high quality.
● Build and/or operate platforms that are highly available, elastic, scalable, operable and
observable.
● Build/Adapt and implement tools that empower the TI AI engineering teams to
self-manage the infrastructure and services owned by them.
● You will identify, articulate, and lead various long-term tech vision, strategies,
cross-cutting initiatives and architecture redesigns.
● Design systems and make decisions that will keep pace with the rapid growth of TI AI.
● Document your work and decision-making processes, and lead presentations and
discussions in a way that is easy for others to understand.
● Available for on-call during emergencies to handle and resolve problems in a quick and
efficient manner.
Requirements
● 2+ years of Hands-on experience as a DevOps / Infrastructure engineer with AWS and
Kubernetes or similar infrastructure platforms. (preferably AWS)
● Hands-on with DevOps principles and practices ( Everything-as-a-code, CI/CD, Test
everything, proactive monitoring etc).
● Experience in building and operating distributed systems.
● Understanding of operating systems, virtualization, containerization and networks
preferable
● Hands-on coding on any of the languages like Python or GoLang.
● Familiarity with software engineering practices including unit testing, code reviews, and
design documentation.
● Strong debugging and problem-solving skills Curiosity about how things work and love to
share that knowledge with others.
Benefits :
● Work with a world class team working on a very forward looking problem
● Competitive Pay
● Flat hierarchy
● Health insurance for the family
Senior DevOps Engineer (8-12 yrs Exp)
Job Description:
We are looking for an experienced and enthusiastic DevOps Engineer. As our new DevOps
Engineer, you will be in charge of the specification and documentation of the new project
features. In addition, you will be developing new features and writing scripts for automation
using Java/BitBucket/Python/Bash.
Roles and Responsibilities:
• Deploy updates and fixes
• Utilize various open source technologies
• Need to have hands on experience on automation tools like Docker / Jenkins /
Puppet etc.
• Build independent web based tools, micro-services and solutions
• Write scripts and automation using Java/BitBucket/Python/Bash.
• Configure and manage data sources like MySQL, Mongo, Elastic search, Redis etc
• Understand how various systems work
• Manage code deployments, fixes, updates and related processes.
• Understand how IT operations are managed
• Work with CI and CD tools, and source control such as GIT and SVN.
• Experience with project management and workflow tools such as Agile, Redmine,
WorkFront, Scrum/Kanban/SAFe, etc.
• Build tools to reduce occurrences of errors and improve customer experience
• Develop software to integrate with internal back-end systems
• Perform root cause analysis for production errors
• Design procedures for system troubleshooting and maintenance
Requirements:
• More than six years of experience in a DevOps Engineer role (or similar role);
experience in software development and infrastructure development is a mandatory.
• Bachelor’s degree or higher in engineering or related field
• Proficiency in deploying and maintaining web applications
• Ability to construct and execute network, server, and application status monitoring
• Knowledge of software automation production systems, including code deployment
• Working knowledge of software development methodologies
• Previous experience with high-performance and high-availability open source web
technologies
• Strong experience with Linux-based infrastructures, Linux/Unix administration, and
AWS.
• Strong communication skills and ability to explain protocol and processes with team
and management.
• Solid team player.
Role Purpose:
As a DevOps , You should be strong in both the Dev and Ops part of DevOps. We are looking for someone who has a deep understanding of systems architecture, understands core CS concepts well, and is able to reason about system behaviour rather than merely working with the toolset of the day. We believe that only such a person will be able to set a compelling direction for the team and excite those around them.
If you are someone who fits the description above, you will find that the rewards are well worth the high bar. Being one of the early hires of the Bangalore office, you will have a significant impact on the culture and the team; you will work with a set of energetic and hungry peers who will challenge you, and you will have considerable international exposure and opportunity for impact across departments.
Responsibilities
- Deployment, management, and administration of web services in a public cloud environment
- Design and develop solutions for deploying highly secure, highly available, performant and scalable services in elastically provisioned environments
- Design and develop continuous integration and continuous deployment solutions from development through production
- Own all operational aspects of running web services including automation, monitoring and alerting, reliability and performance
- Have direct impact on running a business by thinking about innovative solutions to operational problems
- Drive solutions and communication for production impacting incidents
- Running technical projects and being responsible for project-level deliveries
- Partner well with engineering and business teams across continents
Required Qualifications
- Bachelor’s or advanced degree in Computer Science or closely related field
- 4 - 6 years professional experience in DevOps, with at least 1/2 years in Linux / Unix
- Very strong in core CS concepts around operating systems, networks, and systems architecture including web services
- Strong scripting experience in Python and Bash
- Deep experience administering, running and deploying AWS based services
- Solid experience with Terraform, Packer and Docker or their equivalents
- Knowledge of security protocols and certificate infrastructure.
- Strong debugging, troubleshooting, and problem solving skills
- Broad experience with cloud hosted applications including virtualization platforms, relational and non relational data stores, reverse proxies, and orchestration platforms
- Curiosity, continuous learning and drive to continually raise the bar
- Strong partnering and communication skills
Preferred Qualifications
- Past experience as a senior developer or application architect strongly preferred.
- Experience building continuous integration and continuous deployment pipelines
- Experience with Zookeeper, Consul, HAProxy, ELK-Stack, Kafka, PostgreSQL.
- Experience working with, and preferably designing, a system compliant to any security framework (PCI DSS, ISO 27000, HIPPA, SOC 2, ...)
- Experience with AWS orchestration services such as ECS and EKS.
- Experience working with AWS ML pipeline services like AWS Sagemak
Roles and Responsibilities
● Managing Availability, Performance, Capacity of infrastructure and applications.
● Building and implementing observability for applications health/performance/capacity.
● Optimizing On-call rotations and processes.
● Documenting “tribal” knowledge.
● Managing Infra-platforms like
- Mesos/Kubernetes
- CICD
- Observability(Prometheus/New Relic/ELK)
- Cloud Platforms ( AWS/ Azure )
- Databases
- Data Platforms Infrastructure
● Providing help in onboarding new services with the production readiness review process.
● Providing reports on services SLO/Error Budgets/Alerts and Operational Overhead.
● Working with Dev and Product teams to define SLO/Error Budgets/Alerts.
● Working with the Dev team to have an in-depth understanding of the application architecture and its bottlenecks.
● Identifying observability gaps in product services, infrastructure and working with stake owners to fix it.
● Managing Outages and doing detailed RCA with developers and identifying ways to avoid that situation.
● Managing/Automating upgrades of the infrastructure services.
● Automate toil work.
Experience & Skills
● 3+ Years of experience as an SRE/DevOps/Infrastructure Engineer on large scale microservices and infrastructure.
● A collaborative spirit with the ability to work across disciplines to influence, learn, and deliver.
● A deep understanding of computer science, software development, and networking principles.
● Demonstrated experience with languages, such as Python, Java, Golang etc.
● Extensive experience with Linux administration and good understanding of the various linux kernel subsystems (memory, storage, network etc).
● Extensive experience in DNS, TCP/IP, UDP, GRPC, Routing and Load Balancing.
● Expertise in GitOps, Infrastructure as a Code tools such as Terraform etc.. and Configuration Management Tools such as Chef, Puppet, Saltstack, Ansible.
● Expertise of Amazon Web Services (AWS) and/or other relevant Cloud Infrastructure solutions like Microsoft Azure or Google Cloud.
● Experience in building CI/CD solutions with tools such as Jenkins, GitLab, Spinnaker, Argo etc.
● Experience in managing and deploying containerized environments using Docker,
Mesos/Kubernetes is a plus.
● Experience with multiple datastores is a plus (MySQL, PostgreSQL, Aerospike,
Couchbase, Scylla, Cassandra, Elasticsearch).
● Experience with data platforms tech stacks like Hadoop, Hive, Presto etc is a plus
Your Role:
- Serve as a primary point responsible for the overall health, performance, and capacity of one or more of our Internet-facing services
- Gain deep knowledge of our complex applications
- Assist in the roll-out and deployment of new product features and installations to facilitate our rapid iteration and constant growth
- Develop tools to improve our ability to rapidly deploy and effectively monitor custom applications in a large-scale UNIX environment
- Work closely with development teams to ensure that platforms are designed with "operability" in mind.
- Function well in a fast-paced, rapidly-changing environment
- Should be able to lead a team of smart engineers
- Should be able to strategically guide the team to greater automation adoption
Must Have:
- Experience Building/managing DevOps/SRE teams
- Strong in troubleshooting/debugging Systems, Network and Applications
- Strong in Unix/Linux operating systems and Networking
- Working knowledge of Open source technologies in Monitoring, Deployment and incident management
Good to Have:
- Minimum 3+ years of team management experience
- Experience in Containers and orchestration layers like Kubernetes, Mesos/Marathon
- Proven experience in programming & diagnostics in any languages like Go, Python, Java
- Experience in NoSQL/SQL technologies like Cassandra/MySQL/CouchBase etc.
- Experience in BigData technologies like Kafka/Hadoop/Airflow/Spark
- Is a die-hard sports fan

Skill: Python, Docker or Ansible , AWS
➢ Experience Building a multi-region highly available auto-scaling infrastructure that optimizes
performance and cost. plan for future infrastructure as well as Maintain & optimize existing
infrastructure.
➢ Conceptualize, architect and build automated deployment pipelines in a CI/CD environment like
Jenkins.
➢ Conceptualize, architect and build a containerized infrastructure using Docker,Mesosphere or
similar SaaS platforms.
Work with developers to institute systems, policies and workflows which allow for rollback of
deployments Triage release of applications to production environment on a daily basis.
➢ Interface with developers and triage SQL queries that need to be executed inproduction
environments.
➢ Maintain 24/7 on-call rotation to respond and support troubleshooting of issues in production.
➢ Assist the developers and on calls for other teams with post mortem, follow up and review of
issues affecting production availability.
➢ Establishing and enforcing systems monitoring tools and standards
➢ Establishing and enforcing Risk Assessment policies and standards
➢ Establishing and enforcing Escalation policies and standards

