• Bachelor or Master Degree in Computer Science, Software Engineering from a reputed
University.
• 5 - 8 Years of experience in building scalable, secure and compliant systems.
• More than 2 years of experience in working with GCP deployment for millions of daily visitors
• 5+ years hosting experience in a large heavy-traffic environment
• 5+ years production application support experience in a high uptime environment
• Software development and monitoring knowledge with Automated builds
• Technology:
o Cloud: AWS or Google Cloud
o Source Control: Gitlab or Bitbucket or Github
o Container Concepts: Docker, Microservices
o Continuous Integration: Jenkins, Bamboos
o Infrastructure Automation: Puppet, Chef or Ansible
o Deployment Automation: Jenkins, VSTS or Octopus Deploy
o Orchestration: Kubernets, Mesos, Swarm
o Automation: Node JS or Python
o Linux environment network administration, DNS, firewall and security management
• Ability to be adapt to the startup culture, handle multiple competing priorities, meet
deadlines and troubleshoot problems.
About Modistabox
Similar jobs
About us:
HappyFox is a software-as-a-service (SaaS) support platform. We offer an enterprise-grade help desk ticketing system and intuitively designed live chat software.
We serve over 12,000 companies in 70+ countries. HappyFox is used by companies that span across education, media, e-commerce, retail, information technology, manufacturing, non-profit, government and many other verticals that have an internal or external support function.
To know more, Visit! - https://www.happyfox.com/
Responsibilities:
- Build and scale production infrastructure in AWS for the HappyFox platform and its products.
- Research, Build/Implement systems, services and tooling to improve uptime, reliability and maintainability of our backend infrastructure. And to meet our internal SLOs and customer-facing SLAs.
- Proficient in managing/patching servers with Unix-based operating systems like Ubuntu Linux.
- Proficient in writing automation scripts or building infrastructure tools using Python/Ruby/Bash/Golang
- Implement consistent observability, deployment and IaC setups
- Patch production systems to fix security/performance issues
- Actively respond to escalations/incidents in the production environment from customers or the support team
- Mentor other Infrastructure engineers, review their work and continuously ship improvements to production infrastructure.
- Build and manage development infrastructure, and CI/CD pipelines for our teams to ship & test code faster.
- Participate in infrastructure security audits
Requirements:
- At least 5 years of experience in handling/building Production environments in AWS.
- At least 2 years of programming experience in building API/backend services for customer-facing applications in production.
- Demonstrable knowledge of TCP/IP, HTTP and DNS fundamentals.
- Experience in deploying and managing production Python/NodeJS/Golang applications to AWS EC2, ECS or EKS.
- Proficient in containerised environments such as Docker, Docker Compose, Kubernetes
- Proficient in managing/patching servers with Unix-based operating systems like Ubuntu Linux.
- Proficient in writing automation scripts using any scripting language such as Python, Ruby, Bash etc.,
- Experience in setting up and managing test/staging environments, and CI/CD pipelines.
- Experience in IaC tools such as Terraform or AWS CDK
- Passion for making systems reliable, maintainable, scalable and secure.
- Excellent verbal and written communication skills to address, escalate and express technical ideas clearly
- Bonus points – if you have experience with Nginx, Postgres, Redis, and Mongo systems in production.
Required qualifications and must have skills
-
5+ years of experience managing a team of 5+ infrastructure software engineers
-
5+ years of experience in building and scaling technical infrastructure
-
5+ years of experience in delivering software
-
Experience leading by influence in multi-team, cross-functional projects
-
Demonstrated experience recruiting and managing technical teams, including performance management and managing engineers
-
Experience with cloud service providers such as AWS, GCP, or Azure
-
Experience with containerization technologies such as Kubernetes and Docker
Nice to have Skills
-
Experience with Hadoop, Hive and Presto
-
Application/infrastructure benchmarking and optimization
-
Familiarity with modern CI/CD practices
-
Familiarity with reliability best practices
Job Brief:
We are looking for candidates that have experience in development and have performed CI/CD based projects. Should have a good hands-on Jenkins Master-Slave architecture, used AWS native services like CodeCommit, CodeBuild, CodeDeploy and CodePipeline. Should have experience in setting up cross platform CI/CD pipelines which can be across different cloud platforms or on-premise and cloud platform.
Job Location:
Pune.
Job Description:
- Hands on with AWS (Amazon Web Services) Cloud with DevOps services and CloudFormation.
- Experience interacting with customer.
- Excellent communication.
- Hands-on in creating and managing Jenkins job, Groovy scripting.
- Experience in setting up Cloud Agnostic and Cloud Native CI/CD Pipelines.
- Experience in Maven.
- Experience in scripting languages like Bash, Powershell, Python.
- Experience in automation tools like Terraform, Ansible, Chef, Puppet.
- Excellent troubleshooting skills.
- Experience in Docker and Kuberneties with creating docker files.
- Hands on with version control systems like GitHub, Gitlab, TFS, BitBucket, etc.
Job description
The role requires you to design development pipelines from the ground up, Creation of Docker Files, design and operate highly available systems in AWS Cloud environments. Also involves Configuration Management, Web Services Architectures, DevOps Implementation, Database management, Backups, and Monitoring.
Key responsibility area
- Ensure reliable operation of CI/CD pipelines
- Orchestrate the provisioning, load balancing, configuration, monitoring and billing of resources in the cloud environment in a highly automated manner
- Logging, metrics and alerting management.
- Creation of Bash/Python scripts for automation
- Performing root cause analysis for production errors.
Requirement
- 2 years experience as Team Lead.
- Good Command on kubernetes.
- Proficient in Linux Commands line and troubleshooting.
- Proficient in AWS Services. Deployment, Monitoring and troubleshooting applications in AWS.
- Hands-on experience with CI tooling preferably with Jenkins.
- Proficient in deployment using Ansible.
- Knowledge of infrastructure management tools (Infrastructure as cloud) such as terraform, AWS cloudformation etc.
- Proficient in deployment of applications behind load balancers and proxy servers such as nginx, apache.
- Scripting languages: Bash, Python, Groovy.
- Experience with Logging, Monitoring, and Alerting tools like ELK(Elastic-search, Logstash, Kibana), Nagios. Graylog, splunk Prometheus, Grafana is a plus.
Must Have:
Linux, CI/CD(Jenkin), AWS, Scripting(Bash,shell Python, Go), Ngnix, Docker.
Good to have
Configuration Management(Ansible or similar tool), Logging tool( ELK or similar), Monitoring tool(Ngios or similar), IaC(Terraform, cloudformation).
- Design, Develop, deploy, and run operations of infrastructure services in the Acqueon AWS cloud environment
- Manage uptime of Infra & SaaS Application
- Implement application performance monitoring to ensure platform uptime and performance
- Building scripts for operational automation and incident response
- Handle schedule and processes surrounding cloud application deployment
- Define, measure, and meet key operational metrics including performance, incidents and chronic problems, capacity, and availability
- Lead the deployment, monitoring, maintenance, and support of operating systems (Windows, Linux)
- Build out lifecycle processes to mitigate risk and ensure platforms remain current, in accordance with industry standard methodologies
- Run incident resolution within the environment, facilitating teamwork with other departments as required
- Automate the deployment of new software to cloud environment in coordination with DevOps engineers
- Work closely with Presales, understand customer requirement to deploy in Production
- Lead and mentor a team of operations engineers
- Drive the strategy to evolve and modernize existing tools and processes to enable highly secure and scalable operations
- AWS infrastructure management, provisioning, cost management and planning
- Prepare RCA incident reports for internal and external customers
- Participate in product engineering meetings to ensure product features and patches comply with cloud deployment standards
- Troubleshoot and analyse performance issues and customer reported incidents working to restore services within the SLA
- Monthly SLA Performance reports
As a Cloud Operations Manager in Acqueon you will need….
- 8 years’ progressive experience managing IT infrastructure and global cloud environments such as AWS, GCP (must)
- 3-5 years management experience leading a Cloud Operations / Site Reliability / Production Engineering team working with globally distributed teams in a fast-paced environment
- 3-5 years’ experience in IAC (Terraform, K8)
- 3+ years end-to-end incident management experience
- Experience with communicating and presenting to all stakeholders
- Experience with Cloud Security compliance and audits
- Detail-oriented. The ideal candidate is one who naturally digs as deep as they need to understand the why
- Knowledge on GCP will be added advantage
- Manage and monitor customer instances for uptime and reliability
- Staff scheduling and planning to ensure 24x7x365 coverage for cloud operations
- Customer facing, excellent communication skills, team management, troubleshooting
About Us
We have grown over 1400% in revenues in the last year.
Interface.ai provides an Intelligent Virtual Assistant (IVA) to FIs to automate calls and customer inquiries across multiple channels and engage their customers with financial insights and upsell/cross-sell.
Our IVA is transforming financial institutions’ call centers from a cost to a revenue center.
Our core technology is built 100% in-house with several breakthroughs in Natural Language Understanding. Our parser is built based on zero-shot learning that helps us to launch industry-specific IVA that can achieve over 90% accuracy on Day-1.
We are 45 people strong with employees spread across India and US locations. Many of them come from ML teams at Apple, Microsoft, and Salesforce in the US along with enterprise architects with over 20+ years of experience building large-scale systems. Our India team consists of people from ISB, IIMs, and many who have been previously part of early-stage startups.
We are a fully remote team.
Founders come from Banking and Enterprise Technology backgrounds with previous experience scaling companies from scratch to $50M+ in revenues.
As a Site Reliability Engineer you will be in charge of:
- Designing, analyzing and troubleshooting large-scale distributed systems
- Engaging in cross-functional team discussions on design, deployment, operation, and maintenance, in a fast-moving, collaborative set up
- Building automation scripts to validate the stability, scalability, and reliability of interface.ai’s products & services as well as enhance interface.ai’s employees’ productivity
- Debugging and optimizing code and automating routine tasks
- Troubleshoot and diagnose issues (hardware or software), propose and implement solutions to ensure they occur with reduced frequency
- Perform the periodic on-call duty to handle security, availability, and reliability of interface.ai’s products
- You will follow and write good code and solid engineering practices
Requirements
You can be a great fit if you are :
- Extremely self motivated
- Ability to learn quickly
- Growth Mindset (read this if you don't know what it means - https://www.amazon.com/Mindset-Psychology-Carol-S-Dweck/dp/0345472322" target="_blank">link)
- Emotional Maturity (read this if you don't know what it means - https://medium.com/@krisgage/15-signs-of-emotional-maturity-38b1a2ab9766" target="_blank">link)
- Passionate about the possibilities at the intersection of AI + Banking
- Worked in a startup of 5 to 30 employees
- Developer with a strong interest in systems Design. You will be building, maintaining, and scaling our cloud infrastructure through software tooling and automation.
- 4-8 years of industry experience developing and troubleshooting large-scale infrastructure on the cloud
- Have a solid understanding of system availability, latency, and performance
- Strong programming skills in at least one major programming language and the ability to learn new languages as needed
- Strong System/network debugging skills
- Experience with management/automation tools such as Terraform/Puppet/Chef/SALT
- Experience with setting up production-level monitoring and telemetry
- Expertise in Container management & AWS
- Experience with kubernetes is a plus
- Experience building CI/CD pipelines
- Experience working with Web sockets, Redis, Postgres, Elastic search, Logstash
- Experience working in an agile team environment and proficient understanding of code versioning tools, such as Git.
- Ability to effectively articulate technical challenges and solutions.
- Proactive outlook for ways to make our systems more reliable
Responsibilities
- Building and maintenance of resilient and scalable production infrastructure
- Improvement of monitoring systems
- Creation and support of development automation processes (CI / CD)
- Participation in infrastructure development
- Detection of problems in architecture and proposing of solutions for solving them
- Creation of tasks for system improvements for system scalability, performance and monitoring
- Analysis of product requirements in the aspect of devops
- Managing a team of DevOps, control of task deliveries
- Incident analysis and fixing
Technology stack
Linux, Bash, Salt/Ansible, LXC, libvirt, IPsec, VXLAN, Open vSwitch, OpenVPN, OSPF, BIRD, Cisco NX-OS, Multicast, PIM, LVM, software RAID, LUKS, PostgreSQL, nginx, haproxy, Prometheus, Grafana, Zabbix, GitLab, Capistrano
Skills and Experience
- Understanding of the distributed systems principles
- Understanding of principles for building a resistant network infrastructure
- Experience of Ubuntu Linux administration (Debian-like will be a plus)
- Strong knowledge of Bash
- Experience of working with LXC-containers
- Understanding and experience with infrastructure as a code approach
- Experience of development idempotent Ansible roles
- Experience with relational databases (PostgeSQL), ability to create simple SQL queries
- Experience with git
- Experience with monitoring and metric collect systems (Prometheus, Grafana, Zabbix)
- Understanding of dynamic routing (OSPF)
Preferred experience
- Experience of working with highload zero-downtown environments
- Experience of coding on Python
- Experience of working with IPsec, VXLAN, Open vSwitch
- Knowledge and experience of working with network equipment Cisco
- Experience of working with Cisco NX-OS
- Knowledge of principles of multicast protocols IGMP, PIM
- Experience of setting multicast on Cisco equipment
- Experience of working with Solarflare Onload
- Experience administering Atlassian products
Minimum 4 years exp
Skillsets:
- Build automation/CI: Jenkins
- Secure repositories: Artifactory, Nexus
- Build technologies: Maven, Gradle
- Development Languages: Python, Java, C#, Node, Angular, React/Redux
- SCM systems: Git, Github, Bitbucket
- Code Quality: Fisheye, Crucible, SonarQube
- Configuration Management: Packer, Ansible, Puppet, Chef
- Deployment: uDeploy, XLDeploy
- Containerization: Kubernetes, Docker, PCF, OpenShift
- Automation frameworks: Selenium, TestNG, Robot
- Work Management: JAMA, Jira
- Strong problem solving skills, Good verbal and written communication skills
- Good knowledge of Linux environment: RedHat etc.
- Good in shell scripting
- Good to have Cloud Technology : AWS, GCP and Azure
Opening for a Java Developer with Devops experience
Experience required: 5 yrs to 10 yrs
Essential Required Skills:
Familiarity with Version Control such as GitHub, BitBucket
- Java programmer(Liferay, Alfresco will add plus point)
- AWS
- OPs(ansible, apache, python, terraform)
- Effective communication skills
- An analytical bent of mind and problem-solving aptitude
- Good time management skills
- Curiosity for learning
- Patience
Roles & Responsibilities:
- Candidate with good hand on exposure on AWS, Cloud, Devops, Ansible, Docker, Jekins.
- Strong proficiency in Linux, Open Source, Web based and Cloud based environments (ability to use open source technologies and tools)
- Strong scripting and automation (bash, Perl, common Linux utils), strong grasp of automation tools a plus.
- Strong debugging skills (OS, scripting, Web based technologies), SQL, Java and Database concepts are a plus
- Apache, nginx, git, svn, GNU tools
- Must have exposure on Grep, awk, sed, Git, svn
- Scripting (bash, python)
- API related skills (REST, and any other like google, aws, atlassian)
- Web based technology
- Strong Unix Skills
- Java programmer, Coding (Springboot, Microservices, Liferay, Alfresco will add plus point)
- Proficient in AWS
- Ops (ansible, apache, python, terraform)
Benefits
- Cash Rewards & Recognition on Monthly Basis
- Work-Life Balance (Flexible Working Hours)
- Five-Day Work Week