|
Job Title: |
Senior Cloud Infrastructure Engineer (AWS) |
||
|
Department & Team |
Technology |
Location: |
India /UK / Ukraine |
|
Reporting To: |
Infrastructure Services Manager |
|
Role Purpose: |
|
The purpose of the role is to ensure high systems availability across a multi-cloud environment, enabling the business to continue meeting its objectives.
This role will be mostly AWS / Linux focused but will include a requirement to understand comparative solutions in Azure.
Desire to maintain full hands-on status but to add Team Lead responsibilities in future
Client’s cloud strategy is based around a dual vendor solutioning model, utilising AWS and Azure services. This enables us to access more technologies and helps mitigate risks across our infrastructure.
The Infrastructure Services Team is responsible for the delivery and support of all infrastructure used by Client twenty-four hours a day, seven days a week. The team’s primary function is to install, maintain, and implement all infrastructure-based systems, both On Premise and Cloud Hosted. The Infrastructure Services group already consists of three teams:
1. Network Services Team – Responsible for IP Network and its associated components 2. Platform Services Team – Responsible for Server and Storage systems 3. Database Services Team – Responsible for all Databases
This role will report directly into the Infrastructure Services Manager and will have responsibility for the day to day running of the multi-cloud environment, as well as playing a key part in designing best practise solutions. It will enable the Client business to achieve its stated objectives by playing a key role in the Infrastructure Services Team to achieve world class benchmarks of customer service and support.
|
|
Responsibilities: |
|
Operations · Deliver end to end technical and user support across all platforms (On-premise, Azure, AWS) · Day to day, fully hands-on OS management responsibilities (Windows and Linux operating systems) · Ensure robust server patching schedules are in place and meticulously followed to help reduce security related incidents. · Contribute to continuous improvement efforts around cost optimisation, security enhancement, performance optimisation, operational efficiency and innovation. · Take an ownership role in delivering technical projects, ensuring best practise methods are followed. · Design and deliver solutions around the concept of “Planning for Failure”. Ensure all solutions are deployed to withstand system / AZ failure. · Work closely with Cloud Architects / Infrastructure Services Manager to identify and eliminate “waste” across cloud platforms. · Assist several internal DevOps teams with day to day running of pipeline management and drive standardisation where possible. · Ensure all Client data in all forms are backed up in a cost-efficient way. · Use the appropriate monitoring tools to ensure all cloud / on-premise services are continuously monitored. · Drive utilisation of most efficient methods of resource deployment (Terraform, CloudFormation, Bootstrap) · Drive the adoption, across the business, of serverless / open source / cloud native technologies where applicable. · Ensure system documentation remains up to date and designed according to AWS/Azure best practise templates. · Participate in detailed architectural discussions, calling on internal/external subject matter experts as needed, to ensure solutions are designed for successful deployment. · Take part in regular discussions with business executives to translate their needs into technical and operational plans. · Engaging with vendors regularly in terms of verifying solutions and troubleshooting issues. · Designing and delivering technology workshops to other departments in the business. · Takes initiatives for improvement of service delivery. · Ensure that Client delivers a service that resonates with customer’s expectations, which sets Client apart from its competitors. · Help design necessary infrastructure and processes to support the recovery of critical technology and systems in line with contingency plans for the business. · Continually assess working practices and review these with a view to improving quality and reducing costs. · Champions the new technology case and ensure new technologies are investigated and proposals put forward regarding suitability and benefit. · Motivate and inspire the rest of the infrastructure team and undertake necessary steps to raise competence and capability as required. · Help develop a culture of ownership and quality throughout the Infrastructure Services team.
|
|
Skills & Experience: |
|
· AWS Certified Solutions Architect – Professional - REQUIRED · Microsoft Azure Fundamentals AZ-900 – REQUIRED AS MINIMUM AZURE CERT · Red Hat Certified Engineer (RHCE ) - REQUIRED · Must be able to demonstrate working knowledge of designing, implementing and maintaining best practise AWS solutions. (To lesser extend Azure) · Proven examples of ownership of large AWS project implementations in Enterprise settings. · Experience managing the monitoring of infrastructure / applications using tools including CloudWatch, Solarwinds, New Relic, etc. · Must have practical working knowledge of driving cost optimisation, security enhancement and performance optimisation. · Solid understanding and experience of transitioning IaaS solutions to serverless technology · Must have working production knowledge of deploying infrastructure as code using Terraform. · Need to be able to demonstrate security best-practise when designing solutions in AWS. · Working knowledge around optimising network traffic performance an delivering high availability while keeping a check on costs. · Working experience of ‘On Premise to Cloud’ migrations · Experience of Data Centre technology infrastructure development and management · Must have experience working in a DevOps environment · Good working knowledge around WAN connectivity and how this interacts with the various entry point options into AWS, Azure, etc. · Working knowledge of Server and Storage Devices · Working knowledge of MySQL and SQL Server / Cloud native databases (RDS / Aurora) · Experience of Carrier Grade Networking - On Prem and Cloud · Experience in virtualisation technologies · Experience in ITIL and Project management · Providing senior support to the Service Delivery team. · Good understanding of new and emerging technologies · Excellent presentation skills to both an internal and external audience · The ability to share your specific expertise to the rest of the Technology group · Experience with MVNO or Network Operations background from within the Telecoms industry. (Optional) · Working knowledge of one or more European languages (Optional)
|
|
Behavioural Fit: |
|
· Professional appearance and manner · High personal drive; results oriented; makes things happen; “can do attitude” · Can work and adapt within a highly dynamic and growing environment · Team Player; effective at building close working relationships with others · Effectively manages diversity within the workplace · Strong focus on service delivery and the needs and satisfaction of internal clients · Able to see issues from a global, regional and corporate perspective · Able to effectively plan and manage large projects · Excellent communication skills and interpersonal skills at all levels · Strong analytical, presentation and training skills · Innovative and creative · Demonstrates technical leadership · Visionary and strategic view of technology enablers (creative and innovative) · High verbal and written communication ability, able to influence effectively at all levels · Possesses technical expertise and knowledge to lead by example and input into technical debates · Depth and breadth of experience in infrastructure technologies · Enterprise mentality and global mindset · Sense of humour
|
|
Role Key Performance Indicators: |
|
· Design and deliver repeatable, best in class, cloud solutions. · Pro-actively monitor service quality and take action to scale operational services, in line with business growth. · Generate operating efficiencies, to be agreed with Infrastructure Services Manager. · Establish a “best in sector” level of operational service delivery and insight. · Help create an effective team. |

About Jobdost
About
Connect with the team
Company social profiles
Similar jobs
Key Qualifications :
- At least 2 years of hands-on experience with cloud infrastructure on AWS or GCP
- Exposure to configuration management and orchestration tools at scale (e.g. Terraform, Ansible, Packer)
- Knowledge in DevOps tools (e.g. Jenkins, Groovy, and Gradle)
- Familiarity with monitoring and alerting tools(e.g. CloudWatch, ELK stack, Prometheus)
- Proven ability to work independently or as an integral member of a team
Preferable Skills :
- Familiarity with standard IT security practices such as encryption, credentials and key management
- Proven ability to acquire various coding languages (Java, Python- ) to support DevOps operation and cloud transformation
- Familiarity in web standards (e.g. REST APIs, web security mechanisms)
- Multi-cloud management experience with GCP / Azure
- Experience in performance tuning, services outage management and troubleshooting
Preferred Education & Experience: •
Bachelor’s or master’s degree in Computer Engineering,
Computer Science, Computer Applications, Mathematics, Statistics or related technical field or
equivalent practical experience. Relevant experience of at least 3 years in lieu of above if from a different stream of education.
• Well-versed in DevOps principals & practices and hands-on DevOps
tool-chain integration experience: Release Orchestration & Automation, Source Code & Build
Management, Code Quality & Security Management, Behavior Driven Development, Test Driven
Development, Continuous Integration, Continuous Delivery, Continuous Deployment, and
Operational Monitoring & Management; extra points if you can demonstrate your knowledge with
working examples.
• Hands-on experience with demonstrable working experience with DevOps tools
and platforms viz., Slack, Jira, GIT, Jenkins, Code Quality & Security Plugins, Maven, Artifactory,
Terraform, Ansible/Chef/Puppet, Spinnaker, Tekton, StackStorm, Prometheus, Grafana, ELK,
PagerDuty, VictorOps, etc.
• Well-versed in Virtualization & Containerization; must demonstrate
experience in technologies such as Kubernetes, Istio, Docker, OpenShift, Anthos, Oracle VirtualBox,
Vagrant, etc.
• Well-versed in AWS and/or Azure or and/or Google Cloud; must demonstrate
experience in at least FIVE (5) services offered under AWS and/or Azure or and/or Google Cloud in
any categories: Compute or Storage, Database, Networking & Content Delivery, Management &
Governance, Analytics, Security, Identity, & Compliance (or) equivalent demonstratable Cloud
Platform experience.
• Well-versed with demonstrable working experience with API Management,
API Gateway, Service Mesh, Identity & Access Management, Data Protection & Encryption, tools &
platforms.
• Hands-on programming experience in either core Java and/or Python and/or JavaScript
and/or Scala; freshers passing out of college or lateral movers into IT must be able to code in
languages they have studied.
• Well-versed with Storage, Networks and Storage Networking basics
which will enable you to work in a Cloud environment.
• Well-versed with Network, Data, and
Application Security basics which will enable you to work in a Cloud as well as Business
Applications / API services environment.
• Extra points if you are certified in AWS and/or Azure
and/or Google Cloud.
Required Experience: 5+ Years
Job Location: Remote/Pune
Objectives :
- Building and setting up new development tools and infrastructure
- Working on ways to automate and improve development and release processes
- Testing code written by others and analyzing results
- Ensuring that systems are safe and secure against cybersecurity threats
- Identifying technical problems and developing software updates and ‘fixes’
- Working with software developers and software engineers to ensure that development follows established processes and works as intended
- Planning out projects and being involved in project management decisions
Daily and Monthly Responsibilities :
- Deploy updates and fixes
- Build tools to reduce occurrences of errors and improve customer experience
- Develop software to integrate with internal back-end systems
- Perform root cause analysis for production errors
- Investigate and resolve technical issues
- Develop scripts to automate visualization
- Design procedures for system troubleshooting and maintenance
Skills and Qualifications :
- Degree in Computer Science or Software Engineering or BSc in Computer Science, Engineering or relevant field
- 3+ years of experience as a DevOps Engineer or similar software engineering role
- Proficient with git and git workflows
- Good logical skills and knowledge of programming concepts(OOPS,Data Structures)
- Working knowledge of databases and SQL
- Problem-solving attitude
- Collaborative team spirit
- Hands on experience in AWS provisioning of AWS services like EC2, S3,EBS, AMI, VPC, ELB, RDS, Auto scaling groups, Cloud Formation.
- Good experience on Build and release process and extensively involved in the CICD using
Jenkins
- Experienced on configuration management tools like Ansible.
- Designing, implementing and supporting fully automated Jenkins CI/CD
- Extensively worked on Jenkins for continuous Integration and for end to end Automation for all Builds and Deployments.
- Proficient with Docker based container deployments to create shelf environments for dev teams and containerization of environment delivery for releases.
- Experience working on Docker hub, creating Docker images and handling multiple images primarily for middleware installations and domain configuration.
- Good knowledge in version control system in Git and GitHub.
- Good experience in build tools
- Implemented CI/CD pipeline using Jenkins, Ansible, Docker, Kubernetes ,YAML and Manifest
A network of the world's best developers - full-time, long-term remote software jobs with better compensation and career growth. We enable our clients to accelerate their Cloud Offering, and Capitalize on Cloud. We have our own IOT/AI platform and we provide professional services on that platform to build custom clouds for their IOT devices. We also build mobile apps, run 24x7 devops/site reliability engineering for our clients.
This person MUST have:
- B.E Computer Science or equivalent
- 2+ Years of hands-on experience troubleshooting/setting up of the Linux environment, who can write shell scripts for any given requirement.
- 1+ Years of hands-on experience setting up/configuring AWS or GCP services from SCRATCH and maintaining them.
- 1+ Years of hands-on experience setting up/configuring Kubernetes & EKS and ensuring high availability of container orchestration.
- 1+ Years of hands-on experience setting up CICD from SCRATCH in Jenkins & Gitlab.
- Experience configuring/maintaining one monitoring tool.
- Excellent verbal & written communication skills.
- Candidates with certifications - AWS, GCP, CKA, etc will be preferred
- Hands-on experience with databases (Cassandra, MongoDB, MySQL, RDS).
Experience:
- Min 3 years of experience as SRE automation engineer building, running, and maintaining production sites. Not looking for candidates who have experience only as L1/L2 or Build & Deploy..
Location:
- Remotely, anywhere in India
Timings:
- The person is expected to deliver with both high speed and high quality as well as work for 40 Hours per week (~6.5 hours per day, 6 days per week) in shifts which will rotate every month.
Position:
- Full time/Direct
- We have great benefits such as PF, medical insurance, 12 annual company holidays, 12 PTO leaves per year, annual increments, Diwali bonus, spot bonuses and other incentives etc.
- We dont believe in locking in people with large notice periods. You will stay here because you love the company. We have only a 15 days notice period.
Roles and Responsibilities
● Managing Availability, Performance, Capacity of infrastructure and applications.
● Building and implementing observability for applications health/performance/capacity.
● Optimizing On-call rotations and processes.
● Documenting “tribal” knowledge.
● Managing Infra-platforms like
- Mesos/Kubernetes
- CICD
- Observability(Prometheus/New Relic/ELK)
- Cloud Platforms ( AWS/ Azure )
- Databases
- Data Platforms Infrastructure
● Providing help in onboarding new services with the production readiness review process.
● Providing reports on services SLO/Error Budgets/Alerts and Operational Overhead.
● Working with Dev and Product teams to define SLO/Error Budgets/Alerts.
● Working with the Dev team to have an in-depth understanding of the application architecture and its bottlenecks.
● Identifying observability gaps in product services, infrastructure and working with stake owners to fix it.
● Managing Outages and doing detailed RCA with developers and identifying ways to avoid that situation.
● Managing/Automating upgrades of the infrastructure services.
● Automate toil work.
Experience & Skills
● 3+ Years of experience as an SRE/DevOps/Infrastructure Engineer on large scale microservices and infrastructure.
● A collaborative spirit with the ability to work across disciplines to influence, learn, and deliver.
● A deep understanding of computer science, software development, and networking principles.
● Demonstrated experience with languages, such as Python, Java, Golang etc.
● Extensive experience with Linux administration and good understanding of the various linux kernel subsystems (memory, storage, network etc).
● Extensive experience in DNS, TCP/IP, UDP, GRPC, Routing and Load Balancing.
● Expertise in GitOps, Infrastructure as a Code tools such as Terraform etc.. and Configuration Management Tools such as Chef, Puppet, Saltstack, Ansible.
● Expertise of Amazon Web Services (AWS) and/or other relevant Cloud Infrastructure solutions like Microsoft Azure or Google Cloud.
● Experience in building CI/CD solutions with tools such as Jenkins, GitLab, Spinnaker, Argo etc.
● Experience in managing and deploying containerized environments using Docker,
Mesos/Kubernetes is a plus.
● Experience with multiple datastores is a plus (MySQL, PostgreSQL, Aerospike,
Couchbase, Scylla, Cassandra, Elasticsearch).
● Experience with data platforms tech stacks like Hadoop, Hive, Presto etc is a plus









