- JD: • 10+ years of overall industry experience
• 5+ years of cloud experience
• 2+ years of architect experience
• Varied background preferred between systems and development
o Experience working with applications, not pure infra experience
• Azure experience – strong background using Azure for application migrations
• Terraform experience – should mention automation technologies in job experience
• Hands on experience delivering in the cloud
• Must have job experience designing solutions for customers
• IaaS Cloud architect
workload migrations to AWS and/or Azure
• Security architecture considerations experience
• CI/CD experience
• Proven applications migration track of record.
Similar jobs
Responsibilities
Provisioning and de-provisioning AWS accounts for internal customers
Work alongside systems and development teams to support the transition and operation of client websites/applications in and out of AWS.
Deploying, managing, and operating AWS environments
Identifying appropriate use of AWS operational best practices
Estimating AWS costs and identifying operational cost control mechanisms
Keep technical documentation up to date
Proactively keep up to date on AWS services and developments
Create (where appropriate) automation, in order to streamline provisioning and de-provisioning processes
Lead certain data/service migration projects
Job Requirements
Experience provisioning, operating, and maintaining systems running on AWS
Experience with Azure/AWS.
Capabilities to provide AWS operations and deployment guidance and best practices throughout the lifecycle of a project
Experience with application/data migration to/from AWS
Experience with NGINX and the HTTP protocol.
Experience with configuration and management software such as GIT Strong analytical and problem-solving skills
Deployment experience using common AWS technologies like VPC, and regionally distributed EC2 instances, Docker, and more.
Ability to work in a collaborative environment
Detail-oriented, strong work ethic and high standard of excellence
A fast learner, the Achiever, sets high personal goals
Must be able to work on multiple projects and consistently meet project deadlines
Now, more than ever, the Toast team is committed to our customers. We’re taking steps to help restaurants navigate these unprecedented times with technology, resources, and community. Our focus is on building a restaurant platform that helps restaurants adapt, take control, and get back to what they do best: building the businesses they love. And because our technology is purpose-built for restaurants by restaurant people, restaurants can trust that we’ll deliver on their needs for today while investing in experiences that will power their restaurant of the future.
At Toast, our Site Reliability Engineers (SREs) are responsible for keeping all customer-facing services and other Toast production systems running smoothly. SREs are a blend of pragmatic operators and software craftspeople who apply sound software engineering principles, operational discipline, and mature automation to our environments and our codebase. Our decisions are based on instrumentation and continuous observability, as well as predictions and capacity planning.
About this roll* (Responsibilities)
- Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding
- Partner with development teams to improve services through rigorous testing and release procedures
- Participate in system design consulting, platform management, and capacity planning
- Create sustainable systems and services through automation and uplift
- Balance feature development speed and reliability with well-defined service level objectives
Troubleshooting and Supporting Escalations:
- Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding
- Diagnose performance bottlenecks and implement optimizations across infrastructure, databases, web, and mobile applications
- Implement strategies to increase system reliability and performance through on-call rotation and process optimization
- Perform and run blameless RCAs on incidents and outages aggressively, looking for answers that will prevent the incident from ever happening again
Do you have the right ingredients? (Requirements)
- Extensive industry experience with at least 7+ years in SRE and/or DevOps roles
- Polyglot technologist/generalist with a thirst for learning
- Deep understanding of cloud and microservice architecture and the JVM
- Experience with tools such as APM, Terraform, Ansible, GitHub, Jenkins, and Docker
- Experience developing software or software projects in at least four languages, ideally including two of Go, Python, and Java
- Experience with cloud computing technologies ( AWS cloud provider preferred)
Bread puns are encouraged but not required
Roles and Responsibilities:
• Gather and analyse cloud infrastructure requirements
• Automating system tasks and infrastructure using a scripting language (Shell/Python/Ruby
preferred), with configuration management tools (Ansible/ Puppet/Chef), service registry and
discovery tools (Consul and Vault, etc), infrastructure orchestration tools (Terraform,
CloudFormation), and automated imaging tools (Packer)
• Support existing infrastructure, analyse problem areas and come up with solutions
• An eye for monitoring – the candidate should be able to look at complex infrastructure and be
able to figure out what to monitor and how.
• Work along with the Engineering team to help out with Infrastructure / Network automation needs.
• Deploy infrastructure as code and automate as much as possible
• Manage a team of DevOps
Desired Profile:
• Understanding of provisioning of Bare Metal and Virtual Machines
• Working knowledge of Configuration management tools like Ansible/ Chef/ Puppet, Redfish.
• Experience in scripting languages like Ruby/ Python/ Shell Scripting
• Working knowledge of IP networking, VPN's, DNS, load balancing, firewalling & IPS concepts
• Strong Linux/Unix administration skills.
• Self-starter who can implement with minimal guidance
• Hands-on experience setting up CICD from SCRATCH in Jenkins
• Experience with Managing K8s infrastructure
4 – 6 years of application development with design, development, implementation, and
support experience, including the following:
o C#
o JavaScript
o HTML
o SQL
o Messaging/RabbitMQ
o Asynchronous communication patterns
Experience with Visual Studio and Git
A working understanding of build and release automation, preferably with Azure DevOps
Excellent understanding of object-oriented concepts and .Net framework
Experience in creating reusable libraries in C#
Ability to troubleshoot and isolate/solve complex bugs, connectivity issues, or OS related
issues
Ability to write complex SQL queries and stored procedures in Oracle and/or MS SQL
Proven ability to use design patterns to accomplish scalable architecture
Understanding of event-driven architecture
Experience with message brokers such as RabbitMQ
Experience in the development of REST APIs
Understanding of basic steps of an Agile SDLC
Excellent communication (both written and verbal) and interpersonal skills
Demonstrated accountability and ownership of assigned tasks
Demonstrated leadership and ability to work as a leader on large and complex tasks
DevOps Architect
Experience: 10 - 12+ year relevant experience on DevOps
Locations : Bangalore, Chennai, Pune, Hyderabad, Jaipur.
Qualification:
• Bachelors or advanced degree in Computer science, Software engineering or equivalent is required.
• Certifications in specific areas are desired
Technical Skillset: Skills Proficiency level
- Build tools (Ant or Maven) - Expert
- CI/CD tool (Jenkins or Github CI/CD) - Expert
- Cloud DevOps (AWS CodeBuild, CodeDeploy, Code Pipeline etc) or Azure DevOps. - Expert
- Infrastructure As Code (Terraform, Helm charts etc.) - Expert
- Containerization (Docker, Docker Registry) - Expert
- Scripting (linux) - Expert
- Cluster deployment (Kubernetes) & maintenance - Expert
- Programming (Java) - Intermediate
- Application Types for DevOps (Streaming like Spark, Kafka, Big data like Hadoop etc) - Expert
- Artifactory (JFrog) - Expert
- Monitoring & Reporting (Prometheus, Grafana, PagerDuty etc.) - Expert
- Ansible, MySQL, PostgreSQL - Intermediate
• Source Control (like Git, Bitbucket, Svn, VSTS etc)
• Continuous Integration (like Jenkins, Bamboo, VSTS )
• Infrastructure Automation (like Puppet, Chef, Ansible)
• Deployment Automation & Orchestration (like Jenkins, VSTS, Octopus Deploy)
• Container Concepts (Docker)
• Orchestration (Kubernetes, Mesos, Swarm)
• Cloud (like AWS, Azure, GoogleCloud, Openstack)
Roles and Responsibilities
• DevOps architect should automate the process with proper tools.
• Developing appropriate DevOps channels throughout the organization.
• Evaluating, implementing and streamlining DevOps practices.
• Establishing a continuous build environment to accelerate software deployment and development processes.
• Engineering general and effective processes.
• Helping operation and developers teams to solve their problems.
• Supervising, Examining and Handling technical operations.
• Providing a DevOps Process and Operations.
• Capacity to handle teams with leadership attitude.
• Must possess excellent automation skills and the ability to drive initiatives to automate processes.
• Building strong cross-functional leadership skills and working together with the operations and engineering teams to make sure that systems are scalable and secure.
• Excellent knowledge of software development and software testing methodologies along with configuration management practices in Unix and Linux-based environment.
• Possess sound knowledge of cloud-based environments.
• Experience in handling automated deployment CI/CD tools.
• Must possess excellent knowledge of infrastructure automation tools (Ansible, Chef, and Puppet).
• Hand on experience in working with Amazon Web Services (AWS).
• Must have strong expertise in operating Linux/Unix environments and scripting languages like Python, Perl, and Shell.
• Ability to review deployment and delivery pipelines i.e., implement initiatives to minimize chances of failure, identify bottlenecks and troubleshoot issues.
• Previous experience in implementing continuous delivery and DevOps solutions.
• Experience in designing and building solutions to move data and process it.
• Must possess expertise in any of the coding languages depending on the nature of the job.
• Experience with containers and container orchestration tools (AKS, EKS, OpenShift, Kubernetes, etc)
• Experience with version control systems a must (GIT an advantage)
• Belief in "Infrastructure as a Code"(IaaC), including experience with open-source tools such as terraform
• Treats best practices for security as a requirement, not an afterthought
• Extensive experience with version control systems like GitLab and their use in release management, branching, merging, and integration strategies
• Experience working with Agile software development methodologies
• Proven ability to work on cross-functional Agile teams
• Mentor other engineers in best practices to improve their skills
• Creating suitable DevOps channels across the organization.
• Designing efficient practices.
• Delivering comprehensive best practices.
• Managing and reviewing technical operations.
• Ability to work independently and as part of a team.
• Exceptional communication skills, be knowledgeable about the latest industry trends, and highly innovative
Hands on experience in:
- Deploying, managing, securing and patching enterprise applications on large scale in Cloud preferably AWS.
- Experience leading End-to-end DevOps projects with modern tools encompassing both Applications and Infrastructure
- AWS Code deploy, Code build, Jenkins, Sonarqube.
- Incident management and root cause analysis.
- Strong understanding of immutable infrastructure and infrastructure as code concepts. Participate in capacity planning and provisioning of new resources. Importing already deployed infra into IaaC.
- Utilizing AWS cloud services such as EC2, S3, IAM, Route53, RDS, VPC, NAT/IG Gateway, LAMBDA, Load Balancers, CloudWatch, API Gateway are some of them.
- AWS ECS managing multi cluster container environments (ECS with EC2 and Fargate with service discovery using Route53)
- Monitoring/analytics tools like Nagios/DataDog and logging tools like LogStash/SumoLogic
- Simple Notification Service (SNS)
- Version Control System: Git, Gitlab, Bitbucket
- Participate in Security Audit of Cloud Infrastructure.
- Exceptional documentation and communication skills.
- Ready to work in Shift
- Knowledge of Akamai is Plus.
- Microsoft Azure is Plus
- Adobe AEM is plus.
- AWS Certified DevOps Professional is plus
- 7+ years of experience in System Administration, Networking, Automation, Monitoring
- Excellent problem solving, analytical skills and technical troubleshooting skills
- Experience managing systems deployed in public cloud platforms (Microsoft Azure, AWS or Google Cloud)
- Experience implementing and maintaining CI/CD pipelines (Jenkins, Concourse, etc.)
- Linux experience, flavours: Ubuntu, Redhat, CentOS (sysadmin, bash scripting)
- Experience setting up monitoring (Datadog, Splunk, etc.)
- Experience in Infrastructure Automation tools like Terraform
- Experience in Package Manager for Kubernetes like Helm Charts
- Experience with databases and data storage (Oracle, MongoDB, Postgres SQL, ELK stack)
- Experience with Docker
- Experience with orchestration technologies (Kubernetes or DC/OS)
- Familiar with Agile Software Development
- Automate deployments of infrastructure components and repetitive tasks.
- Drive changes strictly via the infrastructure-as-code methodology.
- Promote the use of source control for all changes including application and system-level changes.
- Design & Implement self-recovering systems after failure events.
- Participate in system sizing and capacity planning of various components.
- Create and maintain technical documents such as installation/upgrade MOPs.
- Coordinate & collaborate with internal teams to facilitate installation & upgrades of systems.
- Support 24x7 availability for corporate sites & tools.
- Participate in rotating on-call schedules.
- Actively involved in researching, evaluating & selecting new tools & technologies.
- Cloud computing – AWS, OCI, OpenStack
- Automation/Configuration management tools such as Terraform & Chef
- Atlassian tools administration (JIRA, Confluence, Bamboo, Bitbucket)
- Scripting languages - Ruby, Python, Bash
- Systems administration experience – Linux (Redhat), Mac, Windows
- SCM systems - Git
- Build tools - Maven, Gradle, Ant, Make
- Networking concepts - TCP/IP, Load balancing, Firewall
- High-Availability, Redundancy & Failover concepts
- SQL scripting & queries - DML, DDL, stored procedures
- Decisive and ability to work under pressure
- Prioritizing workload and multi-tasking ability
- Excellent written and verbal communication skills
- Database systems – Postgres, Oracle, or other RDBMS
- Mac automation tools - JAMF or other
- Atlassian Datacenter products
- Project management skills
Qualifications
- 3+ years of hands-on experience in the field or related area
- Requires MS or BS in Computer Science or equivalent field
- Develop and Maintain IAC using Terraform and Ansible
- Draft design documents that translate requirements into code.
- Deal with challenges associated with scale.
- Assume responsibilities from technical design through technical client support.
- Manage expectations with internal stakeholders and context-switch in a fast paced environment.
- Thrive in an environment that uses Elasticsearch extensively.
- Keep abreast of technology and contribute to the engineering strategy.
- Champion best development practices and provide mentorship
An AWS Certified Engineer with strong skills in
- Terraform o Ansible
- *nix and shell scripting
- Elasticsearch
- Circle CI
- CloudFormation
- Python
- Packer
- Docker
- Prometheus and Grafana
- Challenges of scale
- Production support
- Sharp analytical and problem-solving skills.
- Strong sense of ownership.
- Demonstrable desire to learn and grow.
- Excellent written and oral communication skills.
- Mature collaboration and mentoring abilities.
At Karza technologies, we take pride in building one of the most comprehensive digital onboarding & due-diligence platforms by profiling millions of entities and trillions of associations amongst them using data collated from more than 700 publicly available government sources. Primarily in the B2B Fintech Enterprise space, we are headquartered in Mumbai in Lower Parel with 100+ strong workforce. We are truly furthering the cause of Digital India by providing the entire BFSI ecosystem with tech products and services that aid onboarding customers, automating processes and mitigating risks seamlessly, in real-time and at fraction of the current cost.
A few recognitions:
- Recognized as Top25 startups in India to work with 2019 by LinkedIn
- Winner of HDFC Bank's Digital Innovation Summit 2020
- Super Winners (Won every category) at Tecnoviti 2020 by Banking Frontiers
- Winner of Amazon AI Award 2019 for Fintech
- Winner of FinTech Spot Pitches at Fintegrate Zone 2018 held at BSE
- Winner of FinShare 2018 challenge held by ShareKhan
- Only startup in Yes Bank Global Fintech Accelerator to win the account during the Cohort
- 2nd place Citi India FinTech Challenge 2018 by Citibank
- Top 3 in Viacom18's Startup Engagement Programme VStEP
What your average day would look like:
- Deploy and maintain mission-critical information extraction, analysis, and management systems
- Manage low cost, scalable streaming data pipelines
- Provide direct and responsive support for urgent production issues
- Contribute ideas towards secure and reliable Cloud architecture
- Use open source technologies and tools to accomplish specific use cases encountered within the project
- Use coding languages or scripting methodologies to solve automation problems
- Collaborate with others on the project to brainstorm about the best way to tackle a complex infrastructure, security, or deployment problem
- Identify processes and practices to streamline development & deployment to minimize downtime and maximize turnaround time
What you need to work with us:
- Proficiency in at least one of the general-purpose programming languages like Python, Java, etc.
- Experience in managing the IAAS and PAAS components on popular public Cloud Service Providers like AWS, Azure, GCP etc.
- Proficiency in Unix Operating systems and comfortable with Networking concepts
- Experience with developing/deploying a scalable system
- Experience with the Distributed Database & Message Queues (like Cassandra, ElasticSearch, MongoDB, Kafka, etc.)
- Experience in managing Hadoop clusters
- Understanding of containers and have managed them in production using container orchestration services.
- Solid understanding of data structures and algorithms.
- Applied exposure to continuous delivery pipelines (CI/CD).
- Keen interest and proven track record in automation and cost optimization.
Experience:
- 1-4 years of relevant experience
- BE in Computer Science / Information Technology