A network of the world's best developers - full-time, long-term remote software jobs with better compensation and career growth. We enable our clients to accelerate their Cloud Offering, and Capitalize on Cloud. We have our own IOT/AI platform and we provide professional services on that platform to build custom clouds for their IOT devices. We also build mobile apps, run 24x7 devops/site reliability engineering for our clients.
This person MUST have:
- B.E Computer Science or equivalent
- 2+ Years of hands-on experience troubleshooting/setting up of the Linux environment, who can write shell scripts for any given requirement.
- 1+ Years of hands-on experience setting up/configuring AWS or GCP services from SCRATCH and maintaining them.
- 1+ Years of hands-on experience setting up/configuring Kubernetes & EKS and ensuring high availability of container orchestration.
- 1+ Years of hands-on experience setting up CICD from SCRATCH in Jenkins & Gitlab.
- Experience configuring/maintaining one monitoring tool.
- Excellent verbal & written communication skills.
- Candidates with certifications - AWS, GCP, CKA, etc will be preferred
- Hands-on experience with databases (Cassandra, MongoDB, MySQL, RDS).
Experience:
- Min 3 years of experience as SRE automation engineer building, running, and maintaining production sites. Not looking for candidates who have experience only as L1/L2 or Build & Deploy..
Location:
- Remotely, anywhere in India
Timings:
- The person is expected to deliver with both high speed and high quality as well as work for 40 Hours per week (~6.5 hours per day, 6 days per week) in shifts which will rotate every month.
Position:
- Full time/Direct
- We have great benefits such as PF, medical insurance, 12 annual company holidays, 12 PTO leaves per year, annual increments, Diwali bonus, spot bonuses and other incentives etc.
- We dont believe in locking in people with large notice periods. You will stay here because you love the company. We have only a 15 days notice period.
Similar jobs
Responsibilities
Provisioning and de-provisioning AWS accounts for internal customers
Work alongside systems and development teams to support the transition and operation of client websites/applications in and out of AWS.
Deploying, managing, and operating AWS environments
Identifying appropriate use of AWS operational best practices
Estimating AWS costs and identifying operational cost control mechanisms
Keep technical documentation up to date
Proactively keep up to date on AWS services and developments
Create (where appropriate) automation, in order to streamline provisioning and de-provisioning processes
Lead certain data/service migration projects
Job Requirements
Experience provisioning, operating, and maintaining systems running on AWS
Experience with Azure/AWS.
Capabilities to provide AWS operations and deployment guidance and best practices throughout the lifecycle of a project
Experience with application/data migration to/from AWS
Experience with NGINX and the HTTP protocol.
Experience with configuration and management software such as GIT Strong analytical and problem-solving skills
Deployment experience using common AWS technologies like VPC, and regionally distributed EC2 instances, Docker, and more.
Ability to work in a collaborative environment
Detail-oriented, strong work ethic and high standard of excellence
A fast learner, the Achiever, sets high personal goals
Must be able to work on multiple projects and consistently meet project deadlines
2. Kubernetes Engineer
DevOps Systems Engineer with experience in Docker Containers, Docker Swarm, Docker Compose, Ansible, Jenkins and other tools. As part of this role she/he must work with the container best practices in design, development and implementation.
At Least 4 Years of experience in DevOps with knowledge of:
- Docker
- Docker Cloud / Containerization
- DevOps Best Practices
- Distributed Applications
- Deployment Architecture
- Atleast AWS Experience
- Exposure to Kubernetes / Serverless Architecture
Skills:
- 3-7+ years of experience in DevOps Engineering
- Strong experience with Docker Containers, Implementing Docker Containers, Container Clustering
- Experience with Docker Swarm, Docker Compose, Docker Engine
- Experience with Provisioning and managing VM's - Virtual Machines
- Experience / strong knowledge with Network Topologies, Network Research,
- Jenkins, BitBucket, Jira
- Ansible or other Automation Configuration Management System tools
- Scripting & Programming using languages such as BASH, Perl, Python, AWK, SED, PHP, Shell
- Linux Systems Administration -: Redhat
Additional Preference:
Security, SSL configuration, Best Practices
You will be responsible for:
- Managing all DevOps and infrastructure for Sizzle
- We have both cloud and on-premise servers
- Work closely with all AI and backend engineers on processing requirements and managing both development and production requirements
- Optimize the pipeline to ensure ultra fast processing
- Work closely with management team on infrastructure upgrades
You should have the following qualities:
- 3+ years of experience in DevOps, and CI/CD
- Deep experience in: Gitlab, Gitops, Ansible, Docker, Grafana, Prometheus
- Strong background in Linux system administration
- Deep expertise with AI/ML pipeline processing, especially with GPU processing. This doesn’t need to include model training, data gathering, etc. We’re looking more for experience on model deployment, and inferencing tasks at scale
- Deep expertise in Python including multiprocessing / multithreaded applications
- Performance profiling including memory, CPU, GPU profiling
- Error handling and building robust scripts that will be expected to run for weeks to months at a time
- Deploying to production servers and monitoring and maintaining the scripts
- DB integration including pymongo and sqlalchemy (we have MongoDB and PostgreSQL databases on our backend)
- Expertise in Docker-based virtualization including - creating & maintaining custom Docker images, deployment of Docker images on cloud and on-premise services, monitoring of production Docker images with robust error handling
- Expertise in AWS infrastructure, networking, availability
Optional but beneficial to have:
- Experience with running Nvidia GPU / CUDA-based tasks
- Experience with image processing in python (e.g. openCV, Pillow, etc)
- Experience with PostgreSQL and MongoDB (Or SQL familiarity)
- Excited about working in a fast-changing startup environment
- Willingness to learn rapidly on the job, try different things, and deliver results
- Bachelors or Masters degree in computer science or related field
- Ideally a gamer or someone interested in watching gaming content online
Skills:
DevOps, Ansible, CI/CD, GitLab, GitOps, Docker, Python, AWS, GCP, Grafana, Prometheus, python, sqlalchemy, Linux / Ubuntu system administration
Seniority: We are looking for a mid to senior level engineer
Salary: Will be commensurate with experience.
Who Should Apply:
If you have the right experience, regardless of your seniority, please apply.
Work Experience: 3 years to 6 years
Skills We Require:- Dev Ops, AWS Admin, terraform, Infrastructure as a Code
SUMMARY:-
- Implement integrations requested by customers
- Deploy updates and fixes
- Provide Level 2 technical support
- Build tools to reduce occurrences of errors and improve customer experience
- Develop software to integrate with internal back-end systems
- Perform root cause analysis for production errors
- Investigate and resolve technical issues
- Develop scripts to automate visualization
- Design procedures for system troubleshooting and maintenance
Have good hands on experience on Dev Ops, AWS Admin, terraform, Infrastructure as a Code
Have knowledge on EC2, Lambda, S3, ELB, VPC, IAM, Cloud Watch, Centos, Server Hardening
Ability to understand business requirements and translate them into technical requirements
A knack for benchmarking and optimizationJOB RESPONSIBILITIES:
- Responsible for design, implementation, and continuous improvement on automated CI/CD infrastructure
- Displays technical leadership and oversight of implementation and deployment planning, system integration, ongoing data validation processes, quality assurance, delivery, operations, and sustainability of technical solutions
- Responsible for designing topology to meet requirements for uptime, availability, scalability, robustness, fault tolerance & security
- Implement proactive measures for automated detection and resolution of recurring operational issues
- Lead operational support team manage incidents, document root cause and tracking preventive measures
- Identifying and deploying cybersecurity measures by continuously validating/fixing vulnerability assessment reports and risk management
- Responsible for the design and development of tools, installation procedures
- Develops and maintains accurate estimates, timelines, project plans, and status reports
- Organize and maintain packaging and deployment of various internal modules and third-party vendor libraries
- Responsible for the employment, timely performance evaluation, counselling, employee development, and discipline of assigned employees.
- Participates in calls and meetings with customers, vendors, and internal teams on regular basis.
- Perform infrastructure cost analysis and optimization
SKILLS & ABILITIES
Experience: Minimum of 10 years of experience with good technical knowledge regarding build, release, and systems engineering
Technical Skills:
- Experience with DevOps toolchains such as Docker, Rancher, Kubernetes, Bitbucket
- Experience with Apache, Nginx, Tomcat, Prometheus ,Grafana
- Ability to learn/use a wide variety of open-source technologies and tools
- Sound understanding of cloud technologies preferably AWS technologies
- Linux, Windows, Scripting, Configuration Management, Build and Release Engineering
- 6 years of experience in DevOps practices, with a good understanding of DevOps and Agile principles
- Good scripting skills (Python/Perl/Ruby/Bash)
- Experience with standard continuous integration tools Jenkins/Bitbucket Pipelines
- Work on software configuration management systems (Puppet/Chef/Salt/Ansible)
- Microsoft Office Suite (Word, Excel, PowerPoint, Visio, Outlook) and other business productivity tools
- Working knowledge on HSM and PKI (Good to have)
Location:
- Bangalore
Experience:
- 10 + Years.
Experience of Linux
Experience using Python or Shell scripting (for Automation)
Hands-on experience with Implementation of CI/CD Processes
Experience working with one cloud platforms (AWS or Azure or Google)
Experience working with configuration management tools such as Ansible & Chef
Experience working with Containerization tool Docker.
Experience working with Container Orchestration tool Kubernetes.
Experience in source Control Management including SVN and/or Bitbucket
& GitHub
Experience with setup & management of monitoring tools like Nagios, Sensu & Prometheus or any other popular tools
Hands-on experience in Linux, Scripting Language & AWS is mandatory
Troubleshoot and Triage development, Production issues
Role and responsibilities
- Expertise in AWS (Most typical Services),Docker & Kubernetes.
- Strong Scripting knowledge, Strong Devops Automation, Good at Linux
- Hands on with CI/CD (CircleCI preferred but any CI/CD tool will do). Strong Understanding of GitHub
- Strong understanding of AWS networking and. Strong with Security & Certificates.
Nice-to-have skills
- Involved in Product Engineering
We are looking for an experienced DevOps engineer that will help our team establish DevOps practice. You will work closely with the technical lead to identify and establish DevOps practices in the company. You will also help us build scalable, efficient cloud infrastructure. You’ll implement monitoring for automated system health checks. Lastly, you’ll build our CI pipeline, and train and guide the team in DevOps practices.
Responsibilities
- Deployment, automation, management, and maintenance of production systems.
- Ensuring availability, performance, security, and scalability of production systems.
- Evaluation of new technology alternatives and vendor products.
- System troubleshooting and problem resolution across various application domains and platforms.
- Providing recommendations for architecture and process improvements.
- Definition and deployment of systems for metrics, logging, and monitoring on the AWS
platform.
- Manage the establishment and configuration of SaaS infrastructure in an agile way
by storing infrastructure as code and employing automated configuration management tools with a goal to be able to re-provision environments at any point in time.
- Be accountable for proper backup and disaster recovery procedures.
- Drive operational cost reductions through service optimizations and demand-based
auto-scaling.
- Have on-call responsibilities.
- Perform root cause analysis for production errors
- Uses open source technologies and tools to accomplish specific use cases encountered
within the project.
- Uses coding languages or scripting methodologies to solve a problem with a custom workflow.
Requirements
- Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive.
- Prior experience as a software developer in a couple of high-level programming
languages.
- Extensive experience in any Javascript-based framework since we will be deploying services to NodeJS on AWS Lambda (Serverless)
- Strong Linux system administration background.
- Ability to present and communicate the architecture in a visual form.
- Strong knowledge of AWS (e.g. IAM, EC2, VPC, ELB, ALB, Autoscaling, Lambda, NAT
gateway, DynamoDB)
- Experience maintaining and deploying highly-available, fault-tolerant systems at scale (~
1 Lakh users a day)
- A drive towards automating repetitive tasks (e.g. scripting via Bash, Python, Ruby, etc)
- Expertise with Git
- Experience implementing CI/CD (e.g. Jenkins, TravisCI)
- Strong experience with databases such as MySQL, NoSQL, Elasticsearch, Redis and/or
Mongo.
- Stellar troubleshooting skills with the ability to spot issues before they become problems.
- Current with industry trends, IT ops and industry best practices, and able to identify the
ones we should implement.
- Time and project management skills, with the capability to prioritize and multitask as
needed.