
General Description:
Owns all technical aspects of software development for assigned applications.
Participates in the design and development of systems & application programs.
Functions as Senior member of an agile team and helps drive consistent development practices – tools, common components, and documentation.
Required Skills:
In depth experience configuring and administering EKS clusters in AWS.
In depth experience in configuring **DataDog** in AWS environments especially in **EKS**
In depth understanding of OpenTelemetry and configuration of **OpenTelemetry Collectors**
In depth knowledge of observability concepts and strong troubleshooting experience.
Experience in implementing comprehensive monitoring and logging solutions in AWS using **CloudWatch**.
Experience in **Terraform** and Infrastructure as code.
Experience in **Helm**
Strong scripting skills in Shell and/or python.
Experience with large-scale distributed systems and architecture knowledge (Linux/UNIX and Windows operating systems, networking, storage) in a cloud computing or traditional IT infrastructure environment.
Must have a good understanding of cloud concepts (Storage /compute/network).
Experience in Collaborating with several cross functional teams to architect observability pipelines for various GCP services like GKE, cloud run Big Query etc.
Experience with Git and GitHub.
Proficient in developing and maintaining technical documentation, ADRs, and runbooks.

Similar jobs
Description
Do you dream about code every night? If so, we’d love to talk to you about a new product that we’re making to enable delightful testing experiences at scale for development teams who build modern software solutions.
What You'll Do
Troubleshooting and analyzing technical issues raised by internal and external users.
Working with Monitoring tools like Prometheus / Nagios / Zabbix.
Developing automation in one or more technologies such as Terraform, Ansible, Cloud Formation, Puppet, Chef will be preferred.
Monitor infrastructure alerts and take proactive action to avoid downtime and customer impacts.
Working closely with the cross-functional teams to resolve issues.
Test, build, design, deployment, and ability to maintain continuous integration and continuous delivery process using tools like Jenkins, maven Git, etc.
Work in close coordination with the development and operations team such that the application is in line with performance according to the customer's expectations.
What you should have
Bachelor’s or Master’s degree in computer science or any related field.
3 - 6 years of experience in Linux / Unix, cloud computing techniques.
Familiar with working on cloud and datacenter for enterprise customers.
Hands-on experience on Linux / Windows / Mac OS’s and Batch/Apple/Bash scripting.
Experience with various databases such as MongoDB, PostgreSQL, MySQL, MSSQL.
Familiar with AWS technologies like EC2, S3, Lambda, IAM, etc.
Must know how to choose the best tools and technologies which best fit the business needs.
Experience in developing and maintaining CI/CD processes using tools like Git, GitHub, Jenkins etc.
Excellent organizational skills to adapt to a constantly changing technical environment

- Proficiency in Python , Django and Other Allied Frameworks;
- Expert in designing UI/UX interfaces;
- Expert in testing, troubleshooting, debugging and problem solving;
- Basic knowledge of SEO;
- Good communication;
- Team building and good acumen;
- Ability to perform;
- Continuous learning
We will share our workload as a team and we expect you to work on a broad range of tasks. Here’s are some of the things you might have to do on any given day:
- Developing APIs and endpoints for deployments of our product
- Infrastructure Development such as building databases, creating and maintaining automated jobs
- Build out the back-end to deploy and scale our product
- Build POCs for client deployments
- Document your code, write test cases, etc.
A.P.T Portfolio, a high frequency trading firm that specialises in Quantitative Trading & Investment Strategies.Founded in November 2009, it has been a major liquidity provider in global Stock markets.
As a manager, you would be incharge of managing the devops team and your remit shall include the following
- Private Cloud - Design & maintain a high performance and reliable network architecture to support HPC applications
- Scheduling Tool - Implement and maintain a HPC scheduling technology like Kubernetes, Hadoop YARN Mesos, HTCondor or Nomad for processing & scheduling analytical jobs. Implement controls which allow analytical jobs to seamlessly utilize ideal capacity on the private cloud.
- Security - Implementing best security practices and implementing data isolation policy between different divisions internally.
- Capacity Sizing - Monitor private cloud usage and share details with different teams. Plan capacity enhancements on a quarterly basis.
- Storage solution - Optimize storage solutions like NetApp, EMC, Quobyte for analytical jobs. Monitor their performance on a daily basis to identify issues early.
- NFS - Implement and optimize latest version of NFS for our use case.
- Public Cloud - Drive AWS/Google-Cloud utilization in the firm for increasing efficiency, improving collaboration and for reducing cost. Maintain the environment for our existing use cases. Further explore potential areas of using public cloud within the firm.
- BackUps - Identify and automate back up of all crucial data/binary/code etc in a secured manner at such duration warranted by the use case. Ensure that recovery from back-up is tested and seamless.
- Access Control - Maintain password less access control and improve security over time. Minimize failures for automated job due to unsuccessful logins.
- Operating System -Plan, test and roll out new operating system for all production, simulation and desktop environments. Work closely with developers to highlight new performance enhancements capabilities of new versions.
- Configuration management -Work closely with DevOps/ development team to freeze configurations/playbook for various teams & internal applications. Deploy and maintain standard tools such as Ansible, Puppet, chef etc for the same.
- Data Storage & Security Planning - Maintain a tight control of root access on various devices. Ensure root access is rolled back as soon the desired objective is achieved.
- Audit access logs on devices. Use third party tools to put in a monitoring mechanism for early detection of any suspicious activity.
- Maintaining all third party tools used for development and collaboration - This shall include maintaining a fault tolerant environment for GIT/Perforce, productivity tools such as Slack/Microsoft team, build tools like Jenkins/Bamboo etc
Qualifications
- Bachelors or Masters Level Degree, preferably in CSE/IT
- 10+ years of relevant experience in sys-admin function
- Must have strong knowledge of IT Infrastructure, Linux, Networking and grid.
- Must have strong grasp of automation & Data management tools.
- Efficient in scripting languages and python
Desirables
- Professional attitude, co-operative and mature approach to work, must be focused, structured and well considered, troubleshooting skills.
- Exhibit a high level of individual initiative and ownership, effectively collaborate with other team members.
APT Portfolio is an equal opportunity employer
Roles and Responsibilities
● Managing Availability, Performance, Capacity of infrastructure and applications.
● Building and implementing observability for applications health/performance/capacity.
● Optimizing On-call rotations and processes.
● Documenting “tribal” knowledge.
● Managing Infra-platforms like
- Mesos/Kubernetes
- CICD
- Observability(Prometheus/New Relic/ELK)
- Cloud Platforms ( AWS/ Azure )
- Databases
- Data Platforms Infrastructure
● Providing help in onboarding new services with the production readiness review process.
● Providing reports on services SLO/Error Budgets/Alerts and Operational Overhead.
● Working with Dev and Product teams to define SLO/Error Budgets/Alerts.
● Working with the Dev team to have an in-depth understanding of the application architecture and its bottlenecks.
● Identifying observability gaps in product services, infrastructure and working with stake owners to fix it.
● Managing Outages and doing detailed RCA with developers and identifying ways to avoid that situation.
● Managing/Automating upgrades of the infrastructure services.
● Automate toil work.
Experience & Skills
● 3+ Years of experience as an SRE/DevOps/Infrastructure Engineer on large scale microservices and infrastructure.
● A collaborative spirit with the ability to work across disciplines to influence, learn, and deliver.
● A deep understanding of computer science, software development, and networking principles.
● Demonstrated experience with languages, such as Python, Java, Golang etc.
● Extensive experience with Linux administration and good understanding of the various linux kernel subsystems (memory, storage, network etc).
● Extensive experience in DNS, TCP/IP, UDP, GRPC, Routing and Load Balancing.
● Expertise in GitOps, Infrastructure as a Code tools such as Terraform etc.. and Configuration Management Tools such as Chef, Puppet, Saltstack, Ansible.
● Expertise of Amazon Web Services (AWS) and/or other relevant Cloud Infrastructure solutions like Microsoft Azure or Google Cloud.
● Experience in building CI/CD solutions with tools such as Jenkins, GitLab, Spinnaker, Argo etc.
● Experience in managing and deploying containerized environments using Docker,
Mesos/Kubernetes is a plus.
● Experience with multiple datastores is a plus (MySQL, PostgreSQL, Aerospike,
Couchbase, Scylla, Cassandra, Elasticsearch).
● Experience with data platforms tech stacks like Hadoop, Hive, Presto etc is a plus

- He has to perform architectural analysis, and he should know how to design enterprise-level systems.
- He should know how to design and simulate tools for the perfect delivery of systems.
- He should know how to design, develop, and maintain systems, processes, procedures to deliver a high-quality service design.
- He has to work with other members of a team and other departments to establish healthy communication and information flow.
- He should know how to deliver a high-performing solution architecture that can support the development efforts of a business.
- He has to plan, design, and configure the most typical business solutions as needed.
- He has to prepare technical documents and other presentations for multiple solutions areas.
- He has to be sure that the best practices for configuration management are carried our as it was needed.
- He has to work on customer specifications, analyze them, and conduct the best product recommendations associated with the platform
Requirements
- AWS Solution Architect 9-10 Years
- Responsible for managing applications on public cloud (AWS) infrastructure.
- Responsible for larger migrations of applications from VM to cloud/cloud-native.
- Responsible for setting up monitoring for cloud/cloud-native-based infrastructure and applications.
- MUST: AWS Solution Architect Professional certification.
What you do :
- Developing automation for the various deployments core to our business
- Documenting run books for various processes / improving knowledge bases
- Identifying technical issues, communicating and recommending solutions
- Miscellaneous support (user account, VPN, network, etc)
- Develop continuous integration / deployment strategies
- Production systems deployment/monitoring/optimization
-
Management of staging/development environments
What you know :
- Ability to work with a wide variety of open source technologies and tools
- Ability to code/script (Python, Ruby, Bash)
- Experience with systems and IT operations
- Comfortable with frequent incremental code testing and deployment
- Strong grasp of automation tools (Chef, Packer, Ansible, or others)
- Experience with cloud infrastructure and bare-metal systems
- Experience optimizing infrastructure for high availability and low latencies
- Experience with instrumenting systems for monitoring and reporting purposes
- Well versed in software configuration management systems (git, others)
- Experience with cloud providers (AWS or other) and tailoring apps for cloud deployment
-
Data management skills
Education :
- Degree in Computer Engineering or Computer Science
- 1-3 years of equivalent experience in DevOps roles.
- Work conducted is focused on business outcomes
- Can work in an environment with a high level of autonomy (at the individual and team level)
-
Comfortable working in an open, collaborative environment, reaching across functional.
Our Offering :
- True start-up experience - no bureaucracy and a ton of tough decisions that have a real impact on the business from day one.
-
The camaraderie of an amazingly talented team that is working tirelessly to build a great OS for India and surrounding markets.
Perks :
- Awesome benefits, social gatherings, etc.
- Work with intelligent, fun and interesting people in a dynamic start-up environment.

- Work with developers to build out CI/CD pipelines, enable self-service build tools and reusable deployment jobs. Find, explore, and advocate for new technologies for enterprise use.
- Automate the provisioning of environments
- Promote new DevOps tools to simplify the build process and entire Continuous Delivery.
- Manage a Continuous Integration and Deployment environment.
- Coordinate and scale the evolving build and cloud deployment systems across all product development teams.
- Work independently, with, and across teams. Establishing smooth running. environments are paramount to your success, and happiness
- Encourage innovation, implementation of cutting-edge technologies, inclusion, outside-of-the[1]box thinking, teamwork, self-organization, and diversity.
Technical Skills
- Experience with AWS multi-region/multi-AZ deployed systems, auto-scaling of EC2 instances, CloudFormation, ELBs, VPCs, CloudWatch, SNS, SQS, S3, Route53, RDS, IAM roles, security groups, cloud watch
- Experience in Data Visualization and Monitoring tools such as Grafana and Kibana
- Experienced in Build and CI/CD/CT technologies like GitHub, Chef, Artifactory, Hudson/Jenkins
- Experience with log collection, filter creation, and analysis, builds, and performance monitoring/tuning of infrastructure.
- Automate the provisioning of environments pulling strings with Puppet, cooking up some recipes with Chef, or through Ansible, and the deployment of those environments using containers, like Docker or Rocket: (have at least some configuration management tool through some version control).
Qualifications:
- B.E/ B.Tech/ M.C.A in Computer Science, Electronics and Communication Engineering, Electronics and Electrical Engineering.
- Minimum 60% in Graduation and Post-Graduation.
- Good verbal and written communication skills
Objectives of this Role
Improve reliability, quality, and time-to-market of our suite of software solutions
- Run the production environment by monitoring availability and taking a holistic view of system health
- Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer - needs, and innovating to continually improve
- Provide primary operational support and engineering for multiple large distributed software applications
- Participate in system design consulting, platform management, and capacity planning
- Languages: Python, Java, Ruby DSL, Bash
- Databases : MySQL, Cassandra , Elastic Search
- Deployment: AWS CloudFormation
Essential Criteria:
- 8 or more years administrating production Linux systems in a 24x7 environment
- 3 or more years’ experience in a DevOps/ SRE role as an engineer or technical lead
- At least 1 year of team leadership experience
- Significant knowledge of Amazon Web Services (CLI/APIs, EC2, EBS, S3, VPCs, IAM, AWS Lambda)
- Experience deploying services into containerized orchestration environments such as Kubernetes
- Experience with infrastructure automation tools like CloudFormation, Terraform, etc.
- Experience with at least one of Python, Bash, Ruby, or equivalent
- Experience creating and managing CI/CD pipeline like Jenkins or Spinnaker
- Familiar with version control using Git
- Solid understanding of common security principles
Nice to Have:
- Preference for hands on experience with Serverless Architecture, Kubernetes and Docker
- Strong experience with open-source configuration management tools
- Managing distributed systems spanning multiple AWS regions / data-centers
- Experience with bootstrapping solutions
- Open source contributor
- We’re committed to client success: There are over 6,200 brand and retail websites in the Bazaarvoice network. Our clients represent some of the world’s leading companies across a wide range of industries including retail, apparel, automotive, consumer electronics and travel.
- We’re leaders in consumer-generated content: Each month, more than one billion consumers view and share authentic consumer-generated content, such as ratings and reviews, curated photos, social posts and videos, about products in our network. Thousands upon thousands or reviews are added to the Bazaarvoice network everyday.
- Our network delivers: Network analytics provide insights that help marketers and advertisers provide more engaging experiences that drive brand awareness, consideration, sales, and loyalty.
- We’re a great place to work: We pride ourselves on our unique culture. Join a company that values passion, innovation, authenticity, generosity, respect, teamwork, and performance.





