
Acceldata is creating the Data observability space. We make it possible for data-driven enterprises to effectively monitor, discover, and validate Data platforms at Petabyte scale. Our customers are Fortune 500 companies including Asia's largest telecom company, a unicorn fintech startup of India, and many more. We are lean, hungry, customer-obsessed, and growing fast. Our Solutions team values productivity, integrity, and pragmatism. We provide a flexible, remote-friendly work environment.
We are building software that can provide insights into companies' data operations and allows them to focus on delivering data reliably with speed and effectiveness. Join us in building an industry-leading data observability platform that focuses on ensuring data reliability from every spectrum (compute, data and pipeline) of a cloud or on-premise data platform.
Position Summary-
This role will support the customer implementation of a data quality and reliability product. The candidate is expected to install the product in the client environment, manage proof of concepts with prospects, and become a product expert and troubleshoot post installation, production issues. The role will have significant interaction with the client data engineering team and the candidate is expected to have good communication skills.
Required experience
- 6-7 years experience providing engineering support to data domain/pipelines/data engineers.
- Experience in troubleshooting data issues, analyzing end to end data pipelines and in working with users in resolving issues
- Experience setting up enterprise security solutions including setting up active directories, firewalls, SSL certificates, Kerberos KDC servers, etc.
- Basic understanding of SQL
- Experience working with technologies like S3, Kubernetes experience preferred.
- Databricks/Hadoop/Kafka experience preferred but not required

About Acceldata
About
Acceldata is the company that built the leading Multidimensional Data Observability Cloud. This cloud was designed to help data-driven organizations achieve agility in innovation, operational excellence, and enhanced returns on data investment. Embedded analytics and artificial intelligence technologies are becoming more reliant on contemporary organizations to fuel their business operations and choices.
The data observability technologies offered by Acceldata improve the performance of embedded artificial intelligence and analytics workloads by providing purpose-built monitoring and analytics. The first Data Observability Cloud is presently being developed by Acceldata for cloud data warehouses and hybrid data lakes. Acceldata makes it easy for businesses to expand their pipelines to meet the requirements of modern business, regardless of whether they are operating in a platform or cloud environment. Data Observability Cloud by Acceldata provides on-demand operational information to support analytics data workloads and embedded artificial intelligence.
Connect with the team
Company social profiles
Similar jobs
environments: AWS / Azure / GCP
• Must have strong work experience (2 + years) developing IaC (i.e. Terraform)
• Must have strong work experience in Ansible development and deployment.
• Bachelor’s degree with a background in math will be a PLUS.
• Must have 8+ years experience with a mix of Linux and Window systems in a medium to large business
environment.
• Must have command level fluency and shell scripting experience in a mix of Linux and Windows
environments.
•
• Must enjoy the experience of working in small, fast-paced teams
• Identify opportunities for improvement in existing process and automate the process using Ansible Flows.
• Fine tune performance and operation issues that arise with Automation flows.
• Experience administering container management systems like Kubernetes would be plus.
• Certification with Red Hat or any other Linux variant will be a BIG PLUS.
• Fluent in the use of Microsoft Office Applications (Outlook / Word / Excel).
• Possess a strong aptitude towards automating and timely completion of standard/routine tasks.
• Experience with automation and configuration control systems like Puppet or Chef is a plus.
• Experience with Docker, Kubernetes (or container orchestration equivalent) is nice to have
- 2+ years work experience in a DevOps or similar role
- Knowledge of OO programming and concepts (Java, C++, C#, Python)
- A drive towards automating repetitive tasks (e.g., scripting via Bash, Python, etc)
- Fluency in one or more scripting languages such as Python or Ruby.
- Familiarity with Microservice-based architectures
- Practical experience with Docker containerization and clustering (Kubernetes/ECS)
- In-depth, hands-on experience with Linux, networking, server, and cloud architectures.
- Experience with CI/CD tools Azure DevOps, AWS cloud formation, Lamda functions, Jenkins, and Ansible
- Experience with AWS, Azure, or another cloud PaaS provider.
- Solid understanding of configuration, deployment, management, and maintenance of large cloud-hosted systems; including auto-scaling, monitoring, performance tuning, troubleshooting, and disaster recovery
- Proficiency with source control, continuous integration, and testing pipelines
- Effective communication skills
Job Responsibilities:
- Deploy and maintain critical applications on cloud-native microservices architecture.
- Implement automation, effective monitoring, and infrastructure-as-code.
- Deploy and maintain CI/CD pipelines across multiple environments.
- Streamline the software development lifecycle by identifying pain points and productivity barriers and determining ways to resolve them.
- Analyze how customers are using the platform and help drive continuous improvement.
- Support and work alongside a cross-functional engineering team on the latest technologies.
- Iterate on best practices to increase the quality & velocity of deployments.
- Sustain and improve the process of knowledge sharing throughout the engineering team
- Identification and prioritization of technical debt that risks instability or creates wasteful operational toil.
- Own daily operational goals with the team.
We are looking for a Senior Platform Engineer responsible for handling our GCP/AWS clouds. The candidate will be responsible for automating the deployment of cloud infrastructure and services to support application development and hosting (architecting, engineering, deploying, and operationally managing the underlying logical and physical cloud computing infrastructure).
Job Description:
● Collaborate with teams to build and deliver solutions implementing serverless, microservice-based, IaaS, PaaS, and containerized architectures in GCP/AWS environments.
●Responsible for deploying highly complex, distributed transaction processing systems.
● Work on continuous improvement of the products through innovation and learning. Someone with a knack for benchmarking and optimization
● Hiring, developing, and cultivating a high and reliable cloud support team ● Building and operating complex CI/CD pipelines at scale
● Work with GCP Services, Private Service Connect, Cloud Run, Cloud Functions, Pub/Sub, Cloud Storage, Networking
● Collaborate with Product Management and Product Engineering teams to drive excellence in Google Cloud products and features.
● Ensures efficient data storage and processing functions by company security policies and best practices in cloud security.
● Ensuring scaled database setup/monitoring with near zero downtime
- 3 - 6 years of software development, and operations experience deploying and maintaining multi-tiered infrastructure and applications at scale.
- Design cloud infrastructure that is secure, scalable, and highly available on AWS
- Experience managing any distributed NoSQL system (Kafka/Cassandra/etc.)
- Experience with Containers, Microservices, deployment and service orchestration using Kubernetes, EKS (preferred), AKS or GKE.
- Strong scripting language knowledge, such as Python, Shell
- Experience and a deep understanding of Kubernetes.
- Experience in Continuous Integration and Delivery.
- Work collaboratively with software engineers to define infrastructure and deployment requirements
- Provision, configure and maintain AWS cloud infrastructure
- Ensure configuration and compliance with configuration management tools
- Administer and troubleshoot Linux-based systems
- Troubleshoot problems across a wide array of services and functional areas
- Build and maintain operational tools for deployment, monitoring, and analysis of AWS infrastructure and systems
- Perform infrastructure cost analysis and optimization
- AWS
- Docker
- Kubernetes
- Envoy
- Istio
- Jenkins
- Cloud Security & SIEM stacks
- Terraform
Requirements
Role : SRE
Experience : 4 - 8 Years
- Experience in building, deploying and operating cloud solutions on Kubernetes
- Strong expertise administrating and scaling Kubernetes on bare metal and CKA preferred
- Expertise on K8s Interfaces CNI, CSI, CRI and Service meshe
- Hands-on experience as a DevOps or Automation development
- Demonstrable knowledge of TCP/IP, Linux operating system internals, filesystems, disk/storage technologies and storage protocols.
- Experience working with Helm Charts and building out Infrastructure As Code (IaC)
- Experience in writing software to automate orchestration tasks at scale; we commonly use Python, Go, and Shell scripting
- Knowledge of systems (Linux, GNU tooling), networking (OSI model, DNS, routing) and virtualization vs containerization
- Expertise in CI/CD tooling for cloud-based applications specifically Terraform / CloudFormation, Jenkins and Git
- Architected CNF Orchestration with Kubernetes
- Strong understanding of the principles of 12-factor apps and modern containerized microservices
- Plan for reliability by designing systems to work across our multi-region and multi-cloud environments
- Experience developing and using Application & Integration stacks/tools such as Kafka, Spring Cloud, Apache Camel, Kubernetes, Docker, Redis, Knative, and NoSQL
Experience and Education
• Bachelor’s degree in engineering or equivalent.
Work experience
• 4+ years of infrastructure and operations management
Experience at a global scale.
• 4+ years of experience in operations management, including monitoring, configuration management, automation, backup, and recovery.
• Broad experience in the data center, networking, storage, server, Linux, and cloud technologies.
• Broad knowledge of release engineering: build, integration, deployment, and provisioning, including familiarity with different upgrade models.
• Demonstratable experience with executing, or being involved of, a complete end-to-end project lifecycle.
Skills
• Excellent communication and teamwork skills – both oral and written.
• Skilled at collaborating effectively with both Operations and Engineering teams.
• Process and documentation oriented.
• Attention to details. Excellent problem-solving skills.
• Ability to simplify complex situations and lead calmly through periods of crisis.
• Experience implementing and optimizing operational processes.
• Ability to lead small teams: provide technical direction, prioritize tasks to achieve goals, identify dependencies, report on progress.
Technical Skills
• Strong fluency in Linux environments is a must.
• Good SQL skills.
• Demonstratable scripting/programming skills (bash, python, ruby, or go) and the ability to develop custom tool integrations between multiple systems using their published API’s / CLI’s.
• L3, load balancer, routing, and VPN configuration.
• Kubernetes configuration and management.
• Expertise using version control systems such as Git.
• Configuration and maintenance of database technologies such as Cassandra, MariaDB, Elastic.
• Designing and configuration of open-source monitoring systems such as Nagios, Grafana, or Prometheus.
• Designing and configuration of log pipeline technologies such as ELK (Elastic Search Logstash Kibana), FluentD, GROK, rsyslog, Google Stackdriver.
• Using and writing modules for Infrastructure as Code tools such as Ansible, Terraform, helm, customize.
• Strong understanding of virtualization and containerization technologies such as VMware, Docker, and Kubernetes.
• Specific experience with Google Cloud Platform or Amazon EC2 deployments and virtual machines.c
- Development and maintenance of Continuous Integration System on JENKINS.
- Build management for the planned major/minor releases
- Release process management and maintenance
- Enhancement and development of build/release system features.
Required Qualifications:
- 2 - 3 years relevant work experience in Jenkins / Scripting / C / Linux
- Expertise in scripting languages like a shell, python, etc
- Work experience in handling Make/CMake build systems
- Expertise in GIT source revision control
- Experience with Yocto build systems and recipes


