
Senior Platform & Site Reliability Engineer
Location: Remote Employment Type: Contract
The Role
This role carries full architectural and operational ownership of the platform layer across a growing SaaS portfolio. The Cloud Architect owns AWS infrastructure standards — VPCs, account structures, networking, and compute design. Everything outside that lane is yours: the CI/CD platform, the observability and reliability stack, the event streaming infrastructure, the deployment pipelines, and the incident engineering model.
Architectural decisions are yours to make and defend, standards are yours to define and enforce, and the reliability of 20+ enterprise SaaS products depends on what you and your team build.
This is an AI-native engineering organisation. Where it is practical and safe to do so, you are expected to use automation and AI-assisted tooling to reduce toil — in CI/CD triage, infrastructure provisioning, observability workflows, and acquisition onboarding. The expectation is not to replace engineering judgement with automation, but to free it up for the problems that genuinely require it.
The Scale You Will Operate At
The portfolio consists of 20+ live, enterprise-grade SaaS solutions running concurrently. Each product serves enterprise customers and processes millions to billions of real-time requests. The architecture is serious: event streaming for real-time data pipelines, batch processing workloads running alongside live transaction flows, and multi-tenant enterprise-grade reliability expectations across every product.
You will design and operate the platform infrastructure that underpins all of it — scaling horizontally as each new acquisition joins the portfolio, without proportionally scaling cost, complexity, or headcount.
What You Will Own
Platform Architecture
- Full architectural ownership of the non-AWS toolchain: CI/CD, observability, event streaming, automation, secrets, and deployment infrastructure
- Define, build, and enforce platform standards across portfolio products
- Terraform IaC for all infrastructure — nothing provisioned manually, everything versioned and reviewed
- Self-service developer platform so product teams ship without waiting on platform
Event Streaming & Pipeline Infrastructure
- Own the event streaming architecture, operational standards, and health monitoring across all products using real-time pipelines
- Design and maintain batch processing infrastructure alongside live event flows
- Ensure pipeline reliability, throughput, and cost are actively managed at scale
CI/CD & Deployment
- Build and maintain CI/CD pipelines (GitHub Actions) across all portfolio products
- Automate triage and retry logic for known failure classes — flaky tests, dependency timeouts, OOM kills — so engineers are only paged for genuinely novel failures
- Deployment standards: release management, rollback mechanisms, canary and blue-green patterns where justified
Observability & Reliability
- Own the full observability stack: Grafana, Prometheus, and Loki across all products
- SLOs and error budgets defined per product; reliability tracked consistently
- Build alerting that correlates signals and surfaces diagnostic context alongside notifications — so on-call engineers arrive at an incident with hypotheses, not a blank screen
- Incident response: on-call design, escalation playbooks, post-mortem facilitation
- Automated remediation scoped to safe, idempotent actions — container restarts, ECS task scaling, known rollback patterns; novel or ambiguous failures escalate to a human with full context attached
Acquisition Onboarding
- Platform audit and gap analysis for every new acquisition — assessing CI/CD maturity, IaC coverage, observability gaps, and security posture
- Migration plan and execution for each portfolio company joining the platform — sequenced to avoid disrupting live operations
- Target: full platform integration within a defined window per acquisition
A Note on Automation
Where automation is safe and failure modes are well understood — routine provisioning, known CI/CD failure classes, secrets rotation, cost anomaly flagging — aggressive automation is expected. Where automation would act on ambiguous signals or carry significant blast radius, human judgement stays in the loop. The goal is to reduce toil on solved problems, not to automate decisions that require engineering expertise.
Platform Stack
Area Stack / Standard IaC Terraform OSS / OpenTofu CI/CD GitHub Actions Event Streaming Architecture and tooling chosen for the workload Observability Grafana, Prometheus, Loki Log Management AWS CloudWatch, Grafana Loki Incident Management OpsGenie (startup tier) or Better Uptime Secrets AWS Secrets Manager / HashiCorp Vault OSS Containers ECS (default), EKS only where justified Cost Monitoring AWS Cost Explorer with custom dashboards What We’re Looking For
- 8–12 years in platform engineering, DevOps, or SRE — with clear evidence of increasing ownership over time
- Strong Terraform depth across multi-environment, multi-account setups
- CI/CD ownership across a multi-product environment with GitHub Actions
- Experience with event streaming infrastructure at production scale — design, operations, reliability, and cost management
- Hands-on Grafana, Prometheus, and Loki in production
- AWS operational depth: ECS, EKS, RDS, IAM, VPC, CloudWatch, Cost Explorer
- SRE fundamentals: SLOs, error budgets, on-call design, post-mortem culture
- Acquisition or greenfield platform integration experience strongly preferred
How You Work
- Comfortable operating across multiple products simultaneously — context-switching without dropping standards
- Cost-efficiency instinct — you optimise spend as a habit, not as a project
- You treat automation as a tool for eliminating toil, not a substitute for engineering judgement
- You document decisions, enforce standards through code, and build platforms that other engineers find intuitive to use
Why This Role
The platform function is being built from the ground up. You will have architectural ownership of the entire non-AWS platform layer across a growing portfolio of enterprise SaaS products, with the freedom — and responsibility — to build the reliability and delivery culture of the organisation.
This is not a role that inherits someone else’s decisions and maintains them. Every major architectural choice is still to be made. If you want to build something that lasts and that other engineers depend on, this is the role.

About FAiHr
About
We are building the Operating System for Talent. At FAIHR, we believe the talent market has a clarity problem. People struggle to understand their strengths and career direction, while organizations rely on signals that reveal only a fraction of a person’s true potential. Through ReflectEngine™, our reflection-aware AI, we help individuals gain clarity about how they think, work, and grow, and help organizations uncover potential beyond keywords and resumes.
With FAIHR OS™, we bring together career clarity for individuals and intelligence for organizations in one unified platform. By combining verified data with behavioral and growth insights, we enable people to communicate their potential with confidence and help companies make more informed talent decisions. We are building the clarity layer the talent ecosystem has been missing.
Similar jobs
What you’ll do
- Build and scale backend services and APIs using Python
- Work on cross-language integrations (Python ↔ PHP)
- Develop frontend features using React (Angular is a plus)
- Deploy, monitor, and manage applications on AWS
- Own features end-to-end: development, performance, and reliability
- Collaborate closely with product, QA, and engineering teams
Tech Stack
- Backend: Python (working knowledge of PHP is a strong plus)
- Frontend: React (Angular is a plus)
- Cloud: AWS
- Version Control: Git / GitHub
Experience
- 5–10 years of professional software development experience
- Strong hands-on experience with Python
- Hands-on experience deploying and managing applications on AWS
- Working knowledge of modern frontend frameworks
lead and optimize paid media campaigns across online marketplaces including Amazon, Flipkart, and Myntra. You will be responsible for building and scaling performance marketing strategies across Meta, Google, and marketplace ad platforms to drive traffic, conversions, and revenue growth.
Key Responsibilities:
1. Marketplace Advertising:
- Manage and optimize performance marketing campaigns on Amazon (AMS), Flipkart Ads, and Myntra Ads.
- Work with marketplace teams and internal category managers to allocate budgets based on SKU performance, seasonal demand, and brand objectives.
- Leverage keyword strategies, sponsored ads, and DSPs to grow product visibility and improve organic rankings.
- Monitor daily performance and adjust bids, keywords, and targeting to meet campaign KPIs.
- Collaborate with design and content teams to develop high-performing ad creatives and landing pages.
- Own A/B testing strategy across channels (copy, creative, audience, placements).
2. Budgeting & Reporting:
- Develop and manage monthly media plans and performance forecasts.
- Maintain strict control over campaign budgets and ensure efficient allocation across channels.
- Provide weekly/monthly reporting on KPIs like ROAS, CPA, CTR, CVR, etc. with actionable insights.
- Identify growth opportunities through data-driven insights and industry benchmarking.
Key Requirements:
- 4–6 years of hands-on experience in performance marketing for e-commerce marketplaces.
- In-depth knowledge of Amazon Advertising Console, Flipkart Ads Manager, and Myntra Ads/MPP.
- Strong analytical and data interpretation skills with proficiency in Excel, and dashboard tools like Looker Studio / Power BI.
- Understanding of marketing attribution, customer journey mapping, and conversion rate optimization.
- Experience working with fashion, apparel, or lifestyle categories is highly preferred.
- Ability to collaborate cross-functionally and manage agency or in-house creative resources.
Preferred Qualifications:
- Bachelor’s degree in Marketing, Business, or related field.
- Masters in Business Administration
- 4-6 years marketing experience
Candidate MUST HAVE product-based company experience and a minimum of 3years of experience in DevOps.
What you will do (or learn) :
1. Build our application stack on AWS. Infrastructure as code (read Terraform)
2. Build state-of-the-art CI/CD pipelines.
3. Manage data warehouses and data pipelines.
4. Work on infrastructure and data security.
5. State-of-the-art log management system and tooling around them.
6. Monitoring and alerting system.
What do we expect from you?
1. 3 to 10 years of experience with DevOps or SRE principles.
2. Good fundamentals of database management and other distributed systems management.
3. Experience in infrastructure as code or other configuration management systems.
4. Experience in scripting languages (like bash, python, go lang etc.)
5. Good understanding of Linux systems
6. Strong debugging and troubleshooting skills
7. Experience in tooling around monitoring, CI/CD, log management systems.

-We will provide a best hike on candidate's current ctc or offered ctc.....
we will provide upto 25LPA -30LPA
Our client is a call management solutions company, which helps small to mid-sized businesses use its virtual call center to manage customer calls and queries. It is an AI and cloud-based call operating facility that is affordable as well as feature-optimized. The advanced features offered like call recording, IVR, toll-free numbers, call tracking, etc are based on automation and enhances the call handling quality and process, for each client as per their requirements. They service over 6,000 business clients including large accounts like Flipkart and Uber.
- Being involved in Configuration Management, Web Services Architectures, DevOps Implementation, Build & Release Management, Database management, Backups, and Monitoring.
- Ensuring reliable operation of CI/ CD pipelines
- Orchestrate the provisioning, load balancing, configuration, monitoring and billing of resources in the cloud environment in a highly automated manner
- Logging, metrics and alerting management.
- Creating Docker files
- Creating Bash/ Python scripts for automation.
- Performing root cause analysis for production errors.
What you need to have:
- Proficient in Linux Commands line and troubleshooting.
- Proficient in AWS Services. Deployment, Monitoring and troubleshooting applications in AWS.
- Hands-on experience with CI tooling preferably with Jenkins.
- Proficient in deployment using Ansible.
- Knowledge of infrastructure management tools (Infrastructure as cloud) such as terraform, AWS cloudformation etc.
- Proficient in deployment of applications behind load balancers and proxy servers such as nginx, apache.
- Scripting languages: Bash, Python, Groovy.
- Experience with Logging, Monitoring, and Alerting tools like ELK(Elastic-search, Logstash, Kibana), Nagios. Graylog, splunk Prometheus, Grafana is a plus.
1. Strong fundamentals OOPS concepts, Exception Handling, Coding Standards, Logging
2. Creating custom, general use modules and components which extend the elements and modules of core Angular.
3. Creating configuration, build, and test scripts for Continuous Integration environments
4. Communicating with external web services and processing data
5. Experience with offline storage threading and performance tuning
6. Review code and maintain the code quality and suggest best practices
7. Knowledge and experience on data science and programming languages
8. Demonstrable abilities to optimize code. Strong analytical skills for effective problem solving
About MoEngage
MoEngage is a fast-paced startup that’s helping companies run smart marketing efforts in their effort to reach the customer. We are a leading Marketing Technology Stack provider that is helping brands redefine their customer engagement in the mobile era. Brands use MoEngage to drive long-term, personalised and context-based engagement across channels to help achieve increased customer retention as well as customer LTV. Sitting at a conflux of diverse technologies like Artificial Intelligence, Big Data, Web & Mobile platforms, MoEngage technology analyses billions of data points generated by customers and their devices, to predict customer behavior and build marketing campaigns that proactively engage users.
In just four years since inception, MoEngage is working with leading brands across e-commerce, entertainment, travel, publishing and banking domains among others. With marquee clients like Vodafone, Oyo, Airtel, McAfee, MoEngage has over 125+ paying Customers in the Enterprise & Internet companies space in India, US, South East Asia & EU. With a global presence spanning 35 countries, MoEngage has offices in San Francisco, Berlin, Jakarta, and Bengaluru.
Today, MoEngage is an industry pioneer in the space and engages more than 350M devices. This includes approximately 40B events tracked per month, 30B+ messages sent, to millions of users across the globe.
As part of the Engineering team at MoEngage, here are some things you can expect:
- Take ownership and be responsible for what you build - no micro management
- Work with A players (some of the best talent in the country), and expedite your learning curve and career growth
- Make in India and build for the world at scale of 350M active users, which no other internet company in the country has seen
- Learn together from different teams on how they scale to millions of users and billions of messages.
- Explore the latest in topics like Data Pipeline, MongoDB, ElasticSearch, Kafka, Spark, Samza and share with the team
and more importantly have fun while you work on scaling MoEngage.
About InApps team
In-app team is responsible for effectively delivering the contextual information to help companies cross-sell/up-sell on specific workflows triggered on desired actions performed by application users. As a member of in-app team, you will be working on developing high performance systems to deliver the contextual campaigns in real time. In addition to real time campaign delivery, you will work on designing a flexible platform to provide customised experience for application users using web personalization, which allows companies to present unique and personalized experiences on their applications.
- Scaling campaign delivery with personalized content to 500M unique users within 1 sec.
- Rich campaign content delivery keeping user experience native to mobile and web applications.
Skill Requirements
- Proven experience in handling large infrastructure and distributed systems
- Proven experience in managing high performing engineering teams
- Proven experience with at least one of the cloud computing infrastructure - GCP / Azure / AWS
- Hands on in Java OR Python related technologies and frameworks
- Familiarity with ElasticSearch, MongoDB is a plus
- Liaison with Product Management, DevOps, QA and other teams
- Performance management, Sprint management, Roadmap, Hiring, Onboarding, Mentoring, Costing, Documenting
At MoEngage, we are passionate about our team and technology - see below to know more about us and technology.
https://twitter.com/hashtag/lifeatmoengage?f=tweets&vertical=default">Life @MoEngage
https://twitter.com/hashtag/techatmoengage?f=tweets&vertical=default">Tech @MoEngage
https://www.moengage.com/blog/techatmoengage-reddit-has-330m-monthly-active-users-so-do-we/">Scale @MoEngage
We handle more than a billion messages everyday. Rest assured, you will be surrounded by really smart and passionate people as we scale much more to build a world class technology team.
About Us.
We are a fast growing Chennai-based startup (with venture funding)
We are here for the long run and are led by strong engineers with
significant experience at top tier firms like McKinsey, Oracle, Morgan
Stanley, and authors of multiple Java Standards
We're building an awesome enterprise product that
is already transforming businesses across the globe
Our clients include prominent organizations like Swiggy who use us every
day!
We have raised venture investment, so this role will not vanish in a few
months
Our mentors include the Vice Chairman of HCL Technologies, Chairman of
Singapore Airlines, Senior leaders from LinkedIn etc.
Other background information
We're looking for strong, passionate developers who want to join our
team and grow
We move fast, and will make you an offer in a few days for the right
person
The job location will be in Chennai (it is the SaaS capital of India)
Please only apply if you are open to moving to Chennai (or are already
here)
This won't be a 10 - 6 job, but you will be given amazing responsibilities
You will learn a ton, have a lot of flexibility, and have fun while doing it!
Responsibilities and qualifications
5+ years of Android development experience
Will own several parts of our tech stack, but primarily our Android app
Build new functionality to the Android app and regularly re-architect it to
keep up with latest technologies (e.g., Kotlin)
Work on new tech that we are already a leading user of (e.g., Google
Firebase)
Work closely with the backend team to construct creative solutions
Optimize, improve efficiency, scalability, stability of application
MUST BE reliable, and be able to communicate clearly
MUST BE able to own and deliver their own work within deadlines,
professionally
Should be passionate about building a strong engineering culture
Bonus Points for experience building high-scale applications, SDKs and other
web technologies (like JavaScript).
Job Perks
Daily breakfast
Friday team lunches
Macbook
Nice calm work environment, with new furniture
Potential for stock options (in lakhs)
Stipend to attend conferences
Potential travel to Singapore and other client locations
Notice Period: Not more than 45days.
Android experience : 5 to 10 years mandatory.
CTC: No bar for right candidate.
candidates comfortable to reloacte at chennai can apply.
- Manage sales operations in assigned district to achieve revenue goals.
- Supervise sales team members; the BSMs, on daily basis and provide guidance whenever needed.
- Identify skill gaps and conduct trainings to sales team.
- Work with team to implement new sales techniques to obtain profits.
- Assist in employee recruitment, promotion, retention and termination activities.
- Conduct employee performance evaluation and provide feedback for improvements.
- Contact potential customers and identify new business opportunities.
- Stay abreast with customer needs, market trends and competitors.
- Maintain clear and complete sales reports for management review.
- Build strong relationships with customers for business growth.
- Analyze sales performances and recommend improvements.
- Ensure that sales team follows company policies and procedures at all times.
- Develop promotional programs to increase sales and revenue.
- Plan and coordinate sales activities for assigned projects.






