

Discovered Labs
https://discoveredlabs.comJobs at Discovered Labs
Senior Data Engineer
About Discovered Labs
At Discovered Labs we work with $10M - $50M ARR companies to help them get more leads, users and customers from Google, Bing and AI assistants such as ChatGPT, Claude and Perplexity.
We approach marketing the way engineers approach systems: data in, insights out, feedback loops everywhere. Every decision traces back to measurable outcomes. Every workflow is designed to eliminate manual bottlenecks and compound over time.
High-level overview of our approach:
- Data-driven automation: We treat marketing programs like products. We instrument everything, automate the repetitive, and focus human effort on high-leverage problems.
- First principles thinking: We don't copy what others do. We understand the underlying mechanics of how search and AI systems work, then build solutions from that foundation.
- Full-stack ownership: SEO and AEO rarely work as isolated tasks. We work across the entire funnel and multiple surface areas to ensure we own the outcome and clients win.
The Team
We're a deeply technical team building the SpaceX of the AEO & SEO space. You'll work alongside engineers who have built fraud engines powering Stripe, Plaid, and Coinbase; developed self-driving car systems at Aurora; and conducted AI research at Stanford. We don't have layers of management. You'll work directly with founders who can go deep on architecture, code, and product.
This Role
Own the data infrastructure behind automated reporting, AI visibility monitoring, competitive intelligence, and proactive alerting across a growing multi-tenant client base.
The hard problem is operational complexity, less so petabyte scale volume. Many clients, each with multiple data sources, different schemas, different API rate limits, different failure modes, different freshness requirements. When one breaks, it can't take down everyone else. Fault isolation, graceful degradation, and per-tenant reliability are built in from the start.
This is largely greenfield. You'll be building out monitoring, observability, data quality layers and pipeline orchestration.
You report to the CTO and work closely with product engineers who build the features that consume your data layer. You'll define interfaces and data contracts together. There's no platform team. You own your infrastructure, your CI, and your monitoring.
What You'll Do
- Multi-tenant data infrastructure. Ingestion, validation, and transformation across multiple data sources. Fault isolation, schema variation, and graceful upstream failure handling.
- Third-party API integration. Most of our data comes from external APIs with their own auth flows, rate limits, pagination quirks, and breaking changes. You'll build robust, resilient connectors that handle all of this gracefully across many client accounts.
- Data quality systems. Automated checks on distributions, volumes, null rates, and freshness. Statistical validation, not just schema validation. Bad data doesn't make it downstream.
- Data observability. Freshness monitoring, volume anomaly detection, schema drift detection, lineage tracking, blast radius analysis. You know the difference between "the code ran" and "the data is correct."
- Alerting design. Not just dashboards. Threshold tuning, noise reduction, avoiding alert fatigue. Mean time to detection is a core metric for this role.
- Freshness SLAs. Define them per source, build infrastructure to meet them, alert before they breach.
- Event-driven trigger infrastructure. Surface performance changes, quality regressions, and freshness violations as events for downstream systems.
- Entity data models. Design schemas for client, competitor, and content entities. Own schema evolution and backward compatibility.
- Operational environment. CI/CD, containers, deployment pipelines, credential management. Every deploy passes CI before production.
The Ideal Person for This Role
- A builder who ships. You care about getting working systems into production, not endless planning or polish. You've built data infrastructure people actually rely on.
- An operator, not just an architect. You don't just design systems, you run them. You find satisfaction in making things reliable, not just making them work once.
- An owner. You take responsibility for outcomes, not just tasks. When a pipeline you built breaks at 3am, you fix it and make sure it doesn't break again.
- Humble and curious. You acknowledge what you don't know, ask good questions, and genuinely want to learn. You take feedback as a gift, not a threat.
- A first-principles thinker. You understand why things work, not just how. You can go five levels deep on schema decisions, validation strategies, and architecture tradeoffs.
- Always improving. You're not satisfied with "good enough." You actively seek ways to get better at your craft and make systems better over time.
Requirements
- 4+ years in data engineering, platform engineering, or infrastructure-heavy backend work.
- Python, SQL, pipeline orchestration (Airflow, Dagster, Prefect, or similar).
- Event-driven architectures or real-time data processing.
- Third-party API integration. You've built resilient connectors against external APIs with rate limits, auth flows, pagination, and breaking changes. Not just calling endpoints, but handling the full operational reality.
- Pipeline fundamentals. Idempotent pipelines, backfill strategies, and schema evolution handled gracefully in production.
- Data quality systems in production. Automated checks on distributions, volumes, freshness, null rates. Not a one-off notebook.
- Data observability. Freshness monitoring, anomaly detection, lineage tracking, blast radius analysis.
- Alerting design. Threshold tuning, noise reduction, escalation paths. You've thought about false positives as much as missed detections.
- Own your infrastructure. Containers, CI/CD, deployment pipelines, monitoring, credential management. No platform team to hand off to.
- Multi-tenant or multi-client data systems. Tenant isolation, per-client configuration, and operational overhead at scale.
- APIs or service layers for data exposure. You've built interfaces that other systems consume, not just internal scripts.
- Collaborative. You'll work closely with product engineers to define data contracts and interfaces. You communicate tradeoffs clearly in writing. You document decisions, write clear specs, and communicate tradeoffs in writing.
Preferred Qualifications
- Experience with marketing or analytics data (GA4, GSC, SEO tools)
- Prior experience at a fast-moving startup
What's in It for You
- Fully remote position
- Work directly with the CTO on high-impact projects
- High ownership and autonomy. No micromanagement.
- First-hand exposure to cutting-edge AI and search technology
- Your work will directly impact well-known (10M+ ARR) companies' performance
- Join a fast-growing company at the intersection of AI and marketing
Our Hiring Process
- Application
- Take-Home Project
- Technical Deep Dive
- Leadership Interview
- Reference Checks
Pls apply here:
tinyurl[dot]com/ysk8w2eu
Similar companies
About the company
We are a fast growing virtual & hybrid events and engagement platform. Gevme has already powered hundreds of thousands of events around the world for clients like Facebook, Netflix, Starbucks, Forbes, MasterCard, Citibank, Google, Singapore Government etc.
We are a SAAS product company with a strong engineering and family culture; we are always looking for new ways to enhance the event experience and empower efficient event management. We’re on a mission to groom the next generation of event technology thought leaders as we grow.
Join us if you want to become part of a vibrant and fast moving product company that's on a mission to connect people around the world through events.
Jobs
5
About the company
Jobs
1
About the company
Jobs
11
About the company
Designbyte Studio is a digital design and development studio focused on building clean, modern, and reliable web experiences. We work closely with startups, creators, and growing businesses to design and develop websites, user interfaces, and digital products that are simple, functional, and easy to use.
Our approach is practical and detail-driven. We believe good design should be clear, purposeful, and aligned with real business goals. From concept to launch, we focus on quality, performance, and long-term value.
Jobs
1
About the company
Stairio is a digital infrastructure company building scalable online systems for modern businesses.
We help service-driven brands establish strong digital foundations through high-performance websites, booking systems, management dashboards, and integrated payment solutions. Our goal is to give businesses ownership, control, and long-term digital assets that generate measurable revenue.
Jobs
2
About the company
Jobs
10
About the company
Building the most advanced ad blocker on the planet!🌎
Loved by 3,50,000+ users on Chrome!
Jobs
1
About the company
Jobs
6
About the company
Jobs
1
About the company
Jobs
2





