Based on the Creative Mobile Technologies revenue anomaly investigation, here are the critical data assets that need enhanced testing coverage, organized by your strategic testing framework:
raw_nyc_taxi (Highest Priority)Why This Needs Better Testing:
Existing Tests: table_stats (SYNQ)
Recommended Tests:
-- Schema & Content Checks
- not_null: [vendor_id, total_amount, pickup_datetime, dropoff_datetime]
- accepted_values: vendor_id in [1, 2] (known valid vendors)
- positive_values: [total_amount, fare_amount, trip_distance]
- date_range: pickup_datetime < dropoff_datetime
-- Volume & Freshness Monitoring
- anomaly_detection: daily trip counts by vendor_id
- freshness: data arrives within 24 hours of trip date
- completeness: vendor_id distribution stays within historical rangesBusiness Impact: Prevents $37M+ revenue calculation errors from propagating
vendor_codes Seed TableWhy This Needs Better Testing:
Existing Tests: relationships (dbt), not_null vendor_name (dbt)
Recommended Tests:
-- Critical Mapping Validation
- unique: vendor_id
- not_null: [vendor_id, vendor_name]
- accepted_values: vendor_id in [1, 2]
- regression_test: vendor_name mappings match historical snapshotss_nyc_taxi (Critical - Business Logic Layer)Why This Asset is Undertested:
Existing Tests: relationships vendor_id/ratecode_id/payment_type (dbt), table_stats (SYNQ), custom SQL test (SYNQ)
Recommended Tests:
-- Test New Logic Introduced
- expression_is_true: "adjusted_total_amount >= 0"
- custom_sql: "SELECT COUNT(*) FROM {{ ref('s_nyc_taxi') }}
WHERE adjusted_total_amount = 0 AND total_amount > 0"
-- Alert if revenue is being artificially zeroed
-- Business Rule Documentation
- data_test: flag_business_logic_modifications
description: "Alert when revenue adjustment logic is active"nyc_taxi_coded (Join Integrity)Why Testing is Needed:
Existing Tests: inherited relationships (dbt)
Recommended Tests:
-- Guard Against Fan-out
- cardinality_equality: source row count = output row count
- unique_combination: [vendor_id, pickup_datetime, trip_distance]
- not_null: vendor_name (ensure join succeeded)vendor_results (Revenue Aggregation)Why This Needs Integration Testing:
Existing Tests: revenue anomaly monitor (SYNQ), inherited relationships (dbt)
Recommended Tests:
-- Integration-Style Business Logic
- expression_is_true: "market_share_trips <= 100"
- expression_is_true: "market_share_revenue <= 100"
- expression_is_true: "total_revenue >= 0"
-- Historical Consistency
- regression_test: daily_vendor_revenue_snapshot
description: "Compare daily totals to historical ranges"
-- Cross-Vendor Validation
- custom_sql: "SELECT day FROM {{ ref('vendor_results') }}
GROUP BY day
HAVING SUM(market_share_revenue) NOT BETWEEN 95 AND 105"Why Monitor Testing is Critical:
Existing Tests: custom_numeric revenue monitor (SYNQ)
Recommended Enhancements:
Current Gap: No visibility into business logic changes affecting revenue
Recommended Monitoring:
# dbt meta configuration
models:
s_nyc_taxi:
meta:
business_logic_active: true
revenue_impact: high
alert_on_modification: trueraw_nyc_taxi vendor_id validation and volume monitorings_nyc_taxi business logic documentation and modification alertsvendor_results negative revenue and market share validationvendor_codes regression testing for mapping changesnyc_taxi_coded join cardinality protectionHigh-Impact, Low-Cost Tests:
Medium-Impact, Medium-Cost Tests:
Low-Impact, High-Cost Tests to Avoid:
Bottom Line: Focus testing budget on source data validation and business logic change detection rather than comprehensive model coverage. The Creative Mobile Technologies issue would have been prevented by a single test flagging revenue modification logic in s_nyc_taxi.
| Asset | Current Tests & Monitors | Business Impact | Critical Gaps | Priority |
|---|---|---|---|---|
| raw_nyc_taxi (Source Layer) | • SYNQ table_stats monitor • Basic volume/freshness only | CRITICAL Single point of failure for $37M+ revenue | • No vendor_id validation • No revenue component checks • No date logic validation • No vendor-specific volume monitoring | 🔴 High |
| s_nyc_taxi (Staging - Business Logic) | • 3 dbt relationship tests (vendor_id, ratecode_id, payment_type) • SYNQ table_stats monitor • 1 custom SQL test (unknown) | CRITICAL Contains revenue modification logic | • No business logic change alerts • No revenue adjustment validation • No documentation of CASE logic | 🔴 High |
| vendor_codes (Reference Data) | • 1 dbt relationship test • 1 not_null test (vendor_name) | HIGH Controls revenue attribution | • No regression testing • No uniqueness on vendor_id • No accepted values validation | 🟡 Medium |
| vendor_results (Data Product) | • SYNQ revenue anomaly monitor ✅ • Inherited relationship checks | HIGH Direct business reporting | • No market share validation • No negative revenue checks • No cross-vendor consistency | 🟡 Medium |
| revenue_per_distance (Analytics) | • 13 dbt tests (extensive) • not_null, expression_is_true, range checks • Status: 111K+ test failures | MEDIUM Analytical reporting | • Over-testing causing noise • Unrealistic constraints • Test fatigue from failures | 🟢 Low |
| borough_popularity (Analytics) | • 14 dbt tests (comprehensive) • accepted_values, positive_value • SYNQ volume monitor by borough • Status: Some failing tests | MEDIUM Geographic analysis | • Well-monitored overall • Some constraint tuning needed | 🟢 Low |
| Layer | Current State | Recommendation | Business Justification |
|---|---|---|---|
| Source Layer | Under-tested | Add vendor_id validation, revenue checks | Prevents 80% of downstream issues |
| Staging Layer | Missing business logic alerts | Add modification monitoring | Would have prevented $600K+ revenue issue |
| Reference Data | Basic coverage | Add regression testing | Ensures revenue attribution accuracy |
| Data Products | Anomaly detection working | Enhance with business context | Reduces false positive investigations |
| Analytics | Over-tested | Reduce redundant checks | Focus budget on higher-impact areas |
✅ What Worked:
❌ What Failed:
🎯 Strategic Finding:
Current testing is inverted - extensive coverage on final outputs but minimal validation where business logic lives. One targeted test in s_nyc_taxi would have prevented the entire Creative Mobile Technologies incident.
Bottom Line: The current testing strategy is inverted - heavy testing on final outputs but minimal validation at source and transformation layers where business logic lives. The Creative Mobile Technologies issue would have been prevented by a single test flagging revenue modification logic in s_nyc_taxi, while the extensive testing on revenue_per_distance (13 tests) didn't prevent any business-critical issues.