Acceptable Use & Query Governance
BigData-ETL processes petabytes of geospatial data. To ensure fair access and system stability for all tenants, we enforce strict limits on compute and API usage.
API Rate Limits
We utilize a Token Bucket algorithm to throttle requests. Limits are applied per api_key. Headers are included in every response (`X-RateLimit-Remaining`).
429 Handling: If you receive a 429 Too Many Requests, you must implement an exponential backoff strategy. Clients that continue to hammer the API during a 429 state may be temporarily blacklisted.
Warehouse Query Guidelines
Our data lake tables (specifically `events_silver` and `telemetry_history`) are massive. Scanning them without filters is prohibitively expensive and slow. The Query Optimizer (Gatekeeper) will automatically reject queries that violate the following rules:
- Partition Pruning Required: All queries against history tables must filter by
event_dateorpartition_year_month. - No Open-Ended Joins: Joins between large tables must include equality predicates on the join keys. Cross-joins are blocked.
- Limit clause: Ad-hoc queries via the SQL Editor must have a
LIMITclause (max 10,000).
The “Penalty Box”
Users who repeatedly submit queries that scan > 1 TB of data without valid filters will be automatically moved to the `Low_Priority` warehouse pool. This pool has reduced CPU allocation and no concurrency guarantees. To return to the standard pool, you must complete the “Query Optimization” training module.
Common Anti-Patterns
SELECT * FROM events_silver WHERE event_type = 'shipment.arrived'
Why: This triggers a full table scan of 5 years of history to find arrival events. It will timeout.
SELECT * FROM events_silver
WHERE event_date >= DATE_SUB(CURRENT_DATE(), 7)
AND event_type = 'shipment.arrived'
Why: The `event_date` filter allows the engine to skip 99% of the files (Partition Pruning).