Structuring the World’s Supply Chain Data
We are the Data Platform Engineering team. We build the digital infrastructure that turns chaotic GPS signals into predictable logistics decisions.
Our Mission
Global logistics is fragmented. A single shipment might pass through a manufacturer’s TMS, a broker’s load board, a carrier’s dispatch system, and a driver’s mobile app. Each system speaks a different language (EDI 214, CSV, JSON, XML).
Our mission is not just to “move data,” but to standardize reality. We ingest millions of events per hour, normalize them into a coherent Event Model, and serve them to downstream applications that route trucks, pay drivers, and predict delays.
Engineering Values
Idempotency is King
Networks fail. Webhooks get retried. EDI files get re-sent. Our pipelines are designed to handle duplicate events gracefully. We rely on distinct “Event IDs” and “State Hashes” to ensure that processing the same message twice never results in double-billing or corrupt metrics.
Schema Evolution, Not Revolution
We maintain backward compatibility. A shipment API contract is a promise. We use Schema Registry to enforce validation, ensuring that a change in the upstream ELD provider format doesn’t break the downstream billing dashboard.
Visibility > Complexity
We prefer simple, observable pipelines over complex “black boxes.” If a truck is shown in the wrong location, a Data Engineer should be able to trace the lineage back to the raw GPS ping in under 5 minutes using our trace headers.
The Stack
We operate a modern Lakehouse architecture.
- Compute: Spark (Databricks) & Flink for streaming.
- Storage: Delta Lake on S3.
- Orchestration: Airflow.
- Quality: Great Expectations.
- IaC: Terraform.
Team Topology
We are organized into “Stream-Aligned Teams.” The Ingestion Team focuses on high-throughput connectivity with carriers. The Data Modeling Team curates the silver/gold tables. The Platform Team builds the underlying tooling (CI/CD, Observability) that empowers the other teams to move fast.