realtime-cdc-pipeline-docker
PublicAn End-to-End Real-time Data Pipeline using Debezium (CDC) to stream changes from PostgreSQL to Kafka, processed by Apache Spark (Structured Streaming), and sunk into ClickHouse for analytics. Orchestrated by Airflow and fully containerized with Docker Compose.