Live Twitter sentiment analysis using Python, Apache Spark Streaming, Kafka, NLTK, SocketIO
Data Engineering Zoomcamp is a free nine-week course that covers the fundamentals of data engineering.
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
Open-source IoT Platform - Device management, data collection, processing and visualization.
大数据入门指南 :star:
flink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例,还有 Flink 落地应用的大型项目案例(PVUV、日志存储、百亿数据实时去重、监控告警)分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》
【大厂面试专栏】一份Java程序员需要的技术指南,这里有面试题、系统架构、职场锦囊、主流中间件等,让你成为更牛的自己!
NLTK Source
CMAK is a tool for managing Apache Kafka clusters
Open-Source Web UI for Apache Kafka Management
Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!