Python clone of Spark, a MapReduce alike framework in Python
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
High-performance, scalable time-series database designed for Industrial IoT (IIoT) scenarios
大数据入门指南 :star:
A curated list of awesome big data frameworks, ressources and other awesomeness.
JuiceFS is a distributed POSIX file system built on top of Redis and S3.
专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
????, ????????? & ??. Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. https://databend.com
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second ?
Enterprise job scheduling middleware with distributed computing ability.
? High-performance distributed object storage that is faster than MinIO