The translated data: FlexFlow is an open-source distributed deep learning framework that offers low-latency, high-performance LLM model services through speculative inference and tree-based parallel decoding technologies. It supports both data parallelism and model parallelism training, mixed-precision training, and common deep learning models. FlexFlow can be deployed in multi-GPU environments, providing both Python and C++ APIs, and supports the import of mainstream deep learning frameworks.