AIbase
Product LibraryTool NavigationMCP

count-tokens-hf-datasets

Public

This project shows how to derive the total number of training tokens from a large text dataset from ? datasets with Apache Beam and Dataflow.

Creat2022-06-10T11:25:54
Update2025-06-11T20:40:26
27
Stars
0
Stars Increase