Unsupervised-Thai-Document-Clustering-with-Sanook-news
PublicAn unsupervised model to clustering Thai news. Using TD-IDF, SimCSE-WangchanBERTa with weighted by number of named entities as a vector representation, and using k-means as an clustering model.
document-clusteringhuggingface-transformersk-means-clusteringname-entity-recognitionnlp-machine-learningsentence-embeddingsthai-nlp
Creat:2022-07-24T18:52:02
Update:2023-07-21T15:26:46
1
Stars
0
Stars Increase