Moonshot
$200
入力トークン/百万
出力トークン/百万
131
コンテキスト長
amd
PARD is a high-performance speculative decoding method that can convert autoregressive draft models into parallel draft models at low cost, significantly accelerating the inference of large language models.