Amazon plans to launch an AI content market, allowing publishers to directly sell content copyrights to tech companies, aiming to resolve copyright disputes over training data for large models and promote standardized content licensing.
Blogger Tim reviews Byte's AI video model Seedance 2.0, praising its generation accuracy but noting ethical concerns: it can generate unobserved blind spots and clone unauthorized voices, raising industry worries about AI training data sources and privacy.....
Microsoft launches 'Publisher Content Market' to create a transparent AI content licensing platform, providing legal data for AI training and addressing copyright disputes.....
OpenAI is seeking alternative AI computing solutions outside of NVIDIA, as it is dissatisfied with the response speed of NVIDIA's latest chips in the inference stage. The company found that hardware speed has become a bottleneck in complex interactions such as code generation, so its strategic focus is shifting from model training to inference optimization.
An AI-driven sales training platform that includes role-playing, real-time coaching, and call analysis to help teams close more orders.
Professional AI tool capable of generating images and videos, and training custom characters.
An AI agent for framework training, used in marketing and sales, without the need for prompts and a learning curve.
PitchFit offers AI-driven startup analysis and personalized training tools to help turn business ideas into reality.
Alibaba
$6
Input tokens/M
$24
Output tokens/M
256
Context Length
$8
$240
52
$2
-
Moonshot
$4
$16
Baidu
32
Openai
$0.63
$3.15
131
Chatglm
128
Iflytek
$0.5
Huawei
Tencent
28
Google
$1.6
4
$12
$3.5
$10.5
16
prithivMLmods
CodeV is a 7-billion-parameter vision-language model fine-tuned based on Qwen2.5-VL-7B-Instruct. Through two-stage training of supervised fine-tuning (SFT) and reinforcement learning (RL) based on tool-aware policy optimization (TAPO), it aims to achieve reliable and interpretable visual reasoning. It represents visual tools as executable Python code and ensures the consistency between tool usage and question evidence through a reward mechanism, solving the problem of irrelevant tool invocation under high accuracy.
GuangyuanSD
Z-Image-Re-Turbo is a text-to-image generation model that has been optimized for de-reduction and re-acceleration based on the Z-Image-De-Turbo model. This model aims to balance the convenience during training and the speed during inference. It restores the fast generation ability close to the original Turbo model while maintaining the same training-friendly features as Z-Image-De-Turbo, enabling it to be perfectly compatible with a large number of trained LoRA models in the Z-Image ecosystem.
open-thoughts
OpenThinker-Agent-v1-SFT is an agent model obtained through supervised fine-tuning (SFT) based on Qwen/Qwen3-8B. It is the first-stage model of the complete training process (SFT + RL) of OpenThinker-Agent-v1, specifically optimized for agent tasks (such as terminal operations and code repair).
PrimeIntellect
INTELLECT-3 is a Mixture of Experts (MoE) model with 106 billion parameters, trained through large-scale reinforcement learning. It demonstrates excellent performance in mathematics, coding, and reasoning benchmark tests. The model, training framework, and environment are all open-sourced under a permissive license.
Shawon16
This is a video understanding model fine-tuned on an unknown dataset based on the VideoMAE-base architecture, specifically designed for sign language recognition tasks. The model achieved an accuracy of 18.64% after 20 training epochs.
Gjm1234
Wan2.2 is a major upgrade version of the basic video model, focusing on integrating innovative technologies such as effective MoE architecture, efficient training strategies, and multimodal fusion into the video diffusion model, bringing more powerful and efficient solutions to the field of video generation.
TeichAI
This model is obtained through distillation training using the high-reasoning-difficulty Gemini 3 Pro preview dataset based on the Qwen3-4B-Thinking-2507 base model. It focuses on improving complex reasoning abilities in the fields of coding and science. Through training with specific datasets, it aims to efficiently transfer the reasoning abilities of large models (such as Gemini 3 Pro) to a smaller-scale model.
MCG-NJU
SteadyDancer is a powerful animation framework based on the image-to-video paradigm, specifically designed to generate high-fidelity and temporally coherent human body animations. This framework effectively solves the identity drift problem in traditional methods through a robust first frame retention mechanism, performs excellently in visual quality and controllability, and significantly reduces the training resource requirements.
Clemylia
Gheya-1 is a new-generation basic language model in the LES-IA-ETOILES ecosystem, with 202 million parameters. It is an upgraded version of the old Small-lamina series. This model is designed specifically for professional fine-tuning and has targeted training in the fields of artificial intelligence, professional language models, and biology.
ExaltedSlayer
Gemma 3 is a lightweight open-source multimodal model launched by Google. This version is an instruction-tuned quantization-aware training model with 12B parameters, which has been converted to the MXFP4 format of the MLX framework. It supports text and image input and generates text output, with a 128K context window and support for over 140 languages.
This is a video understanding model fine-tuned on an unknown dataset based on the MCG-NJU/videomae-base model. After 20 epochs of training, it achieved an accuracy of 13.31% on the evaluation set. This model is specifically optimized for video analysis tasks.
Charlotte-AMY is a fine-tuned small language model developed by Clemylia, with 51 million parameters, focusing on the fields of hope, friendship, ethics, and support. This model adheres to the concept of 'training quality is better than the number of parameters' and performs excellently in semantic clarity and coherence, providing high-quality ethical consultation and emotional support services.
VibeThinker-1.5B is a 1.5-billion-parameter dense language model launched by Weibo AI. It is fine-tuned based on Qwen2.5-Math-1.5B and is specifically designed for mathematical and algorithmic coding problems. Trained using the 'Spectrum to Signal Principle' framework, it outperforms larger models in multiple math competition tests. The training cost is approximately $7,800, and it supports an output of up to about 40k tokens.
allenai
Olmo 3 is a new generation of language model family developed by the Allen Institute for AI, including 7B and 32B instruction and thinking variants. This model performs excellently in long-chain thinking and can significantly improve the performance of reasoning tasks such as mathematics and coding. All code, checkpoints, and training details will be made public to promote the development of language model science.
Olmo 3 is a series of language models developed by the Allen Institute for AI, including two scales of 7B and 32B, with two variants: instructional and reflective. This model performs excellently in long-chain thinking and can effectively improve the performance of reasoning tasks such as mathematics and coding. It adopts a multi-stage training method, including supervised fine-tuning, direct preference optimization, and reinforcement learning with verifiable rewards.
Olmo-3-7B-Think-DPO is a 7B parameter language model developed by the Allen Institute for AI. It has the ability of long-chain thinking and performs excellently in reasoning tasks such as mathematics and coding. This model has undergone multi-stage training including supervised fine-tuning, direct preference optimization, and reinforcement learning based on verifiable rewards, and is designed specifically for research and educational purposes.
mradermacher
This project provides a static quantized version of the Qwen-4B-Instruct-2507-Self-correct model, supporting tasks such as text generation, bias mitigation, and self-correction. Based on the Qwen-4B architecture, the model has undergone instruction fine-tuning and self-correction training, offering multiple quantized versions to meet different hardware requirements.
sensenova
SenseNova-SI is a series of spatial intelligence enhancement models built on multimodal foundation models. Through careful training with 8 million samples of data, it has achieved excellent performance in multiple spatial intelligence benchmark tests while maintaining strong general multimodal understanding capabilities.
This model is based on Qwen3-4B-Thinking-2507 and fine-tuned on 1000 examples of GPT-5-Codex. It focuses on text generation tasks and uses Unsloth technology to achieve a 2-fold increase in training speed.
SenseNova-SI is a series of spatial intelligence enhancement models built on mature multimodal foundation models. Through training with carefully curated 8 million data samples, it demonstrates excellent performance in multiple spatial intelligence benchmark tests while maintaining strong general multimodal understanding ability.
The Linear Regression MCP project demonstrates an end-to-end machine learning workflow using Claude and the Model Context Protocol (MCP), including data preprocessing, model training, and evaluation.
A comprehensive chess analysis MCP server that integrates Stockfish engine evaluation, thematic analysis, opening databases, puzzle training, and game visualization, providing advanced chess analysis and game improvement functions.
This project is an MCP server for managing memory text files, helping AI models like Claude maintain context between conversations. It provides functions to add, search, delete, and list memories, supporting exact matching operations based on substrings. It is designed to store memories in simple text files, similar to ChatGPT's memory mechanism, and triggers memory storage through prompts and training.
A TypeScript server that connects Hevy fitness data with language models and provides tools for fitness history, training progress, and personal records through the MCP protocol.
The Unsloth MCP Server is a server for efficiently fine-tuning large language models. Through optimized algorithms and 4-bit quantization technology, it achieves a 2-fold increase in training speed and an 80% reduction in video memory usage, supporting multiple mainstream models.
This is an MCP server that provides a standardized interface for Scikit-learn models, supporting functions such as model training, evaluation, data preprocessing, and persistence.
An intelligent e-commerce dialogue agent system based on reinforcement learning, integrating ontology reasoning, business toolchains, dialogue memory, and a Gradio interface. It realizes closed-loop learning from data to training and then to deployment through the Stable Baselines3 PPO algorithm, and can autonomously optimize the decision-making strategy of the shopping assistant.
This project demonstrates the training of a linear regression model using Claude and the Model Context Protocol (MCP) for an end-to-end machine learning workflow. Users only need to upload a CSV dataset, and the system can automatically complete the entire process of data preprocessing, model training, and evaluation (RMSE calculation).
An MCP server that exposes the PyTorch Lightning framework to tools, agents, and orchestration systems through structured APIs, supporting functions such as training, inspection, validation, testing, prediction, and model checkpoint management.
LudusMCP is a model context protocol server for managing the Ludus laboratory environment based on natural language commands. It provides functions for deploying, configuring, and managing virtualized training environments and supports connecting to the Ludus server through WireGuard VPN or SSH tunnel.
The MCP tool is a tool for managing model context in GitHub repositories, supporting version tracking, dataset management, performance recording, and documentation of training configurations.
This project is an integration tool between the Strava API and the Model Context Protocol (MCP) SDK, used to analyze training data and provide personalized recommendations. It supports functions such as training activity analysis, automatic token update, and API request rate limit.
Create a training project for 100 MCP servers
A Model Context Protocol server that provides access to the Whoop API, supporting queries for health data such as workout cycles, recovery status, and training loads.
Implementation of the MCP server for the Hevy API, providing a toolset for interacting with the Hevy fitness platform, including functions such as training record and routine management.
This project is an integration solution of the Strava API and the Model Context Protocol (MCP) SDK, used to analyze training data and provide personalized recommendations.
An MCP service that helps technical coaches create structured learning hour courses. It generates 60-minute technical training courses with code examples and interactive whiteboards through the 4C learning model.
This project is a research on automated medical coding, providing code for training and evaluating medical coding models on the MIMIC - III and MIMIC - IV datasets, including the implementation of multiple models and the division of new datasets.
AWorld is a multi - agent system framework aiming to bridge the gap between theoretical MAS capabilities and practical applications, providing a full - set solution from single - agent to multi - agent collaboration/competition. The project supports scenarios such as browser/mobile operations and GAIA benchmark testing, adopts a client - server architecture, integrates a rich toolchain, and includes performance evaluation and training functions.
This project provides an interface for AI assistants to access Haskell documentation. By retrieving authoritative documentation on Hackage in real - time, it solves the problem of insufficient training data for AI in the Haskell field and improves the accuracy of code generation and explanation.