PPO-Algorithms

Public

Experiments of the three PPO-Algorithms (PPO, clipped PPO, PPO with KL-penalty) proposed by John Schulman et al. on the 'Cartpole-v1' environment.

cartpole-v1 clipped kullback-leibler-divergence ppo pytorch reinforcement-learning

Hora de creación：2021-11-13T02:51:04

Hora de actualización：2024-09-14T02:46:45

Stars

Stars Increase

Proyectos relacionados

Lobe Chat

? Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / DeepSeek / Qwen), Knowledge Base (file upload / knowledge management / RAG ), Multi-Modals (Plugins/Artifacts) and Thinking. One-click FREE deployment of your private ChatGPT/ Claude / DeepSeek application.

62731

4个月前

+24today

Gitea

bitbucket

Git with a cup of tea! Painless self-hosted all-in-one software development service, including Git hosting, code review, team collaboration, package registry and CI/CD

49195

3个月前

+20today

Anything Llm

agent-framework-javascript

The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, No-code agent builder, and more.

45619

3个月前

+36today

Unsloth

Hot

deepseek

Finetune Llama 3.3, DeepSeek-R1 & Reasoning LLMs 2x faster with 70% less memory! ?

40967

3个月前

+65today

Chatgpt On Wechat

基于大模型搭建的聊天机器人，同时支持微信公众号、企业微信应用、飞书、钉钉等接入，可选择GPT3.5/GPT-4o/GPT-o1/ DeepSeek/Claude/文心一言/讯飞星火/通义千问/ Gemini/GLM-4/Claude/Kimi/LinkAI，能处理文本、语音和图片，访问操作系统和互联网，支持基于自有知识库进行定制企业智能客服。

37840

3个月前

+21today

V

compiler

Simple, fast, safe, compiled language for developing maintainable software. Compiles itself in <1s with zero library dependencies. Supports automatic C => V translation. https://vlang.io

36472

2年前

+2today

Pytorch Image Models

augmix

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more

34532

3个月前

+10today

Rufus

bios

The Reliable USB Formatting Utility

31823

3个月前

+12today

1Panel

1panel

? 1Panel offers an intuitive web interface for managing websites, files, containers, databases and LLMs within a Linux server.

29190

3个月前

+37today

LibreChat

Enhanced ChatGPT Clone: Features Agents, DeepSeek, Anthropic, AWS, OpenAI, Assistants API, Azure, Groq, o1, GPT-4o, Mistral, OpenRouter, Vertex AI, Gemini, Artifacts, AI model switching, message search, Code Interpreter, langchain, DALL-E-3, OpenAPI Actions, Functions, Secure Multi-User Auth, Presets, open-source for self-hosting. Active project.

27066

3个月前

+35today

Noticias de IA

IA Diario

Cronología de la IA

Al hardware

Últimos Casos

Colección de Imágenes

Colección de Videos

Colección de Audio

Colección de Contenido

Últimos Tutoriales

Ranking de Productos de IA

Ranking de Crecimiento de Tráfico de IA

Ranking de Descenso de Tráfico de IA

Ranking Semanal de IA

Estados Unidos

China

India

Brasil

Generación de Imágenes

Asistente Personal

Generación de Personajes

Generación de Videos

Ranking de Proyectos de IA

Ranking de Crecimiento de Proyectos de IA

Ranking de Desarrolladores de IA

Ranking de Organizaciones de IA

Deepseek

TTS

LLM

ChatGPT

Visión General

PPO-Algorithms

Proyectos relacionados

Lobe Chat

Gitea

Anything Llm

Unsloth

Chatgpt On Wechat

V

Pytorch Image Models

Rufus

1Panel

LibreChat