PPO-Algorithms

Public

Experiments of the three PPO-Algorithms (PPO, clipped PPO, PPO with KL-penalty) proposed by John Schulman et al. on the 'Cartpole-v1' environment.

cartpole-v1 clipped kullback-leibler-divergence ppo pytorch reinforcement-learning

Creat：2021-11-13T02:51:04

Update：2024-09-14T02:46:45

Stars

Stars Increase

Related projects

Lobe Chat

Hot

? Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / DeepSeek / Qwen), Knowledge Base (file upload / knowledge management / RAG ), Multi-Modals (Plugins/Artifacts) and Thinking. One-click FREE deployment of your private ChatGPT/ Claude / DeepSeek application.

68780

1年前

+205today

Gitea

Hot

bitbucket

Git with a cup of tea! Painless self-hosted all-in-one software development service, including Git hosting, code review, team collaboration, package registry and CI/CD

52405

1年前

+74today

Anything Llm

Hot

agent-framework-javascript

The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, No-code agent builder, and more.

51978

1年前

+113today

Unsloth

Hot

deepseek

Finetune Llama 3.3, DeepSeek-R1 & Reasoning LLMs 2x faster with 70% less memory! ?

49117

1年前

+115today

Chatgpt On Wechat

Hot

基于大模型搭建的聊天机器人，同时支持微信公众号、企业微信应用、飞书、钉钉等接入，可选择GPT3.5/GPT-4o/GPT-o1/ DeepSeek/Claude/文心一言/讯飞星火/通义千问/ Gemini/GLM-4/Claude/Kimi/LinkAI，能处理文本、语音和图片，访问操作系统和互联网，支持基于自有知识库进行定制企业智能客服。

40008

1年前

+71today

V

compiler

Simple, fast, safe, compiled language for developing maintainable software. Compiles itself in <1s with zero library dependencies. Supports automatic C => V translation. https://vlang.io

37112

3年前

+19today

Pytorch Image Models

augmix

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more

35969

1年前

+24today

Rufus

bios

The Reliable USB Formatting Utility

33918

1年前

+43today

1Panel

Hot

1panel

? 1Panel offers an intuitive web interface for managing websites, files, containers, databases and LLMs within a Linux server.

32349

1年前

+76today

LibreChat

Hot

Enhanced ChatGPT Clone: Features Agents, DeepSeek, Anthropic, AWS, OpenAI, Assistants API, Azure, Groq, o1, GPT-4o, Mistral, OpenRouter, Vertex AI, Gemini, Artifacts, AI model switching, message search, Code Interpreter, langchain, DALL-E-3, OpenAPI Actions, Functions, Secure Multi-User Auth, Presets, open-source for self-hosting. Active project.

32260

1年前

+81today

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

GEO Brand Visibility

AI Visibility Audit

AI Search Visibility Checker

GEO Ranking Monitor

AI Conversation Insight

GEO Promotion Link Detection

GEO Ranking Optimization System

GEO Ranking Optimization

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

LLM API Hub

AI Models Finder

Model Providers

LLM Leaderboard

LLM API Proxy Checker

Compare LLMs

LLM Cost Calculator

LLM Arena

AI Model Compatibility Checker

AI Deployment Calculator