Register
|
Login
habr_ru
Aug 10
GSPO (Qwen RL Algorithm by Alibaba Cloud)
http://habr.com/ru/articles/935800
#Qwen
#Alibaba
#GSPO
#GRPO
#reinforcement
-learning