Chapter 67. GRPO(Group Relative Policy Optimization)