[ Argilla ] Curate High Quality Dataset
LLM/HuggingFace-Learn
2025. 8. 16. 20:59
[ Build Reasoning Model ] Fine-tune a model with GRPO
LLM/HuggingFace-Learn
2025. 7. 29. 17:00
[ Build Reasoning Model ] Implementing GRPO in TRL
LLM/HuggingFace-Learn
2025. 7. 29. 09:20
[ Build Reasoning Model ] Understanding the DeepSeek R1 Paper
LLM/HuggingFace-Learn
2025. 7. 28. 13:55
[ Soft prompts ] Prompt tuning
LLM/HuggingFace-Learn
2025. 7. 25. 12:30
[ Build Reasoning Model ] What is Reinforcement Learning?
LLM/HuggingFace-Learn
2025. 7. 24. 21:00
[ Supervised Fine-Tuning ] Evaluation
LLM/HuggingFace-Learn
2025. 7. 24. 20:00
[ Supervised Fine-Tuning ] LoRA
LLM/HuggingFace-Learn
2025. 7. 24. 19:40
[ Supervised Fine-Tuning ] Supervised Fine-Tuning (SFT)
LLM/HuggingFace-Learn
2025. 7. 24. 19:20
[ Supervised Fine-Tuning ] Chat Templates
LLM/HuggingFace-Learn
2025. 7. 24. 19:00