Long context GPT-OSS fine-tuning

(unsloth.ai)

4 points | by danielhanchen 15 hours ago

1 comments

danielhanchen 15 hours ago
Hey HN! Just sharing some work we did to make gpt-oss finetuning use O(N) and not O(N^2) VRAM via Flex Attention + some bug fixes :)