MIT introduces Self-Distillation Fine-Tuning to reduce catastrophic forgetting; it uses student-teacher demonstrations and needs 2.5x compute.
NEW YORK — As a man wearing a neon-blue jellyfish hat fought off draping tentacles to scroll through his phone and find the latest message from his personal AI assistant, three people wearing Pegasus ...