• naeap@sopuli.xyz
    link
    fedilink
    English
    arrow-up
    1
    ·
    13 days ago

    I was mostly interested in the distilling part, while in the video, they pressed on a button and directly afterwards talked with an LLM

    I’m really not an expert, but distilling is usually a time consuming task to get “knowledge” from a lager network to a smaller one, so we can have kinda the same results, without the bulk.

    But in the video I just don’t see that happening, when it is a “how to distill” video

    To be honest, I’m really naive here and maybe I’m wrong, but that just isn’t how I understood distilling

    • voracitude@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      13 days ago

      Oh, right, there’s the issue. It’s not a “how to distill” video. That video has the description of what’s going on under the video player: “Demo showcasing DeepSeek R1 Qwen 1.5 Q4 K M model running on an AMD Ryzen™ HX 370 series processor in real time

      The team releasing this already did the distillation for us; what follows the video are instructions on how to run these new distilled models on your AMD system, not how to distill the models yourself.

      • naeap@sopuli.xyz
        link
        fedilink
        English
        arrow-up
        1
        ·
        13 days ago

        Ok, well, that’s quite anticlimactic…

        Ok, maybe the performance of running models locally is still nice on their chips

        Thanks for clarifying, their title was bringing me to other hopes