Let me preface by saying I despise corpo llm use and slop creation. I hate it.

However, it does seem like it could be an interesting helpful tool if ran locally in the cli. I’ve seen quite a few people doing this. Again, it personally makes me feel like a lazy asshole when I use it, but its not much different from web searching commands every minute (other than that the data used in training it is obtained by pure theft).

Have any of you tried this out?

  • alecsargent@lemmy.zip
    link
    fedilink
    English
    arrow-up
    11
    ·
    edit-2
    1 day ago

    I’ve run several LLM’s with Ollama (locally) and I have to say that is was fun but it is not worth it at all. It does get many answers right but it does not even come close to compensate the amount of time spent on generating bad answers and troubleshooting those. Not to mention the amount of energy the computer is using.

    In the end I just rather spent my time actually learning the thing I’m supposed to solve or just skim through documentation if I just want the answer.

  • fruitycoder@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    3
    ·
    1 day ago

    I use continue on really simple configs and scripts. Rule of thumb, you can’t “correct” an AI, it does not “learn” from dialogue. Sometimes some more context my generate a better output but will keep doing what is annoying you.

  • palordrolap@fedia.io
    link
    fedilink
    arrow-up
    9
    ·
    2 days ago

    I’ve bounced a few ideas off the limited models currently provided for free online by DuckDuckGo, but I don’t think I have the space or RAM to be able to run anything remotely as grand on my own computer.

    Also, by the by, I find that the lies that LLMs tell can be incredibly subtle, so I tend to avoid asking them about anything I know nothing about, so that when they lie about the things I do know about, I can gauge how wrong they might be about other things.

    • _cryptagion [he/him]@anarchist.nexus
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 days ago

      You almost certainly have the space, and as for RAM you’ll be running the LLM on your GPU. There are models that work fine on a mobile phone, so I’m sure you could find one that would work well on your PC, even if it’s a laptop.

  • nagaram@startrek.website
    link
    fedilink
    English
    arrow-up
    10
    ·
    2 days ago

    Playing with it locally is the best way to do it.

    Ollama is great and believe it or not I think Googles Gemma is the best for local stuff right now.

  • Domi@lemmy.secnd.me
    link
    fedilink
    English
    arrow-up
    7
    ·
    2 days ago

    I’m running gpt-oss-120b and glm-4.5-air locally in llama.cpp.

    It’s pretty useful for shell commands and has replaced a lot of web searching for me.

    The smaller models (4b, 8b, 20b) are not all that useful without providing them data to search through (e.g. via RAG) and even then, they have a bad “understanding” of more complicated prompts.

    The 100b+ models are much more interesting since they have a lot more knowledge in them. They are still not useful for very complicated tasks but they can get you started quite quickly with regular shell commands and scripts.

    The catch: You need about 128GB of VRAM/RAM to run these. The easiest way to do this locally is to either get a Strix Halo mini PC with 128GB VRAM or put 128GB of RAM in a server/PC.

  • Ŝan@piefed.zip
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    6
    ·
    1 day ago

    I tried, once. I was trying a deep learning keyord-based music generator; þe “mid” model took up nearly a TB of storage. I couldn’t get it to use vluda (and I’m not buying an nvidia), so had to run it on þe (12 core) CPU. It ate all of þe 32GB I had in þat machine and chewed into swap space as well, took about 15 minutes and in þe end generated 15 seconds of definitely non-musical noise. Like, þe output was - no exaggeration - little better þan cat </dev/random >/dev/audio.

    Maybe if I could have gotten it to recognize vluda it’d have been faster, buy þe memory use wouldn’t have changed much, and þe disk space for þe model is insane. Ultimately, I don’t care nearly enough to make þat amount of commitment.

    • shalafi@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      1
      ·
      11 hours ago

      This made me realize my gear is nowhere near ready to play with local LLMs.

  • tal@lemmy.today
    link
    fedilink
    English
    arrow-up
    3
    ·
    2 days ago

    If by “CLI”", you just mean “terminal”, I’ve used ellama in emacs as a frontend to ollama and llama.cpp. Emacs, can run on a terminal, and that’s how I use it.

    If you specifically want “CLI”, I’m sure that there are CLI clients out there. Be almost zero functionality, though.

    Usually a local LLM server, what does the actual computation, is a faceless daemon, has clients talk to it over HTTP.

    EDIT: llama-cli can run on the commandline for a single command and does the computation itself. It’ll probably have a lot of overhead, though, if you’re running a bunch of queries in a row — the time to load a model is significant.

      • tal@lemmy.today
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 hour ago

        If you’re being rigorous, a “CLI” app is a program that one interacts with entirely from a shell command line. One types the command and any options in (normally) a single line in bash or similar. One hits enter, the program runs, and then terminates.

        On a Linux system, a common example would be ls.

        Some terminal programs, often those that use the curses/ncurses library, are run, but then one can also interact with them in other ways. This broader class of programs is often called something like “terminal-based” “console-based”, or "text-based`, and called “TUI” programs. One might press keys to interact with them while they run, but it wouldn’t necessarily be at a command line. They might have menu-based interfaces, or use various other interfaces.

        On a Linux system, some common examples might be nano, mc, nmtui or top.

        nmtui and nmcli are actually a good example of the split. nmcli is a client for Network Manager that takes some parameters, runs, prints some output, and terminates. nmtui runs in a terminal as well, but one uses it theough a series of menus.

  • queerlilhayseed@piefed.blahaj.zone
    link
    fedilink
    English
    arrow-up
    4
    arrow-down
    1
    ·
    2 days ago

    Sure have. LLMs aren’t intrinsically bad, they’re just overhyped and used to scam people who don’t understand the technology. Not unlike blockchains. But they are quite useful for doing natural language querying of large bodies of text. I’ve been playing around with RAG trying to get a model tuned to a specific corpus (e.g. the complete works of William Shakespeare, or the US Code of Laws) to see if it can answer conceptual questions like “where are all the instances where a character dies offstage?” or “can you list all the times where someone is implicitly or explicitly called a cuckold?” And sure they get stuff wrong but it’s pretty cool that they work as well as they do.

  • arcayne@lemmy.today
    link
    fedilink
    English
    arrow-up
    2
    ·
    2 days ago

    Lowest barrier of entry would be to run a coder model (e.g. Qwen2.5-Coder-32B) on Ollama and interface with it via OpenCode. YMMV when it comes to which specific model will meet your needs and work best with your hardware, but Ollama makes it easy to bounce around and experiment.