@TheOtherJake

TheOtherJake@beehaw.org · 1 year ago

Gimp

TheOtherJake@beehaw.org · 1 year ago

Oobabooga is the main GUI used to interact with models.

https://github.com/oobabooga/text-generation-webui

FYI, you need to find checkpoint models. In the available chat models space, naming can be ambiguous for a few reasons I’m not going to ramble about here. The main source of models is Hugging Face. Start with this model (or get the censored version):

https://huggingface.co/TheBloke/llama2_7b_chat_uncensored-GGML

First, let’s break down the title.

This is a model based in Meta’s Llama2.
This is not “FOSS” in the GPL/MIT type of context. This model has a license that is quite broad in scope with the key point stipulating it can not be used commercially for apps that have more than 700 million users.
Next, it was quantized by a popular user going by “The Bloke.” I have no idea who this is IRL but I imagine this is a pseudonym or corporate alias given how much content is uploaded by this account on HF.
This model is based on a 7 Billion parameter dataset, and is fine tuned for chat applications.
This is uncensored meaning it will respond to most inputs as best it can. It can get NSFW, or talk about almost anything. In practice there are still some minor biases that are likely just over arching morality inherent to the datasets used, or it might be coded somewhere obscure.
Last part of the title is that this is a GGML model. This means it can run on CPU or GPU or a split between the two.

As for options on the landing page or “model card”

you need to get one of the older style models that have “q(numb)” as the quantization type. Do not get the ones that say “qK” as these won’t work with the llama.cpp file you will get with Oobabooga.
look at the guide at the bottom of the model card where it tells you how much ram you need for each quantization type. If you have a Nvidia GPU with the CUDA API, enabling GPU layers makes the model run faster, and with quite a bit less system memory from what is stated on the model card.

The 7B models are about like having a conversation with your average teenager. Asking technical questions yielded around 50% accuracy in my experience. A 13B model got around 80% accuracy. The 30B WizardLM is around 90-95%. I’m still working on trying to get a 70B running on my computer. A lot of the larger models require compiling tools from source. They won’t work directly with Oobabooga.

TheOtherJake@beehaw.org · 1 year ago

Worse yet. Install a whitelist firewall or have a look at the connections required to access Discord. You will immediately stop using it. It involves dozens of undocumented raw IP address connections and weird ports. Top this off by telling me what their business model is and how they are profitable. They provide no documentation whatsoever about what they are doing and why. The best explanation anyone has ever given me when asked why they use discord is, ‘because everyone else is doing it.’ That is idiotic nonsense.

TheOtherJake@beehaw.org · 1 year ago

Hey, I thought of mentioning, but got sidetracked and forgot. Most of the dozen or so consumer grade routers I have hacked around with seem to have less than optimal placement of DC stepdown converters located around the processor and radio circuit blocks. I mean, they appear to be optimised for radio as the primary design constraint, not for what is best for the DC converter operation. They tend to place electrolytic capacitors in close proximity to circuit blocks that get quite warm. I can’t say how often capacitors are creating problems, but it would not surprise me if this is the cause of many issues for many people after a year or two. I can say that I had problems with a cable company provided modem a few years ago. It had an obvious leaky cap and several that were around 25% out of spec, along with a couple identical parts that were around 5% out as I would expect with my typical shelf stock. Replacing all of them fixed the modem.

TheOtherJake@beehaw.org · 1 year ago

I prefer to run hardware supported by OpenWRT or DDWRT. These have monitoring and firewall options under access control.

If you are not the type to flash your own hardware, pcWRT might be an option. It is small business consisting of a dude in Texas that created a simplified front end for OpenWRT. You just have to trust him, which I haven’t had a problem with, and is probably better than trusting whatever underpaid person has access to similar interfaces for whatever commercial vendor you choose. He has a well secured SSH used to send out occasional updates for the device automatically. His setup does not give you access to the underlying OpenWRT system behind his front end, but with a USB to serial converter and a port on the board you can access OpenWRT in a terminal. I have it setup to log any activity and never had any issues. I’m no expert, but I did install Gentoo once.

https://shop.pcwrt.com/collections/all

No affiliation/not an affiliate link. Beware that some people pushing his stuff are doing an affiliation deal. Also, while his stuff is nice and relatively simple, it has more value in the past when OpenWRT was much harder to setup on your own. OpenWRT is open source but the pcWRT frontend is not.