Cryptography nerd

  • 0 Posts
  • 26 Comments
Joined 11 months ago
cake
Cake day: August 16th, 2023

help-circle




  • A) I’ve not yet seen evidence to the contrary

    B) you do know there’s a lot of different definitions of average, right? The centerpoint of multiple vectors is one kind of average. The median of online writing is an average. The most common vocabulary, the most common sentence structure, the most common formulation of replies, etc, those all form averages within their respective problem spaces. It displays these properties because it has seen them so often in samples, and then it blends them.





  • It’s training and fine tuning has a lot of specific instructions given to it about what it can and can’t do, and if something sounds like something it shouldn’t try then it will refuse. Spitting out unbiased random numbers is something it’s specifically trained not to do by virtue of being a neural network architecture. Not sure if OpenAI specifically has included an instruction about it being bad at randomness though.

    While the model is fed randomness when you prompt it, it doesn’t have raw access to those random numbers and can’t feed it forward. Instead it’s likely to interpret it to give you numbers it sees less often.


  • The TLDR is that pathways between nodes corresponding to frequently seen patterns (stereotypical sentences) gets strengthened more than others and therefore it becomes more likely that this pathway gets activated over others when giving the model a prompt. These strengths correspond to probabilities.

    Have you seen how often they’ll sign a requested text with a name placeholder? Have you seen the typical grammar they use? The way they write is a hybridization of the most common types of texts it has seen in samples, weighted by occurrence (which is a statistical property).

    It’s like how mixing dog breeds often results in something which doesn’t look exactly like either breed but which has features from every breed. GPT/LLM models mix in stuff like academic writing, redditisms and stackoverflowisms, quoraisms, linkedin-postings, etc. You get this specific dryish text full of hedging language and mixed types of formalisms, a certain answer structure, etc.







  • To boot a normal OS sure, but anything small enough to fit in registers/cache could do without RAM. That’s still some form of working memory though, so it’s probably not what they meant.

    You could build something RAM-less if you only need the thing to process real-time events like some signal processing with only 1 pass (also see: tons of FPGA and DSP applications)