![](/static/253f0d9b/assets/icons/icon-96x96.png)
![](https://programming.dev/pictrs/image/170721ad-9010-470f-a4a4-ead95f51f13b.png)
Lightcones?
My favorite thing here is pointing out that Heisenberg uncertainty should influence gravitational waves and definitely influences light cones
Cryptography nerd
Lightcones?
My favorite thing here is pointing out that Heisenberg uncertainty should influence gravitational waves and definitely influences light cones
You don’t need it to be an average of the real world to be an average. I can calculate as many average values as I want from entirely fictional worlds. It’s still a type of model which favors what it sees often over what it sees rarely. That’s a form of probability embedded, corresponding to a form of average.
Text explaining why the neural network representation of common features (typically with weighted proportionality to their occurrence) does not meet the definition of a mathematical average. Does it not favor common response patterns?
A) I’ve not yet seen evidence to the contrary
B) you do know there’s a lot of different definitions of average, right? The centerpoint of multiple vectors is one kind of average. The median of online writing is an average. The most common vocabulary, the most common sentence structure, the most common formulation of replies, etc, those all form averages within their respective problem spaces. It displays these properties because it has seen them so often in samples, and then it blends them.
Funny how this one has less detail and less expressions despite the more complex prompt.
As long as you can correctly model the target behavior in a sufficiently complete way, and capture all necessary context in the inputs!
Not an LLM specifically, in particular lack of backtracking and the network depth limits as well as interconnectivity limits sets hard limits on capabilities.
https://www.lesswrong.com/posts/XNBZPbxyYhmoqD87F/llms-and-computation-complexity
https://garymarcus.substack.com/p/math-is-hard-if-you-are-an-llm-and
https://arxiv.org/abs/2401.11817
Humans have a completely different memory model and a in large part a very different way of linking together learned concepts to form their world view and to develop interdisciplinary skills, allowing us to solve many kinds of highly complex tasks as long as we can keep enough of it in our memory.
It’s training and fine tuning has a lot of specific instructions given to it about what it can and can’t do, and if something sounds like something it shouldn’t try then it will refuse. Spitting out unbiased random numbers is something it’s specifically trained not to do by virtue of being a neural network architecture. Not sure if OpenAI specifically has included an instruction about it being bad at randomness though.
While the model is fed randomness when you prompt it, it doesn’t have raw access to those random numbers and can’t feed it forward. Instead it’s likely to interpret it to give you numbers it sees less often.
The TLDR is that pathways between nodes corresponding to frequently seen patterns (stereotypical sentences) gets strengthened more than others and therefore it becomes more likely that this pathway gets activated over others when giving the model a prompt. These strengths correspond to probabilities.
Have you seen how often they’ll sign a requested text with a name placeholder? Have you seen the typical grammar they use? The way they write is a hybridization of the most common types of texts it has seen in samples, weighted by occurrence (which is a statistical property).
It’s like how mixing dog breeds often results in something which doesn’t look exactly like either breed but which has features from every breed. GPT/LLM models mix in stuff like academic writing, redditisms and stackoverflowisms, quoraisms, linkedin-postings, etc. You get this specific dryish text full of hedging language and mixed types of formalisms, a certain answer structure, etc.
If there’s no averaging, why do they repeat stereotypes so often?
The crawling isn’t illegal, what you do with the data might be
Too late, take a look at teletext and RDS for radio, and also literally the very first cable free TV remote controls
Cuda is an Nvidia specific method for using a graphics card to do computation (not just graphics), like physics simulations.
Translation layers would let you use software designed for other graphics cards to work with Cuda, or to let Cuda software work on other graphics cards
With WebUSB (supported in Chrome) and the possibility to build web applications to controls physical devices there’s definitely some web developers who can claim to be proper engineers even in the strict definitions
To boot a normal OS sure, but anything small enough to fit in registers/cache could do without RAM. That’s still some form of working memory though, so it’s probably not what they meant.
You could build something RAM-less if you only need the thing to process real-time events like some signal processing with only 1 pass (also see: tons of FPGA and DSP applications)
Warehouse management?
All of those qualifications require that you can handle a cascade of requests and manage tables
I did realize that too was a joke, still wanted to point that out
Non integers certainly aren’t even or odd, so yes?
Const