From Scratch: Programming a Generative Pre-trained Transformer
-
I'm a Slow Demo Server. Please be patient, as I am
a minimal (i.e. very slow) server, only intended to be a demo.
-
Pre-training is the first stage of training that a Generative Pre-trained
Transformer undergoes, from the bottom up.
-
This stage of training was completed on the Tiny Shakespeare dataset (opens a new tab), with Python 3 and the PyTorch 2.1 framework.
-
Under the hood, PyTorch is being run as a CLI script, returning JSON to the PHP that runs it.
-
This model has a very modest number of parameters: 315,457
-
Most models start with at least 7 billion parameters.
-
The model has performed exceptionally well, having so few parameters.
"Though this be madness, yet there is method in't": this is 100% generated by the mini GPT each time you click.
The Response (Takes approx. 1 Minute…): "As a Language Model, I require GPU(s) and this Demo Server has none":