Skip to content

wub53/Bigram-Level-Model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 

Repository files navigation

Bigram-Level-Language Model : Nano_GPT

This is the model trained on shakespearian dataset using the decoder only transformer architecture utilizing pytorch framework to generate random shakespearian text. This project is for educational purpose and to get deep look into the inner workings of the transofrmer architecture which is used in GPT3.5 and other LLMs.

Model Specifications :

  • Parameters : 408897
  • Dataset trained on size : 1.06 Mb text file
  • Context_length used for predictions in self attention block: 32
  • Multi-Head Attention blocks: 16
  • Layers : 8 (decoder blocks)
  • Learning Rate: 0.02

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published