Releases · smpanaro/more-ane-transformers

01 Nov 00:55

smpanaro

v0-2023-October-31

d31c674

v0-2023-October-31 Latest

Latest

gpt2 model family, 2-4x faster compared to the prior release.

gpt2 now uses KV caching for faster generation
all models generate multiple tokens per second (to get the fastest speeds, see the instructions in SETUP.md)
iOS 16+/macOS 13+ now required

gpt2-xl is split up into multiple files, per Github's restrictions. Download both parts and decompress them like so:
cat gpt2-xl.mlpackage.tar.gz.* | tar -xzvf -

Assets 7

30 May 01:41

smpanaro

v0-2023-May-29

80b2b0f

v0-2023-May-29

pythia model family, up to the 2.8B variant. specifically:

pythia-70m
pythia-160m
pythia-410m
pythia-1b
pythia-1.4b
pythia-2.8b (ANE requires M2)

The larger models are split up into multiple files, per Github's restrictions. Download all the parts and decompress them like so:
cat pythia-1.4b.tar.gz.* | tar -xzvf - and cat pythia-2.8b.tar.gz.* | tar -xzvf -

Assets 11

02 Apr 22:26

smpanaro

v0-2023-April-02

98b219c

v0-2023-April-02

gpt2 model family, but much faster.

all models generate text ~2x as fast
xl model runs on Neural Engine now (so from 5.5s on CPU only → 450ms Neural Engine only)
models are compiled + cached automatically on first run of generate. order of magnitude faster time to first token (1.5 minutes → 2.5s for xl) for the 2nd-nth runs.

note: gpt2-xl is too big for github, so you will need to download both files and then run the following command to join them back into a single zip: zip -F gpt2-xl-split.mlpackage.zip --out gpt2-xl.mlpackage.zip
note2: I accidentally made xl require macOS 13. think it might not need that, let me know if you want to try.

Assets 7

28 Mar 03:15

smpanaro

v0-2023-march-27

b7cb102

v0-2023-march-27

gpt2 model family (base, medium, large, xl) converted to CoreML

Assets 7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: smpanaro/more-ane-transformers

v0-2023-October-31

v0-2023-May-29

v0-2023-April-02

v0-2023-march-27