You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
With generate.py script, the Pythia 2.8b mlpackage from GitHub release will call ANE no matter with --compute_unit="All" or --compute_unit="CPUAndANE". However, if I try to convert Pythia 2.8b from convert.py, the mlpackage will not call ANE, with --compute_unit="All", CPU and GPU will be used; with --compute_unit="CPUAndANE", only CPU will be called. Pythia-410m shows different case, both mlpackage download from GitHub release and converted from convert.py script could call ANE.
BTW, Pythia-6.9b could be converted from convert.py script, and with generate.py and --compute_unit="CPUAndGPU", it works well, but it will not call ANE also.
The text was updated successfully, but these errors were encountered:
You can tell that it worked by doing Show Package Contents and seeing that the Data > com.apple.CoreML > weights folder has many files (one per chunk).
That should allow you to recreate the 2.8b model that runs on ANE. Two more things that might be helpful:
Measuring ANE
I'm not sure how you are checking to see if the model runs on the ANE, but I would recommend using the --wait flag and attaching the CoreML tool from Instruments. Xcode really struggles with these larger models.
For the chunked models you should see one "Neural Engine Prediction" block for each chunk of the model -- it will be obvious if some chunks run on ANE and some do not. (This screenshot is not a chunked model.) There will be a tiny gap between each block that runs on CPU, but it should be very small.
6.9b Model
I only have an M1, but I think there is a chance you can get the 6.9b running on the M2's ANE. You will definitely need to use the chunk_model and make_pipeline tools. I would start with 670 for the chunk size (like 2.8b) and try smaller if that doesn't work. Let me know if you try, I'd be happy to help try and figure out how to get it working!
Sorry for the slow response and also that all of this is missing from the documentation.
Appreciate for your guys work.
My testing machine is M2 Max 64GB memory.
With generate.py script, the Pythia 2.8b mlpackage from GitHub release will call ANE no matter with --compute_unit="All" or --compute_unit="CPUAndANE". However, if I try to convert Pythia 2.8b from convert.py, the mlpackage will not call ANE, with --compute_unit="All", CPU and GPU will be used; with --compute_unit="CPUAndANE", only CPU will be called. Pythia-410m shows different case, both mlpackage download from GitHub release and converted from convert.py script could call ANE.
BTW, Pythia-6.9b could be converted from convert.py script, and with generate.py and --compute_unit="CPUAndGPU", it works well, but it will not call ANE also.
The text was updated successfully, but these errors were encountered: