Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with Creating Updatable Neural Network Model in Core ML using coremltools (8.0) #2383

Open
McWare opened this issue Nov 2, 2024 · 1 comment
Labels
bug Unexpected behaviour that should be corrected (type) NN backend only Affects only the NN backend (not MIL backend)

Comments

@McWare
Copy link

McWare commented Nov 2, 2024

🐞Describing the bug

I am encountering persistent issues while trying to create an updatable neural network model in Core ML using the coremltools library. My goal is to develop a model that supports on-device training and updates, but despite following the documented procedures for marking layers as updatable and adding a categorical cross-entropy loss layer, the model consistently fails to be recognized as updatable.

Stack Trace

0 1 WilfriedBernard_19480107 2024-10-04 14:47:49.550 ... 30.188882 Abgebrochen Verbessert
1 2 WilfriedBernard_19480107 2024-10-04 14:46:57.455 ... 30.358008 Abgebrochen Verbessert
2 3 WilfriedBernard_19480107 2024-10-06 12:55:45.496 ... 10.179011 Abgebrochen Verbessert
3 4 WilfriedBernard_19480107 2024-10-06 16:32:36.915 ... 10.248076 Abgebrochen Verbessert
4 5 WilfriedBernard_19480107 2024-10-06 22:04:08.139 ... 89.004450 Abgebrochen Verschlechtert

[5 rows x 11 columns]
Datum zu DateTime konvertiert.
Datum-Features extrahiert.
Features und Zielwerte extrahiert.
Features Beispiel: [[ 0. 15.3258797 0. 0. 0. 30.18888199
4. 0. ]
[ 0. 15.87591185 0. 0. 0. 30.35800803
4. 0. ]
[ 1.7913756 12.65480479 62. 62. 0. 10.17901099
6. 1. ]
[ 1.82481292 15.24270069 62. 67. 5. 10.24807596
6. 2. ]
[ 7.7663508 15.24471629 68. 74. 6. 89.00444996
6. 2. ]]
Zielwerte in numerische Werte umgewandelt: [1 1 1 1 2]
Klassenlabels: ['Unverändert', 'Verbessert', 'Verschlechtert']
Input-Größe: 8, Anzahl Klassen: 3
Neural Network Builder erstellt.
Erste Schicht (fc1) hinzugefügt.
Aktivierungsschicht (relu1) hinzugefügt.
Zweite Schicht (fc2) hinzugefügt.
Softmax-Schicht hinzugefügt.
Klassenlabels gesetzt.
Layer Name: fc1, Type: innerProduct
Layer Name: relu1, Type: activation
Layer Name: fc2, Type: innerProduct
Layer Name: softmax, Type: softmax
Updatebare Layer: ['fc1', 'fc2']
Modell als updatable markiert.
Name: fc2 (Type: innerProduct)
Input blobs: ['relu1_output']
Output blobs: ['fc2_output']
Name: fc1 (Type: innerProduct)
Input blobs: ['input_1']
Output blobs: ['fc1_output']
Now adding input output_true as target for categorical cross-entropy loss layer.
Verlustfunktion gesetzt.
SGD Optimierer gesetzt.
Anzahl der Epochen gesetzt.
Erster Trainingsinput hinzugefügt: name: "input_1"
type {
multiArrayType {
shape: 8
dataType: DOUBLE
}
}

Zweiter Trainingsinput hinzugefügt: name: "output_true"
type {
multiArrayType {
shape: 1
dataType: INT32
}
}

Trainingsinputs definiert:
Name: input_1, Typ: multiArrayType
Name: classLabel, Typ: stringType
Name: input_1, Typ: multiArrayType
Name: output_true, Typ: multiArrayType
spec.neuralNetwork hat layer: []
Das Modell enthält KEINE Update-Parameter.
Updatable Modell als 'UpdatableTrainingEffectModel.mlmodel' gespeichert.

To Reproduce

1.	Softmax Layer Handling:
•	I initially included a Softmax layer in the PyTorch model for training purposes but removed it during export to Core ML (relying instead on Core ML’s Softmax layer to be added).
•	Despite attempts to add the Softmax layer only in Core ML, the Core ML specification seems to sometimes generate an unexpected second Softmax layer (softmaxND), which complicates the setup for the categorical cross-entropy loss layer.
2.	Marking Layers as Updatable:
•	I used NeuralNetworkBuilder.make_updatable with specific layer names (e.g., linear_0, linear_1).
•	Although the builder logs indicate that these layers are marked as updatable, the resulting model does not seem to include the necessary update parameters and is not recognized as an updatable model.
3.	Categorical Cross-Entropy Loss:
•	When setting up the categorical cross-entropy loss with the Softmax output as its input, Core ML sometimes raises an error, stating that the input must be a Softmax layer output, despite the correct layer being specified.
# Paste Python code snippet here, complete with any required import statements.
import coremltools as ct
from coremltools.models.neural_network import NeuralNetworkBuilder, SgdParams
import numpy as np
import pandas as pd
from sklearn.preprocessing import LabelEncoder
from tkinter import Tk
from tkinter.filedialog import askopenfilename
from coremltools.proto import FeatureTypes_pb2  # Import the missing FeatureTypes_pb2

def test_updatable_model(model):
    spec = model.get_spec()

    # Überprüfen, ob Trainingsinputs definiert wurden
    if len(spec.description.trainingInput) > 0:
        print("Trainingsinputs definiert:")
        for inp in spec.description.trainingInput:
            print(f"Name: {inp.name}, Typ: {inp.type.WhichOneof('Type')}")

        # Überprüfe die Schichten, ob sie als updatebar markiert sind
        if hasattr(spec.neuralNetwork, 'layers'):
            print(f"spec.neuralNetwork hat layer: {spec.neuralNetwork.layers}")
            for layer in spec.neuralNetwork.layers:
                if layer.isUpdatable:
                    print(f"Layer {layer.name} ist updatebar.")
                else:
                    print(f"Layer {layer.name} ist nicht updatebar.")

        # Testen, ob das Modell über die notwendigen Update-Fähigkeiten verfügt
        if spec.neuralNetwork.HasField("updateParams"):
            print("Das Modell enthält Update-Parameter und ist updatebar.")
        else:
            print("Das Modell enthält KEINE Update-Parameter.")
    else:
        print("Keine Trainingsinputs definiert, Modell ist nicht updatable.")

# Funktion zur Auswahl einer CSV-Datei
def select_file():
    Tk().withdraw()  # Das Haupt-Tkinter-Fenster ausblenden
    file_path = askopenfilename(filetypes=[("CSV Files", "*.csv")])
    return file_path

# CSV-Dateipfad erhalten
csv_file_path = select_file()

# CSV-Daten in ein Pandas DataFrame laden
data = pd.read_csv(csv_file_path)
print(f"CSV-Daten geladen: {data.head()}")

# 'Date' in Datetime-Format konvertieren
data['Date'] = pd.to_datetime(data['Date'])
print("Datum zu DateTime konvertiert.")

# Feature-Extraktion aus dem Datum
data['DayOfWeek'] = data['Date'].dt.dayofweek  # Montag=0, Sonntag=6
data['ElapsedDays'] = (data['Date'] - data['Date'].min()).dt.days  # Tage seit der ersten Sitzung
print("Datum-Features extrahiert.")

# Features und Zielwerte extrahieren
features = data[['AvgSway', 'MaxSway', 'StartHR', 'EndHR', 'HRDelta', 'Duration', 'DayOfWeek', 'ElapsedDays']].values
target = data['Label'].values
print("Features und Zielwerte extrahiert.")
print(f"Features Beispiel: {features[:5]}")

# Zielwerte in numerische Labels konvertieren
label_encoder = LabelEncoder()
target_encoded = label_encoder.fit_transform(target)
print(f"Zielwerte in numerische Werte umgewandelt: {target_encoded[:5]}")

# Klassenlabels explizit als Liste von Strings definieren
class_labels = list(map(str, label_encoder.classes_))
print(f"Klassenlabels: {class_labels}")

# Neural Network Modell erstellen
input_size = features.shape[1]
num_classes = len(np.unique(target_encoded))
print(f"Input-Größe: {input_size}, Anzahl Klassen: {num_classes}")

# Neural Network Builder erstellen
builder = NeuralNetworkBuilder(input_features=[('input_1', ct.models.datatypes.Array(input_size))],
                               output_features=[('output', ct.models.datatypes.Array(num_classes))],
                               mode="classifier")

print("Neural Network Builder erstellt.")

# Erste Schicht (fc1) hinzufügen
W_fc1 = np.random.rand(64, input_size)
b_fc1 = np.random.rand(64)
builder.add_inner_product(
    name='fc1',
    W=W_fc1,
    b=b_fc1,
    input_channels=input_size,
    output_channels=64,
    has_bias=True,
    input_name='input_1',
    output_name='fc1_output')
print("Erste Schicht (fc1) hinzugefügt.")

# Aktivierungsschicht (ReLU) hinzufügen
builder.add_activation(
    name='relu1',
    non_linearity='RELU',
    input_name='fc1_output',
    output_name='relu1_output')
print("Aktivierungsschicht (relu1) hinzugefügt.")

# Zweite Schicht (fc2) hinzufügen
W_fc2 = np.random.rand(num_classes, 64)
b_fc2 = np.random.rand(num_classes)
builder.add_inner_product(
    name='fc2',
    W=W_fc2,
    b=b_fc2,
    input_channels=64,
    output_channels=num_classes,
    has_bias=True,
    input_name='relu1_output',
    output_name='fc2_output')
print("Zweite Schicht (fc2) hinzugefügt.")

# Softmax-Schicht hinzufügen
builder.add_softmax(
    name='softmax',
    input_name='fc2_output',
    output_name='output')
print("Softmax-Schicht hinzugefügt.")

# Setze die Klassenlabels
builder.set_class_labels(class_labels)
print("Klassenlabels gesetzt.")

# Layer überprüfen und auf Updatability testen
for layer in builder.nn_spec.layers:
    print(f"Layer Name: {layer.name}, Type: {layer.WhichOneof('layer')}")
    
    
# Markiere die Layer als updatebar
updatable_layers = ['fc1', 'fc2']
builder.make_updatable(updatable_layers)
print(f"Updatebare Layer: {updatable_layers}")
print("Modell als updatable markiert.")

builder.inspect_updatable_layers()

# Verlustfunktion setzen (Eingabe muss die Ausgabe der Softmax-Schicht sein)
builder.set_categorical_cross_entropy_loss(name='lossLayer', input='output')
print("Verlustfunktion gesetzt.")

# Optimizer setzen
builder.set_sgd_optimizer(SgdParams(lr=0.01, batch=32))
print("SGD Optimierer gesetzt.")

# Anzahl der Epochen setzen
builder.set_epochs(10)
print("Anzahl der Epochen gesetzt.")

# Setze die Training Inputs
training_input = builder.spec.description.trainingInput.add()
training_input.name = "input_1"
training_input.type.multiArrayType.shape.extend([input_size])
training_input.type.multiArrayType.dataType = FeatureTypes_pb2.ArrayFeatureType.DOUBLE  # Fix: Import FeatureTypes_pb2
print(f"Erster Trainingsinput hinzugefügt: {training_input}")

# Setze die Zielwerte (Labels)
training_output = builder.spec.description.trainingInput.add()
training_output.name = "output_true"
training_output.type.multiArrayType.shape.extend([1])
training_output.type.multiArrayType.dataType = FeatureTypes_pb2.ArrayFeatureType.INT32
print(f"Zweiter Trainingsinput hinzugefügt: {training_output}")


# Speichern des updatable Modells
mlmodel_updatable = ct.models.MLModel(builder.spec)
test_updatable_model(mlmodel_updatable)

mlmodel_updatable.save('UpdatableTrainingEffectModel.mlmodel')
print("Updatable Modell als 'UpdatableTrainingEffectModel.mlmodel' gespeichert.")

## System environment (please complete the following information):
 - coremltools version: 8.0
 - OS (e.g. MacOS version or Linux type): macOS Sequoia v15.1 (24B83)
 - Any other relevant version information (e.g. PyTorch or TensorFlow version):
 Package                 Version
----------------------- -----------
absl-py                 2.1.0
astunparse              1.6.3
attrs                   24.2.0
cachetools              5.5.0
cattrs                  24.1.2
certifi                 2024.8.30
charset-normalizer      3.4.0
click                   8.1.7
coremltools             8.0
filelock                3.16.1
flatbuffers             24.3.25
fsspec                  2024.10.0
gast                    0.4.0
google-auth             2.35.0
google-auth-oauthlib    0.4.6
google-pasta            0.2.0
grpcio                  1.67.0
h5py                    3.12.1
idna                    3.10
jax                     0.4.30
jaxlib                  0.4.30
Jinja2                  3.1.4
joblib                  1.4.2
keras                   2.12.0
libclang                18.1.1
Markdown                3.7
markdown-it-py          3.0.0
MarkupSafe              3.0.2
mdurl                   0.1.2
ml-dtypes               0.3.2
mpmath                  1.3.0
namex                   0.0.8
networkx                3.4.2
numpy                   1.23.5
oauthlib                3.2.2
onnx                    1.17.0
onnx-coreml             1.3
opt_einsum              3.4.0
optree                  0.13.0
packaging               24.1
pandas                  2.2.3
pillow                  11.0.0
pip                     24.2
protobuf                4.25.5
pyaml                   24.9.0
pyasn1                  0.6.1
pyasn1_modules          0.4.1
Pygments                2.18.0
python-dateutil         2.9.0.post0
pytz                    2024.2
PyYAML                  6.0.2
requests                2.32.3
requests-oauthlib       2.0.0
rich                    13.9.3
rsa                     4.9
scikit-learn            1.5.2
scipy                   1.14.1
setuptools              75.2.0
six                     1.16.0
sympy                   1.13.1
tensorboard             2.12.0
tensorboard-data-server 0.7.2
tensorboard-plugin-wit  1.8.1
tensorflow-estimator    2.12.0
tensorflow-macos        2.12.0
tensorflow-metal        1.1.0
termcolor               2.5.0
threadpoolctl           3.5.0
torch                   2.4.0
torchaudio              2.4.0
torchvision             0.19.0
tqdm                    4.66.5
typing                  3.7.4.3
typing_extensions       4.12.2
tzdata                  2024.2
urllib3                 2.2.3
Werkzeug                3.0.4
wheel                   0.44.0
wrapt                   1.14.1

## Additional context
Key Questions:

	1.	Layer Recognition in Core ML:
	•	Are there specific limitations in coremltools regarding which types of layers can be marked as updatable? For instance, are innerProduct or specific layer names required for updatable layers to be correctly recognized?
	2.	Softmax and Cross-Entropy Integration:
	•	Is there a recommended way to ensure that Core ML consistently recognizes a single, correctly configured Softmax layer when added to the network manually? Should the Softmax layer be explicitly present in the PyTorch model, or can it reliably be added only in Core ML?
	3.	Ensuring Update Parameters:
	•	Are there additional steps or configurations needed in coremltools to ensure the model is saved with all required update parameters, particularly after marking layers as updatable and adding the necessary loss function?
@McWare McWare added the bug Unexpected behaviour that should be corrected (type) label Nov 2, 2024
@McWare
Copy link
Author

McWare commented Nov 2, 2024

Please contact me at: [email protected], Thank you.

@TobyRoseman TobyRoseman added the NN backend only Affects only the NN backend (not MIL backend) label Nov 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Unexpected behaviour that should be corrected (type) NN backend only Affects only the NN backend (not MIL backend)
Projects
None yet
Development

No branches or pull requests

2 participants