Skip to content
This repository has been archived by the owner on Feb 7, 2023. It is now read-only.

MissingShapeCalculator error when using convert_coreml on pipeline with categorical features #582

Open
gitDawn opened this issue Jul 6, 2020 · 0 comments

Comments

@gitDawn
Copy link

gitDawn commented Jul 6, 2020

I'm trying to convert a Pipeline which uses categorical conversion with pd.get_dummies, and for some reason I'm getting an error when trying to convert the fitted Pipeline (although the fitting itself went fine).

I've attached the code below and the error output.
Can someone explain what am I missing here?

p.s. I don't mind using other methods for categorical transform for the pipeline as long as the conversion would work.

from sklearn.preprocessing import OneHotEncoder
from sklearn.base import BaseEstimator, TransformerMixin
from sklearn.pipeline import Pipeline
from sklearn import svm
from winmltools import convert_coreml
import copy
from IPython.display import display
# https://github.com/pandas-dev/pandas/issues/8918

class MyEncoder(TransformerMixin):

    def __init__(self, columns=None):
        self.columns = columns

    def transform(self, X, y=None, **kwargs):
        return pd.get_dummies(X, dtype=np.float, columns=['ID'])

    def fit(self, X, y=None, **kwargs):
        return self

# data
X = pd.DataFrame([[100, 1.1, 3.1], [200, 4.1, 5.1], [100, 4.1, 2.1]], columns=['ID', 'X1', 'X2'])
Y = pd.Series([3, 2, 4])

# check transform (all OK)
df = MyEncoder().transform(X)
display(df)

# create pipeline
pipe = Pipeline( steps=[('categorical', MyEncoder()), ('classifier', svm.SVR())] )
print(type(pipe), MyEncoder().transform(X).dtypes, '\n')

# save onnx

# no problem here
svm_toy  = svm.SVR()
svm_toy.fit(X,Y)
initial_type = [('X', FloatTensorType( [None, X.shape[1]] ) ) ] 
onx = convert_sklearn(svm_toy, initial_types=initial_type  )

# something goes wrong...
pipe_toy = copy.deepcopy(pipe).fit(X, Y) # the pipeline fitting goes OK
initial_type = [('X', FloatTensorType( [None, X.shape[1]] ) ) ] 
onx = convert_sklearn(pipe_toy, initial_types=initial_type  ) # the conversion fails

The error that I'm getting is

MissingShapeCalculator: Unable to find a shape calculator for type ''.
It usually means the pipeline being converted contains a
transformer or a predictor with no corresponding converter
implemented in sklearn-onnx. If the converted is implemented
in another library, you need to register
the converted so that it can be used by sklearn-onnx (function
update_registered_converter). If the model is not yet covered
by sklearn-onnx, you may raise an issue to
https://github.com/onnx/sklearn-onnx/issues
to get the converter implemented or even contribute to the
project. If the model is a custom model, a new converter must
be implemented. Examples can be found in the gallery.

@gitDawn gitDawn changed the title convert_coreml for pipeline with categorical features implemented with pandas get_dummies MissingShapeCalculator error when using convert_coreml on pipeline with categorical features Jul 6, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant