Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any help for minirocket on UEA multivariate time series classification #31

Open
cc19860606 opened this issue May 17, 2024 · 1 comment

Comments

@cc19860606
Copy link

cc19860606 commented May 17, 2024

hello, has any result report on minirocket on UEA multivariate time series classification archive? @angus924
I use the minirocket_multivariate to handle PenDigits dataset in UEA multivariate,but there is NaN in X_training_transform.
And the result on UEA is poor compared to the result of minirocket_dv on UCRArchive_2018, can give me some suggestion?
Code:
parameters = fit(X_training,num_features = 10_000)
X_training_transform = transform(X_training, parameters)
print('X_training_transform:',X_training_transform)
print('type(X_training_transform):',type(X_training_transform))
print("X_training_transform.shape:", X_training_transform.shape)
print("np.isnan(X_training_transform).any():", np.isnan(X_training_transform).any())
classifier = RidgeClassifierCV(alphas = np.logspace(-3, 3, 10), normalize = True)
classifier.fit(X_training_transform, Y_training)
X_test_transform = transform(X_test, parameters)
predictions = classifier.predict(X_test_transform)
Report:
last_X_training.shape: (7494, 2, 8)
last_X_test.shape: (3498, 2, 8)
last_Y_training.shape: (7494,)
last_Y_test.shape: (3498,)
X_training_transform: [[0. 0. 0. ... 0.625 0.875 0.375]
[0. 0. 0. ... 0.625 1. 0.125]
[0. 0. 0. ... 0.375 0.625 0.25 ]
...
[0. 0. 0. ... 0.375 0.875 0.125]
[0. 0. 0. ... 0.25 1. 0.125]
[0. 0. 0. ... 0.5 0.875 0.125]]
type(X_training_transform): <class 'numpy.ndarray'>
X_training_transform.shape: (7494, 9996)
np.isnan(X_training_transform).any(): True
Traceback (most recent call last):
File "cc-test.py", line 68, in
classifier.fit(X_training_transform, Y_training)
File "/newhome/chenc/miniforge3/envs/AIcocahing/lib/python3.6/site-packages/sklearn/linear_model/_ridge.py", line 1943, in fit
multi_output=True, y_numeric=False)
File "/newhome/chenc/miniforge3/envs/AIcocahing/lib/python3.6/site-packages/sklearn/base.py", line 433, in _validate_data
X, y = check_X_y(X, y, **check_params)
File "/newhome/chenc/miniforge3/envs/AIcocahing/lib/python3.6/site-packages/sklearn/utils/validation.py", line 63, in inner_f
return f(*args, **kwargs)
File "/newhome/chenc/miniforge3/envs/AIcocahing/lib/python3.6/site-packages/sklearn/utils/validation.py", line 878, in check_X_y
estimator=estimator)
File "/newhome/chenc/miniforge3/envs/AIcocahing/lib/python3.6/site-packages/sklearn/utils/validation.py", line 63, in inner_f
return f(*args, **kwargs)
File "/newhome/chenc/miniforge3/envs/AIcocahing/lib/python3.6/site-packages/sklearn/utils/validation.py", line 721, in check_array
allow_nan=force_all_finite == 'allow-nan')
File "/newhome/chenc/miniforge3/envs/AIcocahing/lib/python3.6/site-packages/sklearn/utils/validation.py", line 106, in _assert_all_finite
msg_dtype if msg_dtype is not None else X.dtype)
ValueError: Input contains NaN, infinity or a value too large for dtype('float32').

@cc19860606 cc19860606 changed the title Any Any help for minirocket on UEA multivariate time series classification May 17, 2024
@angus924
Copy link
Owner

angus924 commented May 21, 2024

Hi @cc19860606,

Thanks for your question, sorry for the slow reply.

has any result report on minirocket on UEA multivariate time series classification archive?

Good question, it doesn't look like it. I can't find any.

It looks like there are some results here, from this paper. (I haven't checked any of this, so I can't say if it's correct or not, but maybe it's useful.)

there is NaN in X_training_transform

I think what might be happening is this: the input time series are of length 8, but MiniRocket needs time series of at least length 9. So I think you will have to pad the time series by putting zeros at the start and/or end of the series. I think at the moment some kind of strange undefined behaviour is occurring.

Let me know if padding with zeros fixes the problem. If not, I can try to figure out what else might not be working.

Thanks very much.

Best,

Angus

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants