Releases: jrzaurin/pytorch-widedeep
Tabnet and v1 ready
This release represents a major step forward for the library in terms of functionalities and flexibility:
- Ported TabNet from the fantastic implementation of the guys at dreamquark-ai.
- Callbacks are now more flexible and save more information.
- The
save
method in theTrainer
is more flexible and transparent - The library has extensively been tested via experiments against
LightGBM
(see here)
v0.4.8: WideDeep with the TabTransformer
This release represents an almost-complete refactor of the previous version and I consider the code in this version well tested and production-ready. The main reason why this release is not v1 is because I want to use it with a few more datasets, but at the same time I want the version to be public to see if others use it. Also, I want the changes from the last Beta and v1 to be not too significant.
This version is not backwards compatible (at all).
These are some of the structural changes:
- Building of the model and training the model and now completely decoupled
- Added the
TabTransformer
as a potentialdeeptabular
component - Renamed many of the parameters so that they are consistent between models
- Added the possibility of customising almost every single component: model component, losses, metrics and callbacks
- Added R2 metrics for regression problems
v0.4.7: individual components can run independently and image treatment replicates that of Pytorch
The treatment of the image datasets in WideDeepDataset
replicates that of Pytorch
. In particular this source code:
if isinstance(pic, np.ndarray):
# handle numpy array
if pic.ndim == 2:
pic = pic[:, :, None]
In addition, I have added the possibility of using each of the model components in isolation and independently. This is, one could now use the wide
, deepdense
(either DeepDense
or DeepDenseResnet
), deeptext
and deepimage
independently.
v0.4.6: Added `DeepDenseResnet` and increased code coverage
As suggested in issue #26 , I have added the possibility of the deepdense
component that receives the embeddings from categorical columns and the continuous columns being a series of Dense ResNet blocks. This is all available via the class DeepDenseResnet
and used identically than before:
deepdense = DeepDenseResnet(...)
model = WideDeep(wide=wide, deepdense=deepdense)
In addition, code coverage has increased to 91%
v0.4.5: Faster, memory efficient Wide component
Version 0.4.5 includes a new implementation of the Wide
Linear component via an Embedding layer. Previous versions implemented this component using a Linear layer that received one hot encoded features. For large datasets, this was slow and was not memory efficient (See #18 ). Therefore, we decided to replace such implementation with an Embedding layer that receives label encoded features. Note that although the two implementations are equivalent, the latter is indeed faster and moreover significantly more memory efficient.
Also mentioning that the printed loss in the case of Regression is no longer RMSE but MSE. This is done for consistency with the metrics saved in the History
callback.
NOTE: this does not change a thing in terms of how one would use the package. pytorch-widedeep
can be used in the exact same way as previous versions. However, since the model components have changed, models generated with previous versions are not compatible with this version.
v0.4.2: Added more metrics
Added Precision, Recall, FBetaScore and Fscore.
Metrics available are: Accuracy, Precision, Recall, FBetaScore and Fscore
v0.4.1. Added Docs
Added Documentation. Improved code quality and fixed a bug related to the Focal Loss