You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Fix sweep to keep the best model and add best_score of the first model
* Improve documentation
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Remove wrong changes
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Copy file name to clipboardExpand all lines: README.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,7 +12,7 @@
12
12
13
13
PyTorch Tabular aims to make Deep Learning with Tabular data easy and accessible to real-world cases and research alike. The core principles behind the design of the library are:
Copy file name to clipboardExpand all lines: docs/models.md
+21-21Lines changed: 21 additions & 21 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -27,7 +27,7 @@ While there are separate config classes for each model, all of them share a few
27
27
28
28
-`learning_rate`: float: The learning rate of the model. Defaults to 1e-3.
29
29
30
-
-`loss`: Optional\[str\]: The loss function to be applied. By Default it is MSELoss for regression and CrossEntropyLoss for classification. Unless you are sure what you are doing, leave it at MSELoss or L1Loss for regression and CrossEntropyLoss for classification
30
+
-`loss`: Optional\[str\]: The loss function to be applied. By Default, it is MSELoss for regression and CrossEntropyLoss for classification. Unless you are sure what you are doing, leave it at MSELoss or L1Loss for regression and CrossEntropyLoss for classification
31
31
32
32
-`metrics`: Optional\[List\[str\]\]: The list of metrics you need to track during training. The metrics should be one of the functional metrics implemented in `torchmetrics`. By default, it is `accuracy` if classification and `mean_squared_error` for regression
33
33
@@ -55,13 +55,13 @@ That's it, Thats the most basic necessity. All the rest is intelligently inferre
55
55
56
56
Adam Optimizer and the `learning_rate` of 1e-3 is a default that is set in PyTorch Tabular. It's a rule of thumb that works in most cases and a good starting point which has worked well empirically. If you want to change the learning rate(which is a pretty important hyperparameter), this is where you should. There is also an automatic way to derive a good learning rate which we will talk about in the TrainerConfig. In that case, Pytorch Tabular will ignore the learning rate set through this parameter
57
57
58
-
Another key component of the model is the `loss`. Pytorch Tabular can use any loss function from standard PyTorch([`torch.nn`](https://pytorch.org/docs/stable/nn.html#loss-functions)) through this config. By default it is set to `MSELoss` for regression and `CrossEntropyLoss` for classification, which works well for those use cases and are the most popular loss functions used. If you want to use something else specficaly, like `L1Loss`, you just need to mention it in the `loss` parameter
58
+
Another key component of the model is the `loss`. Pytorch Tabular can use any loss function from standard PyTorch([`torch.nn`](https://pytorch.org/docs/stable/nn.html#loss-functions)) through this config. By default, it is set to `MSELoss` for regression and `CrossEntropyLoss` for classification, which works well for those use cases and are the most popular loss functions used. If you want to use something else specficaly, like `L1Loss`, you just need to mention it in the `loss` parameter
59
59
60
60
```python
61
61
loss ="L1Loss
62
62
```
63
63
64
-
PyTorch Tabular also accepts custom loss functions(which are drop in replacements for the standard loss functions) through the `fit` method in the `TabularModel`.
64
+
PyTorch Tabular also accepts custom loss functions(which are drop in replacements for the standard loss functions) through the `fit` method in the `TabularModel`.
65
65
66
66
!!! warning
67
67
@@ -113,7 +113,7 @@ All the parameters have intelligent default values. Let's look at few of them:
113
113
-`use_batch_norm`: bool: Flag to include a BatchNorm layer after each Linear Layer+DropOut. Defaults to `False`
114
114
-`dropout`: float: The probability of the element to be zeroed. This applies to all the linear layers. Defaults to `0.0`
115
115
116
-
**For a complete list of parameters refer to the API Docs**
116
+
**For a complete list of parameters refer to the API Docs**
### Gated Adaptive Network for Deep Automated Learning of Features (GANDALF)
@@ -141,7 +141,7 @@ All the parameters have beet set to recommended values from the paper. Let's loo
141
141
GANDALF can be considered as a more light and more performant Gated Additive Tree Ensemble (GATE). For most purposes, GANDALF is a better choice than GATE.
142
142
143
143
144
-
**For a complete list of parameters refer to the API Docs**
144
+
**For a complete list of parameters refer to the API Docs**
145
145
[pytorch_tabular.models.GANDALFConfig][]
146
146
147
147
@@ -165,14 +165,14 @@ All the parameters have beet set to recommended values from the paper. Let's loo
165
165
166
166
-`share_head_weights`: bool: If True, we will share the weights between the heads. Defaults to True
167
167
168
-
**For a complete list of parameters refer to the API Docs**
168
+
**For a complete list of parameters refer to the API Docs**
[Neural Oblivious Decision Ensembles for Deep Learning on Tabular Data](https://arxiv.org/abs/1909.06312) is a model presented in ICLR 2020 and according to the authors have beaten well-tuned Gradient Boosting models on many datasets. It uses a Neural equivalent of Oblivious Trees(the kind of trees Catboost uses) as the basic building blocks of the architecture. You can use it by choosing `NodeConfig`.
173
+
[Neural Oblivious Decision Ensembles for Deep Learning on Tabular Data](https://arxiv.org/abs/1909.06312) is a model presented in ICLR 2020 and according to the authors have beaten well-tuned Gradient Boosting models on many datasets. It uses a Neural equivalent of Oblivious Trees(the kind of trees Catboost uses) as the basic building blocks of the architecture. You can use it by choosing `NodeConfig`.
174
174
175
-
The basic block, or a "layer" looks something like below(from the paper)
175
+
The basic block, or a "layer" looks something like below(from the paper)
176
176
177
177

178
178
@@ -185,37 +185,37 @@ All the parameters have beet set to recommended values from the paper. Let's loo
185
185
-`num_layers`: int: Number of Oblivious Decision Tree Layers in the Dense Architecture. Defaults to `1`
186
186
-`num_trees`: int: Number of Oblivious Decision Trees in each layer. Defaults to `2048`
187
187
-`depth`: int: The depth of the individual Oblivious Decision Trees. Parameters increase exponentially with the increase in depth. Defaults to `6`
188
-
-`choice_function`: str: Generates a sparse probability distribution to be used as feature weights(aka, soft feature selection). Choices are: `entmax15``sparsemax`. Defaults to `entmax15`
189
-
-`bin_function`: str: Generates a sparse probability distribution to be used as tree leaf weights. Choices are: `entmax15``sparsemax`. Defaults to `entmax15`
188
+
-`choice_function`: str: Generates a sparse probability distribution to be used as feature weights(aka, soft feature selection). Choices are: `entmax15``sparsemax`. Defaults to `entmax15`
189
+
-`bin_function`: str: Generates a sparse probability distribution to be used as tree leaf weights. Choices are: `entmoid15``sparsemoid`. Defaults to `entmoid15`
190
190
-`additional_tree_output_dim`: int: The additional output dimensions which is only used to pass through different layers of the architectures. Only the first output_dim outputs will be used for prediction. Defaults to `3`
191
191
-`input_dropout`: float: Dropout which is applied to the input to the different layers in the Dense Architecture. The probability of the element to be zeroed. Defaults to `0.0`
192
192
193
193
194
-
**For a complete list of parameters refer to the API Docs**
194
+
**For a complete list of parameters refer to the API Docs**
195
195
[pytorch_tabular.models.NodeConfig][]
196
196
197
197
!!! note
198
198
199
-
NODE model has a lot of parameters and therefore takes up a lot of memory. Smaller batchsizes(like 64 or 128) makes the model manageable in a smaller GPU(~4GB).
199
+
NODE model has a lot of parameters and therefore takes up a lot of memory. Smaller batchsizes(like 64 or 128) makes the model manageable in a smaller GPU(~4GB).
200
200
201
201
### TabNet
202
202
203
203
-[TabNet: Attentive Interpretable Tabular Learning](https://arxiv.org/abs/1908.07442) is another model coming out of Google Research which uses Sparse Attention in multiple steps of decision making to model the output. You can use it by choosing `TabNetModelConfig`.
204
204
205
-
The architecture is as shown below(from the paper)
205
+
The architecture is as shown below(from the paper)
All the parameters have beet set to recommended values from the paper. Let's look at few of them:
210
210
211
211
-`n_d`: int: Dimension of the prediction layer (usually between 4 and 64). Defaults to `8`
212
212
-`n_a`: int: Dimension of the attention layer (usually between 4 and 64). Defaults to `8`
213
-
-`n_steps`: int: Number of sucessive steps in the newtork (usually betwenn 3 and 10). Defaults to `3`
213
+
-`n_steps`: int: Number of successive steps in the network (usually between 3 and 10). Defaults to `3`
214
214
-`n_independent`: int: Number of independent GLU layer in each GLU block. Defaults to `2`
215
215
-`n_shared`: int: Number of independent GLU layer in each GLU block. Defaults to `2`
216
216
-`virtual_batch_size`: int: Batch size for Ghost Batch Normalization. BatchNorm on large batches sometimes does not do very well and therefore Ghost Batch Normalization which does batch normalization in smaller virtual batches is implemented in TabNet. Defaults to `128`
217
217
218
-
**For a complete list of parameters refer to the API Docs**
218
+
**For a complete list of parameters refer to the API Docs**
219
219
[pytorch_tabular.models.TabNetModelConfig][]
220
220
221
221
### Automatic Feature Interaction Learning via Self-Attentive Neural Networks(AutoInt)
@@ -228,9 +228,9 @@ All the parameters have beet set to recommended values from the paper. Let's loo
228
228
229
229
-`num_heads`: int: The number of heads in the Multi-Headed Attention layer. Defaults to 2
230
230
231
-
-`num_attn_blocks`: int: The number of layers of stacked Multi-Headed Attention layers. Defaults to 2
231
+
-`num_attn_blocks`: int: The number of layers of stacked Multi-Headed Attention layers. Defaults to 3
232
232
233
-
**For a complete list of parameters refer to the API Docs**
233
+
**For a complete list of parameters refer to the API Docs**
234
234
[pytorch_tabular.models.AutoIntConfig][]
235
235
236
236
### DANETs: Deep Abstract Networks for Tabular Data Classification and Regression
@@ -239,18 +239,18 @@ All the parameters have beet set to recommended values from the paper. Let's loo
239
239
240
240
All the parameters have beet set to recommended values from the paper. Let's look at them:
241
241
242
-
-`n_layers`: int: Number of Blocks in the DANet. Defaults to 16
242
+
-`n_layers`: int: Number of Blocks in the DANet. Each block has 2 Abstlay Blocks each. Defaults to 8
243
243
244
244
-`abstlay_dim_1`: int: The dimension for the intermediate output in the first ABSTLAY layer in a Block. Defaults to 32
245
245
246
-
-`abstlay_dim_2`: int: The dimension for the intermediate output in the second ABSTLAY layer in a Block. Defaults to 64
246
+
-`abstlay_dim_2`: int: The dimension for the intermediate output in the second ABSTLAY layer in a Block. If None, it will be twice abstlay_dim_1. Defaults to None
247
247
248
248
-`k`: int: The number of feature groups in the ABSTLAY layer. Defaults to 5
249
249
250
250
-`dropout_rate`: float: Dropout to be applied in the Block. Defaults to 0.1
251
251
252
252
253
-
**For a complete list of parameters refer to the API Docs**
253
+
**For a complete list of parameters refer to the API Docs**
254
254
[pytorch_tabular.models.DANetConfig][]
255
255
256
256
## Implementing New Architectures
@@ -308,7 +308,7 @@ In addition to the model, you will also need to define a config. Configs are pyt
308
308
309
309
**Key things to note:**
310
310
311
-
1. All the different parameters in the different configs(like TrainerConfig, OptimizerConfig, etc) are all available in `config` before calling `super()` and in `self.hparams` after.
311
+
1. All the different parameters in the different configs(like TrainerConfig, OptimizerConfig, etc) are all available in `config` before calling `super()` and in `self.hparams` after.
312
312
1. the input batch at the `forward` method is a dictionary with keys `continuous` and `categorical`
313
313
1. In the `\_build_network` method, save every component that you want access in the `forward` to `self`
0 commit comments