davidrsch · davidrsch · Sep 9, 2025 · Sep 9, 2025
diff --git a/_pkgdown.yml b/_pkgdown.yml
@@ -21,6 +21,7 @@ guides:
       - functional_api
       - workflows_sequential
       - workflows_functional
+      - tuning_fit_compile_args
 
 # examples:
 
@@ -88,6 +89,9 @@ navbar:
           href: articles/workflows_sequential.html
         - text: "Functional API"
           href: articles/workflows_functional.html
+        - text: "Tuning"
+        - text: "Tuning Fit and Compile Arguments"
+          href: articles/tuning_fit_compile_args.html
     github:
       icon: fa-github
       href: https://github.com/davidrsch/kerasnip
diff --git a/vignettes/tuning_fit_compile_args.Rmd b/vignettes/tuning_fit_compile_args.Rmd
@@ -0,0 +1,194 @@
+---
+title: "Tuning Fit and Compile Arguments"
+output: rmarkdown::html_vignette
+vignette: >
+  %\VignetteIndexEntry{Tuning Fit and Compile Arguments}
+  %\VignetteEngine{knitr::rmarkdown}
+  %\VignetteEncoding{UTF-8}
+---
+
+```{r, include = FALSE}
+knitr::opts_chunk$set(
+  collapse = TRUE,
+  comment = "#>",
+  eval = reticulate::py_module_available("keras")
+)
+# Suppress verbose Keras output for the vignette
+options(keras.fit_verbose = 0)
+set.seed(123)
+```
+
+## Introduction
+
+While `kerasnip` makes it easy to tune the architecture of a Keras model (e.g., the number of layers or the number of units in a layer), it is often just as important to tune the parameters that control the training process itself. `kerasnip` exposes these parameters through special `fit_*` and `compile_*` arguments in the model specification.
+
+This vignette provides a comprehensive example of how to tune these arguments within a `tidymodels` workflow. We will tune:
+
+*   **`fit_epochs`**: The number of training epochs.
+*   **`fit_batch_size`**: The number of samples per gradient update.
+*   **`compile_optimizer`**: The optimization algorithm (e.g., "adam", "sgd").
+*   **`compile_loss`**: The loss function used for training.
+*   **`learn_rate`**: The learning rate for the optimizer.
+
+## Setup
+
+First, we load the necessary packages.
+
+```{r load-packages}
+library(kerasnip)
+library(tidymodels)
+library(keras3)
+```
+
+## Data Preparation
+
+We will use the classic `iris` dataset for this example. It's a simple, small dataset, which is ideal for demonstrating the tuning process without long training times.
+
+```{r data-prep}
+# Split data into training and testing sets
+set.seed(123)
+iris_split <- initial_split(iris, prop = 0.8, strata = Species)
+iris_train <- training(iris_split)
+iris_test <- testing(iris_split)
+
+# Create cross-validation folds for tuning
+iris_folds <- vfold_cv(iris_train, v = 3, strata = Species)
+```
+
+## Define a `kerasnip` Model
+
+We'll create a very simple sequential model with a single dense layer. This keeps the focus on tuning the `fit_*` and `compile_*` arguments rather than the model architecture.
+
+```{r define-kerasnip-model}
+# Define layer blocks
+input_block <- function(model, input_shape) {
+  keras_model_sequential(input_shape = input_shape)
+}
+dense_block <- function(model, units = 10) {
+  model |> layer_dense(units = units, activation = "relu")
+}
+output_block <- function(model, num_classes) {
+  model |> layer_dense(units = num_classes, activation = "softmax")
+}
+
+# Create the kerasnip model specification function
+create_keras_sequential_spec(
+  model_name = "iris_mlp",
+  layer_blocks = list(
+    input = input_block,
+    dense = dense_block,
+    output = output_block
+  ),
+  mode = "classification"
+)
+```
+
+## Define the Tunable Specification
+
+Now, we create an instance of our `iris_mlp` model. We set the arguments we want to optimize to `tune()`.
+
+```{r define-tune-spec}
+# Define the tunable model specification
+tune_spec <- iris_mlp(
+  dense_units = 16, # Keep architecture fixed for this example
+  fit_epochs = tune(),
+  fit_batch_size = tune(),
+  compile_optimizer = tune(),
+  compile_loss = tune(),
+  learn_rate = tune()
+) |>
+  set_engine("keras")
+
+print(tune_spec)
+```
+
+## Create Workflow and Tuning Grid
+
+Next, we create a `workflow` and define the search space for our hyperparameters using `dials`. `kerasnip` provides special `dials` parameter functions for `optimizer` and `loss`.
+
+```{r create-workflow-grid}
+# Create a simple recipe
+iris_recipe <- recipe(Species ~ ., data = iris_train) |>
+  step_normalize(all_numeric_predictors())
+
+# Create the workflow
+tune_wf <- workflow() |>
+  add_recipe(iris_recipe) |>
+  add_model(tune_spec)
+
+# Define the tuning grid
+params <- extract_parameter_set_dials(tune_wf) |>
+  update(
+    fit_epochs = epochs(c(10, 30)),
+    fit_batch_size = batch_size(c(16, 64), trans = NULL),
+    compile_optimizer = optimizer_function(values = c("adam", "sgd", "rmsprop")),
+    compile_loss = loss_function_keras(values = c("categorical_crossentropy", "kl_divergence")),
+    learn_rate = learn_rate(c(0.001, 0.01), trans = NULL)
+  )
+
+set.seed(456)
+tuning_grid <- grid_regular(params, levels = 2)
+
+tuning_grid
+```
+
+## Tune the Model
+
+With the workflow and grid defined, we can now run the hyperparameter tuning using `tune_grid()`.
+
+```{r tune-model, cache=TRUE}
+tune_res <- tune_grid(
+  tune_wf,
+  resamples = iris_folds,
+  grid = tuning_grid,
+  metrics = metric_set(accuracy, roc_auc),
+  control = control_grid(save_pred = FALSE, save_workflow = TRUE, verbose = FALSE)
+)
+```
+
+## Inspect the Results
+
+Let's examine the results to see how the different combinations of fitting and compilation parameters performed.
+
+```{r inspect-results}
+# Show the best performing models based on accuracy
+show_best(tune_res, metric = "accuracy")
+
+# Plot the results
+autoplot(tune_res) + theme_minimal()
+
+# Select the best hyperparameters
+best_params <- select_best(tune_res, metric = "accuracy")
+print(best_params)
+```
+
+The results show that `tune` has successfully explored different optimizers, loss functions, learning rates, epochs, and batch sizes, identifying the combination that yields the best accuracy.
+
+## Finalize and Fit
+
+Finally, we finalize our workflow with the best-performing hyperparameters and fit the model one last time on the full training dataset.
+
+```{r finalize-fit}
+# Finalize the workflow
+final_wf <- finalize_workflow(tune_wf, best_params)
+
+# Fit the final model
+final_fit <- fit(final_wf, data = iris_train)
+
+print(final_fit)
+```
+
+We can now use this `final_fit` object to make predictions on the test set.
+
+```{r predict}
+# Make predictions
+predictions <- predict(final_fit, new_data = iris_test)
+
+# Evaluate performance
+bind_cols(predictions, iris_test) |>
+  accuracy(truth = Species, estimate = .pred_class)
+```
+
+## Conclusion
+
+This vignette demonstrated how to tune the crucial `fit_*` and `compile_*` arguments of a Keras model within the `tidymodels` framework using `kerasnip`. By exposing these as tunable parameters, `kerasnip` gives you full control over the training process, allowing you to optimize not just the model's architecture, but also how it learns.