Skip to content

Conversation

@yaugenst-flex
Copy link
Collaborator

@yaugenst-flex yaugenst-flex commented Nov 13, 2025

Greptile Overview

Greptile Summary

This PR significantly improves the autograd plugin README with a comprehensive rewrite that makes the documentation more accessible and user-friendly. The update reorganizes content into logical sections, adds practical examples, expands coverage of supported components, and provides clearer migration guidance from the deprecated JAX-based adjoint plugin.

Key Improvements

  • Better structure: Content flows from introduction → how it works → basic workflow → capabilities → advanced features → best practices
  • Expanded tables: Comprehensive coverage of differentiable geometry, materials (including all dispersive models), and monitors
  • Practical examples: Added complete code example showing optimization loop from start to finish
  • Runtime controls: New section documenting local_gradient, max_num_adjoint_per_fwd, and other configuration options
  • Advanced toolkit: Detailed coverage of topology optimization tools (filters, projections, penalties) and differentiable primitives
  • Migration guide: Clearer instructions for users transitioning from the old JAX plugin
  • Mermaid diagram: Visual representation of forward/adjoint gradient flow

The documentation now provides a much more complete reference for users performing inverse design and optimization with Tidy3D.

Confidence Score: 5/5

  • This documentation-only PR is safe to merge with no risk
  • This is a pure documentation update with no code changes. The README improvements enhance user experience by providing clearer explanations, better organization, and more comprehensive coverage of the autograd plugin's features. All referenced functions and components were verified to exist in the codebase.
  • No files require special attention

Important Files Changed

File Analysis

Filename Score Overview
tidy3d/plugins/autograd/README.md 5/5 Comprehensive rewrite of autograd documentation with improved structure, added examples, and better explanation of features

Sequence Diagram

sequenceDiagram
    participant User
    participant ObjectiveFn as Objective Function
    participant Autograd as autograd Library
    participant TD as td.web.run()
    participant FDTD as FDTD Solver
    participant Adjoint as Adjoint Engine
    
    User->>ObjectiveFn: Call with design params
    ObjectiveFn->>TD: Build Simulation & run
    TD->>FDTD: Execute forward simulation
    FDTD-->>TD: Return SimulationData + traced fields
    TD-->>ObjectiveFn: SimulationData
    ObjectiveFn->>ObjectiveFn: Post-process to scalar
    ObjectiveFn-->>Autograd: Return scalar objective
    
    Note over User,Adjoint: Backward Pass (Gradient Calculation)
    
    User->>Autograd: Request gradient
    Autograd->>ObjectiveFn: Backpropagate
    ObjectiveFn->>TD: Custom VJP rule triggered
    TD->>Adjoint: Setup adjoint simulations
    Adjoint->>FDTD: Execute adjoint simulation(s)
    FDTD-->>Adjoint: Adjoint fields
    Adjoint->>Adjoint: Compute gradients from forward + adjoint fields
    Adjoint-->>TD: Gradients w.r.t. design params
    TD-->>ObjectiveFn: Gradients
    ObjectiveFn-->>Autograd: Gradients
    Autograd-->>User: Gradient array
Loading

@yaugenst-flex yaugenst-flex self-assigned this Nov 13, 2025
@yaugenst-flex yaugenst-flex force-pushed the FXC-4119-update-autograd-plugin-readme branch from 8e585bb to 8706f49 Compare November 13, 2025 10:56
@yaugenst-flex yaugenst-flex marked this pull request as ready for review November 13, 2025 10:57
Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, no comments

Edit Code Review Agent Settings | Greptile

@github-actions
Copy link
Contributor

Diff Coverage

Diff: origin/develop...HEAD, staged and unstaged changes

No lines with coverage information in this diff.

structures=[structure],
...
)
The gradient calculation is performed efficiently using the **adjoint method**, which requires only one additional simulation per gradient evaluation, regardless of the number of design parameters. This makes it feasible to optimize devices with thousands of parameters.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this might be too high level of a place to mention but is it worth saying something about that the number of adjoint simulations that get run in practice sometimes is more than 1?


* **Geometry + Material coverage**: Optimize standard geometries (including `PolySlab` sidewall angles) and dispersive media without custom wrappers.
* **Topology-friendly workflows**: `CustomMedium` plus the plugin’s filters/projections let you impose fabrication constraints while staying differentiable.
* **Broadband + adjoint throttling**: A single broadband source can drive gradients; adjoint jobs are auto-grouped and limited by `max_num_adjoint_per_fwd`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly here is where to mention more explicitly that for broadband objective functions that we sometimes need more than one adjoint source? Maybe there can have a small section that explains how simulations get broken up for adjoint with a few examples so that a user could structure their optimization around how many simulations they want to budget for each iteration? (as in, maybe they want to reduce number of frequency points when using field monitors during the early stages of optimization)


Converting code from the `adjoint` plugin to the native autograd support is straightforward.
* Use `autograd.numpy` for every array operation in your objective; mixing standard NumPy silently drops gradients.
* Narrow the monitors to the frequencies that actually enter the objective or `max_num_adjoint_per_fwd` will balloon and block the run.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not totally sure, but I thought that this wouldn't be required. Additional frequencies in the monitors will increase the data size for the adjoint field/permittivity monitor sizes in the forward run, but the number of adjoint simulations and data sizes for the adjoint run should depend only on the frequencies that are used in the objective function as I understand it

objective_value = mode_power.sel(mode_index=0).sum()
* **`local_gradient`**: Pass `local_gradient=True` to `td.web.run` / `td.web.run_async` (or set `config.adjoint.local_gradient`) to download the forward and adjoint field data. This is required if you rely on other `config.adjoint.*` overrides (grid spacing, gradient precision, etc.), because remote/server-side gradients ignore those settings.
When enabled, Tidy3D attaches the adjoint monitors up front (via `_with_adjoint_monitors`) so the forward run exports all fields needed for the backward pass, increasing monitor count, runtime, and download size. Ensure the directory pointed to by `config.adjoint.local_adjoint_dir` has sufficient space.
* **Adjoint batch safety (`max_num_adjoint_per_fwd`)**: Each forward simulation can spawn at most `max_num_adjoint_per_fwd` adjoint solves (defaults to `config.adjoint.max_adjoint_per_fwd = 10`). Increase the argument if your objective touches many monitors or broadband field data; otherwise the run will raise an error before launching excessive jobs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this config apply only to local gradients or remote as well?

This can cause unnecessary data usage during the forward pass, especially if the monitors contain many frequencies that are not relevant for the objective function (i.e., they are not being differentiated w.r.t.).
To avoid this, restrict the frequencies in the monitors only to the ones that are relevant for differentiation during optimization.
* **Use `autograd.numpy`**: Always import `autograd.numpy as anp` and use it for all numerical operations within your objective function.
* **Extract Raw Data**: Before performing numerical operations on `xarray.DataArray` objects from `SimulationData` (e.g., `sim_data["monitor"].amps`), extract the raw numpy array using the `.values` or `.data` attribute. This avoids potential issues with metadata interfering with `autograd`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm remembering a discussion where we were wanting to try and have people use .data instead of .values. I can't remember the reason and maybe not relevant anymore, but if so maybe we can say best practice to just use .data here?

intensity = anp.sum(anp.abs(Ex_data)**2)
```
* **Use `GeometryGroup`**: To optimize more than 500 structures, group them into a single `GeometryGroup` if they share the same medium.
* **Set `background_medium`**: When optimizing a structure's shape within another structure, set `Structure.background_medium` to ensure correct gradient calculation at the material interface.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trying to remember why this is needed? In PolySlab right now, it seems like we compute eps_in and eps_out properly with eps_no and eps_inf. Was this a holdover from Box? If so, I can double check, but since it shares the compute_derivatives path now, this might not be necessary anymore

Copy link
Contributor

@groberts-flex groberts-flex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall sounds really good, thanks for the update! left a few comments/questions

@yaugenst-flex yaugenst-flex force-pushed the FXC-4119-update-autograd-plugin-readme branch from 769bf2a to 79920b1 Compare November 19, 2025 07:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants