-
Notifications
You must be signed in to change notification settings - Fork 65
docs: update autograd plugin readme #2999
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
8e585bb to
8706f49
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 file reviewed, no comments
Diff CoverageDiff: origin/develop...HEAD, staged and unstaged changesNo lines with coverage information in this diff. |
8706f49 to
769bf2a
Compare
| structures=[structure], | ||
| ... | ||
| ) | ||
| The gradient calculation is performed efficiently using the **adjoint method**, which requires only one additional simulation per gradient evaluation, regardless of the number of design parameters. This makes it feasible to optimize devices with thousands of parameters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this might be too high level of a place to mention but is it worth saying something about that the number of adjoint simulations that get run in practice sometimes is more than 1?
|
|
||
| * **Geometry + Material coverage**: Optimize standard geometries (including `PolySlab` sidewall angles) and dispersive media without custom wrappers. | ||
| * **Topology-friendly workflows**: `CustomMedium` plus the plugin’s filters/projections let you impose fabrication constraints while staying differentiable. | ||
| * **Broadband + adjoint throttling**: A single broadband source can drive gradients; adjoint jobs are auto-grouped and limited by `max_num_adjoint_per_fwd`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly here is where to mention more explicitly that for broadband objective functions that we sometimes need more than one adjoint source? Maybe there can have a small section that explains how simulations get broken up for adjoint with a few examples so that a user could structure their optimization around how many simulations they want to budget for each iteration? (as in, maybe they want to reduce number of frequency points when using field monitors during the early stages of optimization)
|
|
||
| Converting code from the `adjoint` plugin to the native autograd support is straightforward. | ||
| * Use `autograd.numpy` for every array operation in your objective; mixing standard NumPy silently drops gradients. | ||
| * Narrow the monitors to the frequencies that actually enter the objective or `max_num_adjoint_per_fwd` will balloon and block the run. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not totally sure, but I thought that this wouldn't be required. Additional frequencies in the monitors will increase the data size for the adjoint field/permittivity monitor sizes in the forward run, but the number of adjoint simulations and data sizes for the adjoint run should depend only on the frequencies that are used in the objective function as I understand it
| objective_value = mode_power.sel(mode_index=0).sum() | ||
| * **`local_gradient`**: Pass `local_gradient=True` to `td.web.run` / `td.web.run_async` (or set `config.adjoint.local_gradient`) to download the forward and adjoint field data. This is required if you rely on other `config.adjoint.*` overrides (grid spacing, gradient precision, etc.), because remote/server-side gradients ignore those settings. | ||
| When enabled, Tidy3D attaches the adjoint monitors up front (via `_with_adjoint_monitors`) so the forward run exports all fields needed for the backward pass, increasing monitor count, runtime, and download size. Ensure the directory pointed to by `config.adjoint.local_adjoint_dir` has sufficient space. | ||
| * **Adjoint batch safety (`max_num_adjoint_per_fwd`)**: Each forward simulation can spawn at most `max_num_adjoint_per_fwd` adjoint solves (defaults to `config.adjoint.max_adjoint_per_fwd = 10`). Increase the argument if your objective touches many monitors or broadband field data; otherwise the run will raise an error before launching excessive jobs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this config apply only to local gradients or remote as well?
| This can cause unnecessary data usage during the forward pass, especially if the monitors contain many frequencies that are not relevant for the objective function (i.e., they are not being differentiated w.r.t.). | ||
| To avoid this, restrict the frequencies in the monitors only to the ones that are relevant for differentiation during optimization. | ||
| * **Use `autograd.numpy`**: Always import `autograd.numpy as anp` and use it for all numerical operations within your objective function. | ||
| * **Extract Raw Data**: Before performing numerical operations on `xarray.DataArray` objects from `SimulationData` (e.g., `sim_data["monitor"].amps`), extract the raw numpy array using the `.values` or `.data` attribute. This avoids potential issues with metadata interfering with `autograd`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm remembering a discussion where we were wanting to try and have people use .data instead of .values. I can't remember the reason and maybe not relevant anymore, but if so maybe we can say best practice to just use .data here?
| intensity = anp.sum(anp.abs(Ex_data)**2) | ||
| ``` | ||
| * **Use `GeometryGroup`**: To optimize more than 500 structures, group them into a single `GeometryGroup` if they share the same medium. | ||
| * **Set `background_medium`**: When optimizing a structure's shape within another structure, set `Structure.background_medium` to ensure correct gradient calculation at the material interface. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trying to remember why this is needed? In PolySlab right now, it seems like we compute eps_in and eps_out properly with eps_no and eps_inf. Was this a holdover from Box? If so, I can double check, but since it shares the compute_derivatives path now, this might not be necessary anymore
groberts-flex
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall sounds really good, thanks for the update! left a few comments/questions
769bf2a to
79920b1
Compare
Greptile Overview
Greptile Summary
This PR significantly improves the autograd plugin README with a comprehensive rewrite that makes the documentation more accessible and user-friendly. The update reorganizes content into logical sections, adds practical examples, expands coverage of supported components, and provides clearer migration guidance from the deprecated JAX-based adjoint plugin.
Key Improvements
local_gradient,max_num_adjoint_per_fwd, and other configuration optionsThe documentation now provides a much more complete reference for users performing inverse design and optimization with Tidy3D.
Confidence Score: 5/5
Important Files Changed
File Analysis
Sequence Diagram
sequenceDiagram participant User participant ObjectiveFn as Objective Function participant Autograd as autograd Library participant TD as td.web.run() participant FDTD as FDTD Solver participant Adjoint as Adjoint Engine User->>ObjectiveFn: Call with design params ObjectiveFn->>TD: Build Simulation & run TD->>FDTD: Execute forward simulation FDTD-->>TD: Return SimulationData + traced fields TD-->>ObjectiveFn: SimulationData ObjectiveFn->>ObjectiveFn: Post-process to scalar ObjectiveFn-->>Autograd: Return scalar objective Note over User,Adjoint: Backward Pass (Gradient Calculation) User->>Autograd: Request gradient Autograd->>ObjectiveFn: Backpropagate ObjectiveFn->>TD: Custom VJP rule triggered TD->>Adjoint: Setup adjoint simulations Adjoint->>FDTD: Execute adjoint simulation(s) FDTD-->>Adjoint: Adjoint fields Adjoint->>Adjoint: Compute gradients from forward + adjoint fields Adjoint-->>TD: Gradients w.r.t. design params TD-->>ObjectiveFn: Gradients ObjectiveFn-->>Autograd: Gradients Autograd-->>User: Gradient array