2525
2626\maketitle
2727
28- \section {Notation }
29- \todo {Is this necessary?}
28+ % \section{Notation}
29+ % \todo[inline] {Is this necessary?}
3030
3131\section {Runge-Kutta Methods }
32- Runge-Kutta methods can numerically solve differential-algebraic equations (DAEs)
33- that are written in the form of
32+ Runge-Kutta methods can numerically solve differential-algebraic equations
33+ (DAEs) that are written in the form of
3434\begin {equation }
3535 M \dv {u}{t} = f(u, t),\quad u(a)=u_a \in \mathbb {R}^m, \quad t\in [a, b].
3636\end {equation }
@@ -50,16 +50,16 @@ \section{Runge-Kutta Methods}
5050 \end {align }
5151\end {definition }
5252
53- We observe that when $ a_{ij} = 0 $ for $ i\ge j$ , \cref {eq:rk_lin } can be computed
54- without solving a system of algebraic equations, and we call methods with this
55- property \emph {explicit } otherwise \emph {implicit }.
53+ We observe that when $ a_{ij} = 0 $ for each $ i\ge j$ , \cref {eq:rk_lin } can be
54+ computed without solving a system of algebraic equations, and we call methods
55+ with this property \emph {explicit } otherwise \emph {implicit }.
5656
5757We should note solving $ z_i$ is much preferred over $ g_i$ , as they have a
5858smaller magnitude. A method is stiffly accurate, i.e. the stability function
5959$ R(\infty ) = 0 $ , when the matrix $ \bm {A}$ is fully ranked and $ a_{si} = b_i$ .
60- Hence, $ \bm {b}^T\bm {A}^{-1}$ gives the last row of the identity matrix $ I_s $ . We
61- can use this condition to further simplify \cref {eq:rk_sum } (by slight abuse of
62- notation):
60+ Hence, for such methods $ \bm {b}^T\bm {A}^{-1}$ gives the last row of the identity
61+ matrix $ I_s $ . We can use this condition to further simplify \cref {eq:rk_sum } (by
62+ slight abuse of notation):
6363\begin {equation } \label {eq:rk_sim }
6464 u_{n+1} = u_n + \bm {b}^T\bm {k} = u_n +
6565 \bm {b}^T\underbrace {\bm {A}^{-1}\bm {z}}_\text {\cref {eq:rk_lin }} = u_n +
@@ -73,18 +73,18 @@ \section{Solving Nonlinear Systems from Implicit Runge-Kutta Methods}
7373 G(\bm {z}) = (I_s \otimes M) \bm {z} - h(\bm {A}\otimes I_m) \tilde {\bm {f}}(\bm {z}) \qq {and}
7474 \tilde {f}(\bm {z})_i = f(u_n+z_i, t+c_i h)
7575\end {equation }
76- The propose of introducing a computationally expensive nonlinear
77- system solving step is to combat extreme stiffness. Hence, the Jacobian matrix
78- arising from \cref {eq:rk_lin } is ill-conditioned. We must use Newton
76+ The propose of introducing a computationally expensive nonlinear system solving
77+ step is to combat extreme stiffness. The Jacobian matrix arising from
78+ \cref {eq:rk_lin } is ill-conditioned due to stiffness. Thus, we must use Newton
7979iteration to ensure stability, since fixed-point iteration only converges for
8080contracting maps, which greatly limits the step size.
8181
8282Astute readers may notice that the Jacobian
8383\begin {equation }
8484 \tilde {\bm {J}}_{ij} = \pdv {\tilde {\bm {f}}_i}{\bm {z}_j} = \pdv {f(u_n + z_i, t+c_ih)}{u}
8585\end {equation }
86- requires us to compute the Jacobian of $ f$ at $ s$ many different points, which
87- is very expensive. We can approximate it by
86+ requires us to compute the Jacobian of $ f$ at $ s$ many points, which can very
87+ expensive. We can approximate it by
8888\begin {equation }
8989 \tilde {\bm {J}}_{ij} \approx J = \pdv {f(u_n, t)}{u}.
9090\end {equation }
@@ -96,9 +96,10 @@ \section{Solving Nonlinear Systems from Implicit Runge-Kutta Methods}
9696\end {align }
9797where $ \bm {z}^k$ is the approximation to $ \bm {z}$ at the $ k$ -th iteration.
9898
99- In Hairer's Radau IIA implementation, he left multiplies \cref {eq:newton_1 } by
100- $ (hA)^{-1} \otimes I_m$ to exploit the structure of the iteration matrix
101- \cite {hairer1999stiff }, so we have
99+ \subsection {Change of Basis }
100+ \todo [inline]{discuss SDIRK}In Hairer's Radau IIA implementation, he left
101+ multiplies \cref {eq:newton_1 } by $ (hA)^{-1} \otimes I_m$ to exploit the
102+ structure of the iteration matrix \cite {hairer1999stiff }, so we have
102103\begin {align }
103104 ((hA)^{-1} \otimes I_m)(I_s \otimes M) &= (hA)^{-1} \otimes M \\
104105 ((hA)^{-1} \otimes I_m)(h\bm {A}\otimes J) &= I_s\otimes J \\
@@ -134,6 +135,91 @@ \section{Solving Nonlinear Systems from Implicit Runge-Kutta Methods}
134135 (V^{-1}\otimes I_m)\tilde {\bm {f}}((V\otimes I_m)\bm {w}^k) \\
135136 \bm {w}^{k+1} &= \bm {w}^{k} + \Delta \bm {w}^k.
136137\end {align }
138+ People usually call the matrix $ W=(h^{-1} \Lambda \otimes M - I_s\otimes J)$ the
139+ iteration matrix or the $ W$ -matrix.
140+
141+ \subsection {Stopping Criteria }
142+ Note that throughout this subsection, $ \norm {\cdot }$ denotes the norm that is
143+ used by the time-stepping error estimate. We are using this choice because we
144+ need to make sure that convergent results from a nonlinear solver does not
145+ introduce step rejections.
146+
147+ There are two approaches to estimate the error of a nonlinear solver, by the
148+ displacement $ \Delta \bm {z}^k$ or by the residual $ G(\bm {z}^k)$ . The residual
149+ behaves like the error scaled by the Lipschitz constant of $ G$ . Stiff equations
150+ have a large Lipschitz constant. However, this constant is not known a prior.
151+ This makes the residual test unreliable. Hence, we are going to focus on the
152+ analysis of the displacement.
153+
154+ Simplified Newton iteration converges linearly, so we can model the convergence
155+ process as
156+ \begin {equation }
157+ \norm {\Delta \bm {z}^{k+1}} \le \theta \norm {\Delta \bm {z}^{k}}.
158+ \end {equation }
159+ The convergence rate at $ k$ -th iteration $ \theta _k$ can be estimated by
160+ \begin {equation }
161+ \theta _k = \frac {\norm {\Delta\bm {z}^{k}}}{\norm {\Delta\bm {z}^{k-1}}},\quad k\ge 1.
162+ \end {equation }
163+ Notice we have the relation
164+ \begin {equation }
165+ \bm {z}^{k+1} - \bm {z} = \sum _{i=0}^\infty \Delta\bm {z}^{k+i+1}.
166+ \end {equation }
167+ If $ \theta <1 $ , by the triangle inequality, we then have
168+ \begin {equation }
169+ \norm {\bm {z}^{k+1} - \bm {z}^{k}} \le
170+ \norm {\Delta\bm {z}^{k+1}}\sum _{i=0}^\infty \theta ^i \le \theta
171+ \norm {\Delta\bm {z}^{k}}\sum _{i=0}^\infty \theta ^i = \frac {\theta }{1-\theta }
172+ \norm {\Delta\bm {z}^{k}}.
173+ \end {equation }
174+ To ensure nonlinear solver error does not cause that step rejection, we need a
175+ safety factor $ \kappa = 1 /10 $ . Our first convergence criterion is
176+ \begin {equation }
177+ \eta _k \norm {\Delta\bm {z}^k} \le \kappa , \qq {if}
178+ k \ge 1 \text { and } \theta \le 1, ~ \eta _k=\frac {\theta _k}{1-\theta _k}.
179+ \end {equation }
180+ One major drawback with this convergence criterion is that we can only check it
181+ after two iterations. To cover the case of convergence in the first iteration,
182+ we need to define $ \eta _0 $ . It is reasonable to believe that the convergence
183+ rate remains relatively constant with the same $ W$ -matrix, so if $ W$ is reused
184+ \todo [inline]{Add the reuse logic section} from the previous nonlinear solve,
185+ then we define
186+ \begin {equation }
187+ \eta _0 = \eta _{\text {old}},
188+ \end {equation }
189+ where $ \eta _{\text {old}}$ is the finial $ \eta _k$ from the previous nonlinear
190+ solve, otherwise we define
191+ \begin {equation }
192+ \eta _0 = \max (\eta _{\text {old}}, ~\text {eps}(\text {relative
193+ tolerance}))^{0.8}.
194+ \end {equation }
195+ In the first iteration, we also check
196+ \begin {equation }
197+ \Delta\bm {z}^{1} < 10^{-5}
198+ \end {equation }
199+ for convergence. \todo [inline]{OrdinaryDiffEq.jl also checks
200+ \texttt {iszero(ndz) }. It seems redundant in hindsight.}
201+
202+ Moreover, we need to define the divergence criteria. It is obvious that we want
203+ to limit $ \theta $ to be small. Therefore, the first criterion is
204+ \begin {equation }
205+ \theta _k > 2.
206+ \end {equation }
207+ Also, the algorithm diverges if the max number of iterations is reached without
208+ convergence. A subtler criterion for divergence is: no convergence is predicted
209+ by extrapolating to $ \norm {\Delta\bm {z}^k_{\max }}$ , e.g.
210+ \begin {equation }
211+ \frac {\theta _k^{k_{\max }-k}}{1-\theta _k} \norm {\Delta\bm {z}^k} > \kappa .
212+ \end {equation }
213+ \todo [inline]{OrdinaryDiffEq.jl doesn't actually check this condition anymore.}
214+
215+ \subsection {$ W$ -matrix reuse }
216+
217+ \section {Step size control }
218+ \subsection {Standard (Integral) control }
219+ \subsection {Predictive (modified PI) control }
220+ \subsection {Smooth Error Estimation }
221+
222+ \nocite {hairer2010solving }
137223
138224\bibliography {reference.bib}
139225\bibliographystyle {siam}
0 commit comments