Skip to content

Commit 3c7a24c

Browse files
committed
Add "Stopping Criteria"
1 parent 58d1bee commit 3c7a24c

File tree

2 files changed

+114
-18
lines changed

2 files changed

+114
-18
lines changed

firk/main.tex

Lines changed: 104 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -25,12 +25,12 @@
2525

2626
\maketitle
2727

28-
\section{Notation}
29-
\todo{Is this necessary?}
28+
% \section{Notation}
29+
% \todo[inline]{Is this necessary?}
3030

3131
\section{Runge-Kutta Methods}
32-
Runge-Kutta methods can numerically solve differential-algebraic equations (DAEs)
33-
that are written in the form of
32+
Runge-Kutta methods can numerically solve differential-algebraic equations
33+
(DAEs) that are written in the form of
3434
\begin{equation}
3535
M \dv{u}{t} = f(u, t),\quad u(a)=u_a \in \mathbb{R}^m, \quad t\in [a, b].
3636
\end{equation}
@@ -50,16 +50,16 @@ \section{Runge-Kutta Methods}
5050
\end{align}
5151
\end{definition}
5252

53-
We observe that when $a_{ij} = 0$ for $i\ge j$, \cref{eq:rk_lin} can be computed
54-
without solving a system of algebraic equations, and we call methods with this
55-
property \emph{explicit} otherwise \emph{implicit}.
53+
We observe that when $a_{ij} = 0$ for each $i\ge j$, \cref{eq:rk_lin} can be
54+
computed without solving a system of algebraic equations, and we call methods
55+
with this property \emph{explicit} otherwise \emph{implicit}.
5656

5757
We should note solving $z_i$ is much preferred over $g_i$, as they have a
5858
smaller magnitude. A method is stiffly accurate, i.e. the stability function
5959
$R(\infty) = 0$, when the matrix $\bm{A}$ is fully ranked and $a_{si} = b_i$.
60-
Hence, $\bm{b}^T\bm{A}^{-1}$ gives the last row of the identity matrix $I_s$. We
61-
can use this condition to further simplify \cref{eq:rk_sum} (by slight abuse of
62-
notation):
60+
Hence, for such methods $\bm{b}^T\bm{A}^{-1}$ gives the last row of the identity
61+
matrix $I_s$. We can use this condition to further simplify \cref{eq:rk_sum} (by
62+
slight abuse of notation):
6363
\begin{equation} \label{eq:rk_sim}
6464
u_{n+1} = u_n + \bm{b}^T\bm{k} = u_n +
6565
\bm{b}^T\underbrace{\bm{A}^{-1}\bm{z}}_\text{\cref{eq:rk_lin}} = u_n +
@@ -73,18 +73,18 @@ \section{Solving Nonlinear Systems from Implicit Runge-Kutta Methods}
7373
G(\bm{z}) = (I_s \otimes M) \bm{z} - h(\bm{A}\otimes I_m) \tilde{\bm{f}}(\bm{z}) \qq{and}
7474
\tilde{f}(\bm{z})_i = f(u_n+z_i, t+c_i h)
7575
\end{equation}
76-
The propose of introducing a computationally expensive nonlinear
77-
system solving step is to combat extreme stiffness. Hence, the Jacobian matrix
78-
arising from \cref{eq:rk_lin} is ill-conditioned. We must use Newton
76+
The propose of introducing a computationally expensive nonlinear system solving
77+
step is to combat extreme stiffness. The Jacobian matrix arising from
78+
\cref{eq:rk_lin} is ill-conditioned due to stiffness. Thus, we must use Newton
7979
iteration to ensure stability, since fixed-point iteration only converges for
8080
contracting maps, which greatly limits the step size.
8181

8282
Astute readers may notice that the Jacobian
8383
\begin{equation}
8484
\tilde{\bm{J}}_{ij} = \pdv{\tilde{\bm{f}}_i}{\bm{z}_j} = \pdv{f(u_n + z_i, t+c_ih)}{u}
8585
\end{equation}
86-
requires us to compute the Jacobian of $f$ at $s$ many different points, which
87-
is very expensive. We can approximate it by
86+
requires us to compute the Jacobian of $f$ at $s$ many points, which can very
87+
expensive. We can approximate it by
8888
\begin{equation}
8989
\tilde{\bm{J}}_{ij} \approx J = \pdv{f(u_n, t)}{u}.
9090
\end{equation}
@@ -96,9 +96,10 @@ \section{Solving Nonlinear Systems from Implicit Runge-Kutta Methods}
9696
\end{align}
9797
where $\bm{z}^k$ is the approximation to $\bm{z}$ at the $k$-th iteration.
9898

99-
In Hairer's Radau IIA implementation, he left multiplies \cref{eq:newton_1} by
100-
$(hA)^{-1} \otimes I_m$ to exploit the structure of the iteration matrix
101-
\cite{hairer1999stiff}, so we have
99+
\subsection{Change of Basis}
100+
\todo[inline]{discuss SDIRK}In Hairer's Radau IIA implementation, he left
101+
multiplies \cref{eq:newton_1} by $(hA)^{-1} \otimes I_m$ to exploit the
102+
structure of the iteration matrix \cite{hairer1999stiff}, so we have
102103
\begin{align}
103104
((hA)^{-1} \otimes I_m)(I_s \otimes M) &= (hA)^{-1} \otimes M \\
104105
((hA)^{-1} \otimes I_m)(h\bm{A}\otimes J) &= I_s\otimes J \\
@@ -134,6 +135,91 @@ \section{Solving Nonlinear Systems from Implicit Runge-Kutta Methods}
134135
(V^{-1}\otimes I_m)\tilde{\bm{f}}((V\otimes I_m)\bm{w}^k) \\
135136
\bm{w}^{k+1} &= \bm{w}^{k} + \Delta \bm{w}^k.
136137
\end{align}
138+
People usually call the matrix $W=(h^{-1} \Lambda \otimes M - I_s\otimes J)$ the
139+
iteration matrix or the $W$-matrix.
140+
141+
\subsection{Stopping Criteria}
142+
Note that throughout this subsection, $\norm{\cdot}$ denotes the norm that is
143+
used by the time-stepping error estimate. We are using this choice because we
144+
need to make sure that convergent results from a nonlinear solver does not
145+
introduce step rejections.
146+
147+
There are two approaches to estimate the error of a nonlinear solver, by the
148+
displacement $\Delta \bm{z}^k$ or by the residual $G(\bm{z}^k)$. The residual
149+
behaves like the error scaled by the Lipschitz constant of $G$. Stiff equations
150+
have a large Lipschitz constant. However, this constant is not known a prior.
151+
This makes the residual test unreliable. Hence, we are going to focus on the
152+
analysis of the displacement.
153+
154+
Simplified Newton iteration converges linearly, so we can model the convergence
155+
process as
156+
\begin{equation}
157+
\norm{\Delta \bm{z}^{k+1}} \le \theta \norm{\Delta \bm{z}^{k}}.
158+
\end{equation}
159+
The convergence rate at $k$-th iteration $\theta_k$ can be estimated by
160+
\begin{equation}
161+
\theta_k = \frac{\norm{\Delta\bm{z}^{k}}}{\norm{\Delta\bm{z}^{k-1}}},\quad k\ge 1.
162+
\end{equation}
163+
Notice we have the relation
164+
\begin{equation}
165+
\bm{z}^{k+1} - \bm{z} = \sum_{i=0}^\infty \Delta\bm{z}^{k+i+1}.
166+
\end{equation}
167+
If $\theta<1$, by the triangle inequality, we then have
168+
\begin{equation}
169+
\norm{\bm{z}^{k+1} - \bm{z}^{k}} \le
170+
\norm{\Delta\bm{z}^{k+1}}\sum_{i=0}^\infty \theta^i \le \theta
171+
\norm{\Delta\bm{z}^{k}}\sum_{i=0}^\infty \theta^i = \frac{\theta}{1-\theta}
172+
\norm{\Delta\bm{z}^{k}}.
173+
\end{equation}
174+
To ensure nonlinear solver error does not cause that step rejection, we need a
175+
safety factor $\kappa = 1/10$. Our first convergence criterion is
176+
\begin{equation}
177+
\eta_k \norm{\Delta\bm{z}^k} \le \kappa, \qq{if}
178+
k \ge 1 \text{ and } \theta \le 1, ~ \eta_k=\frac{\theta_k}{1-\theta_k}.
179+
\end{equation}
180+
One major drawback with this convergence criterion is that we can only check it
181+
after two iterations. To cover the case of convergence in the first iteration,
182+
we need to define $\eta_0$. It is reasonable to believe that the convergence
183+
rate remains relatively constant with the same $W$-matrix, so if $W$ is reused
184+
\todo[inline]{Add the reuse logic section} from the previous nonlinear solve,
185+
then we define
186+
\begin{equation}
187+
\eta_0 = \eta_{\text{old}},
188+
\end{equation}
189+
where $\eta_{\text{old}}$ is the finial $\eta_k$ from the previous nonlinear
190+
solve, otherwise we define
191+
\begin{equation}
192+
\eta_0 = \max(\eta_{\text{old}}, ~\text{eps}(\text{relative
193+
tolerance}))^{0.8}.
194+
\end{equation}
195+
In the first iteration, we also check
196+
\begin{equation}
197+
\Delta\bm{z}^{1} < 10^{-5}
198+
\end{equation}
199+
for convergence. \todo[inline]{OrdinaryDiffEq.jl also checks
200+
\texttt{iszero(ndz)}. It seems redundant in hindsight.}
201+
202+
Moreover, we need to define the divergence criteria. It is obvious that we want
203+
to limit $\theta$ to be small. Therefore, the first criterion is
204+
\begin{equation}
205+
\theta_k > 2.
206+
\end{equation}
207+
Also, the algorithm diverges if the max number of iterations is reached without
208+
convergence. A subtler criterion for divergence is: no convergence is predicted
209+
by extrapolating to $\norm{\Delta\bm{z}^k_{\max}}$, e.g.
210+
\begin{equation}
211+
\frac{\theta_k^{k_{\max}-k}}{1-\theta_k} \norm{\Delta\bm{z}^k} > \kappa.
212+
\end{equation}
213+
\todo[inline]{OrdinaryDiffEq.jl doesn't actually check this condition anymore.}
214+
215+
\subsection{$W$-matrix reuse}
216+
217+
\section{Step size control}
218+
\subsection{Standard (Integral) control}
219+
\subsection{Predictive (modified PI) control}
220+
\subsection{Smooth Error Estimation}
221+
222+
\nocite{hairer2010solving}
137223

138224
\bibliography{reference.bib}
139225
\bibliographystyle{siam}

firk/reference.bib

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,3 +8,13 @@ @article{hairer1999stiff
88
year={1999},
99
publisher={Elsevier}
1010
}
11+
12+
@book{hairer2010solving,
13+
title={Solving Ordinary Differential Equations II: Stiff and Differential-Algebraic Problems},
14+
author={Hairer, E. and Wanner, G.},
15+
isbn={9783642052217},
16+
series={Springer Series in Computational Mathematics},
17+
url={https://books.google.com/books?id=g-jvCAAAQBAJ},
18+
year={2010},
19+
publisher={Springer Berlin Heidelberg}
20+
}

0 commit comments

Comments
 (0)