Conformal mappings and the Phragmen-Lindelöf Theorem

I would like to go back to the Phragmen-Lindelöf theorem that I presented in a previous post. Let us recall the result. In the following, we write, for all z\in \mathbb{C}, zx+iy, with x and y real numbers. The open set \Omega is defined as

\displaystyle \Omega=\{z\in \mathbb{C}\,;\,-\frac{\pi}{2}<y<\frac{\pi}{2}\},

\overline{\Omega} denotes the closure of \Omega in \mathbb{C}, and \partial \Omega=\overline{\Omega}\setminus\Omega its boundary. I write \mathcal{H}(\Omega) for the set of all holomorphic functions on \Omega and \mathcal{C}(\overline{\Omega}) for the set of all continuous functions on \overline{\Omega}.

Theorem (Phragem-Lindelöf)

Let f be a function in \mathcal{H}(\Omega)\cap\mathcal{C}(\overline{\Omega}) such that

\displaystyle \left|f\left(x\pm i\frac{\pi}{2}\right)\right|\le 1

for all x\in \mathbb{R}, and let us assume that there exist real constants A and \alpha <1 such that

\displaystyle |f(z)|\le \exp(A\exp(\alpha|x|))

for all z \in \Omega. Then |f(z)|\le 1 for all z \in \Omega, and, if there exists z_0\in \Omega such that |f(z_0)|=1, f is a constant.

The Phragmen-Lindelöf  method can be adapted to prove results of this type on various domains by constructing suitable families of functions (g_{\varepsilon})_{\varepsilon>0} (see this same previous post for context). There is however another way to obtain a similar result for another domain. If we can find an holomorphic change of variable, that is to say a conformal mapping, that maps the domain \Omega in the theorem to the domain that we are considering, we obtain the Phragmen-Lindelöf result on this last domain. Of course, the growth condition will be modified by the mapping. Let us give several examples. In the following, the original complex variable will be denoted by z=x+iy as before and the new variable by w=u+iv.

General horizontal strip

Let a and b be real numbers such that a<b and let \Omega(a,b) be the strip

\{z\in \mathbb{C}\,;\,a<y<b\}.

Proposition 1

Let f be a function in \mathcal{H}(\Omega(a,b))\cap\mathcal{C}(\overline{\Omega(a,b)}) such that |f(z)|\le 1 for z\in \partial \Omega(a,b), and let us assume that there exist real constants A and \alpha<\frac{\pi}{b-a} such that

\displaystyle |f(z)|\le \exp\left(A\exp\left(\alpha|x|\right)\right)

for all z \in \Omega(a,b).
Then |f(z)|\le 1 for all z\in \Omega(a,b), and, if there exists z_0\in \Omega(a,b) such that |f(z_0)|=1, f is a constant.

Proof. Let us define \varphi:\mathbb{C}\to \mathbb{C} by

\displaystyle \varphi(w)=\frac{b-a}{\pi}w+i\frac{a+b}{2}.

It is obviously an holomorphic change of variable (it is even linear). We have \varphi(\Omega)=\Omega(a,b). Let us write g=f\circ \varphi. The function g is in \mathcal{H}(\Omega)\cap\mathcal{C}(\overline{\Omega}), and |g(w)|\le 1 for all w \in \partial \Omega. Let us now consider w\in \Omega and z=\varphi(w). We have, for the real part of z,

\displaystyle x=\frac{b-a}{\pi}u.

Since we have

\displaystyle  |f(z)|\le \exp\left(A\exp\left(\alpha|x|\right)\right),

we obtain

\displaystyle |g(w)|\le \exp\left(A\exp\left(\frac{\alpha(b-a)}{\pi}|u|\right)\right).

Since \frac{\alpha(b-a)}{\pi}<1, the Phragmen-Lindelöf Theorem yields the desired result. QED.

General vertical strip

Let a and b be real numbers such that a<b, and let \Pi(a,b) be the strip

\{z\in \mathbb{C}\,;\,a<x<b\}.

Proposition 2
Let f be a function in \mathcal{H}(\Pi(a,b))\cap\mathcal{C}(\overline{\Pi(a,b)}) such that |f(z)|\le 1 for z\in \partial \Pi(a,b), and let us assume that there exist real constants A and \alpha<\frac{\pi}{b-a} such that

\displaystyle |f(z)|\le \exp\left(A\exp\left(\alpha|y|\right)\right)

for all z \in \Pi(a,b).
Then |f(z)|\le 1 for all z\in \Omega(a,b), and, if there exists z_0\in \Pi(a,b) such that |f(z_0)|=1, f is a constant.

Proof. We define \varphi:\mathbb{C}\to \mathbb{C} by \varphi(w)=-iw. The function \varphi is an holomorphic change of variable that maps \Omega(a,b) to \Pi(a,b). The function g=f\circ \varphi satisfies the hypotheses of Proposition 1, which yields the desired result. QED.

Sector

Let \theta_1 be a number in ]-\pi,\pi] and \theta_2 another number such that 0<\theta_2-\theta_1<2\pi.
We define the open sector \Sigma(\theta_1,\theta_2) by

\displaystyle \Sigma(\theta_1,\theta_2)=\left\{re^{i\theta}\,;\,r>0\mbox{ and } \theta_1<\theta<\theta_2\right\},

and we set

\displaystyle \Sigma'(\theta_1,\theta_2)=\left\{re^{i\theta}\,;\,r>0\mbox{ and } \theta_1\le\theta\le\theta_2\right\},

the closure of \Sigma(\theta_1,\theta_2) with the origin removed.

Proposition 3

Let f be a function in \mathcal{H}(\Sigma(\theta_1,\theta_2))\cap\mathcal{C}(\Sigma'(\theta_1,\theta_2)) such that |f\left(re^{i\theta_1}\right)|\le 1 and |f\left(re^{i\theta_2}\right)|\le 1 for all r>0. Let us assume that there exist real constants A and \alpha<\frac{\pi}{\theta_2-\theta_1} such that

\displaystyle |f(z)|\le \exp\left(A|z|^{\alpha}\right)

if z \in \Sigma(\theta_1,\theta_2) with |z|\ge 1 and

\displaystyle |f(z)|\le \exp\left(\frac{A}{|z|^{\alpha}}\right)

if z \in \Sigma(\theta_1,\theta_2) with |z|\le 1.
Then |f(z)|\le 1 for all z\in \Sigma(\theta_1,\theta_2), and, if there exists z_0\in \Sigma(\theta_1,\theta_2) such that |f(z_0)|=1, f is a constant.

Let us note that this proposition implies the following weaker statement, which is often referred to as the Phragmen-Lindelöf principle.

Corollary

Let f be a function in \mathcal{H}(\Sigma(\theta_1,\theta_2))\cap\mathcal{C}(\overline{\Sigma(\theta_1,\theta_2)}) such that |f\left(re^{i\theta_1}\right)|\le 1 and |f\left(re^{i\theta_2}\right)|\le 1 for all r>0. Let us assume that there exist real constants A and \alpha<\frac{\pi}{\theta_2-\theta_1} such that

\displaystyle |f(z)|\le \exp\left(A|z|^{\alpha}\right)

for all z \in \Sigma(\theta_1,\theta_2).
Then |f(z)|\le 1 for all z\in \Sigma(\theta_1,\theta_2), and, if there exists z_0\in \Sigma(\theta_1,\theta_2) such that |f(z_0)|=1, f is a constant.

Proof. Since 0<\theta_2-\theta_1<2\pi, the exponential function is a bijection from \Omega(\theta_1,\theta_2) to \Sigma(\theta_1,\theta_2), and furthermore \exp\left(\overline{\Omega(\theta_1,\theta_2)}\right)=\Sigma'(\theta_1,\theta_2). Let us consider w\in \Omega(\theta_1,\theta_2) and z=\exp(w). We have
|z|=e^u.

If u\ge 0, |z|\ge 1, and since

\displaystyle |f(z)|\le \exp\left(A|z|^{\alpha}\right),

we obtain

\displaystyle |g(w)|\le \exp(A\exp(\alpha u)).

If u\le 0, |z|\le 1, and since

\displaystyle |f(z)|\le \exp\left(A|z|^{-\alpha}\right),

we obtain

\displaystyle |g(w)|\le \exp(A\exp(-\alpha u)).

In both cases, we have

\displaystyle |g(w)|\le \exp(A\exp(\alpha|u|)),

and we can apply Proposition 2. QED.

One last remark: in my previous post, I presented several results which give a precised form of the maximum principle. This allows us to see which part of the boundary has the most weight in controlling the modulus at a given interior point. I stated the results for bounded holomorphic functions, but they are not limited to them. Indeed, we can <em<first apply one of the results in this post to show that an holomorphic function that does not grow to fast at infinity is bounded, and then apply the corresponding result in the previous post.

Three-curves theorems

This post deals with a family of theorems that precise and extend the usual maximum modulus principle for analytic functions. Contrary to my previous post on the Phragmen-Lindelöf principle, we are less interested in the fact that the maximal modulus principle is extended to unbounded domains (although this is indeed the case in several of the following examples) than in giving a quantitative form, which will allow us to say that the parts of the boundary that are closer create a stronger constraint on the modulus of the function. In the following, z=x+iy denotes a complex number, with x and y two real numbers, its real and imaginary part. I use as a reference Real and complex analysis, second edition, by Walter Rudin.

The three-lines theorem

Let \Pi be the open vertical strip in the complex plane defined by

\displaystyle \Pi=\left\{z \in \mathbb{C}\,;\, 0<x<1 \right\}.

Theorem 1

Let f be a function that is holomorphic in \Pi, continuous on \overline{\Pi} and bounded on \overline{\Pi}. For x\in [0,1], we set

\displaystyle M(x)=\sup_{y\in \mathbb{R}} |f(x+iy)|.

Then

\displaystyle M(x)\le M(0)^{1-x}M(1)^x.

Let us note that Theorem 1 states that x \mapsto \log(M(x)) is a convex function on [0,1]. To prove this, we will use the following Lemma.

Lemma 1

Let f be a function that is holomorphic in \Pi, continuous on \overline{\Pi} and bounded on \overline{\Pi}. If |f(z)|\le 1 for all z \in \partial\Pi, then |f(z)|\le 1 for all z \in \Pi.

Proof. The above Lemma is a direct consequence of a Phragmen-Lindelöf type result on the domain \Pi. However, the hypothesis that f is bounded allows us to give a simpler proof, taken from Rudin (Theorem 12.8).

Let us consider z_0\in \Pi. For any \varepsilon>0, we define the function

\displaystyle g_{\varepsilon}(z)=\frac{1}{1+\varepsilon z}.

For all z\in \Pi, Re(1+\varepsilon z)=1+\varepsilon x>1, therefore |1+\varepsilon z|>1 and |g_{\varepsilon}(z)|<1. On the other hand, for all z \in \Pi, \mbox{Im}(1+\varepsilon z)=\varepsilon y, and thus
|g_{\varepsilon}(y)|\le \frac{1}{\varepsilon |y|}.
We chose R such that |\mbox{Im}(z_0)|<R and R\ge  \frac{1}{\varepsilon }. Let us consider the open rectangle \Pi_{R} defined by

\displaystyle \Pi_{R}=\left\{z\in \mathbb{C}\,;\,0<x<1 \mbox{ and } -R<y<R \right\}.

The function z \to f(z)g_{\varepsilon}(z) is holomorphic in \Pi_{R}, continuous on \Pi_{R}, and, for all z \in \partial \Pi_{R}, |f(z)g_{\varepsilon}(z)|\le 1. Since z_0 is in \Pi_{R}, we have, according to the maximum modulus principle, |f(z_0)g_{\varepsilon}(z_0)|\le 1. The inequality holds for arbitrary \varepsilon>0, and g_{\varepsilon}(z_0) tends to 1 as \varepsilon tends to 0. Therefore, we obtain |f(z_0)|\le 1. QED.

Let us now prove the general case.

First, let us treat the case when M(0)=0, that is to say when f is identically 0 on the imaginary axis. We will show that in this case, f is identically 0 on \Pi, and the inequality is then trivial (by the way, this is an answer to Exercice 7, Chapter 12 in Rudin).

Let us consider the function \varphi:\mathbf{C} \to \mathbf{C} defined by \varphi(w)=-iw. We denote by \Omega^{+} the horizontal strip defined by

\displaystyle \Omega^{+}=\left\{z \in \mathbb{C}\,;\, 0<y<1\right\}.

We have \varphi(\Omega^{+})=\Pi. We now set g=f\circ \varphi. The function g is continuous on \overline{\Omega^{+}}, holomorphic in \Omega, and is identically 0 on the real axis. According to the Schwarz reflexion principle (see for instance Rudin, Theorem 11.17), there exists a function G, holomorphic on the strip

\Omega=\left\{z \in \mathbb{C}\,;\, -1<y<1\right\}

such that G(z)=g(z) for all z \in \Omega^{+}. We know (by taking the limit of G(z) as \mbox{Im(z)} tends to 0) that G is identically 0 on the real axis. By the isolated zeros theorem, G is identically 0 on \Omega, and in particular g is identically 0 on \Omega^{+}, and thus f is identically 0 on \Pi. If M(1)=0, we obtain that f is identically 0 on \Pi by a symmetry argument: considering z\mapsto g(1-z) brings us back to the previous case.

We now assume that M(0)>0 and M(1)>0. We set g(z)=M(0)^{1-z}M(1)^{z}. The function g is entire. We have

\displaystyle \left|M(0)^{1-z}M(1)^{z}\right|=M(0)^{1-x}M(1)^{x}

for all z \in \mathbb{C}, and in particular |g(z)|\ge \min(M(0),M(1))>0. This implies that g has no zero and that \frac{1}{g} is bounded. Furthermore, for all y \in \mathbb{R}, |g(iy)|=M(0) and |g(1+iy)|=M(1). This implies that |f(z)/g(z)|\le 1 for all z \in \partial \Pi. According to Lemma 2, we have

\displaystyle \frac{|f(z)|}{|g(z)|}\le 1

for all z \in \Pi. Thus if x \in ]0,1[, we have, for all y \in \mathbb{R},

\displaystyle |f(x+iy)|\le M(0)^{1-x}M(1)^{x}.

This gives us the desired result.

From Theorem 1, we obtain a more general three-line theorem simply by using a linear change of variable (actually, we already used one that to apply Schwartz reflexion principle). Let a and b be real numbers with a<b and let \Pi(a,b) be the open vertical strip defined by

\displaystyle \Pi(a,b)=\left\{z\in \mathbb{C}\,;\, a<x<b\right\}.

Theorem 2

Let f be a function that is holomorphic in \Pi(a,b), continuous on \overline{\Pi(a,b)} and bounded on \overline{\Pi(a,b)}. For x\in [a,b], we set

\displaystyle M(x)=\sup_{y\in\mathbb{R}}|f(x+iy)|.

We have

\displaystyle M(x)\le M(a)^{\frac{b-x}{b-a }}M(b)^{\frac{x-a}{b-a}}.

Proof. We define \varphi(w)=(b-a)w+a. We have \varphi(\Pi)=\Pi(a,b). We set g=f\circ \varphi and apply Theorem 1 to g. QED.

In the same way, we can formulate and prove a theorem for horizontal strips. Let a and b be real numbers with a<b and let \Omega(a,b) be the open horizontal strip defined by

\displaystyle \Omega(a,b)=\left\{z\in \mathbb{C}\,;\, a<y<b\right\}.

Theorem 3

Let f be a function that is holomorphic in \Omega(a,b), continuous on \overline{\Omega(a,b)} and bounded on \overline{\Omega(a,b)}. For y\in [a,b], we set

\displaystyle M(y)=\sup_{x\in\mathbb{R}}|f(x+iy)|.

We have

\displaystyle M(y)\le M(a)^{\frac{b-y}{b-a }}M(b)^{\frac{y-a}{b-a}}.

Proof. We use the change of variable z=iw and apply Theorem 2. QED.

The three-circles theorem

The theorem is due to Jacques Hadamard. There exist several proofs, here we will deduce it from Theorem 2 (see Rudin, Chapter 12, Exercise 8).

We consider 0<r_1<r_2 and we denote by A(r_1,r_2) the open annulus defined by

\displaystyle A(r_1,r_2)=\{z\in \mathbb{C}\,;\,r_1<|z|<r_2\}.

Theorem 4

Let f be a function that is holomorphic in A(r_1,r_2) and continuous on \overline{A(r_1,r_2)}. For r \in [r_1,r_2], we set

\displaystyle M(r)=\max_{\theta\in]-\pi,\pi]}|f(re^{i\theta})|.

Then, we have

\displaystyle M(r)=M(r_1)^{\frac{\log(r_2)-\log(r)}{\log(r_2)-\log(r_1)}}M(r_2)^{\frac{\log(r)-\log(r_1)}{\log(r_2)-\log(r_1)}}.

Let us note that in this case, f is clearly bounded on \overline{A(r_1,r_2)}, since it is continuous and \overline{A(r_1,r_2)} is compact. Furthermore, the maximum modulus principle for an holomorphic function on a bounded domain tells us that the maximum of f is reached on one of the boundary circles \{z\,;\,|z|=r_1\} and \{z\,;\,|z|=r_2\}. The improvement resides in the fact that Theorem 4 is quantitative: it tells us for instance that if we are closer to the inner circle, the maximum of the modulus of f on this circle has more weight in controlling the modulus of f.

Let us also note that Theorem 4 can be expressed as the fact that M(r) is a convex function of \log(r). Indeed, it was first stated in this form by Hadamard.

Proof. The exponential function maps the open vertical strip \Pi(\log(r_1),\log(r_2)) to A(r_1,r_2) and its closure \overline{\Pi(\log(r_1),\log(r_2))} to \overline{A(r_1,r_2)} (the mapping is not one-to-one, but it doesn’t matter here). We set g=f\circ \exp and apply Theorem 2 to g. QED.

The three-rays theorem

I haven’t found this last result stated anywhere in quite this way, but it a straightforward transposition of Theorem 3. Let \theta_1 be a number in ]-\pi,\pi] and \theta_2 another number such that 0<\theta_2-\theta_1<2\pi.
We define the open sector \Sigma(\theta_1,\theta_2) by

\displaystyle \Sigma(\theta_1,\theta_2)=\left\{re^{i\theta}\,;\,r>0\mbox{ and } \theta_1<\theta<\theta_2\right\}.

Theorem 5

Let f be a function that is holomorphic in (\Sigma(\theta_1,\theta_2), continuous and bounded on \overline{\Sigma(\theta_1,\theta_2)}. For \theta\in[\theta_1,\theta_2], we set

\displaystyle M(\theta)=\sup_{r>0}|f(re^{i\theta})|.

Then, we have

\displaystyle M(\theta)\le M(\theta_1)^{\frac{\theta_2-\theta}{\theta_2-\theta_1}}M(\theta_2)^{\frac{\theta-\theta_1}{\theta_2-\theta_1}}.

Proof. The exponential function sends the horizontal strip \Omega(\theta_1,\theta_2) to the open sector \Sigma(\theta_1,\theta_2). The closure \overline{\Omega(\theta_1,\theta_2)} of this strip is sent to the closure of the sector minus the origin. We set g=f\circ \exp and apply Theorem 3 to g. QED.

The Phragmen-Lindelöf method

A friend told me some time ago about a family of theorems that extend the maximum principle for holomorphic functions to unbounded domains. I have read about it, and I want to record some results with their proof. My main reference is the book Real and Complex Analysis, by Walter Rudin (I use the second edition).

The maximum principle for holomorphic functions

In the following, \Omega denotes an open set in the complex plane \mathbb{C}, \overline{\Omega} denotes the closure of \Omega in \mathbb{C}, and \partial \Omega=\overline{\Omega}\setminus\Omega its boundary. I denote by \mathcal{H}(\Omega) the set of all holomorphic functions on \Omega and by \mathcal{C}(\overline{\Omega}) the set of all continuous functions on \overline{\Omega}. I will use the the notation z=x+iy,  where x and y are real numbers, the real and imaginary parts of z. Let us recall the maximum principle.

Theorem 1

Let us assume that \Omega is bounded and that f belongs to \mathcal{H}(\Omega)\cap\mathcal{C}(\overline{\Omega}). Then,

\displaystyle  \sup_{z \in \Omega}|f(z)|\le \max_{z \in \partial\Omega}|f(z)|.

Furthermore, if there exists z_0\in \Omega such that |f(z_0)|=\max_{z \in \partial\Omega}|f(z)|, the function f is a constant.

For a proof, see for instance Chapters 10 and 12 of Rudin’s book. As Rudin points out, this principle does not hold if we do not assume that \Omega is bounded. He gives the following counter example:

\displaystyle \Omega=\{z\in \mathbb{C}\,;\,-\frac{\pi}{2}<y<\frac{\pi}{2}\}

and

\displaystyle f(z)=\exp(\exp(z)).

The function f is entire, so we obviously have f\in \mathcal{H}(\Omega)\cap\mathcal{C}(\overline{\Omega}). We also have

\displaystyle f\left(x\pm i\frac{\pi}{2}\right)=\exp\left(\pm i \exp(x)\right),

and thus

\displaystyle \max_{z \in \partial\Omega}|f(z)|=1.

On the other hand,

\displaystyle f(x)=\exp(\exp(x)),

so that

\displaystyle \lim_{x\to +\infty}f(x)=+\infty,

and therefore

\displaystyle \sup_{z \in \Omega}|f(z)|=+\infty.

The maximum principle therefore does not hold in this example.

An example of the method

In this section, as before,

\displaystyle\Omega=\{z\in \mathbb{C}\,;\,-\frac{\pi}{2}<y<\frac{\pi}{2}\}.

In the counter-example of the preceding section, the modulus of the function f grows very rapidly when x tends to +\infty. We will now show that if we prevent |f(z)| from growing too rapidly when z tends to \infty, the maximum principle holds.

To simplify, our statement, let us consider a function f\in \mathcal{H}(\Omega)\cap\mathcal{C}(\overline{\Omega}) such that \left|f\left(x\pm i\frac{\pi}{2}\right)\right|\le 1 for any x\in \mathbb{R}. Informally, the Phragmen-Lindelöf result states that if we know a priori that |f(z)| does not grow to rapidly  when z tends to \infty, then in fact |f(z)|\le 1 for all z\in \Omega. Furthermore, if there exists z_0 in \Omega such that |f(z_0)|=1, then f is a constant.

Since the precise formulation will be a bit technical, I will first try to give an outline of the method. The basic idea is to apply the standard maximum principle on a bounded domain, and to use a family of auxiliary functions (g_{\varepsilon})_{\varepsilon>0}. The trick is to build (g_{\varepsilon})_{\varepsilon>0} such that

  1. g_{\varepsilon}\in \mathcal{H}(\Omega)\cap\mathcal{C}(\overline{\Omega});
  2. \left|g_{\varepsilon}\left(x\pm i\frac{\pi}{2}\right)\right|\le1;
  3. g_{\varepsilon}(z) tends to 1 when \varepsilon tends to 0 for any z \in \Omega;
  4. g_{\varepsilon}(z)f(z) tends to 0 when z tends to \infty, which is allowed by the a priori condition on the growth of f(z).

Then, we obtain the desired result by applying the maximum principle to the function z\mapsto f(z)g_{\varepsilon}(z) on the bounded rectangle

\displaystyle \Omega_{R}=\left\{z \in \mathbb{C}\,;\, -R<x<R \mbox{ and } -\frac{\pi}{2}<y<\frac{\pi}{2}\right\},

and by letting \varepsilon tend to 0 and R tend to +\infty.

Let us now give a precise statement and a proof. It will in particular show that the counter-example quoted above is in some sense optimal.

Theorem 2

Let f be a function in \mathcal{H}(\Omega)\cap\mathcal{C}(\overline{\Omega}) such that
\displaystyle \left|f\left(x\pm i\frac{\pi}{2}\right)\right|\le 1
for all x\in \mathbb{R}, and let us assume that there exist real constants A and \alpha <1 such that
\displaystyle |f(z)|\le \exp(A\exp(\alpha|x|))
for all z \in \Omega. Then |f(z)|\le 1 for all z \in \Omega, and, if there exists z_0\in \Omega such that |f(z_0)|=1, f is a constant.

Proof. Let us first pick z_0\in \Omega. We want to prove that |f(z_0)|\le 1. In accordance with the method outlined above, let us choose some \beta such that \alpha <\beta and \beta <1. For all \varepsilon>0, let us set

\displaystyle g_{\varepsilon}(z)=\exp(-\varepsilon\cosh(\beta z)).

For the moment, the parameter \varepsilon is fixed. The function g_{\varepsilon} is entire and therefore belongs to \mathcal{H}(\Omega)\cap\mathcal{C}(\overline{\Omega}). Furthermore, we have

\displaystyle|g_{\varepsilon}(z)|=\exp(-\varepsilon\mbox{Re}(\cosh(\beta z)))=\exp(-\varepsilon\cosh(\beta x)\cos(\beta y)).

This implies

\displaystyle|g_{\varepsilon}(z)|\le \exp(-\delta \exp(\beta|x|))

with

\displaystyle\delta=\frac{\varepsilon}{2}\cos\left(\frac{\beta\pi}{2}\right)>0.

This implies that

\displaystyle\left|f\left(x\pm i\frac{\pi}{2}\right)g_{\varepsilon}\left(x\pm i\frac{\pi}{2}\right)\right|\le 1

for all x \in \mathbb{R}, and that

\displaystyle|f(z)g_{\varepsilon}(z)|\le \exp\left(A\exp(\alpha|x|)-\delta\exp(\beta|x|)\right)

for all z \in \Omega.
Let us now chose R>0 large enough so that R> |z_0| and

\displaystyle \exp\left(A\exp(\alpha R)-\delta\exp(\beta R)\right)\le 1.

Such an R exists since \beta>\alpha. Of course, it will depend on \varepsilon. The bounded rectangle \Omega_{R} is defined as above. The function z \mapsto f(z)g_{\varepsilon}(z) is in \mathcal{H}(\Omega_{R})\cap\mathcal{C}(\overline{\Omega_{R}}) and |f(z)g_{\varepsilon}(z)|\le 1 for all z in \partial \Omega_{R}, the boundary of \Omega_{R}. According to the maximum principle (Theorem 1), since z_0\in \Omega_{R}, |f(z_0)g_{\varepsilon}(z_0)|\le 1. This last inequality has been proved for an arbitrary \varepsilon, and f_{\varepsilon}(z_0)\to 1 when \varepsilon \to 0, therefore |f(z_0)|\le 1. This allows us to conclude that |f(z)|\le 1 for all z \in \Omega.

Let us now assume that there exists z_0\in \Omega such that |f(z_0)|=1. Let us pick some R>0 such that R>|x_0|. Using our previous notation we have z_0\in \Omega_{R}, and, since |f(z)|\le 1 for all z\in \overline{\Omega}, we have in particular |f(z)|\le 1 for all z\in \partial \Omega_R. According to the maximum principle, this implies that f is a constant on \Omega_{R}. Since \Omega_{R} is an open set in \Omega, the principle of analytic continuation tells us that f is a constant on \Omega. This concludes the proof.

Let us note if \alpha\ge 1, the theorem is false, as shown by the counter-example of the first section. The method is obviously very flexible. With little change in the proof, we can obtain results dealing with a domain that is bounded on one side, or with a function that satisfies the inequality only for some values of the modulus. To be more explicit, I will state two results, but we can imagine much more. In the following, we consider \Omega', the open half-strip defined by

\displaystyle \Omega'=\{z\in \mathbb{C}\,;\,0<x\mbox{ and } -\frac{\pi}{2}<y<\frac{\pi}{2}\}.

Proposition 3

Let f be a function in \mathcal{H}(\Omega')\cap\mathcal{C}(\overline{\Omega'}) such that
|f(z)|\le 1 for all z \in \partial \Omega',
and let us assume that there exist real constants A and \alpha<1 such that

\displaystyle |f(z)|\le \exp(A\exp(\alpha x))

for all z \in \Omega'. Then |f(z)|\le 1 for all z \in \Omega, and, if there exists z_0\in \Omega' such that |f(z_0)|=1, f is a constant.

Proposition 4

Let f be a function in \mathcal{H}(\Omega')\cap\mathcal{C}(\overline{\Omega'}) such that |f(z)|\le 1 for all z \in \partial \Omega', and let us assume that there exist real constants A and \alpha<1, and a sequence (R_n) of positive numbers tending to +\infty such that, for all n \in \mathbb{N},

\displaystyle |f(z)|\le \exp(A\exp(\alpha R_n))

for all z \in \Omega' such that x=R_n. Then |f(z)|\le 1 for all z \in \Omega, and, if there exists z_0\in \Omega' such that |f(z_0)|=1, f is a constant.

Some Point-set Topology: “Part 2: Separation Axioms”

The general structure of topological space is quite nice, but it is often not sufficient in practice. We often have to consider more certain types of topological spaces, that satisfy an additional separation axiom. In this section, we will present several such axioms, classified from less to more stringent. They basically tell us how efficient the open sets are in separating points or sets.

Definition 1.
Let (X,\tau) be a topological space.

  • (T_0) It is called T_0 (or Kolmogorov) if, for two points x and y in X with x\neq y, there exists an open set U such that x\in U and y \notin U or there exists an open set V such that x\notin V and y \in U.
  •  (T_1) It is called T_1 (or Fr\’echet) if, for for two points x and y in X with x\neq y, there exists an open set U such that x\in U and y \notin U and there exists an open set V such that x\notin V and y \in U.
  • (T_2) It is called T_2 (or Hausdorff) if, for two points x and y in X with x\neq y, there exists open sets U and V such that x \in U, y \in V, and U\cap V=\emptyset.
  •   (T_3) It is called T_3  if it is both T_1 and regular, that is to say, for any closed set C and x \notin C, there exist open sets U and V such that x \in U, C\subset V and U\cap V=\emptyset.
  • (T_{3\frac{1}{2}}) It is called (T_{3\frac{1}{2}}) (or Tychonoff) if it is both T_1 and completely regular, that is to say, for any closed set C and x \notin C, there exists a continuous function f:X\to [0,1], such that f(x)=0 and f\equiv 1 on C.
  • (T_4) It is called T_4 if it is both T_1 and normal, that is to say, for any closed sets C and D there exist open sets U and V such that C\subset U, D \subset V, and U\cap V=\emptyset.

The most important property is y far T_2: in T_2 spaces (which I will call Hausdorff spaces from now on), a convergent sequence has a unique limit. A topological space X is T_1 if, and only if, for all x\in X, \{x\} (the singleton containing x) is closed. A moment of reflexion shows that a T_{3\frac{1}{2}} topological space is T_3 (take U=f^{-1}([0,1/2[)) and V=f^{-1}(]1/2,1])). Clearly, a T_3  space is T_2, a T_2 space is T_1, and a T_1 space is T_0. It is also clear that a T_4 space is T_3, but it is not obvious that a T_4 space should be T_{3\frac{1}{2}}. This is indeed the case, and can be proved with the help of the following result.

Lemma 1. (Urysohn’s Lemma) If X is normal, if C and U are subsets of X with C closed, U open, and C\subset U, then there exists a continuous function f:X\to [0,1] such that f\equiv 0 on C and f\equiv 1 on X\setminus U.

This is very far from obvious. I have just learned about it, and I will try to convey my understanding of the proof (such as it is). Urysohn’s Lemma essentially tells us that we can separate closed sets using a continuous function. Let us try to do it on an example. We take as a topological space the segment [0,1], with its topology as a subspace of the real line \mathbb{R} equipped with the usual topology (the one I described at the beginning of the section on topological spaces). We take our closed set to be \{0\} and our open set to be [0,1[ (recall that we use the subspace topology). The function defined by f(x)=x then satisfies the three conditions we require: it is continuous, f(0)=0 and f(1)=1. To be more general, we have to define this function in such a way that an analogous definition could be found in a general normal space. It is reasonable (though I admit not obvious) to think that a definition of f(x) using a countable family of conditions on x would be suited to this task.

Let us consider the set
\displaystyle \mathbb{D}=\left\{\frac{j}{2^n}; n\in \mathbb{N}^*\mbox{ and } j\in\{1,\dots, 2^n-1\}\right\}.
We call it the set of dyadic numbers. Those are the numbers in ]0,1[ that can be written with a finite number of binary digits. For any r \in \mathbb{D}, we set U_r=[0,s[. Now (U_r)_{r\in \mathbb{D}} is a family of open sets of [0,1] such that:

  1. for all r\in \mathbb{D}, \{0\} \subset U_r;
  2. for all r\in \mathbb{D}, U_r\subset [0,1[;
  3. for all r and s in \mathbb{D} such that r<s, \overline{U}_r\subset U_s.

We can now define the function f using using the family (U_r)_{r\in \mathbb{D}}:
\displaystyle f(x)=\left\{\begin{array}{ll}  1 &\mbox{ if } x=1;\\  \inf\{r \in \mathbb{D}; x \in U_r\} &\mbox{ if } x\in [0,1[.\\  \end{array}\right.
This definition can be generalized.

Lemma 2.  Let X be a topological space, C a closed set and U an open set in X. Let us assume that (U_r)_{r \in \mathbb{D}} is a family of open sets in X satisfying:

  1.  for all r\in \mathbb{D}, C \subset U_r;
  2. for all r\in \mathbb{D}, \overline{U}_r\subset U;
  3. for all r and s in \mathbb{D} such that r<s, \overline{U}_r\subset U_s;
  4. U=\bigcup_{r\in \mathbb{D}}U_r.

Let us define the function f: X\to [0,1] by
\displaystyle f(x)=\left\{\begin{array}{ll}  1 &\mbox{ if } x\in X\setminus U;\\  \inf\{r \in \mathbb{D}; x \in U_r\} &\mbox{ if } x\in U.\\  \end{array}\right.
Then f is continous and f\equiv 0 on C.

Proof. Since C\subset U_{r} for all r \in \mathbb{D} and \inf \mathbb{D}=0, we have f\equiv 0 on C. Let us note (this will be used repeatedly ) that if x\in U_r, f(x) \le r, and if x\notin U_r, r\le f(x). It remains to prove that f is continuous. According to the statement at the end of the first section, it is enough to show that f is continuous at every point of X. Let x be a point of X and N a neighborhood of f(x) in [0,1].
Case 1: x\in X\setminus U. In that case f(x)=1. We recall that the intervals ]a-\varepsilon,a+\varepsilon[ (with a \in \mathbb{R} and \varepsilon>0) form a basis for the topological space \mathbb{R}. According to the definition of the subspace topology, there exists therefore \varepsilon>0 such that ]1-\varepsilon,1]\subset N. There also exists r\in \mathbb{D} such that r\in ]1-\varepsilon,1]. Then X\setminus \overline{U}_r is an open set, containing x. For all y \in U_r, we have f(y)> r according to the remark at the beginning, and thus f(y)\in N.
Case 2: x \in U with f(x)=0. With the same arguments than in the preceding case, there exists \varepsilon>0 such that [0,\varepsilon[\subset N. There exists r \in \mathbb{D} such that r\in [0,\varepsilon[. Then U_r is an open set containing x. For y\in U_r, we have f(y)\ge r, and therefore f(y)\in N.
Case 3: x\in U with f(x)>0. With the same arguments as before, there exists \varepsilon>0 such that ]f(x)-\varepsilon,f(x)+\varepsilon[\subset V. There exist also r and s in \mathbb{D} such that
\displaystyle f(x)-\varepsilon<r<f(x)<s<f(x)+\varepsilon.
Then U_s\setminus \overline{U}_r is an open set containing x such that f(y)\in N for all y\in U_s\setminus \overline{U}_r.
We have shown that in all cases there is a neighborhood N' of x, such that $f(N’)\subset N$. This shows that f is continuous at x and concludes the proof.

Let us turn to the proof of Urysohn’s Lemma. For n\in \mathbb{N}^*, we denote by \mathbb{D}_n the set
\displaystyle \left\{\frac{j}{2^n};j\in\{1,\dots,2^n-1\}\right\},
that is to say the set of numbers in ]0,1[ that have at most n binary digits. We have \mathbb{D}_n\subset\mathbb{D}_{n+1}  and \mathbb{D}=\bigcup_{n=1}^{\infty}\mathbb{D}_n. Let us fix an open set C and a closed set U in X. We will construct recursively a family of open set satisfying the hypotheses in Lemma 2.

Since X is normal, there exist two open sets U_{\frac{1}{2}} and V_1 such that C\subset U_{\frac{1}{2}}, X\setminus U \subset V_1 and U_{\frac{1}{2}}\cap V_1=\emptyset. In particular, U_{\frac{1}{2}} is contained in the closed set X\setminus V_1 and thus \overline{U}_{\frac{1}{2}}\subset X\setminus V_1\subset U. We have “constructed” the open set indexed by the element in \mathbb{D}_1=\{1/2\}.

Let now assume that, for some n\in\mathbb{N}^*, we have a family of open set (U_r)_{r\in \mathbb{D}_n} satisfying the hypotheses in Lemma 2. Let us now pick r \in \mathbb{D}_{n+1}\setminus \mathbb{D}_{n}.
Case 1: r=1/2^{n+1}. Then the closed sets C and X\setminus U_{\frac{1}{2^n}} are disjoint, and, proceeding as in the first step, we find an open set U_{\frac{1}{2^{n+1}}} such that
C\subset U_{\frac{1}{2^{n+1}}} and \overline{U}_{\frac{1}{2^{n+1}}}\subset U_{\frac{1}{2^n}}.
Case 2: r=(2j+1)/2^{n+1} for some j \in \{1,\dots, 2^n-3\}. Then the closed sets \overline{U}_{\frac{j}{2^n}}  and X\setminus U_{\frac{j+1}{2^n}} are disjoint, and, proceeding as before, we find an open set U_r such that \overline{U}_{\frac{j}{2^n}}\subset U_r and \overline{U}_r \subset U_{\frac{j+1}{2^n}}.
Case 3: r=(2^{n+1}-1)/2^{n+1}. Then the closed sets \overline{U}_{\frac{2^n-1}{2^n}} and X\setminus U are disjoint and, proceeding as before, we find an open set U_r such that \overline{U}_{\frac{2^n-1}{2^n}} \subset U_r and \overline{U}_r \subset U.

We can therefore build recursively a family of open sets (U_{r})_{r\in \mathbb{D}} satisfying the hypotheses of Lemma 2, and thus, by applying Lemma 2, a continuous function f:X\to [0,1] such that f\equiv 0 on C and f\equiv 1 on X\setminus U. This proves Urysohn’s Lemma.

We can now prove that a T_4 space is T_{3\frac{1}{2}} by applying Urysohn’s Lemma to the closed set \{x\} and the open set X\setminus C.

Some Point-set Topology: Part 1

Statement of intent

I have picked up differential geometry and topology where I left it three years ago. It keeps popping up in my research, and I have decided that I should study it seriously, with a precise treatment of the foundational material. I will record my progress here, in order to motivate me and to force me to be as complete and precise as I can. For the moment, I am using Foundations of Differentiable Manifolds and Lie Groups, by Frank W. Warner, and Topology and Geometry, by Glen E. Bredon. I also use two french books, Cours de Topologie, by Gustave Choquet, and Introduction aux Variétés Différentielles, by Jacques Lafontaine.

I will start with a review of some point-set topology. I will not attempt to be complete on this topic, I am mainly interested in learning the concepts that I might need to study manifolds. I will however try to define the objects that I use and motivate these definitions

 

Topological spaces

When we first study functions of one or several real variables, we quickly define continuity and prove results dealing with continuous function, such as that the intermediate value theorem or the extreme value theorem. Continuous functions are the basic objects of analysis. If we go further in our study, we encounter function that map the elements of a set to the elements of another set, where the set need not be subsets of the real line \mathbb{R} or the d-dimensional euclidean space \mathbb{R}^d. If we want to keep talking about continuous functions, we need to specify the simplest mathematical structure that allows us to do so.

Let us stay on the real line for the moment. We have learned in our calculus lectures to call continuous at x_0 a function f:\mathbb{R}\to \mathbb{R}  that satisfy the following property:
for all \varepsilon>0, there exists some \delta>0 such that, if x is ]x_0-\delta,x_0+\delta[, its image f(x) is in ]f(x_0)-\varepsilon,f(x_0)+\varepsilon[. We say that f:\mathbb{R}\to\mathbb{R} is continuous when it is continuous at every point x\in\mathbb{R}. This definition is not very suitable to generalization, mostly because the definition of intervals relies on the fact that real number can be ordered. But it can be reformulated in a much more powerful way, at the price of increased abstraction.

Let us say that a subset U\subset \mathbb{R} is open if it satisfy the following property: for each x in U, there exist a real number \varepsilon>0 such that the interval ]x-\varepsilon, x+\varepsilon[\subset U. It is then easy to see that, with this definition,

  1.  the whole real line \mathbb{R} and the empty set \emptyset are open;
  2. the union of a collection of open sets is open;
  3. the intersection of a finite number of open sets is open.

By playing around a bit with this definition, we can also see that f:\mathbb{R}\to \mathbb{R} is continuous if, and only if, the preimage f^{-1}(U) of any open set U \in \mathbb{R} is open.

In the above paragraph, the intervals (and thus the order relation on \mathbb{R}) were used to define the open set, but after that we were able to define the continuity of a function purely in terms of open sets. This suggests that, to talk about continuity on sets that are more general than \mathbb{R}, it is enough to specify the open subsets  More precisely, we call topological space a set X, together with a collection \tau of subsets of X such that:

  1. X and \emptyset are in X;
  2. if (U_i)_{i\in I} is a family of subsets in \tau, \bigcup_{i\in I}U_i\in \tau;
  3. if (U_i)_{i\in I} is a finite family of subsets in \tau, \bigcap_{i\in I}U_i\in \tau;

In the second axiom, the set of indexes I need not be finite, or even countable. We say that subsets of X that are in \tau are open, and that subsets of X whose complement are in \tau closed. Now, if (X_1,\tau_1) and (X_2,\tau_2) are two topological spaces, we say that a function f:X_1\to X_2 is continuous if, for all U\in \tau_2, the preimage f^{-1}(U) is in \tau_1.

If (X,\tau) is a topological space and Y\subset X, we define a topology on Y by specifying the open sets in the following way: a set V\subset Y is open when there exists U \in \tau such that V=U\cap Y (Exercise: Show that this indeed defines a topology). Of course U need not be unique. The set Y with this topology is called a topological subspace of (X,\tau).

We often need to define a topology without specifying all the open sets. With that in mind, we call \beta \subset \tau a basis for (X,\tau) when any open set is the union of sets in $\beta$. For instance, by the very definition of open sets in \mathbb{R} used at the beginning, the collection containing all the sets ]x-\varepsilon,x+\varepsilon[ with x\in \mathbb{R} and \varepsilon>0 is a basis.

Finally, if x\in X, we often want to restrict our attention to points that are “close” to x. We say that a set N (not necessarily open) is a neighborhood of x when there exist U\in \tau such that x\in U \subset N. We say that a collection \beta_x of subsets of X is a neighborhood basis at x if each member of \beta_x is a neighborhood of x, and if, for any neighborhood N of x, there exists N'\in \beta_x such that N'\subset N. For instance,  if x \in \mathbb{R}, the intervals ]x-\varepsilon, x+\varepsilon[, with \varepsilon>0, are a neighborhood basis at x for the topology defined at the beginning.
This allows us to give a general definition of the concept of continuity at a point, from which we started. Let (X_1,\tau_1) and (X_2,\tau_2) be topological spaces. We say that a function
f:X_1\to X_2 is continuous at x if, for every neighborhood N_2 of $f(x)$, the exist a neighborhood N_1 of x such that f(N_1)\subset N_2.  As an exercise, the reader should check the following result, which show that everything is consistent.

Proposition
The function f:X_1\to X_2 is continuous at every point of X_1 if, and only if, it is continuous (in the sense that the preimage of any open set is an open set).

Free fall

This is just elementary calculus, but it is harder than I thought at first glance. A friend told me that there is no formula expressing the altitude of a falling body if you is the inverse square law for the force, rather than making the approximation of a constant gravitational field. As far as I am able to tell, he is right. One can however compute some things.

We study the motion of a mass m, moving along an axis, whose position is denoted by x(t). At the time t=0, the mass is at rest in the position x(0)=x_0. There is a mass M, at the position x=0, that does not move and attracts m by a force f given by the inverse square law:
\displaystyle f=-\frac{GMm}{x^2},
G being the gravitational constant.

Let us write the conservation of the mechanical energy for the mass m: at any time t\ge 0,
\displaystyle \frac{1}{2}mx'(t)^2-\frac{GMm}{x(t)}=-\frac{GMm}{x_0}.
We consider a falling body, therefore x'(t)\le 0, and thus
x'(t)=-\sqrt{2GM}\sqrt{\frac{1}{x(t)}-\frac{1}{x_0}},
This yields
-\frac{x'(t)}{\sqrt{\frac{1}{x(t)}-\frac{1}{x_0}}}=\sqrt{2GM},
and therefore
\displaystyle  -\frac{x'(t)x(t)}{\sqrt{x(t)(x_0-x(t))}}=k, (1)
with
\displaystyle k=\sqrt{\frac{2GM}{x_0}}
To find t as a function of x(t), we compute
\displaystyle \int\frac{x\,dx}{\sqrt{x(x_0-x)}}.
We have
\displaystyle \frac{x\,dx}{\sqrt{x(x_0-x)}}=\frac{-(-2x+x_0)}{2\sqrt{x(x_0-x)}}+\frac{x_0}{2\sqrt{x(x_0-x)}},
and therefore
\displaystyle \int\frac{x\,dx}{\sqrt{x(x_0-x)}}=-\sqrt{x(x_0-x)}+\frac{x_0}{2}\int \frac{dx}{\sqrt{x(x_0-x)}}.
We make the change of variable x=\frac{x_0}{1+u^2}, which gives
\displaystyle dx=\frac{2x_0u\,du}{(1+u^2)^2}
and
\displaystyle \sqrt{x(x_0-x)}-x_0\arctan\left(\sqrt{\frac{x_0}{x}-1}\right).
We obtain
\displaystyle \int\frac{x\,dx}{\sqrt{x(x_0-x)}}=-\sqrt{x(x_0-x)}-\frac{x_0}{2}\int_{u=\sqrt{\frac{x_{0}}{x}-1}} \frac{2\,du}{1+u^2}
and finally
\displaystyle \int\frac{x\,dx}{\sqrt{x(x_0-x)}}=-\sqrt{x(x_0-x)}-x_0\arctan\left(\sqrt{\frac{x_0}{x}-1}\right)+C.
We now integrate  (1) between 0 and t:
\displaystyle  k\,t=\sqrt{x(x_0-x)}+x_0\arctan\left(\sqrt{\frac{x_0}{x}-1}\right). (2)
This gives us the duration of the fall as a function of the position of the falling mass m.

We can find another formula for the antiderivative. Let us make the change of variable x=x_0\frac{1+\cos(u)}{2}=x_0\cos^2\left(\frac{u}{2}\right). Then
dx=-x_0\sin\left(\frac{u}{2}\right)\cos\left(\frac{u}{2}\right), and
\displaystyle \sqrt{x(x_0-x)}=x_0\sqrt{\cos^2\left(\frac{u}{2}\right)\left(1-\cos^2\left(\frac{u}{2}\right)\right)}=x_0\sin\left(\frac{u}{2}\right)\cos\left(\frac{u}{2}\right),
u being taken between 0 and \pi. We obtain
\displaystyle\int\frac{x\,dx}{\sqrt{x(x_0-x)}}=-\sqrt{x(x_0-x)}-\frac{x_0}{2}\int_{u=2\arccos\left(\sqrt{\frac{x}{x_0}}\right)} du,
and therefore
\displaystyle  k\,t=\sqrt{x(x_0-x)}+x_0\arccos\left(\sqrt{\frac{x}{x_0}}\right), (3)
which is the answer that can be found in Wikipedia. Taking x=0, we obtain the free-fall time t_f, the time it takes for the falling mass to reach the center of attraction:
\displaystyle t_f=\frac{\pi}{2}\sqrt{\frac{x_0^{3}}{2GM}}.

How does we find these changes of variable (this is the only tricky part in these calculations)? Well, if we set y=\sqrt{x(x_0-x)}, the points (x,y), with x in [0,x_0], are on the circle of radius x_0/2 centered at (x_0/2,0). Now if (x(u),y(u)) is a parametric representation of this circle, the change of variable x=x(u) allows us to write
\displaystyle \int \frac{dx}{\sqrt{x(x_0-x)}}=\int_{u=x^{-1}(x)}\frac{x'(u)\,du}{y(u)}.
In the first change of variable, we used a parametrization of the circle by rational functions:
\displaystyle\left\{\begin{array}{lcl}  x(u)&=&\frac{x_0}{1+u^2};\\  y(u)&=&\frac{x_0\,u}{1+u^2}.  \end{array}  \right.
In the second, we used a parametrization by trigonometric functions:
\displaystyle\left\{\begin{array}{lcl}  x(u)&=&x_0\frac{1+\cos(u)}{2};\\  y(u)&=&\frac{x_0}{2}\sin(u).  \end{array}  \right.
Both are natural choices (the second maybe more so than the first), and both produce a simple result.

If we set
\displaystyle F(\xi)=\xi\sqrt{1-\xi^2}+\arccos(\xi),
we obtain
\displaystyle k\,t=x_0\,F\left(\sqrt{\frac{x}{x_0}}\right)
and thus
\displaystyle  x(t)=x_0\,\left(F^{-1}\right)^2\left(\frac{k\,t}{x_0}\right). (4)
Indeed, F is continuous on [0,1], differentiable on [0,1[, and
\displaystyle F'(\xi)=\sqrt{1-\xi^2}-\frac{\xi^2}{\sqrt{1-\xi^2}}-\frac{1}{\sqrt{1-\xi^2}}=\frac{-2\,\xi^2}{\sqrt{1-\xi^2}}.
Therefore F is decreasing, and thus a bijection from [0,1] to [0,\pi/2]. Its inverse F^{-1} is continuous on [0,\pi/2] and differentiable on ]0,\pi/2[.

On the other hand, if we set
\displaystyle G(\xi)=\frac{\xi}{1+\xi^2}+\arctan(\xi),
we obtain
k\,t=x_0\,G\left(\sqrt{\frac{x_0}{x}-1}\right)
and thus
\displaystyle x(t)=\frac{x_0}{1+\left(G^{-1}\right)^{2}\left(\frac{k\,t}{x_0}\right)}.
The function G is continuous and differentiable on \mathbb{R} and
G'(\xi)=\frac{2}{(1+\xi^2)^2}.
Its is therefore a diffeomorphism from \mathbb{R} to ]-\pi/2,\pi/2[. Let us denote by g the inverse G^{-1}. It satisfies the differential equation
\displaystyle g'(\zeta)=\frac{1}{2}+g(\zeta)^2+\frac{1}{2}g(\zeta)^4,
with g(0)=0. In particular, g'(0)=1/2. This, taken with Equation (4), implies that
\displaystyle x(t)=x_0\left(1-\frac{k^2t^2}{4x_0^2}+O\left(\frac{k^3t^3}{x_0^3}\right)\right)
for t close to 0. We have approximately
\displaystyle x(t)\simeq x_0-\frac{1}{2}\gamma_0t^2,
whith
\displaystyle \gamma_0=\frac{GM}{x_0^2}.
The right-hand side is the result that we would obtain for a constant gravitational field \gamma_0.

We have
\displaystyle \lim_{\zeta\to \pi/2}g(\zeta)=+\infty.
Since
\displaystyle G(\xi)=\frac{\xi}{1+\xi^2}+\arctan(\xi)=\frac{\pi}{2}-\frac{1/\xi}{1+(1/\xi)^2}-\arctan(1/\xi),
we have
\displaystyle \zeta=\frac{\pi}{2}-\frac{1/g(\zeta)}{1+(1/g(\zeta))^2}-\arctan(1/g(\zeta))\\    = \frac{\pi}{2}-\frac{1}{g(\zeta)}-\frac{1}{g(\zeta)^3}-\frac{1}{g(\zeta)}+\frac{1}{3\,g(\zeta)^3}+o\left(\frac{1}{g(\zeta)^3}\right)\sim -\frac{2}{3\,g(\zeta)^3},
and therefore
\displaystyle g(\zeta)\sim \left(\frac{3}{2}\left(\frac{\pi}{2}-\zeta\right)\right)^{-1/3}.
Therefore, when t tends to t_f,
\displaystyle x(t)\sim x_0\left(\frac{3}{2}\left(\frac{\pi}{2}-\frac{k\,t}{x_0}\right)\right)^{2/3}=\left(\frac{9GM}{2}\right)^{1/3}(t_f-t)^{2/3}.
Using the differential equation, we also find
\displaystyle  x'(t)\sim-\sqrt{2GM}\left(\frac{9GM}{2}\right)^{-1/6}(t_f-t)^{-1/3}\\  =-\frac{2}{3}\left(\frac{9}{2}\right)^{1/3}(GM)^{1/3}(t_f-t)^{-1/3},
which is what we would obtain by formally differentiating the equivalent.

The free-fall time can be rewritten
\displaystyle t_f=\frac{\pi}{2}\sqrt{\frac{x^{2}}{2GM}}\,\frac{x_0}{x}\sqrt{x_0}=\frac{\pi x_0}{2x}\sqrt{\frac{x_0}{2\gamma}}
with \gamma the accelaration of gravity at the altitude x. How many time does it take to fall from the altitude of the Moon? We take x to be  the radius of the Earth, and x_0 is the distance form the Earth to the Moon. Wikipedia gives
x_0=384399 kilometers, x=6378 kilometers and \gamma=9.780327 meters per second squared. We find roughly 116 hours, which is to say four days and twenty hours.

The Wikipedia article I linked above gives two references for the free fall:

1. From Moon-fall to motions under inverse square laws, by S. K. Foong, European Journal of Physics, 2008 29: 987–1003;
2. Radial motion of Two mutually attracting particles, by Carl E. Mungan, The Physics Teacher, 2009, 47: 502-507.

I skimmed the first article (unfortunately it is behind a paywall, but I could get access with my university subscription). The formula for x \ll x_0 and the numerical value for the free fall time from the orbit of the moon agree with what I found. There is also a series expansion, and probably other stuff. I will read in detail later.

Sum of subspaces

In connection with my previous post, I just read about a criterion that determines when the sum of two closed subspaces of a Banach space is closed. This is explained for instance in Haïm Brezis’s textbook on Functional Analysis, but I assume it can be found in a lot of other places.

Proposition

Let E be a Banach space and F and G two closed subspaces of E. The two following statements are equivalent.

  1.     F+G is a closed subspace of E.
  2.     There exists a constant C such that for each z \in F+G, there exist x \in F and y \in G with \|x\|\le C\|z\| and \|y\|\le C\|z\| satisfying z=x+y.

Proof

Let us first prove that 1 implies 2. Let (z_n) be a sequence of vectors in F+G such that z_n \to z. The sequence (z_n) is a Cauchy sequence. The following trick is often useful, since it allows us to use absolutely convergent series (which, in a Banach space, are convergent).

Lemma 1

There exists a subsequence (z_{n_k}) of (z_n) such that  \|z_{n_{k+1}}-z_{n_k}\|\le 2^{-k}.

Proof of Lemma 1

We prove it recursively. Since (z_n) is a Cauchy sequence, there exists an integer N such that for all m and n greater than N, \|z_n-z_m\|\le 1/2. We set n_1=N. Now there is an integer M, that we can choose strictly greater than $latex  n_1$, such that for all m and n greater than M\|z_n-z_m\|\le 1/4. We set n_2=M. Since n_2>n_1, we have, by the initial choice of n_1, \|z_{n_2}-z_{n_1}\|\le 1/2. We can now pick n_3>n_2 such that  \|z_n-z_{n_3}\|\le 1/8 for all n\ge n_3, and it will automatically satisfy \|z_{n_3}-z_{n_2}\|\le 1/4. If we keep on, we obtain the desired subsequence z_{n_1}, z_{n_2},z_{n_3},z_{n_4},\dots.

In the rest of the proof, we assume that such a subsequence has been chosen. We will make an abuse of notation by denoting this subsequence by (z_n). We therefore assume in the rest of the proof that \|z_{n+1}-z_{n}\|\le 2^{-n}.

Lemma 2

There are sequences (x_n) and $(y_n)$ of vectors in F and G respectively such that x_n+y_n=z_n, \|x_{n+1}-x_n\| \le C\|z_{n+1}-z_{n}\| and  \|y_{n+1}-y_n\| \le C\|z_{n+1}-z_{n}\|.

Proof of Lemma 2

We first pick any x_1 \in F and y_1 \in G such that x_1+y_1=z_1. Now, according to 1, there exist \tilde{x}\in F and \tilde{y} \in G such that \tilde{x}+\tilde{y}=z_2-z_1, \|\tilde{x}\| \le C\|z_2-z_1\| and \|\tilde{y}\| \le C\|z_2-z_1\|. We then set x_2=x_1+\tilde{x} and x_2=x_1+\tilde{x}. The other terms of the sequences are built recursively in the same manner.

The series \sum(x_{n+1}-x_n) is absolutely convergent and therefore convergent. The sequence (x_n) is therefore convergent. Let us call x its limit. Since F is closed, x belongs to F. In the same way, (y_n) converges to some y in G. By passing to the limit in x_n+y_n=z_n, we get x+y=z and therefore z\in F+G. We have proved that F+G is closed (This took longer than I thought it would).

Now for the converse. If 1 holds, the spaces F, G, and F+G are closed subspaces of E and therefore Banach spaces. We consider the Cartesian product F\times G, equipped with the norm \|(x,y)\|=\max(\|x\|,\|y\|). It is also a Banach space. We consider the mapping S:(x,y)\to x+y from  F\times G to F+G.  It is clearly linear, bounded, and surjective. Then, we bring out the big gun, namely the Banach-Schauder theorem.

Theorem

A bounded linear mapping between Banach space that is surjective is an open mapping (the direct image of any open set is an open set).

Applying this theorem to S, we obtain that there is some \varepsilon>0 such that B(0,\varepsilon)\cap (F+G) \subset S((B(0,1)\cap F)\times (B(0,1)\cap G)), where $latex  B(x,R)$ denotes the open ball in E of radius R and center x. Now, let us pick any z \in F+G, z \neq 0. The vector z'=\frac{\varepsilon}{2\|z\|}z belongs to B(0,\varepsilon)\cap (F+G), therefore there exist x' \in B(0,1)\cap F and  y'\in B(0,1)\cap G such that z'=x'+y'. By setting x=\frac{2\|z\|}{\varepsilon}x' and  y=\frac{2\|z\|}{\varepsilon}y', we see that we have proved 2 with C=\frac{2}{\varepsilon}.

Of course, this criterion cries out for a reformulation in terms of quotient topology. I will do it at some point, but I have procrastinated enough this  week. Time to go to bed.