Delta function
is defined such that this relation holds:
(1)
No such function exists, but one can find many sequences “converging” to a delta function:
(2)
more precisely:
(3)
one example of such a sequence is:

It’s clear that (3) holds for any well behaved function
.
Some mathematicians like to say that it’s incorrect to use such a notation when
in fact the integral (1) doesn’t “exist”, but we will not follow
their approach, because it is not important if something “exists” or not,
but rather if it is clear what we mean by our notation: (1) is a
shorthand for (3) and (2) gets a mathematically rigorous
meaning when you integrate both sides and use (1) to arrive at
(3). Thus one uses the relations (1), (2),
(3) to derive all properties of the delta function.
Let’s give an example. Let
be the unit vector in 3D and we can label it using spherical coordinates
. We can also express it in cartesian coordinates as
.
(4)
Expressing
as a function of
and
we have
(5)
Expressing (4) in spherical coordinates we get

and comparing to (5) we finally get

In exactly the same manner we get

See also (6) for an example of how to deal with more complex expressions involving the delta function like
.
When integrating over finite interval, this formula is very useful:

in other words, the integral vanishes unless
. In the limit
and
we get:

Some mathematicians like to use distributions and a mathematical notation for that, which I think is making things less clear, but nevertheless it’s important to understand it too, so the notation is explained in this section, but I discourage to use it – I suggest to only use the physical notation as explained below. The math notation below is put into quotation marks, so that it’s not confused with the physical notation.
The distribution is a functional and each function
can be identified
with a distribution
that it generates using this definition (
is a test function):

besides that, one can also define distributions that can’t be identified with regular functions, one example is a delta distribution (Dirac delta function):

The last integral is not used in mathematics, in physics on the other hand, the
first expressions (
) is not used, so
always means
that you have to integrate it, as explained in the previous section, so it
behaves like a regular function (except that such a function doesn’t exist and
the precise mathematical meaning is only after you integrate it, or through the
identification above with distributions).
One then defines common operations via acting on the generating function, then observes the pattern and defines it for all distributions. For example differentiation:

so:

Multiplication:

so:

Fourier transform:
![\mathnot{FT_f(\varphi)} =
\mathnot{T_{Ff}(\varphi)} =
\int F(f)\varphi \d x =
=\int\left[\int e^{-ikx} f(k) \d k\right] \varphi(x) \d x
=\int f(k)\left[\int e^{-ikx} \varphi(x) \d x\right] \d k
=\int f(x)\left[\int e^{-ikx} \varphi(k) \d k\right] \d x
=
=
\int f F(\varphi) \d x =
\mathnot{T_f(F\varphi)}](../../_images/math/99ff14da52d12b993ca97412c35f70755ffe200a.png)
so:

But as you can see, the notation is just making things more complex, since it’s enough to just work with the integrals and forget about the rest. One can then even omit the integrals, with the understanding that they are implicit.
Some more examples:

Proof of
:

Proof of
:

Proof of
:

Variations and functional derivatives are generalization of differentials and partial derivatives to functionals. It is important to master this subject just like regular differentials/derivatives in calculus.
Let’s first review differentials and derivatives of functions of one variable.
We will use an approach that directly generalizes to multivariable functions
and functionals.
The differential
is defined as:

Last equality follows from the fact, that the limit is a linear function of
:

Where we used the substitution
.
We define the derivative
as:

To get a formula for
, we set
and get:

Using the formulas above we get an equivalent expression for the differential:

So we get a general formula (the analogy of which we will use later):

The variable
can be treated as a function (a very simple one):

So we define
as:

As such,
can have two meanings: either
(a finite
change in the variable
) or a differential (if
depends on another
variable, thanks to the chain rule everything will work).
With this understanding,
for all calculations, we only need the following two formulas —
the definition of the differential (using a limit):

and the definition of the derivative (using the differential):

where
is either a differential or a finite change in the variable
.
If for example
is a function of
then in the above
is a
differential and we get:

Thanks to the chain rule, this can also be written as:

and so the notation is consistent.
Let’s have
. The function
assigns a number to
each
. We define a differential of
in the direction of
as:

The last equality follows from the fact, that
is a
linear function of
. We define the partial derivative
of
with respect to
as the
-th component of
the vector
:

This also gives a formula for computing
: we set
and

The usual way to define partial derivatives is to use the last formula as the
definition, but here this formula is a consequence of our definition in terms
of the components of
.
Every variable can be treated as a function (very simple one):

and so we define

and thus we write
and
and

So
has two meanings — it’s either
(a
finite change in the independent variable
) or a differential,
depending on the context. The above is a detailed explanation why things
are defined the way they are and what the exact meaning is. With this
understanding, the only things that are actually needed for any calculations
are the following – the definition of a differential:

Only a regular derivative (defined in the previous section) is needed for this definition. The definition of a partial derivative (and a gradient):

And finally the understanding that
means
either
or a differential depending on the context.
That’s all there is to it.
Let’s now define functional derivatives and variations.
Functional
assigns a number to each function
. The variation is defined as
![\delta F[f]\equiv\left.{\d\over\d\varepsilon}F[f+\varepsilon h] \right|_{\varepsilon=0}=\lim_{\epsilon\to0}{F[f+\epsilon h]-F[f]\over\epsilon}= \int a(x)h(x)\d x](../../_images/math/07dd13ee0b73980dd443e90f846d99b53b9b41ea.png)
We define
as

This also gives a formula for computing
: we set
and
![{\delta F\over\delta f(x)}=a(x)=\int a(y)\delta(x-y)\d y= \left.{\d\over\d\varepsilon}F[f(y)+\varepsilon\delta(x-y)] \right|_{\varepsilon=0}=
=\lim_{\varepsilon\to0} {F[f(y)+\varepsilon\delta(x-y)]-F[f(y)]\over\varepsilon}](../../_images/math/ba48f6268c624bac2ad8ef20f325709c95e53e88.png)
Sometimes the functional derivative is defined using the last formula, here this formula just follows from our definition. Every function can be treated as a functional (although a very simple one):
![f(x)=G[f]=\int f(y)\delta(x-y)\d y](../../_images/math/387f075e6d71f2a505dac776f7e17e0c76da8f10.png)
and so we define
![\delta f\equiv\delta G[f]= \left.{\d\over\d\varepsilon}G[f(x)+\varepsilon h(x)] \right|_{\varepsilon=0}= \left.{\d\over\d\varepsilon}(f(x)+\varepsilon h(x)) \right|_{\varepsilon=0}= h(x)](../../_images/math/7dd417d049064080709b9c9c5268279b1530fa30.png)
thus we write
and
![\delta F[f]=\int {\delta F\over\delta f(x)}\delta f(x)\d x](../../_images/math/fa073632306ec146fbbaab3afd4d93610653f6e3.png)
so
have two meanings — it’s either
(a finite change in the function
) or a variation
of a functional, depending on the context.
It is completely analogous to
. Let’s summarize the only formulas needed
in actual calculations – the definition of a variation (using a regular
derivative):
![\delta F[f] = \left.{\d\over\d\varepsilon}F[f+\varepsilon \delta f]
\right|_{\varepsilon=0}](../../_images/math/2f84e4f82f284e7893935401be4b974bb8707d14.png)
the definition of the functional derivative:
![\delta F[f]=\int {\delta F\over\delta f(x)} \delta f(x) \d x](../../_images/math/aca9a01e7da7a50611a335ad9fda59410460a6aa.png)
and the understanding that
means either
or a variation.
The correspondence between the finite and infinite dimensional case can be summarized as:
![\begin{eqnarray*} f(x_i) \quad&\Longleftrightarrow&\quad F[f] \\ \d f=0 \quad&\Longleftrightarrow&\quad \delta F=0 \\ {\partial f\over\partial x_i}=0 \quad&\Longleftrightarrow&\quad {\delta F\over\delta f(x)}=0 \\ f \quad&\Longleftrightarrow&\quad F \\ x_i \quad&\Longleftrightarrow&\quad f(x) \\ x \quad&\Longleftrightarrow&\quad f \\ i \quad&\Longleftrightarrow&\quad x \\ \end{eqnarray*}](../../_images/math/847f04226ae2cb7a6dd56aee63bc4e3292504c66.png)
More generally,
-variation can by applied to any function
which contains the function
being varied, you just need to replace
by
and apply
to the whole
, for example (here
and
):

This notation allows us a very convinient computation, as shown in the following examples. First, when computing a variation of some integral, when can interchange
and
:
![F[f]=\int K(x) f(x) \d x](../../_images/math/90c6e5cb0f8047f52f0362c62ae0b152748eab17.png)


In the expression
we must understand from the context if we are treating it as a functional of
or
. In our case it’s a functional of
, so we have
.
A few more examples (notice that one can do each calculation either in terms of the functional derivative or the variation, and the variation version is usually simpler):






The last equality follows from
(any antisymmetrical part of a
would not contribute to the symmetrical integration).
Another example is the derivation of Euler-Lagrange equations for the
Lagrangian density
:
![0 = \delta I = \delta \int \L \,\d^4x^\mu
= \int \delta \L \,\d^4x^\mu
= \int { \partial \L\over\partial \eta_\rho}\delta\eta_\rho
+
{ \partial \L\over\partial (\partial_\nu \eta_\rho)}
\delta(\partial_\nu\eta_\rho)
\,\d^4x^\mu
=
= \int { \partial \L\over\partial \eta_\rho}\delta\eta_\rho
+
{ \partial \L\over\partial (\partial_\nu \eta_\rho)}
\partial_\nu(\delta\eta_\rho)
\,\d^4x^\mu
=
= \int { \partial \L\over\partial \eta_\rho}\delta\eta_\rho
-
\partial_\nu\left(
{ \partial \L\over\partial (\partial_\nu \eta_\rho)}
\right)
\delta\eta_\rho
\,\d^4x^\mu
+\int \partial_\nu \left(
{ \partial \L\over\partial (\partial_\nu \eta_\rho)}
\delta\eta_\rho
\right)
\,\d^4x^\mu
=
= \int \left[{ \partial \L\over\partial \eta_\rho}
-
\partial_\nu\left(
{ \partial \L\over\partial (\partial_\nu \eta_\rho)}
\right)
\right]
\delta\eta_\rho
\,\d^4x^\mu](../../_images/math/5d0daaac1d2cb9ccf560bfc270acfdd165709b2d.png)
Another example:


One might thing that the above calculation is incorrect, because
is undefined. In case of
such problems the above notation automatically implies working with some
sequence
(for example
) and taking the limit
:


(6)
As you can see, we got the same result, with the same rigor, but using an obfuscating notation. That’s why such obvious manipulations with
are tacitly implied.
Another example with a metric as a function of coordinates
:

And an example of varying with respect to a metric:

Another example (varying energy functional):
![E[\rho] = 4\pi\int {a \rho(r)\over b + r_s(r)} r^2 \d r
r_s(r) = \left(3\over 4\pi (-\rho)\right)^{1\over 3}
{\d r_s\over\d \rho} =
{1\over 3}\left(3\over 4\pi (-\rho)\right)^{-{2\over 3}}
{3\over 4\pi \rho^2}
=
-{1\over 3\rho}\left(3\over 4\pi (-\rho)\right)^{1\over 3}
=
-{r_s\over 3\rho}
\delta E[\rho] = 4\pi \delta \int {a \rho\over b + r_s} r^2 \d r =
= 4\pi \int\left({a \delta \rho\over b + r_s}
- {a \rho\over (b + r_s)^2 }\delta r_s\right) r^2 \d r =
= 4\pi \int\left({a \delta \rho\over b + r_s}
- {a \rho\over (b + r_s)^2 }\left(-{r_s\over 3\rho}\right)
\delta\rho\right)
r^2 \d r =
= 4\pi \int\left({a \over b + r_s}
+{1\over3} {a r_s\over (b + r_s)^2 }\right) (\delta\rho) r^2 \d r
{\delta E[\rho]\over\delta\rho}
= 4\pi r^2 \left({a \over b + r_s}
+{1\over3} {a r_s\over (b + r_s)^2 }\right)](../../_images/math/26010e32cd7922e1e8f4d5ce2a3b46f0ec95b739.png)
Another example (Hartree energy):
![E[n] = \half \int {n({\bf r}') n({\bf r}'')\over
| {\bf r}' - {\bf r}''| } \d^3 r' \d^3 r''
\delta E[n] = \half \delta \int {n({\bf r}') n({\bf r}'')\over
| {\bf r}' - {\bf r}''| } \d^3 r' \d^3 r'' =
= \half \int { (\delta n({\bf r}')) n({\bf r}'')
+ n({\bf r}') (\delta n({\bf r}''))\over
| {\bf r}' - {\bf r}''| } \d^3 r' \d^3 r'' =
= \int { n({\bf r}') \over | {\bf r}' - {\bf r}''| }
(\delta n({\bf r}'')) \d^3 r' \d^3 r'' =
= \int { n({\bf r}') \over | {\bf r} - {\bf r}'| }
(\delta n({\bf r})) \d^3 r' \d^3 r
{\delta E[n]\over \delta n({\bf r})}
= \int { n({\bf r}') \over | {\bf r} - {\bf r}'| } \d^3 r'](../../_images/math/171de532d2c5b24b9589de2930343b615997fc9b.png)
The Dirac notation allows a very compact and powerful way of writing equations that describe a function expansion into a basis, both discrete (e.g. a fourier series expansion) and continuous (e.g. a fourier transform) and related things. The notation is designed so that it is very easy to remember and it just guides you to write the correct equation.
Let’s have a function
. We define

The following equation

then becomes

and thus we can interpret
as a vector,
as a basis and
as the coefficients in the basis expansion:

That’s all there is to it. Take the above rules as the operational definition
of the Dirac notation. It’s like with the delta function - written alone it
doesn’t have any meaning, but there are clear and non-ambiguous rules to
convert any expression with
to an expression which even mathematicians
understand (i.e. integrating, applying test functions and using other relations
to get rid of all
symbols in the expression – but the result is
usually much more complicated than the original formula). It’s the same with
the ket
: written alone it doesn’t have any meaning, but you can
always use the above rules to get an expression that make sense to everyone
(i.e. attaching any bra to the left and rewriting all brackets
with their equivalent expressions) – but it will be more complex and harder to
remember and – that is important – less general.
Now, let’s look at the spherical harmonics:

on the unit sphere, we have


thus

and from (?) we get

now

from (?) we get

so we have

so
forms an orthonormal basis. Any function defined on the sphere
can be written using this basis:

where

If we have a function
in 3D, we can write it as a function of
and
and expand only with respect to the variable
:

In Dirac notation we are doing the following: we decompose the space into the angular and radial part

and write

where

Let’s calculate 

so

We must stress that
only acts in the
space (not the
space) which means that

and
leaves
intact. Similarly,

is a unity in the
space only (i.e. on the unit sphere).
Let’s rewrite the equation (?):

Using the completeness relation (?):


we can now derive a very important formula true for every function
:


where

or written explicitly
(7)
A function of several variables
is
homogeneous of degree
if

By differentiating with respect to
:

and setting
we get the so called Euler equation:

in 3D this can also be written as:

The function
is homogeneous of degree 1, because:

and the Euler equation is:

or

Which is true.
The function
is homogeneous of degree -1, because:

and the Euler equation is:

or

Which is true.
Green functions are an excellent tool for working with a solution to any ODE or PDE. In this text we explain how it works and then show how one can calculate them using FEM.
Let’s put any ODE or PDE in the form:
(8)
Here
is a differential operator and
can have any dimension, e.g. 1D
(ODE), 2D, 3D or more (PDE). Then we can express the solution as
(9)
where
is a Green function, that needs to satisfy the equation:
(10)
Remember, that
acts on
only, so we can check, that (9)
indeed solves the PDE (8):

The equation (10) doesn’t determine the Green function uniquely,
because one can add to it any solution of the homogeneous equation
.
We can use this freedom to solve (10) for any boundary condition.
So we prescribe a boundary condition
and find the Green function (by solving (10)) that satisfies the
boundary condition. It can be shown, that
determined from
(9) then also needs to satisfy the same boundary condition.
We write the equation for Green functions at two different points
and
:

and multiply the first equation by
, second by
:

substract them and integrate over
:

Assuming that the operator
is Hermitean, we get:

So the Green function is symmetric for Hermitean operators
.
Poisson equation:

We calculate the Green function using the Fourier transform:

Check:

Then:

The green function can also be written using
and
:

Let’s write
and
using the Heaviside step function:

and:

Then we can differentiate:

Given:
(11)
The Green function is

Let’s differentiate:

and

So we get:

So
from (11) is a solution to the radial Poisson
equation:


with boundary conditions
.
We use the Fourier transform:

Check:

The general solution of the homogeneous equation is:

so the general Green function is:

Satisfying the boundary conditions (for all
):

we get:

and:

and

To show that this really works, let’s take for example
. Then

We can use SymPy to evaluate the integrals:
In [1]: u = -cos(x)*integrate(3*sin(2*y)*sin(y), (y, 0, x)) - \
sin(x)*integrate(3*sin(2*y)*cos(y), (y, x, pi/2))
In [2]: u
Out[2]:
-(cos(x)*sin(2*x) - 2*cos(2*x)*sin(x))*cos(x) - (sin(x)*sin(2*x)
+ 2*cos(x)*cos(2*x))*sin(x)
In [3]: simplify(u)
Out[3]:
2 2
- cos (x)*sin(2*x) - sin (x)*sin(2*x)
In [4]: trigsimp(_)
Out[4]: -sin(2*x)
And we get

We can easily check, that
:
>>> u = -sin(2*x)
>>> u.diff(x, 2) + u
3*sin(2*x)
and since
, we have verified, that
is the correct
solution.
Let’s show it on the Laplace equation. We want to solve:

We will treat
as a parameter, so we define
:

We set
on the boundary and we get:

So we choose
and then solve for
using FEM and we get the
Green function
for all
and one particular
. We can then
evaluate the integral (9) numerically – one would have to use FEM
for all
that are needed in the integral, so that is not efficient, but it
should work. One will then be able to play with Green functions and be able to
calculate them numerically for any boundary condition (which is not possible
analytically).
For
and
integers, the binomial coefficients are defined by:

For
real, one just uses the second formula as a definition:

Example I:

Example II:

The binomial formula is for
integer:

and for
real and
this can be generalized to:

Example: (for
)

so:

Another example:

where we used:

and

The
are Legendre Polynomials.
Triangle inequality (condition) means that none of the three
quantities
,
,
is greater than the sum of the other two:
(12)
This is equivalent to just one equation:
(13)
we can do any permutation of the symbols, i.e. the above equation is equivalent to any of these:

So instead of stating the three inequalities (12) it is more convenient to just write (13), using any permutation that we like.
To show, that (12) implies (13) we rewrite (12):

so

and we get (13).
To show, that (13) implies (12) we rewrite
(13) for
first:

so:

rearranging:

since
is positive, if
then also
and we get
(12). Finally, for
:

so:

rearranging:

since
is positive, if
then also
and we get
(12).
The Gamma function
is defined by the following properties
for
:
(14)
(15)
(16)
It can be shown that this determines the function uniquely for
(this is
called the Bohr-Mollerup theorem) and then it can be extended analytically to
the whole complex plane.
The most common formula for
that satisfies (14),
(15) and (16)
is:
(17)
It satisfies (14) because:
![\Gamma(1)
= \int_0^\infty t^{1-1} e^{-t} \d t
= \int_0^\infty e^{-t} \d t
= [-e^{-t}]_0^\infty
= 1](../../_images/math/2b4bfe80d6879c06ddcbd31976503ca67c2807d1.png)
It satisfies (15) by integrating by parts:
![\Gamma(z)
= \int_0^\infty t^{z-1} e^{-t} \d t
= (z-1)\int_0^\infty t^{z-2} e^{-t} \d t-[t^{z-1}e^{-t}]_0^\infty
= (z-1)\Gamma(z-1)](../../_images/math/6578ff5f0fccbfb26ce613f0a8e971f010d2e406.png)
Finally it satisfies (16) by verifying the convex condition
directly (
and
):

And thus (17) uniquely determines the Gamma function.
We can use (17) to calculate
:

From this and the definition of the Gamma function we get
for integer
:
(18)
and
(19)