Python/numba package for evaluating and transforming Wigner $\mathfrak{D}$ matrices and spin-weighted spherical harmonics directly in terms of quaternions, and in more standard forms
Project maintained by moble
The Wigner $\mathfrak{D}$ matrices allow us to represent the rotation group by means of finite-dimensional matrices. They reduce to spherical functions in special cases, which further allows us to derive the rotation of those spherical functions—as described in more detail on this page. They are derived in terms of a particular split of the quaternion group into two parts.
Explicitly, a quaternion $\quat{Q}$ can be expressed in terms of two
complex numbers $\quat{Q}_a = \quat{Q}_1 + i\, \quat{Q}_z$ and
$\quat{Q}_b = \quat{Q}_y + i\, \quat{Q}_x$.^{1} This is only
important because it allows us to verify the multiplication law
\begin{align}
\label{eq:QuaternionComponentProducts}
(\quat{P}\,\quat{Q})_a &= \quat{P}_a\, \quat{Q}_a - \co{\quat{P}}_b\, \quat{Q}_b, \\\
(\quat{P}\,\quat{Q})_b &= \quat{P}_b\, \quat{Q}_a + \co{\quat{P}}_a\, \quat{Q}_b.
\end{align}
Given a rotor $\rotor{R}$, these two complex numbers are the
quantities actually used in computing the $\mathfrak{D}$ matrix
elements. Note that there is a series of basic choices leading to
both the decomposition of a quaternion into two complex numbers, and
the product law. None of these choices is set in stone; we just
choose something with an eye to the desired result.
The following is essentially the same as Wigner’s original derivation, but with more elegance, and more sensitivity to numerical issues and special cases. This version of the derivation comes from a paper I wrote a couple years ago, and is the source of the code used in this module.
The basic idea of the derivation is to construct a $(2\ell+1)$-dimensional vector space of homogeneous polynomials in these complex numbers $\quat{Q}_a$ and $\quat{Q}_b$. To make that a little more concrete, the basis of this vector space is \begin{equation*} \label{WignerBasisComponent} \mathbf{e}_{(m)}(\quat{Q}) \defined \frac{\quat{Q}_{a}^{\ell+m}\, \quat{Q}_{b}^{\ell-m}} {\sqrt{ (\ell+m)!\, (\ell-m)! }}. \end{equation*} Here, $m$ ranges from $-\ell$ to $\ell$ in steps of $1$, but $\ell$ (and hence $m$) can be a half-integer. Now, the key idea is that a rotation of $\quat{Q}$ by some new rotor $\quat{R}$ gives us a new vector basis, which we can represent in terms of the basis shown above. In fact, we get a matrix transforming one set of basis vectors to another. We’ll write this as \begin{equation*} \mathbf{e}_{(m’)}(\quat{R}\, \quat{Q}) = \sum_{m} \mathfrak{D}^{(\ell)}_{m’,m}(\quat{R})\, \mathbf{e}_{(m)}(\quat{Q}). \end{equation*}
So now, we’ve defined the $\mathfrak{D}$ matrices. But we can also plug $\quat{R}\, \quat{Q}$ into the original expression for $\mathbf{e}$, and figure out what $\mathfrak{D}$ should actually be. We’ll have to use the expressions for $(\quat{R}\, \quat{Q})_a$ and $(\quat{R}\, \quat{Q})_b$ given above, and we’ll find that we have polynomials with terms given as sums of two different things.
This brings us to the first fork in the road. If either $\quat{R}_a$
or $\quat{R}_b$ is tiny (i.e., the absolute value is numerically
small), we can (and in fact must) treat these as special cases.
First, if $\lvert \quat{R}_a \rvert \lesssim 10^{-15}$, for example,
we can just ignore it; since $\lvert \quat{R} \rvert=1$ (within
numerical precision), we are assured that $\lvert \quat{R}_b \rvert
\approx 1$. Thus, we get
\begin{align*}
\mathbf{e}_{(m’)}(\quat{R}\, \quat{Q})
&\approx \frac{ (- \co{\quat{R}}_{b}\, \quat{Q}_{b})^{\ell+m’}\,
(\quat{R}_{b}\, \quat{Q}_{a})^{\ell-m’} } { \sqrt{
(\ell+m’)!\, (\ell-m’)! } }, \\\
&\approx (- \co{\quat{R}}_{b})^{\ell+m’}\,
(\quat{R}_{b})^{\ell-m’}\, \mathbf{e}_{(-m’)}(\quat{Q}).
\end{align*}
In this case, it’s not hard to see that the expression for the
$\mathfrak{D}$ matrix is
\begin{equation}
\label{eq:D_RaApprox0}
\mathfrak{D}^{(\ell)}_{m’,m}(\quat{R})
= (-1)^{\ell+m’}\, \quat{R}_b^{-2m’} \delta_{-m’,m}
= (-1)^{\ell-m}\, \quat{R}_b^{2m} \delta_{-m’,m}.
\end{equation}
In the same way, we can calculate this for $\lvert \quat{R}_b \rvert
\lesssim 10^{-15}$:
\begin{equation}
\label{eq:D_RbApprox0}
\mathfrak{D}^{(\ell)}_{m’,m}(\quat{R})
= \quat{R}_a^{2m’} \delta_{m’,m}
= \quat{R}_a^{2m} \delta_{m’,m}.
\end{equation}
Now, the other fork in that road is the general case, when both
components have larger magnitudes. We have powers of the sum of those
terms in Eq. \eqref{eq:QuaternionComponentProducts}. This leads us to
use (two applications of) the
binomial expansion.
After a little simplification, we can
express the result as
\begin{multline}
\label{eq:DAnalytically}
\mathfrak{D}^{(\ell)}_{m’,m}(\quat{R})
= \sum_{\rho} \binom{\ell+m’} {\rho}\, \binom{\ell-m’}
{\ell-\rho-m}\, (-1)^{\rho}\, \\\\ \times \quat{R}_{a}^{\ell+m’-\rho}\,
\co{\quat{R}}_{a}^{\ell-\rho-m}\,
\quat{R}_{b}^{\rho-m’+m}\, \co{\quat{R}}_{b}^{\rho}\,
\sqrt{ \frac{ (\ell+m)!\, (\ell-m)! } { (\ell+m’)!\,
(\ell-m’)! } }.
\end{multline}
It turns out that this expression is not the best way to implement the
calculation in the code. The reason is that we would need to take a
bunch of exponents of complex numbers, and there’s the possibility
that the sum would cancel out to give a very small number, which would
be polluted with roundoff, etc. So we manipulate it to put it in a
better form. For example, we can simplify the above as
\begin{align}
\nonumber
\mathfrak{D}^{(\ell)}_{m’,m}(\quat{R})
&=
\sqrt{ \frac{ (\ell+m)!\, (\ell-m)! } { (\ell+m’)!\, (\ell-m’)! } }\,
\quat{R}_{a}^{\ell+m’}\,
\co{\quat{R}}_{a}^{\ell-m}\,
\quat{R}_{b}^{-m’+m} \\\
\nonumber
&\qquad
\times \sum_{\rho}
\binom{\ell+m’} {\rho}\, \binom{\ell-m’} {\ell-\rho-m}\,
(-1)^{\rho}\,
\quat{R}_{a}^{-\rho}\,
\co{\quat{R}}_{a}^{-\rho}\,
\quat{R}_{b}^{\rho}\,
\co{\quat{R}}_{b}^{\rho}, \\\
\nonumber
&=
\sqrt{ \frac{ (\ell+m)!\, (\ell-m)! } { (\ell+m’)!\, (\ell-m’)! } }\,
\lvert \quat{R}_{a} \rvert^{2\ell-2m}\,
\quat{R}_{a}^{m’+m}\,
\quat{R}_{b}^{-m’+m} \\\
\label{eq:D_RaGeqRb}
&\qquad
\times \sum_{\rho}
\binom{\ell+m’} {\rho}\, \binom{\ell-m’} {\ell-\rho-m}\,
\left( - \frac{\lvert \quat{R}_{b} \rvert^2}
{\lvert \quat{R}_{a} \rvert^2} \right)^{\rho}.
\end{align}
Typically, this last expression can be evaluated pretty efficiently.
But to make it really fast, accurate, and robust, we have to use some tricks. First of all, the various combinatorical factors are expensive to re-evaluate each time. The obvious way to handle this is just to pre-compute them, and have a function returning the appropriate values with some simple indexing tricks. These functions are implemented in this module’s initialization code.
Now, that sum is essentially a polynomial, and the best way to evaluate a polynomial uses Horner form — which is both faster and more accurate than the naive approach. Also, since the coefficients involve factorials of the summation index, we can factor out the $\rho_{\text{min}}$ binomials, and be left with just a few factorials, which we can then evaluate more efficiently. This also allows us to pull out the lowest-order coefficient of that polynomial: $(-\lvert R_b \rvert / \lvert R_a \rvert)^{2 \rho_{\mathrm{min}}}$. We will see momentarily that being able to do this is very important.
That deals very nicely with the sum, but we also need to deal with the
factor in front of the sum. And there are some fairly subtle
complications to deal with when evaluating that factor. First of all,
complex exponentials are slow. Second, there are cases where those
terms can become really huge, only to be multiplied by something
really tiny, leaving us reasonable values in the neighborhood of 1.
But those exponents can be very large; if we’re looking for matrix
elements for $\ell=32$, the exponents will range from $-64$ all the
way up to $64$. And since the largest and smallest numbers python can
represent are in the neighborhood of $10^{323}$ and $10^{-308}$,
respectively, we can easily get nan
or 0
for calculations that
should actually give us something that is neither of those values —
for example, if $\lvert R_b \rvert \approx 10^{-6}$ and $-m’=m=-32$.
But these terms only appear due to our separation of powers into the
sum. If we now bring out the lowest factor from the sum, we can make
those exponents disappear. The problem now is to cancel the exponents
in a mixture of exponents of complex numbers and exponents of real
numbers, and to do so efficiently.
The solution to this problem is simple and, miraculously, makes the
code 2.3 times faster! We decompose $R_a$ and $R_b$ into magnitude
and phase, calculate the exponents to which those must be raised, and
then produce the result as a complex number from the separate
magnitude and phase. There are fast functions in the standard python
module cmath
that do these operations, and they are directly
supported in numba
since version 0.15, so this is quite a simple
solution to implement, also. The calculation of this prefactor is
given in this module
here.
But this doesn’t solve all our problems. The same issues can creep up if $\lvert R_a \rvert$ is small (though it requires a smaller magnitude, and occurs for a smaller subset of $m’,m$ values). In this case, it would be better if we could find an expression that sort of reverses the roles of $a$ and $b$. It turns out that this isn’t hard. In deriving Eq. \eqref{eq:DAnalytically}, a choice was made regarding the summation variable. We can simply transform that summation variable as $\rho \mapsto \ell-m-\rho$, and obtain \begin{multline*} \mathfrak{D}^{(\ell)}_{m’,m}(\quat{R}) = \sum_{\rho} \binom{\ell+m’} {\ell-m-\rho}\, \binom{\ell-m’} {\rho}\, (-1)^{\ell-m-\rho}\, \\\\ \times \quat{R}_{a}^{m’+m+\rho}\, \co{\quat{R}}_{a}^{\rho}\, \quat{R}_{b}^{\ell-m’-\rho}\, \co{\quat{R}}_{b}^{\ell-m-\rho}\, \sqrt{ \frac{ (\ell+m)!\, (\ell-m)! } { (\ell+m’)!\, (\ell-m’)! } }. \end{multline*} It’s interesting to note the symmetry with the earlier version of this equation; we’ve essentially just exchanged the labels $a$ and $b$, while also reversing the sign of $m’$, and multiplying by an overall factor of $(-1)^{\ell+m}$.
In any case, we can apply the same simplification to this expression
as before:
\begin{align}
\nonumber
\mathfrak{D}^{(\ell)}_{m’,m}(\quat{R})
&=
(-1)^{\ell-m}
\sqrt{ \frac{ (\ell+m)!\, (\ell-m)! } { (\ell+m’)!\, (\ell-m’)! } }
\quat{R}_{a}^{m’+m}\,
\quat{R}_{b}^{\ell-m’}\,
\co{\quat{R}}_{b}^{\ell-m}
\\\
\nonumber
&\qquad
\times \sum_{\rho}
\binom{\ell+m’} {\ell-m-\rho}\, \binom{\ell-m’} {\rho}\,
(-1)^\rho\,
\quat{R}_{a}^{\rho}\,
\co{\quat{R}}_{a}^{\rho}\,
\quat{R}_{b}^{-\rho}\,
\co{\quat{R}}_{b}^{-\rho}
\\\
\nonumber
&=
(-1)^{\ell-m}
\sqrt{ \frac{ (\ell+m)!\, (\ell-m)! } { (\ell+m’)!\, (\ell-m’)! } }
\quat{R}_{a}^{m’+m}\,
\quat{R}_{b}^{m-m’}\,
\lvert \quat{R}_{b} \rvert^{2\ell-2m}
\\\
\label{eq:D_RaLeqRb}
&\qquad
\times \sum_{\rho}
\binom{\ell+m’} {\ell-m-\rho}\, \binom{\ell-m’} {\rho}\,
\left( - \frac{ \lvert \quat{R}_{a} \rvert^2 }
{ \lvert \quat{R}_{b} \vert^2 } \right)^{\rho}
\end{align}
And again, we evaluate this cleverly, as above.
So we get four branches in our logic, with a different expression for $\mathfrak{D}$ in each branch:
Note that these expressions are valid even for half-integer values of $\ell$, noting that if $\ell$ is half-integer, then so must $m’$ and $m$ be. However, in the interests of fast implementation, the code in this module assumes integer values. (It would be simple for me to implement the more general case. If this is a functionality you need, please feel free to open an issue on this module’s github page to request it.)
I hope I don’t have to repeat my utter disdain for the use of Euler
angles. However, it is important to make contact with other
literature to be able to compare conventions. As noted
previously, the rotation performed by the
set $(\alpha, \beta, \gamma)$ of Euler angles (using conventions to
agree with
Wikipedia’s page
on $\mathfrak{D}$ matrices) can be written in quaternion form as
\begin{align*}
\quat{R}_{(\alpha, \beta, \gamma)} &= e^{\alpha\, \basis{z}/2}\,
e^{\beta\, \basis{y}/2}\, e^{\gamma\, \basis{z}/2}, \\\
&= \left(
\cos \frac{\alpha}{2}\, \cos \frac{\beta}{2}\, \cos \frac{\gamma}{2}
-\sin \frac{\alpha}{2}\, \cos \frac{\beta}{2}\, \sin \frac{\gamma}{2}
\right) \\\
&\qquad + \basis{x} \left(
\cos \frac{\alpha}{2}\, \sin \frac{\beta}{2}\, \sin \frac{\gamma}{2}
-\sin \frac{\alpha}{2}\, \sin \frac{\beta}{2}\, \cos \frac{\gamma}{2}
\right) \\\
&\qquad + \basis{y} \left(
\cos \frac{\alpha}{2}\, \sin \frac{\beta}{2}\, \cos \frac{\gamma}{2}
+\sin \frac{\alpha}{2}\, \sin \frac{\beta}{2}\, \sin \frac{\gamma}{2}
\right) \\\
&\qquad + \basis{z} \left(
\sin \frac{\alpha}{2}\, \cos \frac{\beta}{2}\, \cos \frac{\gamma}{2}
+\cos \frac{\alpha}{2}\, \cos \frac{\beta}{2}\, \sin \frac{\gamma}{2}
\right).
\end{align*}
Taking the complex components of this, we have
\begin{equation*}
\quat{R}_a = e^{i\,\alpha/2}\, \cos\frac{\beta}{2}\, e^{i\,\gamma/2},
\qquad
\quat{R}_b = e^{-i\,\alpha/2}\, \sin\frac{\beta}{2}\, e^{i\,\gamma/2}.
\end{equation*}
We can plug these values into, e.g., Eq. \eqref{eq:D_RaGeqRb}, and
get the standard, hideous, reprehensible form of the $\mathfrak{D}$
matrices in terms of Euler angles.
Again, of course, this is not a form that should be used for calculations, but could be useful for comparing conventions with antiquated research.
This, of course, is not a productive way of thinking about quaternions, but it is a very useful way of calculating with quaternions, since complex numbers are already built in to many languages. That is, this decomposition into two complex components is something that the user probably does not need to worry about. It is, however, an isomorphism between quaternions and the usual (awful) presentation of Pauli spinors as two-component complex vectors, which is closer to the flavor of Wigner’s original derivation. It is also how the code is actually implemented. ↩