内容简介:Given the usefulness of partial derivatives for closed-loop control, it is natural to ask how might large branching structures in the brain and other biological systems compute derivatives. After some reflection I realised that an important result in compl
Introduction:
Given the usefulness of partial derivatives for closed-loop control, it is natural to ask how might large branching structures in the brain and other biological systems compute derivatives. After some reflection I realised that an important result in complex analysis due to Cauchy, the Cauchy Integral Formula, may be used to compute derivatives with a simple forward propagation of signals using a Monte Carlo method.
In this article I introduce Cauchy’s formula, explain how Cauchy’s formula has a natural interpretation as an expected value, and demonstrate its reliability using concrete applications. These include gradient descent(also discovered by Cauchy), computing partial derivatives, and simulating a Spring Pendulum using the Euler-Lagrange equations.
Furthermore, I also address more conceptual issues towards the end of the article in the discussion section, where I provide a natural physical interpretation of complex numbers in terms of waves.
Note:While reading this article, you may find it helpful to experiment with functions in the following Jupyter notebook .
Cauchy’s Integral Formula for derivatives:
Derivation of the formula:
The Cauchy Integral Formula is defined as follows for functions
where is assumed to be differentiable and is a simple closed piecewise smooth and positively oriented curve in:
\begin{equation} f’(z_0) = \frac{1}{2\pi} \int_{\gamma} \frac{f(z)}{(z-z_0)^2} \end{equation}
An important consequence of this definition is that (1) is applicable to any holomorphic function which includes polynomials with complex coefficients, trigonometric functions and hyperbolic functions. The keyword here is polynomials. Due to Taylor’s theorem, any differentiable real-valued function may be approximated by polynomials so Cauchy’s method is applicable to any function that is differentiable.
In order to apply (1) numerically it is convenient to derive a simpler formulation using the unit disk as a boundary. By letting
we obtain:
\begin{equation} \forall z_0 \in A, f’(z_0) = \frac{1}{2\pi} \int_{0}^{2\pi} f(z_0+e^{i\theta}) \cdot e^{-i\theta} d\theta \end{equation}
and if we are mainly interested in functions of a real variable then
andis of the form:
\begin{equation} A = \{x + e^{i\theta}: x, \theta \in \mathbb{R}\} \end{equation}
Deterministic computation of derivatives:
Given (2), if
denotes the sampling frequency in the discrete case, thenand we may implement this integration procedure in Julia as follows:
function nabla(f, x::Float64, delta::Float64) N = round(Int,2*pi/delta) thetas = vcat(1:N)*delta ## collect arguments and rotations: rotations = map(theta -> exp(-im*theta),thetas) arguments = x .+ conj.(rotations) ## calculate expectation: expectation = 1.0/N*real(sum(map(f,arguments).*rotations)) return expectation end
If you are familiar with Julia you may notice that, as with any finite summation, the order of the operations is permutation invariant. This may be implemented in a tree-like structure where a large number of local computations occur in parallel. However, given that most biological networks are inherently noisy some scientists may object to the fact that this program is deterministic.
In the next section I explain why this is generally a non-issue. On the contrary, random sampling using intrinsic noise allows cheap, fast and reliable Monte Carlo estimates of an integral.
Cauchy’s formula as a Monte Carlo method:
Interpretation as an Expected Value:
If we define the real-valued function:
\begin{equation} g: \theta \mapsto \text{Re}(f(x+e^{i\theta})\cdot e^{-i\theta}) \end{equation}
By the intermediate-value theorem,
\begin{equation} \forall x \in A \exists \theta^* \in [0,2\pi], g(\theta^*) = \text{Re}(f’(x)) \end{equation}
It follows that we may interpret
as an expected value and the factoras the value of a Probability Density Function of a continuous uniform distribution. This may serve as a basis for cheap and stochastic gradient estimates using Monte Carlo methods.
Monte Carlo estimates of the gradient:
It is worth noting that (2) isn’t a high-dimensional integral. Nevertheless, a biological network may be inherently noisy and face severe computation constraints so a Monte Carlo method may be both useful and a lot more biologically plausible.
Using half the number of samples, this may be implemented in Julia simply by adding one line:
function mc_nabla(f, x::Float64, delta::Float64) N = round(Int,2*pi/delta) ## sample with only half the number of points: sample = rand(1:N,round(Int,N/2)) thetas = sample*delta ## collect arguments and rotations: rotations = map(theta -> exp(-im*theta),thetas) arguments = x .+ conj.(rotations) ## calculate expectation: expectation = 2.0/N*real(sum(map(f,arguments).*rotations)) return expectation end
I’d like to add that by sampling randomly using intrinsic noise we are taking into account the epistemological uncertainty of the biological network, that doesn’t generally know how to sample
optimally for arbitrary functions. So if we assume bounded computational resources this would generally be a better procedure to follow even in the absence of intrinsic noise.
Practical Applications:
Performing gradient descent:
Given the great importance of gradient descent in machine learning, the reader might wonder whether Cauchy’s definition of the complex derivative may be used to perform gradient descent. The answer is yes:
function gradient_descent(f,x_p::Float64,alpha::Float64) ## 100 steps for i=1:100 x_n = x_p - alpha*nabla(f,x_p,delta) x_p = x_n end return x_p end
…which may be tested on concrete problems such as this one:
## function: g(x) = (x-1)^2 + (x-2)^4 + (x-3)^6 ## initial value: x_p = 5.0 ## learning rate: alpha = 0.01 x_min = gradient_descent(g,x_p,alpha)
The reader should find a minimum-value around 2.17 which Wolfram Alpha agrees with. A good sign.
Computing Partial Derivatives:
We may also use Cauchy’s formula to compute partial derivatives of functions of several variables such as:
\begin{equation} q(x,y,z) = x + y^2 + cos(z) \end{equation}
Using the Kronecker delta,
, we may useto construct three functions of a single variable where the others are held constant. I could explain how to proceed but I think this is better left as an exercise for the reader.
In any case, we are now ready to consider the more interesting challenge of simulating dynamical systems which is very useful for an organism capable of counterfactual reasoning.
Simulating a Spring Pendulum using the Euler-Lagrange equations:
Let’s consider a pendulum with mass
on the end, stiffness , and whose equilibrium length is . By doing a drawing, you may discover the auxiliary variable:
\begin{equation} x = \Delta l \end{equation}
After a bit more reflection, you should find the following expressions for the kinetic energy
and the potential energy:
\begin{equation} T(\dot{x},\dot{\theta}) = \frac{1}{2}m(\dot{x}^2+(l+x)^2\dot{\theta}^2) \end{equation}
\begin{equation} V(x,\theta) = -mg(l+x)cos(\theta) + \frac{1}{2}kx^2 \end{equation}
We may then define the Lagrangian:
\begin{equation} \mathcal{L}(x,\theta,\dot{x},\dot{\theta}) = T-V= \frac{1}{2}m(\dot{x}^2+(l+x)^2\dot{\theta}^2)+mg(l+x)cos(\theta) - \frac{1}{2}kx^2 \end{equation}
which may be defined in Julia:
## The Lagrangian where we assume the length of the unextended spring is 1.0 and mass is 1.0 kg: L(X) = 0.5*(X[3]^2+(1.0+X[1])^2*X[4]^2)+9.8*(1.0+X[1])*cos(X[2])-0.5*X[1]^2
Given initial conditions, we may simulate this pendulum system using the Euler-Lagrange equations:
\begin{equation} \frac{\partial \mathcal{L}}{\partial x} = \frac{d}{dt}\frac{\partial \mathcal{L}}{\partial \dot{x}} \end{equation}
\begin{equation} \frac{\partial \mathcal{L}}{\partial \theta} = \frac{d}{dt}\frac{\partial \mathcal{L}}{\partial \dot{\theta}} \end{equation}
where
and describe forces acting on the system with respect to andrespectively.
The full simulation in Julia for 100 time-steps using the leapfrog method may then be defined:
## The Lagrangian where we assume the length of the unextended spring is 1.0 and mass is 1.0 kg: L(X) = 0.5*(X[3]^2+(1.0+X[1])^2*X[4]^2)+9.8*(1.0+X[1])*cos(X[2])-0.5*X[1]^2 ## kronecker delta, which will come in handy: kd(i,n) = [j==i for j in 1:n] ## All four variables will be stored in this array: Z = zeros(100,4) ## The initial conditions: Z[1,:] = [1.0,pi/4,0.0,0.0] ## simulate the pendulum system: for i=1:99 ## define partial derivatives: dd_theta = x -> L(x .+ Z[i,:].*(ones(4)-kd(2,4))) dd_x = x -> L(x .+ Z[i,:].*(ones(4)-kd(1,4))) ## update angle: Z[i+1,1] = Z[i,1] + Z[i,3]*0.01 + 0.5*(nabla(dd_theta,Z[i,4],delta))*0.0001 ## update position: Z[i+1,2] = Z[i,2] + Z[i,4]*0.01 + 0.5*(nabla(dd_x,Z[i,4],delta))*0.0001 ## update dx and dtheta: Z[i+1,3] = Z[i,3] + 0.5*(nabla(dd_x,Z[i,1],delta)+nabla(dd_x,Z[i+1,1],delta))*0.01 Z[i+1,4] = Z[i,4] + 0.5*(nabla(dd_theta,Z[i,2],delta)+nabla(dd_theta,Z[i+1,2],delta))*0.01 end
Discussion:
At this point the reader might still have questions. I can’t address all questions but today I will try to address the physical interpretation of complex numbers.
It’s worth noting that complex numbers,
have both magnitude and an angle . Therefore multiplication by complex numbers has a geometric interpretation as re-scaling by and rotation by. This leads to a natural physical interpretation.
For any
we may define the frequency which specifies a sampling rate and may be physically interpreted as the propagation of a wave whose amplitude corresponds to. So complex numbers may be represented in biological networks through the propagation of waves.
References:
- L.D. Landau & E.M. Lifshitz. Mechanics ( Volume 1 of A Course of Theoretical Physics ). Pergamon Press 1969.
- W. Rudin. Real and complex analysis. McGraw-Hill. 3rd ed.1986.
- Aidan Rocke. AutoDiff(2020).GitHub repository, https://github.com/AidanRocke/AutoDiff
- Aidan Rocke. Twitter Thread. https://twitter.com/bayesianbrain/status/1202650626653597698
- Duncan E. Donohue,Giorgio A. Ascoli. A Comparative Computer Simulation of Dendritic Morphology. PLOS Biology. June 6, 2008.
- Michael London & Michael Häusser. Dendritic Computation. Annu. Rev. Neurosci. 2005. 28:503–32
- WARREN S. MCCULLOCH AND WALTER PITTS. A LOGICAL CALCULUS OF THE IDEAS IMMANENT IN NERVOUS ACTIVITY. Bulletin of Mothemnticnl Biology Vol. 52, No. l/2. pp. 99-115. 1990.
以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
URL 编码/解码
URL 编码/解码
html转js在线工具
html转js在线工具