Optimal Portfolio Weights Under CRRA Utility

Summary

This series of essays explores the optimization of portfolio weights to maximize a Constant Relative Risk Aversion (CRRA) utility function over an agent's wealth. We use classic stochastic calculus techniques to model price processes as Geometric Brownian Motion (GBM). In Part I, we derive the optimal allocation between two risky assets and find our solution is an extension of the famous Merton Share. In Part II, we extend the analysis to three assets and then to an n-asset model. In Part III, we examine what happens when we change the numeraire from cash to a risky asset.


Part I: The Binary Asset Model

Model Definition

Suppose we have a universe of two stocks, AA and BB, modeled as independent GBM processes with parameters μA\mu_A, μB\mu_B, σA\sigma_A, and σB\sigma_B. We also define λA\lambda_A and λB\lambda_B to be our portfolio weights for assets AA and BB, such that λA+λB=1\lambda_A + \lambda_B = 1. Finally, we have a CRRA utility function over possible wealth states WW such that U(W)=W1γ11γU(W) = \frac{W^{1-\gamma} - 1}{1 - \gamma} and γ\gamma is our relative risk aversion parameter. In what follows, we attempt to find the optimal portfolio weights λ\lambda which maximize the expected utility of our future wealth.


Deriving a Closed Form Expected Utility Function

We intend to find the portfolio allocation [λA,λB][\lambda_A, \lambda_B] which maximizes the expected utility of our wealth in the next period, that is: maxλA,λBE[U(Wt+dt)].\underset{\lambda_A, \lambda_B}{\max} \, E[U(W_{t+dt})].

Incorporating our CRRA utility function yields E[U(Wt+dt)]=E[(Wt+dWt)1γ11γ]E[U(W_{t+dt})] = E\left[\frac{(W_t + dW_t)^{1-\gamma} - 1}{1-\gamma}\right].

We can now define our wealth dynamic dWtdW_t as evolving according to the chosen portfolio weights λA\lambda_A and λB\lambda_B

dWt=λAWtdSA,tSA,t+λBWtdSB,tSB,t.dW_t = \lambda_A W_t \frac{dS_{A,t}}{S_{A,t}} + \lambda_B W_t \frac{dS_{B,t}}{S_{B,t}}.

Similarly, we note that each stock's price follows a GBM, defined by the stochastic differential equations (SDEs)

dSA,t=μASA,tdt+σASA,tdNA,tdSB,t=μBSB,tdt+σBSB,tdNB,t,\begin{align*} dS_{A,t} &= \mu_A S_{A,t} \, dt + \sigma_A S_{A,t} \, dN_{A,t} \\ dS_{B,t} &= \mu_B S_{B,t} \, dt + \sigma_B S_{B,t} \, dN_{B,t}, \end{align*}

where Ni,tN_{i,t} is our notation for Wiener process on asset ii at time tt.

Substituting both individual asset processes into our wealth SDE yields

dWt=λAWtSA,t(μASA,tdt+σASA,tdNA,t)+λBWtSB,t(μBSB,tdt+σBSB,tdNB,t),dW_t = \lambda_A \frac{W_t}{S_{A,t}} (\mu_A S_{A,t} \, dt + \sigma_A S_{A,t} \, dN_{A,t}) + \lambda_B \frac{W_t}{S_{B,t}} (\mu_B S_{B,t} \, dt + \sigma_B S_{B,t} \, dN_{B,t}),

which we can simplify to

dWt=Wt[λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t)].dW_t = W_t \left[ \lambda_A (\mu_A \, dt + \sigma_A \, dN_{A,t}) + \lambda_B (\mu_B \, dt + \sigma_B \, dN_{B,t}) \right].

Now, we can substitute this into our expected utility equation:

E[U(Wt+dt)]=E[(Wt+Wt(λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t)))1γ11γ].E[U(W_{t+dt})] = E\left[\frac{(W_t + W_t (\lambda_A (\mu_A dt + \sigma_A dN_{A,t}) + \lambda_B (\mu_B dt + \sigma_B dN_{B,t})))^{1-\gamma}-1}{1-\gamma}\right].

From here, we begin the process of simplifying this expectation to:

E[(Wt+Wt(λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t)))1γ]11γ.\frac{E\left[(W_t + W_t (\lambda_A (\mu_A dt + \sigma_A dN_{A,t}) + \lambda_B (\mu_B dt + \sigma_B dN_{B,t})))^{1-\gamma}\right] - 1}{1-\gamma}.

Then we can pull the Wt1γW_t^{1-\gamma} out of the expectation yielding

Wt1γE[(1+λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t))1γ]11γ.\frac{W_t^{1-\gamma} E\left[\left(1 + \lambda_A (\mu_A dt + \sigma_A dN_{A,t}) + \lambda_B (\mu_B dt + \sigma_B dN_{B,t})\right)^{1-\gamma}\right] - 1}{1-\gamma}.

We now consider the second-order Taylor series expansion of E[(1+x)1γ]E[(1 + x)^{1-\gamma}] around 1 because we know xx will be very small since we're dealing with an infinitesimally small time increment dtdt. In this case, x=λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t)x=\lambda_A (\mu_A dt + \sigma_A dN_{A,t}) + \lambda_B (\mu_B dt + \sigma_B dN_{B,t}).

We remember that the Taylor series of a function f(x)f(x) around a point aa is given by

f(x)=f(a)+f(a)(xa)+f(a)2!(xa)2+f(x) = f(a) + f'(a)(x - a) + \frac{f''(a)}{2!}(x - a)^2 + \ldots

and we are careful to make sure to include the second order term which includes our volatility parameters.

This implies that E[(1+x)1γ]1+(1γ)E[x]+(1γ)(γ)2E[x2]E[(1 + x)^{1-\gamma}] \approx 1 + (1 - \gamma)E[x] + \frac{(1 - \gamma)(-\gamma)}{2} E[x^2].

Using x=λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t)x=\lambda_A (\mu_A dt + \sigma_A dN_{A,t}) + \lambda_B (\mu_B dt + \sigma_B dN_{B,t}), we see that E[x]=(λAμA+λBμB)dtE[x] = (\lambda_A \mu_A + \lambda_B \mu_B)dt because E[dNi,t]=0E[dN_{i,t}]=0.

Furthermore, to solve for E[x2]E[x^2], we substitute in xx which gives us E[x2]=E[(λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t))2].E[x^2] = E[(\lambda_A (\mu_A dt + \sigma_A dN_{A,t}) + \lambda_B (\mu_B dt + \sigma_B dN_{B,t}))^2].

From here, we expand out the expression to

E[x2]=E[(λA(μAdt+σAdNA,t))2+2λAλB(μAdt+σAdNA,t)(μBdt+σBdNB,t)+(λB(μBdt+σBdNB,t))2].E[x^2] = E[(\lambda_A (\mu_A dt + \sigma_A dN_{A,t}))^2 + 2 \lambda_A \lambda_B (\mu_A dt + \sigma_A dN_{A,t}) (\mu_B dt + \sigma_B dN_{B,t}) + (\lambda_B (\mu_B dt + \sigma_B dN_{B,t}))^2].

We now have to do some algebra to untangle this a bit further by expanding each of these terms.

(λA(μAdt+σAdNA,t))2=λA2μA2dt2+2λA2μAσAdtdNA,t+λA2σA2dNA,t2(\lambda_A (\mu_A dt + \sigma_A dN_{A,t}))^2 = \lambda_A^2 \mu_A^2 dt^2 + 2\lambda_A^2 \mu_A \sigma_A dt \, dN_{A,t} + \lambda_A^2 \sigma_A^2 dN_{A,t}^2

(λB(μBdt+σBdNB,t))2=λB2μB2dt2+2λB2μBσBdtdNB,t+λB2σB2dNB,t2(\lambda_B (\mu_B dt + \sigma_B dN_{B,t}))^2 = \lambda_B^2 \mu_B^2 dt^2 + 2\lambda_B^2 \mu_B \sigma_B dt \, dN_{B,t} + \lambda_B^2 \sigma_B^2 dN_{B,t}^2

2λAλB(μAdt+σAdNA,t)(μBdt+σBdNB,t)=2λAλBμAμBdt2+2λAλBμAσBdtdNB,t+2λAλBμBσAdtdNA,t+2λAλBσAσBdNA,tdNB,t2\lambda_A \lambda_B (\mu_A dt + \sigma_A dN_{A,t})(\mu_B dt + \sigma_B dN_{B,t}) = 2\lambda_A \lambda_B \mu_A \mu_B dt^2 + 2\lambda_A \lambda_B \mu_A \sigma_B dt \, dN_{B,t} + 2\lambda_A \lambda_B \mu_B \sigma_A dt \, dN_{A,t} + 2\lambda_A \lambda_B \sigma_A \sigma_B dN_{A,t} dN_{B,t}

At this point, some properties of Brownian motion come to our aid, particularly that E[dNi,t]=0E[dN_{i,t}]=0, E[dNi,t2]=dtE[dN_{i,t}^2]=dt, and E[dNA,tdNB,t]=0E[dN_{A,t} dN_{B,t}]=0 (since AA and BB are independent processes).

We can make the following simplifications:

(λA(μAdt+σAdNA,t))2=λA2σA2dt(\lambda_A (\mu_A dt + \sigma_A dN_{A,t}))^2 = \lambda_A^2 \sigma_A^2 dt

(λB(μBdt+σBdNB,t))2=λB2σB2dt(\lambda_B (\mu_B dt + \sigma_B dN_{B,t}))^2 = \lambda_B^2 \sigma_B^2 dt

2λAλB(μAdt+σAdNA,t)(μBdt+σBdNB,t)=02\lambda_A \lambda_B (\mu_A dt + \sigma_A dN_{A,t})(\mu_B dt + \sigma_B dN_{B,t}) = 0

Putting it all together, the simplified expression for E[x2]=λA2σA2dt+λB2σB2dtE[x^2] = \lambda_A^2 \sigma_A^2 dt + \lambda_B^2 \sigma_B^2 dt.

Returning back to the earlier expectation we've been trying to solve with these new results in hand, we see that

E[(1+x)1γ]=1+(1γ)(λAμA+λBμB)dt+(1γ)(γ)2(λA2σA2dt+λB2σB2dt).E[(1 + x)^{1-\gamma}] = 1 + (1 - \gamma)(\lambda_A \mu_A + \lambda_B \mu_B)dt + \frac{(1 - \gamma)(-\gamma)}{2} (\lambda_A^2 \sigma_A^2 dt + \lambda_B^2 \sigma_B^2 dt).

This means we can now write our full expected utility maximization equation as:

E[U(Wt+dt)]=Wt1γ(1+(1γ)(λAμA+λBμB)dt+(1γ)(γ)2(λA2σA2+λB2σB2)dt)11γE[U(W_{t+dt})] = \frac{W_t^{1-\gamma} \left(1 + (1 - \gamma)(\lambda_A \mu_A + \lambda_B \mu_B)dt + \frac{(1 - \gamma)(-\gamma)}{2} (\lambda_A^2 \sigma_A^2 + \lambda_B^2 \sigma_B^2)dt\right) - 1}{1-\gamma}

Quick aside, my hunch is that if we were to extend to a multi-asset model this becomes: E[U(Wt+dt)]=Wt1γ(1+(1γ)(i=1nλiμi)dt+(1γ)(γ)2(i=1nλi2σi2)dt)11γE[U(W_{t+dt})] = \frac{W_t^{1-\gamma} \left(1 + (1 - \gamma)\left(\sum_{i=1}^{n} \lambda_i \mu_i\right)dt + \frac{(1 - \gamma)(-\gamma)}{2} \left(\sum_{i=1}^{n} \lambda_i^2 \sigma_i^2\right)dt\right) - 1}{1-\gamma}

Now, because we're in a dual-asset model where the weights sum to 100%, λB\lambda_B is determined to be 1λA1-\lambda_A. We can subsitute this into the expression above which gives us:

E[U(Wt+dt)]=Wt1γ[1+(1γ)(λAμA+(1λA)μB)dt+(1γ)(γ)2(λA2σA2+(1λA)2σB2)dt]11γE[U(W_{t+dt})] = \frac{W_t^{1-\gamma} \left[1 + (1 - \gamma)(\lambda_A \mu_A + (1-\lambda_A) \mu_B)dt + \frac{(1 - \gamma)(-\gamma)}{2} (\lambda_A^2 \sigma_A^2 + (1-\lambda_A)^2 \sigma_B^2)dt\right] - 1}{1-\gamma}

Optimizing Portfolio Weights to Maximize Expected Utility

In order to maximize E[U(Wt+dt)]E[U(W_{t+dt})], we can follow the classic method of differentiating the function with respect to λA\lambda_A and setting this partial derivative equal to zero.

First, we notice that the term W1γ11γ\frac{W^{1-\gamma} - 1}{1 - \gamma} is a constant with respect to λA\lambda_A, so we can focus on differentiating only the bracketed expression, which we can denote as f(λA)f(\lambda_A):

f(λA)=1+(1γ)(λAμA+(1λA)μB)dt+(1γ)(γ)2(λA2σA2+(1λA)2σB2)dt.f(\lambda_A) = 1 + (1 - \gamma)(\lambda_A \mu_A + (1 - \lambda_A) \mu_B)dt + \frac{(1-\gamma)(-\gamma)}{2} \left( \lambda_A^2 \sigma_A^2 + (1 - \lambda_A)^2 \sigma_B^2 \right) dt.

Now, we differentiate f(λA)f(\lambda_A) with respect to λA\lambda_A:

dfdλA=(1γ)(μAμB)dt+(1γ)(γ)(λAσA2(1λA)σB2)dt.\frac{df}{d\lambda_A} = (1 - \gamma) (\mu_A - \mu_B) dt + (1 - \gamma)(-\gamma) \left( \lambda_A \sigma_A^2 - (1 - \lambda_A) \sigma_B^2 \right) dt.

This simplifies to:

dfdλA=(1γ)(μAμB)dt+(1γ)(γ)(λA(σA2+σB2)σB2)dt.\frac{df}{d\lambda_A} = (1 - \gamma) (\mu_A - \mu_B) dt + (1 - \gamma)(-\gamma) \left( \lambda_A (\sigma_A^2 + \sigma_B^2) - \sigma_B^2 \right) dt.

To find the optimal λA\lambda_A, set dfdλA=0\frac{df}{d\lambda_A} = 0 and solve for λA\lambda_A:

(1γ)(μAμB)dt+(1γ)(γ)(λA(σA2+σB2)σB2)dt=0.(1 - \gamma) (\mu_A - \mu_B) dt + (1 - \gamma)(-\gamma) \left( \lambda_A (\sigma_A^2 + \sigma_B^2) - \sigma_B^2 \right) dt = 0.

We can see that (1γ)dt(1 - \gamma) dt is a common factor in both terms, and since dtdt is an infinitesimal time increment (which importantly is not zero), we can simplify the equation by dividing through by (1γ)dt(1 - \gamma) dt:

μAμBγ(λA(σA2+σB2)σB2)=0.\mu_A - \mu_B - \gamma (\lambda_A (\sigma_A^2 + \sigma_B^2) - \sigma_B^2) = 0.

Now we rearrange the equation to solve for λA\lambda_A:

γλA(σA2+σB2)=μAμB+γσB2\gamma \lambda_A (\sigma_A^2 + \sigma_B^2) = \mu_A - \mu_B + \gamma \sigma_B^2

λA(σA2+σB2)=μAμB+γσB2γ\lambda_A (\sigma_A^2 + \sigma_B^2) = \frac{\mu_A - \mu_B + \gamma \sigma_B^2}{\gamma}

From this we have found our optimal λA\lambda_A:

λA=μAμB+γσB2γ(σA2+σB2).\lambda_A = \frac{\mu_A - \mu_B + \gamma \sigma_B^2}{\gamma (\sigma_A^2 + \sigma_B^2)}.

The optimal λB\lambda_B follows easily:

λB=1μAμB+γσB2γ(σA2+σB2)=μBμA+γσA2γ(σA2+σB2).\lambda_B = 1 - \frac{\mu_A - \mu_B + \gamma \sigma_B^2}{\gamma (\sigma_A^2 + \sigma_B^2)} = \frac{\mu_B - \mu_A + \gamma \sigma_A^2}{\gamma (\sigma_A^2 + \sigma_B^2)}.


Conclusion

Given two assets modeled as independent GBM processes, wealth WtW_t, and a CRRA utility function, we have found that the optimal allocation to asset AA is λA=μAμB+γσB2γ(σA2+σB2)\lambda_A = \frac{\mu_A - \mu_B + \gamma \sigma_B^2}{\gamma (\sigma_A^2 + \sigma_B^2)} and the optimal allocation to asset BB is λB=μBμA+γσA2γ(σA2+σB2)\lambda_B = \frac{\mu_B - \mu_A + \gamma \sigma_A^2}{\gamma (\sigma_A^2 + \sigma_B^2)}.

You might have noticed that the portfolio allocations λA\lambda_A and λB\lambda_B don't have a subscript tt. This is because, given that the price processes are stationary and that our risk aversion parameter γ\gamma does not change, they are time and wealth independent! This implies a constant fractional allocation to each stock in our portfolio.

Notably, in the case where BB is a risk-free investment, implying that σB2=0\sigma_B^2=0, our optimal allocation reduces to λA=μAμBγσA2\lambda_A = \frac{\mu_A - \mu_B}{\gamma \sigma_A^2}, which is the famous Merton Share!

While all models are lossy, I take issue with the idea of a risk-free rate. In particular, the real returns on a nation's treasuries are sensitive to interest rate changes, inflation, and currency fluctuations. It also shouldn't be overlooked that big debt crises occur fairly regularly and nations do default. With this in mind, I think extending the dual-asset Merton Share model to two risky assets is an improvement toward realism.


Part II: Extension to Three Assets

In Part I, we derived the optimal allocations under an independent binary asset model where the two stocks follow geometric Brownian motion processes. We now extend the analysis to three assets and then to an n-asset model.

Model Definition We have three assets AA, BB, and CC whose price processes follow GBM processeses with parameters (μA,σA)(\mu_A, \sigma_A), (μB,σB)(\mu_B, \sigma_B), and (μC,σC)(\mu_C, \sigma_C), respectively. We allocate our wealth WW between AA, BB, and CC in proportion λA\lambda_A, λB\lambda_B, and λC\lambda_C, respectively, such that λA+λB+λC=1\lambda_A + \lambda_B + \lambda_C = 1. We again maintain a Constant Relative Risk Aversion (CRRA) utility function U(W)=W1γ11γU(W) = \frac{W^{1-\gamma} - 1}{1 - \gamma} where WtW_t is our wealth at time tt and γ\gamma is our relative risk aversion parameter.

Deriving Our Expected Utility Function

We intend to find the portfolio allocation [λA,λB,λC][\lambda_A, \lambda_B, \lambda_C] which maximizes the expected utility of our wealth in the next period, that is: maxλA,λB,λCE[U(Wt+dt)].\underset{\lambda_A, \lambda_B, \lambda_C}{\max} \, E[U(W_{t+dt})].

After incorporating our CRRA utility function, we see E[U(Wt+dt)]=E[(Wt+dWt)1γ11γ]E[U(W_{t+dt})] = E\left[\frac{(W_t + dW_t)^{1-\gamma} - 1}{1-\gamma}\right].

We can now define our wealth dynamic dWtdW_t as evolving according to the chosen portfolio weights λA\lambda_A, λB\lambda_B, and λC\lambda_C.

dWt=λAWtdSA,tSA,t+λBWtdSB,tSB,t+λCWtdSC,tSC,t.dW_t = \lambda_A W_t \frac{dS_{A,t}}{S_{A,t}} + \lambda_B W_t \frac{dS_{B,t}}{S_{B,t}} + \lambda_C W_t \frac{dS_{C,t}}{S_{C,t}}.

Similarly, we note that each stock's price follows a GBM, defined by the stochastic differential equations (SDEs)

dSA,t=μASA,tdt+σASA,tdNA,tdSB,t=μBSB,tdt+σBSB,tdNB,tdSC,t=μCSC,tdt+σCSC,tdNC,t,\begin{align*} dS_{A,t} &= \mu_A S_{A,t} \, dt + \sigma_A S_{A,t} \, dN_{A,t} \\ dS_{B,t} &= \mu_B S_{B,t} \, dt + \sigma_B S_{B,t} \, dN_{B,t} \\ dS_{C,t} &= \mu_C S_{C,t} \, dt + \sigma_C S_{C,t} \, dN_{C,t}, \end{align*}

where Ni,tN_{i,t} is our notion for Wiener process on asset ii at time tt.

Substituting all three individual asset processes into our wealth SDE yields

dWt=λAWtSA,t(μASA,tdt+σASA,tdNA,t)+λBWtSB,t(μBSB,tdt+σBSB,tdNB,t)+λCWtSC,t(μCSC,tdt+σCSC,tdNC,t),dW_t = \lambda_A \frac{W_t}{S_{A,t}} (\mu_A S_{A,t} \, dt + \sigma_A S_{A,t} \, dN_{A,t}) + \lambda_B \frac{W_t}{S_{B,t}} (\mu_B S_{B,t} \, dt + \sigma_B S_{B,t} \, dN_{B,t}) + \lambda_C \frac{W_t}{S_{C,t}} (\mu_C S_{C,t} \, dt + \sigma_C S_{C,t} \, dN_{C,t}),

which we can simplify to

dWt=λA(μAWtdt+σAWtdNA,t)+λB(μBWtdt+σBWtdNB,t)+λC(μCWtdt+σCWtdNC,t),dW_t = \lambda_A (\mu_A W_t \, dt + \sigma_A W_t \, dN_{A,t}) + \lambda_B (\mu_B W_t \, dt + \sigma_B W_t \, dN_{B,t}) + \lambda_C (\mu_C W_t \, dt + \sigma_C W_t \, dN_{C,t}),

and then to

dWt=Wt[λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t)+λC(μCdt+σCdNC,t)].dW_t = W_t \left[ \lambda_A (\mu_A \, dt + \sigma_A \, dN_{A,t}) + \lambda_B (\mu_B \, dt + \sigma_B \, dN_{B,t}) + \lambda_C (\mu_C \, dt + \sigma_C \, dN_{C,t}) \right].

Now we can substitute this into our expected utility equation:

E[U(Wt+dt)]=E[(Wt+Wt(λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t)+λC(μCdt+σCdNC,t)+))1γ11γ]E[U(W_{t+dt})] = E\left[ \frac {(W_t + W_t ( \lambda_A (\mu_A dt + \sigma_A dN_{A,t}) + \lambda_B (\mu_B dt + \sigma_B dN_{B,t}) + \lambda_C (\mu_C dt + \sigma_C dN_{C,t}) + ))^{1-\gamma}-1} {1-\gamma} \right]

From here, we begin the process of simplifying this expectation to:

E[(Wt+Wt(λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t)+λC(μCdt+σCdNC,t)))1γ]11γ.\frac {E\left[(W_t + W_t ( \lambda_A (\mu_A dt + \sigma_A dN_{A,t}) + \lambda_B (\mu_B dt + \sigma_B dN_{B,t}) + \lambda_C (\mu_C dt + \sigma_C dN_{C,t}) ))^{1-\gamma}\right] - 1} {1-\gamma}.

Then we can pull the Wt1γW_t^{1-\gamma} out of the expectation yielding

Wt1γE[(1+λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t)+λB(μCdt+σBdNC,t))1γ]11γ.\frac {W_t^{1-\gamma} E\left[\left(1 + \lambda_A (\mu_A dt + \sigma_A dN_{A,t}) + \lambda_B (\mu_B dt + \sigma_B dN_{B,t}) + \lambda_B (\mu_C dt + \sigma_B dN_{C,t}) \right)^{1-\gamma}\right] - 1} {1-\gamma}.

We now consider the second-order Taylor series expansion of E[(1+x)1γ]E[(1 + x)^{1-\gamma}] around 1 because we know xx will be very small since we're dealing with an infinitesimally small time increment dtdt. In this case, x=λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t)+λC(μCdt+σCdNC,t)x= \lambda_A (\mu_A dt + \sigma_A dN_{A,t}) + \lambda_B (\mu_B dt + \sigma_B dN_{B,t}) + \lambda_C (\mu_C dt + \sigma_C dN_{C,t}).

We remember that the Taylor series of a function f(x)f(x) around a point aa is given by

f(x)=f(a)+f(a)(xa)+f(a)2!(xa)2+f(x) = f(a) + f'(a)(x - a) + \frac{f''(a)}{2!}(x - a)^2 + \ldots

and we are careful to make sure to include the second order term which includes our volatility parameters.

This implies that E[(1+x)1γ]1+(1γ)E[x]+(1γ)(γ)2E[x2]E[(1 + x)^{1-\gamma}] \approx 1 + (1 - \gamma)E[x] + \frac{(1 - \gamma)(-\gamma)}{2} E[x^2].

Using x=λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t)+λC(μCdt+σCdNC,t)x= \lambda_A (\mu_A dt + \sigma_A dN_{A,t}) + \lambda_B (\mu_B dt + \sigma_B dN_{B,t}) + \lambda_C (\mu_C dt + \sigma_C dN_{C,t}), we see that E[x]=(λAμA+λBμB+λCμC)dtE[x] = (\lambda_A \mu_A + \lambda_B \mu_B + \lambda_C \mu_C)dt because E[dNi,t]=0E[dN_{i,t}]=0.

Furthermore, to solve for E[x2]E[x^2], we substitute in xx which gives us

E[x2]=E[(λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t)+λC(μCdt+σCdNC,t))2].E[x^2] = E[( \lambda_A (\mu_A dt + \sigma_A dN_{A,t}) + \lambda_B (\mu_B dt + \sigma_B dN_{B,t}) + \lambda_C (\mu_C dt + \sigma_C dN_{C,t}) )^2].

Though it seems a bit unwieldy, we can simplify the expression E[(λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t)+λC(μCdt+σCdNC,t))2]E\left[\left(\lambda_A (\mu_A dt + \sigma_A dN_{A,t}) + \lambda_B (\mu_B dt + \sigma_B dN_{B,t}) + \lambda_C (\mu_C dt + \sigma_C dN_{C,t})\right)^2\right], we need to first expand the square and use the properties of Wiener processes, notably that E[dNi,t]=0E[dN_{i,t}] = 0 and E[dNi,t2]=dtE[dN_{i,t}^2] = dt.

Expanding the square gives:

λA2(μAdt+σAdNA,t)2+λB2(μBdt+σBdNB,t)2+λC2(μCdt+σCdNC,t)2+2λAλB(μAdt+σAdNA,t)(μBdt+σBdNB,t)+2λAλC(μAdt+σAdNA,t)(μCdt+σCdNC,t)+2λBλC(μBdt+σBdNB,t)(μCdt+σCdNC,t)\begin{align*} &\lambda_A^2 (\mu_A dt + \sigma_A dN_{A,t})^2 + \lambda_B^2 (\mu_B dt + \sigma_B dN_{B,t})^2 + \lambda_C^2 (\mu_C dt + \sigma_C dN_{C,t})^2 \\ &+ 2 \lambda_A \lambda_B (\mu_A dt + \sigma_A dN_{A,t})(\mu_B dt + \sigma_B dN_{B,t}) \\ &+ 2 \lambda_A \lambda_C (\mu_A dt + \sigma_A dN_{A,t})(\mu_C dt + \sigma_C dN_{C,t}) \\ &+ 2 \lambda_B \lambda_C (\mu_B dt + \sigma_B dN_{B,t})(\mu_C dt + \sigma_C dN_{C,t}) \end{align*}

We can now simplify each term by considering the properties of dNi,tdN_{i,t} noted before:

For terms like λA2(μAdt+σAdNA,t)2\lambda_A^2 (\mu_A dt + \sigma_A dN_{A,t})^2, the expansion will give λA2μA2dt2+2λA2μAσAdtdNA,t+λA2σA2dNA,t2\lambda_A^2 \mu_A^2 dt^2 + 2 \lambda_A^2 \mu_A \sigma_A dt dN_{A,t} + \lambda_A^2 \sigma_A^2 dN_{A,t}^2. When taking the expected value of this, the dtdNA,tdt dN_{A,t} term disappears, and dNA,t2dN_{A,t}^2 becomes dtdt, leaving λA2σA2dt\lambda_A^2 \sigma_A^2 dt.

When taking the expected value of this, the dt,dNA,tdt , dN_{A,t} term disappears, and dNA,t2dN_{A,t}^2 becomes dtdt, leaving λA2σA2dt\lambda_A^2 \sigma_A^2 dt.

The cross terms like 2λAλB(μAdt+σAdNA,t)(μBdt+σBdNB,t)2 \lambda_A \lambda_B (\mu_A dt + \sigma_A dN_{A,t})(\mu_B dt + \sigma_B dN_{B,t}) expand out to

2λAλBμAμBdt2+2λAλBμAσBdtdNB,t+2λAλBσAμBdtdNA,t+2λAλBσAσBdNA,tdNB,t.2 \lambda_A \lambda_B \mu_A \mu_B dt^2 + 2 \lambda_A \lambda_B \mu_A \sigma_B dt \, dN_{B,t} + 2 \lambda_A \lambda_B \sigma_A \mu_B dt \, dN_{A,t} + 2 \lambda_A \lambda_B \sigma_A \sigma_B dN_{A,t} \, dN_{B,t}. Each term in this expression goes to zero because E[dt2]=0E[dt^2]=0 and E[Ni,t]=0E[N_{i,t}]=0.

After applying these simplifications, the expected value expression becomes:

[(λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t)+λC(μCdt+σCdNC,t))2]=(λA2σA2+λB2σB2+λC2σC2)dt.\left[\left( \lambda_A (\mu_A dt + \sigma_A dN_{A,t}) + \lambda_B (\mu_B dt + \sigma_B dN_{B,t}) + \lambda_C (\mu_C dt + \sigma_C dN_{C,t}) \right)^2\right] = (\lambda_A^2 \sigma_A^2 + \lambda_B^2 \sigma_B^2 + \lambda_C^2 \sigma_C^2)dt.

Returning back to the earlier expectation we've been trying to solve with these new results in hand, we see that

E[(1+x)1γ]=1+(1γ)(λAμA+λBμB+λCμC)dt+(1γ)(γ)2(λA2σA2dt+λB2σB2dt+λC2σC2dt).E[(1 + x)^{1-\gamma}] = 1 + (1 - \gamma)( \lambda_A \mu_A + \lambda_B \mu_B + \lambda_C \mu_C )dt + \frac{(1 - \gamma)(-\gamma)}{2} ( \lambda_A^2 \sigma_A^2 dt + \lambda_B^2 \sigma_B^2 dt + \lambda_C^2 \sigma_C^2 dt ).

This means we can now write our full expected utility maximization equation as:

E[U(Wt+dt)]=Wt1γ[1+(1γ)(λAμA+λBμB+λCμC)dt+(1γ)(γ)2(λA2σA2+λB2σB2+λC2σC2)dt]11γE[U(W_{t+dt})] = \frac{W_t^{1-\gamma} \left[1 + (1 - \gamma)( \lambda_A \mu_A + \lambda_B \mu_B + \lambda_C \mu_C )dt + \frac{(1 - \gamma)(-\gamma)}{2} ( \lambda_A^2 \sigma_A^2 + \lambda_B^2 \sigma_B^2 + \lambda_C^2 \sigma_C^2 )dt\right] - 1}{1-\gamma}

We make one final adjustment by including our λA+λB+λC=1\lambda_A + \lambda_B + \lambda_C = 1 constraint to reduce a degree of freedom our model.

We first substitute λC=1λAλB\lambda_C = 1 - \lambda_A - \lambda_B into the linear term λAμA+λBμB+λCμC=λAμA+λBμB+(1λAλB)μC=λA(μAμC)+λB(μBμC)+μC\lambda_A \mu_A + \lambda_B \mu_B + \lambda_C \mu_C = \lambda_A \mu_A + \lambda_B \mu_B + (1 - \lambda_A - \lambda_B) \mu_C = \lambda_A (\mu_A - \mu_C) + \lambda_B (\mu_B - \mu_C) + \mu_C. This also makes intuitive sense because initially the expression was the sum of all of our allocation percentages times the average return of those investments which is the expected return of our portfolio. The final expression is the same expected return of our portfolio, except we can conceptualize this as 100% of our portfolio returning μC\mu_C, and then for each non-CC asset we compute how much more or less we'd make on that fraction of our portfolio against a CC-based benchmark.

Now we need to handle the quadratic term, λA2σA2+λB2σB2+λC2σC2\lambda_A^2 \sigma_A^2 + \lambda_B^2 \sigma_B^2 + \lambda_C^2 \sigma_C^2. We substitute out λC\lambda_C which yields λA2σA2+λB2σB2+(1λAλB)2σC2\lambda_A^2 \sigma_A^2 + \lambda_B^2 \sigma_B^2 + (1 - \lambda_A - \lambda_B)^2 \sigma_C^2. We note that generally (1i=1nxi)2=12i=1nxi+21i<jnxixj+i=1nxi2(1 - \sum_{i=1}^{n} x_i)^2 = 1 - 2 \sum_{i=1}^{n} x_i + 2 \sum_{1 \leq i < j \leq n} x_ix_j + \sum_{i=1}^{n} x_i^2, this generalization will help us when we extend to the n-asset framework, but we can use it in our three-asset model too.

Now we expand and simplify the quadratic term for λC\lambda_C:

=λA2σA2+λB2σB2+(12λA2λB+2λAλB+λA2+λB2)σC2= \lambda_A^2 \sigma_A^2 + \lambda_B^2 \sigma_B^2 + ( 1 - 2\lambda_A - 2\lambda_B + 2\lambda_A\lambda_B + \lambda_A^2 + \lambda_B^2 ) \sigma_C^2 =λA2σA2+λB2σB2+σC22λAσC22λBσC2+2λAλBσC2+λA2σC2+λB2σC2= \lambda_A^2 \sigma_A^2 + \lambda_B^2 \sigma_B^2 + \sigma_C^2 - 2\lambda_A \sigma_C^2 - 2\lambda_B \sigma_C^2 + 2\lambda_A\lambda_B \sigma_C^2 + \lambda_A^2 \sigma_C^2 + \lambda_B^2 \sigma_C^2 =λA2(σA2+σC2)+λB2(σB2+σC2)+2λAλBσC22λAσC22λBσC2+σC2= \lambda_A^2 (\sigma_A^2 + \sigma_C^2) + \lambda_B^2 (\sigma_B^2 + \sigma_C^2) + 2\lambda_A\lambda_B \sigma_C^2 - 2\lambda_A \sigma_C^2 - 2\lambda_B \sigma_C^2 + \sigma_C^2

We can now substitute these simplified expressions back into the original equation:

Wt1γ(1+(1γ)(λA(μAμC)+λB(μBμC)+μC)dt+(1γ)(γ)2(λA2(σA2+σC2)+λB2(σB2+σC2)+2λAλBσC22λAσC22λBσC2+σC2)dt)11γ\frac{W_t^{1-\gamma} \left( 1 + (1 - \gamma) \left( \lambda_A (\mu_A - \mu_C) + \lambda_B (\mu_B - \mu_C) + \mu_C\right)dt + \frac{(1 - \gamma)(-\gamma)}{2} \left( \lambda_A^2 (\sigma_A^2 + \sigma_C^2) + \lambda_B^2 (\sigma_B^2 + \sigma_C^2) + 2\lambda_A\lambda_B \sigma_C^2 - 2\lambda_A \sigma_C^2 - 2\lambda_B \sigma_C^2 + \sigma_C^2 \right)dt \right) - 1}{1-\gamma}

This expression for our expected marginal utility incorporates all of the constraints of our model now that λC\lambda_C has been eliminated and replaced by solely λA\lambda_A and λB\lambda_B, which crucially reduces a degree of freedome from our model and allows the matrix inversion technique which follows to succeed.

Optimizing Portfolio Weights to Maximize Expected Utility (Take Three)

We know that the maximium of E[U(Wt+dt)]E[U(W_{t+dt})] w.r.t. our λi\lambda_is will have a tangent plane with zero gradient in the λA\lambda_A and λB\lambda_B directions. That is, dE[U(Wt+dt)]dλi=0\frac{dE[U(W_{t+dt})]}{d\lambda_i}=0 for i{1,2}i \in \{1,2\}. From this we will get a system of equations which we can then solve to get our optimal portfolio allocations. We start by solving for ddλAE[U(Wt+dt)]\frac{d}{d\lambda_A}E[U(W_{t+dt})].

ddλAE[U(Wt+dt)]=ddλA(Wt1γ(1+(1γ)(λA(μAμC)+λB(μBμC)+μC)dt+(1γ)(γ)2(λA2(σA2+σC2)+λB2(σB2+σC2)+2λAλBσC22λAσC22λBσC2+σC2)dt)11γ)\frac{d}{d\lambda_A} E[U(W_{t+dt})] = \frac{d}{d\lambda_A} \left( \frac{ W_t^{1-\gamma} \left( 1 + (1 - \gamma) \left( \lambda_A (\mu_A - \mu_C) + \lambda_B (\mu_B - \mu_C) + \mu_C \right)dt + \frac{(1 - \gamma)(-\gamma)}{2} \left( \lambda_A^2 (\sigma_A^2 + \sigma_C^2) + \lambda_B^2 (\sigma_B^2 + \sigma_C^2) + 2\lambda_A\lambda_B \sigma_C^2 - 2\lambda_A \sigma_C^2 - 2\lambda_B \sigma_C^2 + \sigma_C^2 \right)dt \right) - 1 }{ 1-\gamma } \right) =Wt1γ1γddλA(1+(1γ)(λA(μAμC)+λB(μBμC)+μC)dt+(1γ)(γ)2(λA2(σA2+σC2)+λB2(σB2+σC2)+2λAλBσC22λAσC22λBσC2+σC2)dt)= \frac{W_t^{1-\gamma}}{1-\gamma} \frac{d}{d\lambda_A} \left( 1 + (1 - \gamma) \left( \lambda_A (\mu_A - \mu_C) + \lambda_B (\mu_B - \mu_C) + \mu_C \right)dt + \frac{(1 - \gamma)(-\gamma)}{2} \left( \lambda_A^2 (\sigma_A^2 + \sigma_C^2) + \lambda_B^2 (\sigma_B^2 + \sigma_C^2) + 2\lambda_A\lambda_B \sigma_C^2 - 2\lambda_A \sigma_C^2 - 2\lambda_B \sigma_C^2 + \sigma_C^2 \right)dt \right) =Wt1γ1γddλA((1γ)(λA(μAμC)+λB(μBμC)+μC)dt)+ddλA((1γ)(γ)2(λA2(σA2+σC2)+λB2(σB2+σC2)+2λAλBσC22λAσC22λBσC2+σC2)dt)= \frac{W_t^{1-\gamma}}{1-\gamma} \frac{d}{d\lambda_A} \left( (1 - \gamma) \left( \lambda_A (\mu_A - \mu_C) + \lambda_B (\mu_B - \mu_C) + \mu_C \right)dt \right) + \frac{d}{d\lambda_A} \left( \frac{(1 - \gamma)(-\gamma)}{2} \left( \lambda_A^2 (\sigma_A^2 + \sigma_C^2) + \lambda_B^2 (\sigma_B^2 + \sigma_C^2) + 2\lambda_A\lambda_B \sigma_C^2 - 2\lambda_A \sigma_C^2 - 2\lambda_B \sigma_C^2 + \sigma_C^2 \right)dt \right) =Wt1γddλA((λA(μAμC)+λB(μBμC)+μC)dt)Wt1γγ2ddλA((λA2(σA2+σC2)+λB2(σB2+σC2)+2λAλBσC22λAσC22λBσC2+σC2)dt)= W_t^{1-\gamma} \frac{d}{d\lambda_A} \left( \left( \lambda_A (\mu_A - \mu_C) + \lambda_B (\mu_B - \mu_C) + \mu_C \right)dt \right) - \frac{W_t^{1-\gamma} \gamma}{2} \frac{d}{d\lambda_A} \left( \left( \lambda_A^2 (\sigma_A^2 + \sigma_C^2) + \lambda_B^2 (\sigma_B^2 + \sigma_C^2) + 2\lambda_A\lambda_B \sigma_C^2 - 2\lambda_A \sigma_C^2 - 2\lambda_B \sigma_C^2 + \sigma_C^2 \right)dt \right) =Wt1γ(ddλA(λA(μAμC)dt)+ddλA(λB(μBμC)dt)+ddλA(μCdt))Wt1γγ2(ddλA(λA2(σA2+σC2)dt)+ddλA(λB2(σB2+σC2)dt)+ddλA(2λAλBσC2dt)ddλA(2λAσC2dt)ddλA(2λBσC2dt)+ddλA(σC2dt))= W_t^{1-\gamma} \left( \frac{d}{d\lambda_A} (\lambda_A (\mu_A - \mu_C) dt) + \frac{d}{d\lambda_A} (\lambda_B (\mu_B - \mu_C) dt) + \frac{d}{d\lambda_A} (\mu_C dt) \right) - \frac{W_t^{1-\gamma} \gamma}{2} \left( \frac{d}{d\lambda_A} (\lambda_A^2 (\sigma_A^2 + \sigma_C^2) dt) + \frac{d}{d\lambda_A} (\lambda_B^2 (\sigma_B^2 + \sigma_C^2) dt) + \frac{d}{d\lambda_A} (2\lambda_A\lambda_B \sigma_C^2 dt) - \frac{d}{d\lambda_A} (2\lambda_A \sigma_C^2 dt) - \frac{d}{d\lambda_A} (2\lambda_B \sigma_C^2 dt) + \frac{d}{d\lambda_A} (\sigma_C^2 dt) \right) =Wt1γ((μAμC)dt+dλBdλA(μBμC)dt)Wt1γγ2(2λA(σA2+σC2)dt+2λB(σB2+σC2)dλBdλAdt+2σC2(λAdλBdλA+λB)dt2σC2dt2σC2dλBdλAdt)= W_t^{1-\gamma} \left( (\mu_A - \mu_C) dt + \frac{d \lambda_B}{d\lambda_A} (\mu_B - \mu_C) dt \right) - \frac{W_t^{1-\gamma} \gamma}{2} \left( 2 \lambda_A (\sigma_A^2 + \sigma_C^2) dt + 2 \lambda_B (\sigma_B^2 + \sigma_C^2) \frac{d \lambda_B}{d\lambda_A} dt + 2 \sigma_C^2 (\lambda_A \frac{d \lambda_B}{d\lambda_A} + \lambda_B) dt - 2 \sigma_C^2 dt - 2 \sigma_C^2 \frac{d\lambda_B}{d\lambda_A} dt \right) =Wt1γdt((μAμC)+dλBdλA(μBμC)γλA(σA2+σC2)γλB(σB2+σC2)dλBdλAγσC2λAdλBdλAγσC2λB+γσC2+γσC2dλBdλA)= W_t^{1-\gamma} dt \left( (\mu_A - \mu_C) + \frac{d \lambda_B}{d\lambda_A} (\mu_B - \mu_C) - \gamma \lambda_A (\sigma_A^2 + \sigma_C^2) - \gamma \lambda_B (\sigma_B^2 + \sigma_C^2) \frac{d \lambda_B}{d\lambda_A} - \gamma \sigma_C^2 \lambda_A \frac{d \lambda_B}{d\lambda_A} - \gamma \sigma_C^2 \lambda_B + \gamma \sigma_C^2 + \gamma \sigma_C^2 \frac{d\lambda_B}{d\lambda_A} \right) =Wt1γdt(μAμCγλA(σA2+σC2)γσC2λB+γσC2+dλBdλA(μBμCγλB(σB2+σC2)γσC2λA+γσC2))= W_t^{1-\gamma} dt \left( \mu_A - \mu_C - \gamma \lambda_A (\sigma_A^2 + \sigma_C^2) - \gamma \sigma_C^2 \lambda_B + \gamma \sigma_C^2 + \frac{d \lambda_B}{d\lambda_A} \left(\mu_B - \mu_C - \gamma \lambda_B (\sigma_B^2 + \sigma_C^2) - \gamma \sigma_C^2 \lambda_A + \gamma \sigma_C^2\right) \right)

We can apply a symmetry argument to find dE[U]dλB\frac{dE[U]}{d\lambda_B} because λB\lambda_B can be interchanged with λA\lambda_A without changing E[U]E[U]. Setting each of these partial derivative to equal zero gives us the system of equations we're looking for.

dE[U]dλA=Wt1γdt(μAμCγλA(σA2+σC2)γσC2λB+γσC2+dλBdλA(μBμCγλB(σB2+σC2)γσC2λA+γσC2))=0\frac{dE[U]}{d\lambda_A} = W_t^{1-\gamma} dt \left( \mu_A - \mu_C - \gamma \lambda_A (\sigma_A^2 + \sigma_C^2) - \gamma \sigma_C^2 \lambda_B + \gamma \sigma_C^2 + \frac{d \lambda_B}{d\lambda_A} \left( \mu_B - \mu_C - \gamma \lambda_B (\sigma_B^2 + \sigma_C^2) - \gamma \sigma_C^2 \lambda_A + \gamma \sigma_C^2 \right) \right) = 0 dE[U]dλB=Wt1γdt(μBμCγλB(σB2+σC2)γσC2λA+γσC2+dλAdλB(μAμCγλA(σA2+σC2)γσC2λB+γσC2))=0\frac{dE[U]}{d\lambda_B} = W_t^{1-\gamma} dt \left( \mu_B - \mu_C - \gamma \lambda_B (\sigma_B^2 + \sigma_C^2) - \gamma \sigma_C^2 \lambda_A + \gamma \sigma_C^2 + \frac{d \lambda_A}{d\lambda_B} \left( \mu_A - \mu_C - \gamma \lambda_A (\sigma_A^2 + \sigma_C^2) - \gamma \sigma_C^2 \lambda_B + \gamma \sigma_C^2 \right) \right) = 0 γσC2λBγσC2+dλBdλA(γλB(σB2+σC2)+γσC2λAγσC2)=μAμC+dλBdλA(μBμC)\gamma \sigma_C^2 \lambda_B - \gamma \sigma_C^2 + \frac{d \lambda_B}{d\lambda_A} \left( \gamma \lambda_B (\sigma_B^2 + \sigma_C^2) + \gamma \sigma_C^2 \lambda_A - \gamma \sigma_C^2 \right) = \mu_A - \mu_C + \frac{d \lambda_B}{d\lambda_A} \left( \mu_B - \mu_C \right)

We now do some simplification.

After cancelling out Wt1γdtW_t^{1-\gamma} dt from both sides of the equation, the rearranged form becomes:

γλA(σA2+σC2)γσC2λB+dλBdλA(γλB(σB2+σC2)γσC2λA)=μAμC+γσC2+dλBdλA(μBμC+γσC2)-\gamma \lambda_A (\sigma_A^2 + \sigma_C^2) - \gamma \sigma_C^2 \lambda_B + \frac{d \lambda_B}{d\lambda_A} \left( - \gamma \lambda_B (\sigma_B^2 + \sigma_C^2) - \gamma \sigma_C^2 \lambda_A \right) = \mu_A - \mu_C + \gamma \sigma_C^2 + \frac{d \lambda_B}{d\lambda_A} (\mu_B - \mu_C + \gamma \sigma_C^2) γλB(σB2+σC2)γσC2λA+dλAdλB(γλA(σA2+σC2)γσC2λB)=μBμC+γσC2+dλAdλB(μAμC+γσC2-\gamma \lambda_B (\sigma_B^2 + \sigma_C^2) - \gamma \sigma_C^2 \lambda_A + \frac{d \lambda_A}{d\lambda_B} \left( - \gamma \lambda_A (\sigma_A^2 + \sigma_C^2) - \gamma \sigma_C^2 \lambda_B \right) = \mu_B - \mu_C + \gamma \sigma_C^2 + \frac{d \lambda_A}{d\lambda_B} (\mu_A - \mu_C + \gamma \sigma_C^2

I think the answer might be:

λA=γσB2σC2μAσB2μAσC2+μBσC2+μCσB2γσA2σB2+γσA2σC2+γσB2σC2\lambda_A = \frac{-\gamma \sigma_B^2 \sigma_C^2 - \mu_A \sigma_B^2 - \mu_A \sigma_C^2 + \mu_B \sigma_C^2 + \mu_C \sigma_B^2}{\gamma \sigma_A^2 \sigma_B^2 + \gamma \sigma_A^2 \sigma_C^2 + \gamma \sigma_B^2 \sigma_C^2} λB=γσA2σC2+μAσC2μBσA2μBσC2+μCσA2γσA2σB2+γσA2σC2+γσB2σC2\lambda_B = \frac{-\gamma \sigma_A^2 \sigma_C^2 + \mu_A \sigma_C^2 - \mu_B \sigma_A^2 - \mu_B \sigma_C^2 + \mu_C \sigma_A^2}{\gamma \sigma_A^2 \sigma_B^2 + \gamma \sigma_A^2 \sigma_C^2 + \gamma \sigma_B^2 \sigma_C^2}

Simplifying:

λA=σB2(μCμA)+σC2(μBμA)γσB2σC2γ(σA2σB2+σA2σC2+σB2σC2)\lambda_A = \frac{ \sigma_B^2 (\mu_C - \mu_A) + \sigma_C^2 (\mu_B - \mu_A) - \gamma \sigma_B^2 \sigma_C^2 }{ \gamma ( \sigma_A^2 \sigma_B^2 + \sigma_A^2 \sigma_C^2 + \sigma_B^2 \sigma_C^2 )} λB=σC2(μAμB)+σA2(μCμB)γσA2σC2γ(σA2σB2+σA2σC2+σB2σC2)\lambda_B = \frac{ \sigma_C^2 (\mu_A - \mu_B) + \sigma_A^2 (\mu_C - \mu_B) - \gamma \sigma_A^2 \sigma_C^2 }{ \gamma ( \sigma_A^2 \sigma_B^2 + \sigma_A^2 \sigma_C^2 + \sigma_B^2 \sigma_C^2 )} λC=σA2(μBμC)+σB2(μAμC)γσA2σB2γ(σA2σB2+σA2σC2+σB2σC2)\lambda_C = \frac{ \sigma_A^2 (\mu_B - \mu_C) + \sigma_B^2 (\mu_A - \mu_C) - \gamma \sigma_A^2 \sigma_B^2 }{ \gamma ( \sigma_A^2 \sigma_B^2 + \sigma_A^2 \sigma_C^2 + \sigma_B^2 \sigma_C^2 )}

Let's solve for the partial derivatives.

dλBdλA=μAμC+γλA(σA2+σC2)+γσC2(1+λB)μBμCγλB(σB2+σC2)+γσC2(1λA)\frac{d \lambda_B}{d\lambda_A} = \frac{ \mu_A - \mu_C + \gamma \lambda_A (\sigma_A^2 + \sigma_C^2) + \gamma \sigma_C^2 (1 +\lambda_B) }{ \mu_B - \mu_C - \gamma \lambda_B (\sigma_B^2 + \sigma_C^2) + \gamma \sigma_C^2 (1 - \lambda_A) } dλAdλB=μBμC+γλB(σB2+σC2)+γσC2(1+λA)μAμCγλA(σA2+σC2)+γσC2(1λB)\frac{d \lambda_A}{d\lambda_B} = \frac{ \mu_B - \mu_C + \gamma \lambda_B (\sigma_B^2 + \sigma_C^2) + \gamma \sigma_C^2 (1 + \lambda_A) }{ \mu_A - \mu_C - \gamma \lambda_A (\sigma_A^2 + \sigma_C^2) + \gamma \sigma_C^2 (1 - \lambda_B) }

Part III: Risky Numeraire

In Part I, we made the case that there may be no such thing as a risk-free asset. In the case of treasuries, the typical example of the risk-free asset, the holder is exposed to inflation risks, dollar fluctuations, interest rate changes, and other factors. Given this, we constructed a model for the optimal allocation between two risky assets.

Every investor has their own basket of goods under which they estimate changes in their real purchasing power.1 This subjective basket is not exactly cash, which was the motivation for the first essay. But what if it were somehow a known tradable asset? Or even more simply, perhaps an investor wants to denominate their returns in ETH, SPY, or some known liquid asset. Does our previous analysis still hold if we change the currency units? In this essay, I examine the case where, instead of using cash as a base for both assets AA and BB, we designate asset AA as the numeraire, expressing asset BB in terms of AA.

Let us begin with the optimal allocations derived from the previous cash-denominated model:

λA=μAμB+γσB2γ(σA2+σB2),λB=μBμA+γσA2γ(σA2+σB2).\lambda_A = \frac{\mu_A - \mu_B + \gamma \sigma_B^2}{\gamma (\sigma_A^2 + \sigma_B^2)}, \quad \lambda_B = \frac{\mu_B - \mu_A + \gamma \sigma_A^2}{\gamma (\sigma_A^2 + \sigma_B^2)}.

Both assets AA and BB are initially denominated in cash. We now shift our perspective by setting asset AA as the numeraire, effectively redefining all quantities in relation to AA. This transition moves us from a cash-denominated framework to one in which asset AA is the central reference point.

Key Definitions in the AA-Denominated Model:

  • SB/AS_{B/A}: Price of asset BB relative to AA.
  • SA/AS_{A/A}: Price of asset AA relative to itself.
  • μB/A\mu_{B/A}, σB/A\sigma_{B/A}: Drift and volatility of BB with respect to AA.
  • λA/A\lambda_{A/A}, λB/A\lambda_{B/A}: Portfolio weights for AA and BB in the AA-denominated framework.
  • WA,tW_{A,t}: Wealth expressed in terms of AA.
  • dWA,tdW_{A,t}: Wealth dynamics in terms of AA.

We start by solving for the simplest expressions.

SA/AS_{A/A} is the price of asset AA relative to itself, so SA/A=SA/ASA/A=1S_{A/A} = \frac{S_{A/A}}{S_{A/A}} = 1.

Since SA/A=1S_{A/A} = 1, its drift and volatility are zero: μA/A=0\mu_{A/A} = 0, σA/A=0\sigma_{A/A} = 0.

The price of BB in terms of AA is given by SB/A=SBSAS_{B/A} = \frac{S_B}{S_A}.

In this model, the drift and volatility of BB relative to AA are defined as:

μB/A=μBμA,σB/A=σA2+σB2.\mu_{B/A} = \mu_B - \mu_A, \quad \sigma_{B/A} = \sqrt{\sigma_A^2 + \sigma_B^2}.

The transition from the cash-denominated optimal BB allocation, λB=μBμA+γσA2γ(σA2+σB2)\lambda_B = \frac{\mu_B - \mu_A + \gamma \sigma_A^2}{\gamma (\sigma_A^2 + \sigma_B^2)}, to the AA-denominated model can be achieved by the following transformations:

Replace the cash-denominated drift difference μBμA\mu_B - \mu_A with μB/A\mu_{B/A}:

μBμA+γσA2γ(σA2+σB2)=μB/A+γσA2γ(σA2+σB2)\frac{\mu_B - \mu_A + \gamma \sigma_A^2}{\gamma (\sigma_A^2 + \sigma_B^2)} = \frac{\mu_{B/A} + \gamma \sigma_A^2}{\gamma (\sigma_A^2 + \sigma_B^2)}

Next, replace the combined variance σA2+σB2\sigma_A^2 + \sigma_B^2 with σB/A2\sigma_{B/A}^2:

μB/A+γσA2γ(σA2+σB2)=μB/A+γσA2γσB/A2\frac{\mu_{B/A} + \gamma \sigma_A^2}{\gamma (\sigma_A^2 + \sigma_B^2)} = \frac{\mu_{B/A} + \gamma \sigma_A^2}{\gamma \sigma_{B/A}^2}

Finally, note that the numeraire asset A/AA/A has zero drift and zero volatility: μA/A=0\mu_{A/A} = 0 and σA/A=0\sigma_{A/A} = 0:

μB/A+γσA2γσB/A2=μB/AγσB/A2\frac{\mu_{B/A} + \gamma \sigma_A^2}{\gamma \sigma_{B/A}^2} = \frac{\mu_{B/A}}{\gamma \sigma_{B/A}^2}

Thus, the optimal weight for BB in the AA-denominated framework becomes:

λB/A=μB/AγσB/A2\lambda_{B/A} = \frac{\mu_{B/A}}{\gamma \sigma_{B/A}^2}

The derivation above is perfectly legitamate, though if we don't want to take as axiom the results of my previous essay, we can start again from scratch.

Fully Deriving the Optimal Portfolio Weights in the AA-Denominated Model

We start with maximizing our expected CRRA utility as before:

maxλB/AE[U(WA,t+dt)]=E[(WA,t+dWA,t)1γ11γ]\underset{\lambda_{B/A}}{\max} \, E\left[U(W_{A,t+dt})\right] = E\left[\frac{(W_{A,t} + dW_{A,t})^{1-\gamma} - 1}{1 - \gamma}\right]

We note that in the AA-denominated model the only risky asset is B/AB/A, so the wealth dynamics are driven solely by B/AB/A. The optimization problem simplifies to maximizing expected utility with respect to λB/A\lambda_{B/A}.

Since SA/A=1S_{A/A} = 1, asset AA contributes no differential to wealth dynamics in AA-terms. Thus, wealth dynamics in this AA-denominated framework are driven solely by B/AB/A:

dWA,t=λB/AWA,tdSB/A,tSB/A,t,dW_{A,t} = \lambda_{B/A} W_{A,t} \frac{dS_{B/A,t}}{S_{B/A,t}}, dWA,t=λB/AWA,t(μB/Adt+σB/AdNt).dW_{A,t} = \lambda_{B/A} W_{A,t} \left( \mu_{B/A} \, dt + \sigma_{B/A} \, dN_t \right).

We now substitute this wealth dynamic term into our expected utility calculation:

maxλB/AE[U(WA,t+dt)]=E[(WA,t+λB/AWA,t(μB/Adt+σB/AdNt))1γ11γ].\underset{\lambda_{B/A}}{\max} \, E\left[U(W_{A,t+dt})\right] = E\left[\frac{(W_{A,t} + \lambda_{B/A} W_{A,t} ( \mu_{B/A} \, dt + \sigma_{B/A} \, dN_t ))^{1-\gamma} - 1}{1 - \gamma}\right].

Applying a second-order Taylor expansion to E[(1+x)1γ]E\left[(1 + x)^{1 - \gamma}\right] where x=λB/A(μB/Adt+σB/AdNt)x = \lambda_{B/A} \left( \mu_{B/A} \, dt + \sigma_{B/A} \, dN_t \right), we get:

E[(1+x)1γ]1+(1γ)E[x]+(1γ)(γ)2E[x2].E\left[(1 + x)^{1 - \gamma}\right] \approx 1 + (1 - \gamma) E[x] + \frac{(1 - \gamma)(-\gamma)}{2} E[x^2].

E[x]=λB/AμB/AdtE[x] = \lambda_{B/A} \mu_{B/A} dt

E[x2]=λB/A2σB/A2dtE[x^2] = \lambda_{B/A}^2 \sigma_{B/A}^2 dt

We now simplify our expected utility expression:

E[U(WA,t+dt)]WA,t1γ(1+(1γ)λB/AμB/Adt+(1γ)(γ)2λB/A2σB/A2dt)11γ.E[U(W_{A,t+dt})] \approx \frac{W_{A,t}^{1 - \gamma} \left(1 + (1 - \gamma) \lambda_{B/A} \mu_{B/A} dt + \frac{(1 - \gamma)(-\gamma)}{2} \lambda_{B/A}^2 \sigma_{B/A}^2 dt \right) - 1}{1 - \gamma}.

Now we differentiate the bracketed term with respect to λB/A\lambda_{B/A} and set the derivative to zero:

ddλB/A[(1γ)λB/AμB/Adtγ(1γ)2λB/A2σB/A2dt]=0.\frac{d}{d\lambda_{B/A}} \left[ (1 - \gamma) \lambda_{B/A} \mu_{B/A} dt - \frac{\gamma (1 - \gamma)}{2} \lambda_{B/A}^2 \sigma_{B/A}^2 dt \right] = 0.

Simplify:

(1γ)μB/Adtγ(1γ)λB/AσB/A2dt=0.(1 - \gamma) \mu_{B/A} dt - \gamma (1 - \gamma) \lambda_{B/A} \sigma_{B/A}^2 dt = 0.

Divide both sides by (1γ)dt(1 - \gamma) dt:

μB/AγλB/AσB/A2=0.\mu_{B/A} - \gamma \lambda_{B/A} \sigma_{B/A}^2 = 0.

Thus, the optimal weight for B/AB/A is

λB/A=μB/AγσB/A2\lambda_{B/A} = \frac{\mu_{B/A}}{\gamma \sigma_{B/A}^2}

And the optimal weight for A/AA/A is:

λA/A=1λB/A=1μB/AγσB/A2\lambda_{A/A} = 1 - \lambda_{B/A} = 1 - \frac{\mu_{B/A}}{\gamma \sigma_{B/A}^2}

Conclusion:

We can now express the optimal weights in the AA-denominated model directly in terms of μB/A\mu_{B/A} and σB/A2\sigma_{B/A}^2.

λB/A=μB/AγσB/A2,λA/A=1λB/A.\lambda_{B/A} = \frac{\mu_{B/A}}{\gamma \sigma_{B/A}^2}, \quad \lambda_{A/A} = 1 - \lambda_{B/A}.

Thus, after transforming the cash-denominated model's optimal weights to the AA-denominated model's optimal weights, we see that we've derived the famous Merton share.

Footnotes

  1. I am skeptical of standard national CPI measures, as discussed in Chapter 5 of Keynes' Treatise on Money. I think exact CPI calculation seems like a fundamentally futile task, though it's nuanced so I haven't made up my mind yet.