Optimal Portfolio Weights Under CRRA Utility

Summary

This series of essays explores the optimization of portfolio weights to maximize a Constant Relative Risk Aversion (CRRA) utility function over an agent's wealth. We use classic stochastic calculus techniques to model price processes as Geometric Brownian Motion (GBM). In Part I, we derive the optimal allocation between two risky assets and find our solution is an extension of the famous Merton Share. In Part II, we extend the analysis to three assets and then to an n-asset model. In Part III, we examine what happens when we change the numeraire from cash to a risky asset.


Part I: The Binary Asset Model

Model Definition

Suppose we have a universe of two stocks, AA and BB, modeled as independent GBM processes with parameters μA\mu_A, μB\mu_B, σA\sigma_A, and σB\sigma_B. We also define λA\lambda_A and λB\lambda_B to be our portfolio weights for assets AA and BB, such that λA+λB=1\lambda_A + \lambda_B = 1. Finally, we have a CRRA utility function over possible wealth states WW such that U(W)=W1γ11γU(W) = \frac{W^{1-\gamma} - 1}{1 - \gamma} and γ\gamma is our relative risk aversion parameter. In what follows, we attempt to find the optimal portfolio weights λ\lambda which maximize the expected utility of our future wealth.


Deriving a Closed Form Expected Utility Function

We intend to find the portfolio allocation [λA,λB][\lambda_A, \lambda_B] which maximizes the expected utility of our wealth in the next period, that is: maxλA,λBE[U(Wt+dt)].\underset{\lambda_A, \lambda_B}{\max} \, E[U(W_{t+dt})].

Incorporating our CRRA utility function yields E[U(Wt+dt)]=E[(Wt+dWt)1γ11γ]E[U(W_{t+dt})] = E\left[\frac{(W_t + dW_t)^{1-\gamma} - 1}{1-\gamma}\right].

We can now define our wealth dynamic dWtdW_t as evolving according to the chosen portfolio weights λA\lambda_A and λB\lambda_B

dWt=λAWtdSA,tSA,t+λBWtdSB,tSB,t.dW_t = \lambda_A W_t \frac{dS_{A,t}}{S_{A,t}} + \lambda_B W_t \frac{dS_{B,t}}{S_{B,t}}.

Similarly, we note that each stock's price follows a GBM, defined by the stochastic differential equations (SDEs)

dSA,t=μASA,tdt+σASA,tdNA,tdSB,t=μBSB,tdt+σBSB,tdNB,t,\begin{align*} dS_{A,t} &= \mu_A S_{A,t} \, dt + \sigma_A S_{A,t} \, dN_{A,t} \\ dS_{B,t} &= \mu_B S_{B,t} \, dt + \sigma_B S_{B,t} \, dN_{B,t}, \end{align*}

where Ni,tN_{i,t} is our notation for Wiener process on asset ii at time tt.

Substituting both individual asset processes into our wealth SDE yields

dWt=λAWtSA,t(μASA,tdt+σASA,tdNA,t)+λBWtSB,t(μBSB,tdt+σBSB,tdNB,t),dW_t = \lambda_A \frac{W_t}{S_{A,t}} (\mu_A S_{A,t} \, dt + \sigma_A S_{A,t} \, dN_{A,t}) + \lambda_B \frac{W_t}{S_{B,t}} (\mu_B S_{B,t} \, dt + \sigma_B S_{B,t} \, dN_{B,t}),

which we can simplify to

dWt=Wt[λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t)].dW_t = W_t \left[ \lambda_A (\mu_A \, dt + \sigma_A \, dN_{A,t}) + \lambda_B (\mu_B \, dt + \sigma_B \, dN_{B,t}) \right].

Now, we can substitute this into our expected utility equation:

E[U(Wt+dt)]=E[(Wt+Wt(λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t)))1γ11γ].E[U(W_{t+dt})] = E\left[\frac{(W_t + W_t (\lambda_A (\mu_A dt + \sigma_A dN_{A,t}) + \lambda_B (\mu_B dt + \sigma_B dN_{B,t})))^{1-\gamma}-1}{1-\gamma}\right].

From here, we begin the process of simplifying this expectation to:

E[(Wt+Wt(λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t)))1γ]11γ.\frac{E\left[(W_t + W_t (\lambda_A (\mu_A dt + \sigma_A dN_{A,t}) + \lambda_B (\mu_B dt + \sigma_B dN_{B,t})))^{1-\gamma}\right] - 1}{1-\gamma}.

Then we can pull the Wt1γW_t^{1-\gamma} out of the expectation yielding

Wt1γE[(1+λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t))1γ]11γ.\frac{W_t^{1-\gamma} E\left[\left(1 + \lambda_A (\mu_A dt + \sigma_A dN_{A,t}) + \lambda_B (\mu_B dt + \sigma_B dN_{B,t})\right)^{1-\gamma}\right] - 1}{1-\gamma}.

We now consider the second-order Taylor series expansion of E[(1+x)1γ]E[(1 + x)^{1-\gamma}] around 1 because we know xx will be very small since we're dealing with an infinitesimally small time increment dtdt. In this case, x=λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t)x=\lambda_A (\mu_A dt + \sigma_A dN_{A,t}) + \lambda_B (\mu_B dt + \sigma_B dN_{B,t}).

We remember that the Taylor series of a function f(x)f(x) around a point aa is given by

f(x)=f(a)+f(a)(xa)+f(a)2!(xa)2+f(x) = f(a) + f'(a)(x - a) + \frac{f''(a)}{2!}(x - a)^2 + \ldots

and we are careful to make sure to include the second order term which includes our volatility parameters.

This implies that E[(1+x)1γ]1+(1γ)E[x]+(1γ)(γ)2E[x2]E[(1 + x)^{1-\gamma}] \approx 1 + (1 - \gamma)E[x] + \frac{(1 - \gamma)(-\gamma)}{2} E[x^2].

Using x=λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t)x=\lambda_A (\mu_A dt + \sigma_A dN_{A,t}) + \lambda_B (\mu_B dt + \sigma_B dN_{B,t}), we see that E[x]=(λAμA+λBμB)dtE[x] = (\lambda_A \mu_A + \lambda_B \mu_B)dt because E[dNi,t]=0E[dN_{i,t}]=0.

Furthermore, to solve for E[x2]E[x^2], we substitute in xx which gives us E[x2]=E[(λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t))2].E[x^2] = E[(\lambda_A (\mu_A dt + \sigma_A dN_{A,t}) + \lambda_B (\mu_B dt + \sigma_B dN_{B,t}))^2].

From here, we expand out the expression to

E[x2]=E[(λA(μAdt+σAdNA,t))2+2λAλB(μAdt+σAdNA,t)(μBdt+σBdNB,t)+(λB(μBdt+σBdNB,t))2].E[x^2] = E[(\lambda_A (\mu_A dt + \sigma_A dN_{A,t}))^2 + 2 \lambda_A \lambda_B (\mu_A dt + \sigma_A dN_{A,t}) (\mu_B dt + \sigma_B dN_{B,t}) + (\lambda_B (\mu_B dt + \sigma_B dN_{B,t}))^2].

We now have to do some algebra to untangle this a bit further by expanding each of these terms.

(λA(μAdt+σAdNA,t))2=λA2μA2dt2+2λA2μAσAdtdNA,t+λA2σA2dNA,t2(\lambda_A (\mu_A dt + \sigma_A dN_{A,t}))^2 = \lambda_A^2 \mu_A^2 dt^2 + 2\lambda_A^2 \mu_A \sigma_A dt \, dN_{A,t} + \lambda_A^2 \sigma_A^2 dN_{A,t}^2

(λB(μBdt+σBdNB,t))2=λB2μB2dt2+2λB2μBσBdtdNB,t+λB2σB2dNB,t2(\lambda_B (\mu_B dt + \sigma_B dN_{B,t}))^2 = \lambda_B^2 \mu_B^2 dt^2 + 2\lambda_B^2 \mu_B \sigma_B dt \, dN_{B,t} + \lambda_B^2 \sigma_B^2 dN_{B,t}^2

2λAλB(μAdt+σAdNA,t)(μBdt+σBdNB,t)=2λAλBμAμBdt2+2λAλBμAσBdtdNB,t+2λAλBμBσAdtdNA,t+2λAλBσAσBdNA,tdNB,t2\lambda_A \lambda_B (\mu_A dt + \sigma_A dN_{A,t})(\mu_B dt + \sigma_B dN_{B,t}) = 2\lambda_A \lambda_B \mu_A \mu_B dt^2 + 2\lambda_A \lambda_B \mu_A \sigma_B dt \, dN_{B,t} + 2\lambda_A \lambda_B \mu_B \sigma_A dt \, dN_{A,t} + 2\lambda_A \lambda_B \sigma_A \sigma_B dN_{A,t} dN_{B,t}

At this point, some properties of Brownian motion come to our aid, particularly that E[dNi,t]=0E[dN_{i,t}]=0, E[dNi,t2]=dtE[dN_{i,t}^2]=dt, and E[dNA,tdNB,t]=0E[dN_{A,t} dN_{B,t}]=0 (since AA and BB are independent processes).

We can make the following simplifications:

(λA(μAdt+σAdNA,t))2=λA2σA2dt(\lambda_A (\mu_A dt + \sigma_A dN_{A,t}))^2 = \lambda_A^2 \sigma_A^2 dt

(λB(μBdt+σBdNB,t))2=λB2σB2dt(\lambda_B (\mu_B dt + \sigma_B dN_{B,t}))^2 = \lambda_B^2 \sigma_B^2 dt

2λAλB(μAdt+σAdNA,t)(μBdt+σBdNB,t)=02\lambda_A \lambda_B (\mu_A dt + \sigma_A dN_{A,t})(\mu_B dt + \sigma_B dN_{B,t}) = 0

Putting it all together, the simplified expression for E[x2]=λA2σA2dt+λB2σB2dtE[x^2] = \lambda_A^2 \sigma_A^2 dt + \lambda_B^2 \sigma_B^2 dt.

Returning back to the earlier expectation we've been trying to solve with these new results in hand, we see that

E[(1+x)1γ]=1+(1γ)(λAμA+λBμB)dt+(1γ)(γ)2(λA2σA2dt+λB2σB2dt).E[(1 + x)^{1-\gamma}] = 1 + (1 - \gamma)(\lambda_A \mu_A + \lambda_B \mu_B)dt + \frac{(1 - \gamma)(-\gamma)}{2} (\lambda_A^2 \sigma_A^2 dt + \lambda_B^2 \sigma_B^2 dt).

This means we can now write our full expected utility maximization equation as:

E[U(Wt+dt)]=Wt1γ(1+(1γ)(λAμA+λBμB)dt+(1γ)(γ)2(λA2σA2+λB2σB2)dt)11γE[U(W_{t+dt})] = \frac{W_t^{1-\gamma} \left(1 + (1 - \gamma)(\lambda_A \mu_A + \lambda_B \mu_B)dt + \frac{(1 - \gamma)(-\gamma)}{2} (\lambda_A^2 \sigma_A^2 + \lambda_B^2 \sigma_B^2)dt\right) - 1}{1-\gamma}

Quick aside, my hunch is that if we were to extend to a multi-asset model this becomes: E[U(Wt+dt)]=Wt1γ(1+(1γ)(i=1nλiμi)dt+(1γ)(γ)2(i=1nλi2σi2)dt)11γE[U(W_{t+dt})] = \frac{W_t^{1-\gamma} \left(1 + (1 - \gamma)\left(\sum_{i=1}^{n} \lambda_i \mu_i\right)dt + \frac{(1 - \gamma)(-\gamma)}{2} \left(\sum_{i=1}^{n} \lambda_i^2 \sigma_i^2\right)dt\right) - 1}{1-\gamma}

Now, because we're in a dual-asset model where the weights sum to 100%, λB\lambda_B is determined to be 1λA1-\lambda_A. We can subsitute this into the expression above which gives us:

E[U(Wt+dt)]=Wt1γ[1+(1γ)(λAμA+(1λA)μB)dt+(1γ)(γ)2(λA2σA2+(1λA)2σB2)dt]11γE[U(W_{t+dt})] = \frac{W_t^{1-\gamma} \left[1 + (1 - \gamma)(\lambda_A \mu_A + (1-\lambda_A) \mu_B)dt + \frac{(1 - \gamma)(-\gamma)}{2} (\lambda_A^2 \sigma_A^2 + (1-\lambda_A)^2 \sigma_B^2)dt\right] - 1}{1-\gamma}

Optimizing Portfolio Weights to Maximize Expected Utility

In order to maximize E[U(Wt+dt)]E[U(W_{t+dt})], we can follow the classic method of differentiating the function with respect to λA\lambda_A and setting this partial derivative equal to zero.

First, we notice that the term W1γ11γ\frac{W^{1-\gamma} - 1}{1 - \gamma} is a constant with respect to λA\lambda_A, so we can focus on differentiating only the bracketed expression, which we can denote as f(λA)f(\lambda_A):

f(λA)=1+(1γ)(λAμA+(1λA)μB)dt+(1γ)(γ)2(λA2σA2+(1λA)2σB2)dt.f(\lambda_A) = 1 + (1 - \gamma)(\lambda_A \mu_A + (1 - \lambda_A) \mu_B)dt + \frac{(1-\gamma)(-\gamma)}{2} \left( \lambda_A^2 \sigma_A^2 + (1 - \lambda_A)^2 \sigma_B^2 \right) dt.

Now, we differentiate f(λA)f(\lambda_A) with respect to λA\lambda_A:

dfdλA=(1γ)(μAμB)dt+(1γ)(γ)(λAσA2(1λA)σB2)dt.\frac{df}{d\lambda_A} = (1 - \gamma) (\mu_A - \mu_B) dt + (1 - \gamma)(-\gamma) \left( \lambda_A \sigma_A^2 - (1 - \lambda_A) \sigma_B^2 \right) dt.

This simplifies to:

dfdλA=(1γ)(μAμB)dt+(1γ)(γ)(λA(σA2+σB2)σB2)dt.\frac{df}{d\lambda_A} = (1 - \gamma) (\mu_A - \mu_B) dt + (1 - \gamma)(-\gamma) \left( \lambda_A (\sigma_A^2 + \sigma_B^2) - \sigma_B^2 \right) dt.

To find the optimal λA\lambda_A, set dfdλA=0\frac{df}{d\lambda_A} = 0 and solve for λA\lambda_A:

(1γ)(μAμB)dt+(1γ)(γ)(λA(σA2+σB2)σB2)dt=0.(1 - \gamma) (\mu_A - \mu_B) dt + (1 - \gamma)(-\gamma) \left( \lambda_A (\sigma_A^2 + \sigma_B^2) - \sigma_B^2 \right) dt = 0.

We can see that (1γ)dt(1 - \gamma) dt is a common factor in both terms, and since dtdt is an infinitesimal time increment (which importantly is not zero), we can simplify the equation by dividing through by (1γ)dt(1 - \gamma) dt:

μAμBγ(λA(σA2+σB2)σB2)=0.\mu_A - \mu_B - \gamma (\lambda_A (\sigma_A^2 + \sigma_B^2) - \sigma_B^2) = 0.

Now we rearrange the equation to solve for λA\lambda_A:

γλA(σA2+σB2)=μAμB+γσB2\gamma \lambda_A (\sigma_A^2 + \sigma_B^2) = \mu_A - \mu_B + \gamma \sigma_B^2

λA(σA2+σB2)=μAμB+γσB2γ\lambda_A (\sigma_A^2 + \sigma_B^2) = \frac{\mu_A - \mu_B + \gamma \sigma_B^2}{\gamma}

From this we have found our optimal λA\lambda_A:

λA=μAμB+γσB2γ(σA2+σB2).\lambda_A = \frac{\mu_A - \mu_B + \gamma \sigma_B^2}{\gamma (\sigma_A^2 + \sigma_B^2)}.

The optimal λB\lambda_B follows easily:

λB=1μAμB+γσB2γ(σA2+σB2)=μBμA+γσA2γ(σA2+σB2).\lambda_B = 1 - \frac{\mu_A - \mu_B + \gamma \sigma_B^2}{\gamma (\sigma_A^2 + \sigma_B^2)} = \frac{\mu_B - \mu_A + \gamma \sigma_A^2}{\gamma (\sigma_A^2 + \sigma_B^2)}.


Conclusion

Given two assets modeled as independent GBM processes, wealth WtW_t, and a CRRA utility function, we have found that the optimal allocation to asset AA is λA=μAμB+γσB2γ(σA2+σB2)\lambda_A = \frac{\mu_A - \mu_B + \gamma \sigma_B^2}{\gamma (\sigma_A^2 + \sigma_B^2)} and the optimal allocation to asset BB is λB=μBμA+γσA2γ(σA2+σB2)\lambda_B = \frac{\mu_B - \mu_A + \gamma \sigma_A^2}{\gamma (\sigma_A^2 + \sigma_B^2)}.

You might have noticed that the portfolio allocations λA\lambda_A and λB\lambda_B don't have a subscript tt. This is because, given that the price processes are stationary and that our risk aversion parameter γ\gamma does not change, they are time and wealth independent! This implies a constant fractional allocation to each stock in our portfolio.

Notably, in the case where BB is a risk-free investment, implying that σB2=0\sigma_B^2=0, our optimal allocation reduces to λA=μAμBγσA2\lambda_A = \frac{\mu_A - \mu_B}{\gamma \sigma_A^2}, which is the famous Merton Share!

While all models are lossy, I take issue with the idea of a risk-free rate. In particular, the real returns on a nation's treasuries are sensitive to interest rate changes, inflation, and currency fluctuations. It also shouldn't be overlooked that big debt crises occur fairly regularly and nations do default. With this in mind, I think extending the dual-asset Merton Share model to two risky assets is an improvement toward realism.


Part II: Extension to Three Assets

In Part I, we derived the optimal allocations under an independent binary asset model where the two stocks follow geometric Brownian motion processes. We now extend the analysis to three assets and then to an n-asset model.

Model Definition We have three assets AA, BB, and CC whose price processes follow GBM processeses with parameters (μA,σA)(\mu_A, \sigma_A), (μB,σB)(\mu_B, \sigma_B), and (μC,σC)(\mu_C, \sigma_C), respectively. We allocate our wealth WW between AA, BB, and CC in proportion λA\lambda_A, λB\lambda_B, and λC\lambda_C, respectively, such that λA+λB+λC=1\lambda_A + \lambda_B + \lambda_C = 1. We again maintain a Constant Relative Risk Aversion (CRRA) utility function U(W)=W1γ11γU(W) = \frac{W^{1-\gamma} - 1}{1 - \gamma} where WtW_t is our wealth at time tt and γ\gamma is our relative risk aversion parameter.

Deriving Our Expected Utility Function

We intend to find the portfolio allocation [λA,λB,λC][\lambda_A, \lambda_B, \lambda_C] which maximizes the expected utility of our wealth in the next period, that is: maxλA,λB,λCE[U(Wt+dt)].\underset{\lambda_A, \lambda_B, \lambda_C}{\max} \, E[U(W_{t+dt})].

After incorporating our CRRA utility function, we see E[U(Wt+dt)]=E[(Wt+dWt)1γ11γ]E[U(W_{t+dt})] = E\left[\frac{(W_t + dW_t)^{1-\gamma} - 1}{1-\gamma}\right].

We can now define our wealth dynamic dWtdW_t as evolving according to the chosen portfolio weights λA\lambda_A, λB\lambda_B, and λC\lambda_C.

dWt=λAWtdSA,tSA,t+λBWtdSB,tSB,t+λCWtdSC,tSC,t.dW_t = \lambda_A W_t \frac{dS_{A,t}}{S_{A,t}} + \lambda_B W_t \frac{dS_{B,t}}{S_{B,t}} + \lambda_C W_t \frac{dS_{C,t}}{S_{C,t}}.

Similarly, we note that each stock's price follows a GBM, defined by the stochastic differential equations (SDEs)

dSA,t=μASA,tdt+σASA,tdNA,tdSB,t=μBSB,tdt+σBSB,tdNB,tdSC,t=μCSC,tdt+σCSC,tdNC,t,\begin{align*} dS_{A,t} &= \mu_A S_{A,t} \, dt + \sigma_A S_{A,t} \, dN_{A,t} \\ dS_{B,t} &= \mu_B S_{B,t} \, dt + \sigma_B S_{B,t} \, dN_{B,t} \\ dS_{C,t} &= \mu_C S_{C,t} \, dt + \sigma_C S_{C,t} \, dN_{C,t}, \end{align*}

where Ni,tN_{i,t} is our notion for Wiener process on asset ii at time tt.

Substituting all three individual asset processes into our wealth SDE yields

dWt=λAWtSA,t(μASA,tdt+σASA,tdNA,t)+λBWtSB,t(μBSB,tdt+σBSB,tdNB,t)+λCWtSC,t(μCSC,tdt+σCSC,tdNC,t),dW_t = \lambda_A \frac{W_t}{S_{A,t}} (\mu_A S_{A,t} \, dt + \sigma_A S_{A,t} \, dN_{A,t}) + \lambda_B \frac{W_t}{S_{B,t}} (\mu_B S_{B,t} \, dt + \sigma_B S_{B,t} \, dN_{B,t}) + \lambda_C \frac{W_t}{S_{C,t}} (\mu_C S_{C,t} \, dt + \sigma_C S_{C,t} \, dN_{C,t}),

which we can simplify to

dWt=λA(μAWtdt+σAWtdNA,t)+λB(μBWtdt+σBWtdNB,t)+λC(μCWtdt+σCWtdNC,t),dW_t = \lambda_A (\mu_A W_t \, dt + \sigma_A W_t \, dN_{A,t}) + \lambda_B (\mu_B W_t \, dt + \sigma_B W_t \, dN_{B,t}) + \lambda_C (\mu_C W_t \, dt + \sigma_C W_t \, dN_{C,t}),

and then to

dWt=Wt[λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t)+λC(μCdt+σCdNC,t)].dW_t = W_t \left[ \lambda_A (\mu_A \, dt + \sigma_A \, dN_{A,t}) + \lambda_B (\mu_B \, dt + \sigma_B \, dN_{B,t}) + \lambda_C (\mu_C \, dt + \sigma_C \, dN_{C,t}) \right].

Now we can substitute this into our expected utility equation:

E[U(Wt+dt)]=E[(Wt+Wt(λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t)+λC(μCdt+σCdNC,t)+))1γ11γ]E[U(W_{t+dt})] = E\left[ \frac {(W_t + W_t ( \lambda_A (\mu_A dt + \sigma_A dN_{A,t}) + \lambda_B (\mu_B dt + \sigma_B dN_{B,t}) + \lambda_C (\mu_C dt + \sigma_C dN_{C,t}) + ))^{1-\gamma}-1} {1-\gamma} \right]

From here, we begin the process of simplifying this expectation to:

E[(Wt+Wt(λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t)+λC(μCdt+σCdNC,t)))1γ]11γ.\frac {E\left[(W_t + W_t ( \lambda_A (\mu_A dt + \sigma_A dN_{A,t}) + \lambda_B (\mu_B dt + \sigma_B dN_{B,t}) + \lambda_C (\mu_C dt + \sigma_C dN_{C,t}) ))^{1-\gamma}\right] - 1} {1-\gamma}.

Then we can pull the Wt1γW_t^{1-\gamma} out of the expectation yielding

Wt1γE[(1+λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t)+λB(μCdt+σBdNC,t))1γ]11γ.\frac {W_t^{1-\gamma} E\left[\left(1 + \lambda_A (\mu_A dt + \sigma_A dN_{A,t}) + \lambda_B (\mu_B dt + \sigma_B dN_{B,t}) + \lambda_B (\mu_C dt + \sigma_B dN_{C,t}) \right)^{1-\gamma}\right] - 1} {1-\gamma}.

We now consider the second-order Taylor series expansion of E[(1+x)1γ]E[(1 + x)^{1-\gamma}] around 1 because we know xx will be very small since we're dealing with an infinitesimally small time increment dtdt. In this case, x=λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t)+λC(μCdt+σCdNC,t)x= \lambda_A (\mu_A dt + \sigma_A dN_{A,t}) + \lambda_B (\mu_B dt + \sigma_B dN_{B,t}) + \lambda_C (\mu_C dt + \sigma_C dN_{C,t}).

We remember that the Taylor series of a function f(x)f(x) around a point aa is given by

f(x)=f(a)+f(a)(xa)+f(a)2!(xa)2+f(x) = f(a) + f'(a)(x - a) + \frac{f''(a)}{2!}(x - a)^2 + \ldots

and we are careful to make sure to include the second order term which includes our volatility parameters.

This implies that E[(1+x)1γ]1+(1γ)E[x]+(1γ)(γ)2E[x2]E[(1 + x)^{1-\gamma}] \approx 1 + (1 - \gamma)E[x] + \frac{(1 - \gamma)(-\gamma)}{2} E[x^2].

Using x=λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t)+λC(μCdt+σCdNC,t)x= \lambda_A (\mu_A dt + \sigma_A dN_{A,t}) + \lambda_B (\mu_B dt + \sigma_B dN_{B,t}) + \lambda_C (\mu_C dt + \sigma_C dN_{C,t}), we see that E[x]=(λAμA+λBμB+λCμC)dtE[x] = (\lambda_A \mu_A + \lambda_B \mu_B + \lambda_C \mu_C)dt because E[dNi,t]=0E[dN_{i,t}]=0.

Furthermore, to solve for E[x2]E[x^2], we substitute in xx which gives us

E[x2]=E[(λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t)+λC(μCdt+σCdNC,t))2].E[x^2] = E[( \lambda_A (\mu_A dt + \sigma_A dN_{A,t}) + \lambda_B (\mu_B dt + \sigma_B dN_{B,t}) + \lambda_C (\mu_C dt + \sigma_C dN_{C,t}) )^2].

Though it seems a bit unwieldy, we can simplify the expression E[(λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t)+λC(μCdt+σCdNC,t))2]E\left[\left(\lambda_A (\mu_A dt + \sigma_A dN_{A,t}) + \lambda_B (\mu_B dt + \sigma_B dN_{B,t}) + \lambda_C (\mu_C dt + \sigma_C dN_{C,t})\right)^2\right], we need to first expand the square and use the properties of Wiener processes, notably that E[dNi,t]=0E[dN_{i,t}] = 0 and E[dNi,t2]=dtE[dN_{i,t}^2] = dt.

Expanding the square gives:

λA2(μAdt+σAdNA,t)2+λB2(μBdt+σBdNB,t)2+λC2(μCdt+σCdNC,t)2+2λAλB(μAdt+σAdNA,t)(μBdt+σBdNB,t)+2λAλC(μAdt+σAdNA,t)(μCdt+σCdNC,t)+2λBλC(μBdt+σBdNB,t)(μCdt+σCdNC,t)\begin{align*} &\lambda_A^2 (\mu_A dt + \sigma_A dN_{A,t})^2 + \lambda_B^2 (\mu_B dt + \sigma_B dN_{B,t})^2 + \lambda_C^2 (\mu_C dt + \sigma_C dN_{C,t})^2 \\ &+ 2 \lambda_A \lambda_B (\mu_A dt + \sigma_A dN_{A,t})(\mu_B dt + \sigma_B dN_{B,t}) \\ &+ 2 \lambda_A \lambda_C (\mu_A dt + \sigma_A dN_{A,t})(\mu_C dt + \sigma_C dN_{C,t}) \\ &+ 2 \lambda_B \lambda_C (\mu_B dt + \sigma_B dN_{B,t})(\mu_C dt + \sigma_C dN_{C,t}) \end{align*}

We can now simplify each term by considering the properties of dNi,tdN_{i,t} noted before:

For terms like λA2(μAdt+σAdNA,t)2\lambda_A^2 (\mu_A dt + \sigma_A dN_{A,t})^2, the expansion will give λA2μA2dt2+2λA2μAσAdtdNA,t+λA2σA2dNA,t2\lambda_A^2 \mu_A^2 dt^2 + 2 \lambda_A^2 \mu_A \sigma_A dt dN_{A,t} + \lambda_A^2 \sigma_A^2 dN_{A,t}^2. When taking the expected value of this, the dtdNA,tdt dN_{A,t} term disappears, and dNA,t2dN_{A,t}^2 becomes dtdt, leaving λA2σA2dt\lambda_A^2 \sigma_A^2 dt.

When taking the expected value of this, the dt,dNA,tdt , dN_{A,t} term disappears, and dNA,t2dN_{A,t}^2 becomes dtdt, leaving λA2σA2dt\lambda_A^2 \sigma_A^2 dt.

The cross terms like 2λAλB(μAdt+σAdNA,t)(μBdt+σBdNB,t)2 \lambda_A \lambda_B (\mu_A dt + \sigma_A dN_{A,t})(\mu_B dt + \sigma_B dN_{B,t}) expand out to

2λAλBμAμBdt2+2λAλBμAσBdtdNB,t+2λAλBσAμBdtdNA,t+2λAλBσAσBdNA,tdNB,t.2 \lambda_A \lambda_B \mu_A \mu_B dt^2 + 2 \lambda_A \lambda_B \mu_A \sigma_B dt \, dN_{B,t} + 2 \lambda_A \lambda_B \sigma_A \mu_B dt \, dN_{A,t} + 2 \lambda_A \lambda_B \sigma_A \sigma_B dN_{A,t} \, dN_{B,t}. Each term in this expression goes to zero because E[dt2]=0E[dt^2]=0 and E[Ni,t]=0E[N_{i,t}]=0.

After applying these simplifications, the expected value expression becomes:

[(λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t)+λC(μCdt+σCdNC,t))2]=(λA2σA2+λB2σB2+λC2σC2)dt.\left[\left( \lambda_A (\mu_A dt + \sigma_A dN_{A,t}) + \lambda_B (\mu_B dt + \sigma_B dN_{B,t}) + \lambda_C (\mu_C dt + \sigma_C dN_{C,t}) \right)^2\right] = (\lambda_A^2 \sigma_A^2 + \lambda_B^2 \sigma_B^2 + \lambda_C^2 \sigma_C^2)dt.

Returning back to the earlier expectation we've been trying to solve with these new results in hand, we see that

E[(1+x)1γ]=1+(1γ)(λAμA+λBμB+λCμC)dt+(1γ)(γ)2(λA2σA2dt+λB2σB2dt+λC2σC2dt).E[(1 + x)^{1-\gamma}] = 1 + (1 - \gamma)( \lambda_A \mu_A + \lambda_B \mu_B + \lambda_C \mu_C )dt + \frac{(1 - \gamma)(-\gamma)}{2} ( \lambda_A^2 \sigma_A^2 dt + \lambda_B^2 \sigma_B^2 dt + \lambda_C^2 \sigma_C^2 dt ).

This means we can now write our full expected utility maximization equation as:

E[U(Wt+dt)]=Wt1γ[1+(1γ)(λAμA+λBμB+λCμC)dt+(1γ)(γ)2(λA2σA2+λB2σB2+λC2σC2)dt]11γE[U(W_{t+dt})] = \frac{W_t^{1-\gamma} \left[1 + (1 - \gamma)( \lambda_A \mu_A + \lambda_B \mu_B + \lambda_C \mu_C )dt + \frac{(1 - \gamma)(-\gamma)}{2} ( \lambda_A^2 \sigma_A^2 + \lambda_B^2 \sigma_B^2 + \lambda_C^2 \sigma_C^2 )dt\right] - 1}{1-\gamma}

We make one final adjustment by including our λA+λB+λC=1\lambda_A + \lambda_B + \lambda_C = 1 constraint to reduce a degree of freedom our model.

We first substitute λC=1λAλB\lambda_C = 1 - \lambda_A - \lambda_B into the linear term λAμA+λBμB+λCμC=λAμA+λBμB+(1λAλB)μC=λA(μAμC)+λB(μBμC)+μC\lambda_A \mu_A + \lambda_B \mu_B + \lambda_C \mu_C = \lambda_A \mu_A + \lambda_B \mu_B + (1 - \lambda_A - \lambda_B) \mu_C = \lambda_A (\mu_A - \mu_C) + \lambda_B (\mu_B - \mu_C) + \mu_C. This also makes intuitive sense because initially the expression was the sum of all of our allocation percentages times the average return of those investments which is the expected return of our portfolio. The final expression is the same expected return of our portfolio, except we can conceptualize this as 100% of our portfolio returning μC\mu_C, and then for each non-CC asset we compute how much more or less we'd make on that fraction of our portfolio against a CC-based benchmark.

Now we need to handle the quadratic term, λA2σA2+λB2σB2+λC2σC2\lambda_A^2 \sigma_A^2 + \lambda_B^2 \sigma_B^2 + \lambda_C^2 \sigma_C^2. We substitute out λC\lambda_C which yields λA2σA2+λB2σB2+(1λAλB)2σC2\lambda_A^2 \sigma_A^2 + \lambda_B^2 \sigma_B^2 + (1 - \lambda_A - \lambda_B)^2 \sigma_C^2. We note that generally (1i=1nxi)2=12i=1nxi+21i<jnxixj+i=1nxi2(1 - \sum_{i=1}^{n} x_i)^2 = 1 - 2 \sum_{i=1}^{n} x_i + 2 \sum_{1 \leq i < j \leq n} x_ix_j + \sum_{i=1}^{n} x_i^2, this generalization will help us when we extend to the n-asset framework, but we can use it in our three-asset model too.

Now we expand and simplify the quadratic term for λC\lambda_C:

=λA2σA2+λB2σB2+(12λA2λB+2λAλB+λA2+λB2)σC2= \lambda_A^2 \sigma_A^2 + \lambda_B^2 \sigma_B^2 + ( 1 - 2\lambda_A - 2\lambda_B + 2\lambda_A\lambda_B + \lambda_A^2 + \lambda_B^2 ) \sigma_C^2 =λA2σA2+λB2σB2+σC22λAσC22λBσC2+2λAλBσC2+λA2σC2+λB2σC2= \lambda_A^2 \sigma_A^2 + \lambda_B^2 \sigma_B^2 + \sigma_C^2 - 2\lambda_A \sigma_C^2 - 2\lambda_B \sigma_C^2 + 2\lambda_A\lambda_B \sigma_C^2 + \lambda_A^2 \sigma_C^2 + \lambda_B^2 \sigma_C^2 =λA2(σA2+σC2)+λB2(σB2+σC2)+2λAλBσC22λAσC22λBσC2+σC2= \lambda_A^2 (\sigma_A^2 + \sigma_C^2) + \lambda_B^2 (\sigma_B^2 + \sigma_C^2) + 2\lambda_A\lambda_B \sigma_C^2 - 2\lambda_A \sigma_C^2 - 2\lambda_B \sigma_C^2 + \sigma_C^2

We can now substitute these simplified expressions back into the original equation:

Wt1γ(1+(1γ)(λA(μAμC)+λB(μBμC)+μC)dt+(1γ)(γ)2(λA2(σA2+σC2)+λB2(σB2+σC2)+2λAλBσC22λAσC22λBσC2+σC2)dt)11γ\frac{W_t^{1-\gamma} \left( 1 + (1 - \gamma) \left( \lambda_A (\mu_A - \mu_C) + \lambda_B (\mu_B - \mu_C) + \mu_C\right)dt + \frac{(1 - \gamma)(-\gamma)}{2} \left( \lambda_A^2 (\sigma_A^2 + \sigma_C^2) + \lambda_B^2 (\sigma_B^2 + \sigma_C^2) + 2\lambda_A\lambda_B \sigma_C^2 - 2\lambda_A \sigma_C^2 - 2\lambda_B \sigma_C^2 + \sigma_C^2 \right)dt \right) - 1}{1-\gamma}

This expression for our expected marginal utility incorporates all of the constraints of our model now that λC\lambda_C has been eliminated and replaced by solely λA\lambda_A and λB\lambda_B, which crucially reduces a degree of freedome from our model and allows the matrix inversion technique which follows to succeed.

Optimizing Portfolio Weights to Maximize Expected Utility (Take Three)

We know that the maximium of E[U(Wt+dt)]E[U(W_{t+dt})] w.r.t. our λi\lambda_is will have a tangent plane with zero gradient in the λA\lambda_A and λB\lambda_B directions. That is, dE[U(Wt+dt)]dλi=0\frac{dE[U(W_{t+dt})]}{d\lambda_i}=0 for i{1,2}i \in \{1,2\}. From this we will get a system of equations which we can then solve to get our optimal portfolio allocations. We start by solving for ddλAE[U(Wt+dt)]\frac{d}{d\lambda_A}E[U(W_{t+dt})].

ddλAE[U(Wt+dt)]=ddλA(Wt1γ(1+(1γ)(λA(μAμC)+λB(μBμC)+μC)dt+(1γ)(γ)2(λA2(σA2+σC2)+λB2(σB2+σC2)+2λAλBσC22λAσC22λBσC2+σC2)dt)11γ)\frac{d}{d\lambda_A} E[U(W_{t+dt})] = \frac{d}{d\lambda_A} \left( \frac{ W_t^{1-\gamma} \left( 1 + (1 - \gamma) \left( \lambda_A (\mu_A - \mu_C) + \lambda_B (\mu_B - \mu_C) + \mu_C \right)dt + \frac{(1 - \gamma)(-\gamma)}{2} \left( \lambda_A^2 (\sigma_A^2 + \sigma_C^2) + \lambda_B^2 (\sigma_B^2 + \sigma_C^2) + 2\lambda_A\lambda_B \sigma_C^2 - 2\lambda_A \sigma_C^2 - 2\lambda_B \sigma_C^2 + \sigma_C^2 \right)dt \right) - 1 }{ 1-\gamma } \right) =Wt1γ1γddλA(1+(1γ)(λA(μAμC)+λB(μBμC)+μC)dt+(1γ)(γ)2(λA2(σA2+σC2)+λB2(σB2+σC2)+2λAλBσC22λAσC22λBσC2+σC2)dt)= \frac{W_t^{1-\gamma}}{1-\gamma} \frac{d}{d\lambda_A} \left( 1 + (1 - \gamma) \left( \lambda_A (\mu_A - \mu_C) + \lambda_B (\mu_B - \mu_C) + \mu_C \right)dt + \frac{(1 - \gamma)(-\gamma)}{2} \left( \lambda_A^2 (\sigma_A^2 + \sigma_C^2) + \lambda_B^2 (\sigma_B^2 + \sigma_C^2) + 2\lambda_A\lambda_B \sigma_C^2 - 2\lambda_A \sigma_C^2 - 2\lambda_B \sigma_C^2 + \sigma_C^2 \right)dt \right) =Wt1γ1γddλA((1γ)(λA(μAμC)+λB(μBμC)+μC)dt)+ddλA((1γ)(γ)2(λA2(σA2+σC2)+λB2(σB2+σC2)+2λAλBσC22λAσC22λBσC2+σC2)dt)= \frac{W_t^{1-\gamma}}{1-\gamma} \frac{d}{d\lambda_A} \left( (1 - \gamma) \left( \lambda_A (\mu_A - \mu_C) + \lambda_B (\mu_B - \mu_C) + \mu_C \right)dt \right) + \frac{d}{d\lambda_A} \left( \frac{(1 - \gamma)(-\gamma)}{2} \left( \lambda_A^2 (\sigma_A^2 + \sigma_C^2) + \lambda_B^2 (\sigma_B^2 + \sigma_C^2) + 2\lambda_A\lambda_B \sigma_C^2 - 2\lambda_A \sigma_C^2 - 2\lambda_B \sigma_C^2 + \sigma_C^2 \right)dt \right) =Wt1γddλA((λA(μAμC)+λB(μBμC)+μC)dt)Wt1γγ2ddλA((λA2(σA2+σC2)+λB2(σB2+σC2)+2λAλBσC22λAσC22λBσC2+σC2)dt)= W_t^{1-\gamma} \frac{d}{d\lambda_A} \left( \left( \lambda_A (\mu_A - \mu_C) + \lambda_B (\mu_B - \mu_C) + \mu_C \right)dt \right) - \frac{W_t^{1-\gamma} \gamma}{2} \frac{d}{d\lambda_A} \left( \left( \lambda_A^2 (\sigma_A^2 + \sigma_C^2) + \lambda_B^2 (\sigma_B^2 + \sigma_C^2) + 2\lambda_A\lambda_B \sigma_C^2 - 2\lambda_A \sigma_C^2 - 2\lambda_B \sigma_C^2 + \sigma_C^2 \right)dt \right) =Wt1γ(ddλA(λA(μAμC)dt)+ddλA(λB(μBμC)dt)+ddλA(μCdt))Wt1γγ2(ddλA(λA2(σA2+σC2)dt)+ddλA(λB2(σB2+σC2)dt)+ddλA(2λAλBσC2dt)ddλA(2λAσC2dt)ddλA(2λBσC2dt)+ddλA(σC2dt))= W_t^{1-\gamma} \left( \frac{d}{d\lambda_A} (\lambda_A (\mu_A - \mu_C) dt) + \frac{d}{d\lambda_A} (\lambda_B (\mu_B - \mu_C) dt) + \frac{d}{d\lambda_A} (\mu_C dt) \right) - \frac{W_t^{1-\gamma} \gamma}{2} \left( \frac{d}{d\lambda_A} (\lambda_A^2 (\sigma_A^2 + \sigma_C^2) dt) + \frac{d}{d\lambda_A} (\lambda_B^2 (\sigma_B^2 + \sigma_C^2) dt) + \frac{d}{d\lambda_A} (2\lambda_A\lambda_B \sigma_C^2 dt) - \frac{d}{d\lambda_A} (2\lambda_A \sigma_C^2 dt) - \frac{d}{d\lambda_A} (2\lambda_B \sigma_C^2 dt) + \frac{d}{d\lambda_A} (\sigma_C^2 dt) \right) =Wt1γ((μAμC)dt+dλBdλA(μBμC)dt)Wt1γγ2(2λA(σA2+σC2)dt+2λB(σB2+σC2)dλBdλAdt+2σC2(λAdλBdλA+λB)dt2σC2dt2σC2dλBdλAdt)= W_t^{1-\gamma} \left( (\mu_A - \mu_C) dt + \frac{d \lambda_B}{d\lambda_A} (\mu_B - \mu_C) dt \right) - \frac{W_t^{1-\gamma} \gamma}{2} \left( 2 \lambda_A (\sigma_A^2 + \sigma_C^2) dt + 2 \lambda_B (\sigma_B^2 + \sigma_C^2) \frac{d \lambda_B}{d\lambda_A} dt + 2 \sigma_C^2 (\lambda_A \frac{d \lambda_B}{d\lambda_A} + \lambda_B) dt - 2 \sigma_C^2 dt - 2 \sigma_C^2 \frac{d\lambda_B}{d\lambda_A} dt \right) =Wt1γdt((μAμC)+dλBdλA(μBμC)γλA(σA2+σC2)γλB(σB2+σC2)dλBdλAγσC2λAdλBdλAγσC2λB+γσC2+γσC2dλBdλA)= W_t^{1-\gamma} dt \left( (\mu_A - \mu_C) + \frac{d \lambda_B}{d\lambda_A} (\mu_B - \mu_C) - \gamma \lambda_A (\sigma_A^2 + \sigma_C^2) - \gamma \lambda_B (\sigma_B^2 + \sigma_C^2) \frac{d \lambda_B}{d\lambda_A} - \gamma \sigma_C^2 \lambda_A \frac{d \lambda_B}{d\lambda_A} - \gamma \sigma_C^2 \lambda_B + \gamma \sigma_C^2 + \gamma \sigma_C^2 \frac{d\lambda_B}{d\lambda_A} \right) =Wt1γdt(μAμCγλA(σA2+σC2)γσC2λB+γσC2+dλBdλA(μBμCγλB(σB2+σC2)γσC2λA+γσC2))= W_t^{1-\gamma} dt \left( \mu_A - \mu_C - \gamma \lambda_A (\sigma_A^2 + \sigma_C^2) - \gamma \sigma_C^2 \lambda_B + \gamma \sigma_C^2 + \frac{d \lambda_B}{d\lambda_A} \left(\mu_B - \mu_C - \gamma \lambda_B (\sigma_B^2 + \sigma_C^2) - \gamma \sigma_C^2 \lambda_A + \gamma \sigma_C^2\right) \right)

We can apply a symmetry argument to find dE[U]dλB\frac{dE[U]}{d\lambda_B} because λB\lambda_B can be interchanged with λA\lambda_A without changing E[U]E[U]. Setting each of these partial derivative to equal zero gives us the system of equations we're looking for.

dE[U]dλA=Wt1γdt(μAμCγλA(σA2+σC2)γσC2λB+γσC2+dλBdλA(μBμCγλB(σB2+σC2)γσC2λA+γσC2))=0\frac{dE[U]}{d\lambda_A} = W_t^{1-\gamma} dt \left( \mu_A - \mu_C - \gamma \lambda_A (\sigma_A^2 + \sigma_C^2) - \gamma \sigma_C^2 \lambda_B + \gamma \sigma_C^2 + \frac{d \lambda_B}{d\lambda_A} \left( \mu_B - \mu_C - \gamma \lambda_B (\sigma_B^2 + \sigma_C^2) - \gamma \sigma_C^2 \lambda_A + \gamma \sigma_C^2 \right) \right) = 0 dE[U]dλB=Wt1γdt(μBμCγλB(σB2+σC2)γσC2λA+γσC2+dλAdλB(μAμCγλA(σA2+σC2)γσC2λB+γσC2))=0\frac{dE[U]}{d\lambda_B} = W_t^{1-\gamma} dt \left( \mu_B - \mu_C - \gamma \lambda_B (\sigma_B^2 + \sigma_C^2) - \gamma \sigma_C^2 \lambda_A + \gamma \sigma_C^2 + \frac{d \lambda_A}{d\lambda_B} \left( \mu_A - \mu_C - \gamma \lambda_A (\sigma_A^2 + \sigma_C^2) - \gamma \sigma_C^2 \lambda_B + \gamma \sigma_C^2 \right) \right) = 0 γσC2λBγσC2+dλBdλA(γλB(σB2+σC2)+γσC2λAγσC2)=μAμC+dλBdλA(μBμC)\gamma \sigma_C^2 \lambda_B - \gamma \sigma_C^2 + \frac{d \lambda_B}{d\lambda_A} \left( \gamma \lambda_B (\sigma_B^2 + \sigma_C^2) + \gamma \sigma_C^2 \lambda_A - \gamma \sigma_C^2 \right) = \mu_A - \mu_C + \frac{d \lambda_B}{d\lambda_A} \left( \mu_B - \mu_C \right)

We now do some simplification.

After cancelling out Wt1γdtW_t^{1-\gamma} dt from both sides of the equation, the rearranged form becomes:

γλA(σA2+σC2)γσC2λB+dλBdλA(γλB(σB2+σC2)γσC2λA)=μAμC+γσC2+dλBdλA(μBμC+γσC2)-\gamma \lambda_A (\sigma_A^2 + \sigma_C^2) - \gamma \sigma_C^2 \lambda_B + \frac{d \lambda_B}{d\lambda_A} \left( - \gamma \lambda_B (\sigma_B^2 + \sigma_C^2) - \gamma \sigma_C^2 \lambda_A \right) = \mu_A - \mu_C + \gamma \sigma_C^2 + \frac{d \lambda_B}{d\lambda_A} (\mu_B - \mu_C + \gamma \sigma_C^2) γλB(σB2+σC2)γσC2λA+dλAdλB(γλA(σA2+σC2)γσC2λB)=μBμC+γσC2+dλAdλB(μAμC+γσC2-\gamma \lambda_B (\sigma_B^2 + \sigma_C^2) - \gamma \sigma_C^2 \lambda_A + \frac{d \lambda_A}{d\lambda_B} \left( - \gamma \lambda_A (\sigma_A^2 + \sigma_C^2) - \gamma \sigma_C^2 \lambda_B \right) = \mu_B - \mu_C + \gamma \sigma_C^2 + \frac{d \lambda_A}{d\lambda_B} (\mu_A - \mu_C + \gamma \sigma_C^2

I think the answer might be:

λA=γσB2σC2μAσB2μAσC2+μBσC2+μCσB2γσA2σB2+γσA2σC2+γσB2σC2\lambda_A = \frac{-\gamma \sigma_B^2 \sigma_C^2 - \mu_A \sigma_B^2 - \mu_A \sigma_C^2 + \mu_B \sigma_C^2 + \mu_C \sigma_B^2}{\gamma \sigma_A^2 \sigma_B^2 + \gamma \sigma_A^2 \sigma_C^2 + \gamma \sigma_B^2 \sigma_C^2} λB=γσA2σC2+μAσC2μBσA2μBσC2+μCσA2γσA2σB2+γσA2σC2+γσB2σC2\lambda_B = \frac{-\gamma \sigma_A^2 \sigma_C^2 + \mu_A \sigma_C^2 - \mu_B \sigma_A^2 - \mu_B \sigma_C^2 + \mu_C \sigma_A^2}{\gamma \sigma_A^2 \sigma_B^2 + \gamma \sigma_A^2 \sigma_C^2 + \gamma \sigma_B^2 \sigma_C^2}

Simplifying:

λA=σB2(μCμA)+σC2(μBμA)γσB2σC2γ(σA2σB2+σA2σC2+σB2σC2)\lambda_A = \frac{ \sigma_B^2 (\mu_C - \mu_A) + \sigma_C^2 (\mu_B - \mu_A) - \gamma \sigma_B^2 \sigma_C^2 }{ \gamma ( \sigma_A^2 \sigma_B^2 + \sigma_A^2 \sigma_C^2 + \sigma_B^2 \sigma_C^2 )} λB=σC2(μAμB)+σA2(μCμB)γσA2σC2γ(σA2σB2+σA2σC2+σB2σC2)\lambda_B = \frac{ \sigma_C^2 (\mu_A - \mu_B) + \sigma_A^2 (\mu_C - \mu_B) - \gamma \sigma_A^2 \sigma_C^2 }{ \gamma ( \sigma_A^2 \sigma_B^2 + \sigma_A^2 \sigma_C^2 + \sigma_B^2 \sigma_C^2 )} λC=σA2(μBμC)+σB2(μAμC)γσA2σB2γ(σA2σB2+σA2σC2+σB2σC2)\lambda_C = \frac{ \sigma_A^2 (\mu_B - \mu_C) + \sigma_B^2 (\mu_A - \mu_C) - \gamma \sigma_A^2 \sigma_B^2 }{ \gamma ( \sigma_A^2 \sigma_B^2 + \sigma_A^2 \sigma_C^2 + \sigma_B^2 \sigma_C^2 )}

Let's solve for the partial derivatives.

dλBdλA=μAμC+γλA(σA2+σC2)+γσC2(1+λB)μBμCγλB(σB2+σC2)+γσC2(1λA)\frac{d \lambda_B}{d\lambda_A} = \frac{ \mu_A - \mu_C + \gamma \lambda_A (\sigma_A^2 + \sigma_C^2) + \gamma \sigma_C^2 (1 +\lambda_B) }{ \mu_B - \mu_C - \gamma \lambda_B (\sigma_B^2 + \sigma_C^2) + \gamma \sigma_C^2 (1 - \lambda_A) } dλAdλB=μBμC+γλB(σB2+σC2)+γσC2(1+λA)μAμCγλA(σA2+σC2)+γσC2(1λB)\frac{d \lambda_A}{d\lambda_B} = \frac{ \mu_B - \mu_C + \gamma \lambda_B (\sigma_B^2 + \sigma_C^2) + \gamma \sigma_C^2 (1 + \lambda_A) }{ \mu_A - \mu_C - \gamma \lambda_A (\sigma_A^2 + \sigma_C^2) + \gamma \sigma_C^2 (1 - \lambda_B) }

Part III: Risky Numeraire

In Part I, we made the case that there may be no such thing as a risk-free asset. In the case of treasuries, the typical example of the risk-free asset, the holder is exposed to inflation risks, dollar fluctuations, interest rate changes, and other factors. Given this, we constructed a model for the optimal allocation between two risky assets.

Every investor has their own basket of goods under which they estimate changes in their real purchasing power.1 This subjective basket is not exactly cash, which was the motivation for the first essay. But what if it were somehow a known tradable asset? Or even more simply, perhaps an investor wants to denominate their returns in ETH, SPY, or some known liquid asset. Does our previous analysis still hold if we change the currency units? In this essay, I examine the case where, instead of using cash as a base for both assets AA and BB, we designate asset AA as the numeraire, expressing asset BB in terms of AA.

Let us begin with the optimal allocations derived from the previous cash-denominated model:

λA=μAμB+γσB2γ(σA2+σB2),λB=μBμA+γσA2γ(σA2+σB2).\lambda_A = \frac{\mu_A - \mu_B + \gamma \sigma_B^2}{\gamma (\sigma_A^2 + \sigma_B^2)}, \quad \lambda_B = \frac{\mu_B - \mu_A + \gamma \sigma_A^2}{\gamma (\sigma_A^2 + \sigma_B^2)}.

Both assets AA and BB are initially denominated in cash. We now shift our perspective by setting asset AA as the numeraire, effectively redefining all quantities in relation to AA. This transition moves us from a cash-denominated framework to one in which asset AA is the central reference point.

Key Definitions in the AA-Denominated Model:

  • SB/AS_{B/A}: Price of asset BB relative to AA.
  • SA/AS_{A/A}: Price of asset AA relative to itself.
  • μB/A\mu_{B/A}, σB/A\sigma_{B/A}: Drift and volatility of BB with respect to AA.
  • λA/A\lambda_{A/A}, λB/A\lambda_{B/A}: Portfolio weights for AA and BB in the AA-denominated framework.
  • WA,tW_{A,t}: Wealth expressed in terms of AA.
  • dWA,tdW_{A,t}: Wealth dynamics in terms of AA.

We start by solving for the simplest expressions.

SA/AS_{A/A} is the price of asset AA relative to itself, so SA/A=SA/ASA/A=1S_{A/A} = \frac{S_{A/A}}{S_{A/A}} = 1.

Since SA/A=1S_{A/A} = 1, its drift and volatility are zero: μA/A=0\mu_{A/A} = 0, σA/A=0\sigma_{A/A} = 0.

The price of BB in terms of AA is given by SB/A=SBSAS_{B/A} = \frac{S_B}{S_A}.

In this model, the drift and volatility of BB relative to AA are defined as:

μB/A=μBμA,σB/A=σA2+σB2.\mu_{B/A} = \mu_B - \mu_A, \quad \sigma_{B/A} = \sqrt{\sigma_A^2 + \sigma_B^2}.

The transition from the cash-denominated optimal BB allocation, λB=μBμA+γσA2γ(σA2+σB2)\lambda_B = \frac{\mu_B - \mu_A + \gamma \sigma_A^2}{\gamma (\sigma_A^2 + \sigma_B^2)}, to the AA-denominated model can be achieved by the following transformations:

Replace the cash-denominated drift difference μBμA\mu_B - \mu_A with μB/A\mu_{B/A}:

μBμA+γσA2γ(σA2+σB2)=μB/A+γσA2γ(σA2+σB2)\frac{\mu_B - \mu_A + \gamma \sigma_A^2}{\gamma (\sigma_A^2 + \sigma_B^2)} = \frac{\mu_{B/A} + \gamma \sigma_A^2}{\gamma (\sigma_A^2 + \sigma_B^2)}

Next, replace the combined variance σA2+σB2\sigma_A^2 + \sigma_B^2 with σB/A2\sigma_{B/A}^2:

μB/A+γσA2γ(σA2+σB2)=μB/A+γσA2γσB/A2\frac{\mu_{B/A} + \gamma \sigma_A^2}{\gamma (\sigma_A^2 + \sigma_B^2)} = \frac{\mu_{B/A} + \gamma \sigma_A^2}{\gamma \sigma_{B/A}^2}

Finally, note that the numeraire asset A/AA/A has zero drift and zero volatility: μA/A=0\mu_{A/A} = 0 and σA/A=0\sigma_{A/A} = 0:

μB/A+γσA2γσB/A2=μB/AγσB/A2\frac{\mu_{B/A} + \gamma \sigma_A^2}{\gamma \sigma_{B/A}^2} = \frac{\mu_{B/A}}{\gamma \sigma_{B/A}^2}

Thus, the optimal weight for BB in the AA-denominated framework becomes:

λB/A=μB/AγσB/A2\lambda_{B/A} = \frac{\mu_{B/A}}{\gamma \sigma_{B/A}^2}

The derivation above is perfectly legitamate, though if we don't want to take as axiom the results of my previous essay, we can start again from scratch.

Fully Deriving the Optimal Portfolio Weights in the AA-Denominated Model

We start with maximizing our expected CRRA utility as before:

maxλB/AE[U(WA,t+dt)]=E[(WA,t+dWA,t)1γ11γ]\underset{\lambda_{B/A}}{\max} \, E\left[U(W_{A,t+dt})\right] = E\left[\frac{(W_{A,t} + dW_{A,t})^{1-\gamma} - 1}{1 - \gamma}\right]

We note that in the AA-denominated model the only risky asset is B/AB/A, so the wealth dynamics are driven solely by B/AB/A. The optimization problem simplifies to maximizing expected utility with respect to λB/A\lambda_{B/A}.

Since SA/A=1S_{A/A} = 1, asset AA contributes no differential to wealth dynamics in AA-terms. Thus, wealth dynamics in this AA-denominated framework are driven solely by B/AB/A:

dWA,t=λB/AWA,tdSB/A,tSB/A,t,dW_{A,t} = \lambda_{B/A} W_{A,t} \frac{dS_{B/A,t}}{S_{B/A,t}}, dWA,t=λB/AWA,t(μB/Adt+σB/AdNt).dW_{A,t} = \lambda_{B/A} W_{A,t} \left( \mu_{B/A} \, dt + \sigma_{B/A} \, dN_t \right).

We now substitute this wealth dynamic term into our expected utility calculation:

maxλB/AE[U(WA,t+dt)]=E[(WA,t+λB/AWA,t(μB/Adt+σB/AdNt))1γ11γ].\underset{\lambda_{B/A}}{\max} \, E\left[U(W_{A,t+dt})\right] = E\left[\frac{(W_{A,t} + \lambda_{B/A} W_{A,t} ( \mu_{B/A} \, dt + \sigma_{B/A} \, dN_t ))^{1-\gamma} - 1}{1 - \gamma}\right].

Applying a second-order Taylor expansion to E[(1+x)1γ]E\left[(1 + x)^{1 - \gamma}\right] where x=λB/A(μB/Adt+σB/AdNt)x = \lambda_{B/A} \left( \mu_{B/A} \, dt + \sigma_{B/A} \, dN_t \right), we get:

E[(1+x)1γ]1+(1γ)E[x]+(1γ)(γ)2E[x2].E\left[(1 + x)^{1 - \gamma}\right] \approx 1 + (1 - \gamma) E[x] + \frac{(1 - \gamma)(-\gamma)}{2} E[x^2].

E[x]=λB/AμB/AdtE[x] = \lambda_{B/A} \mu_{B/A} dt

E[x2]=λB/A2σB/A2dtE[x^2] = \lambda_{B/A}^2 \sigma_{B/A}^2 dt

We now simplify our expected utility expression:

E[U(WA,t+dt)]WA,t1γ(1+(1γ)λB/AμB/Adt+(1γ)(γ)2λB/A2σB/A2dt)11γ.E[U(W_{A,t+dt})] \approx \frac{W_{A,t}^{1 - \gamma} \left(1 + (1 - \gamma) \lambda_{B/A} \mu_{B/A} dt + \frac{(1 - \gamma)(-\gamma)}{2} \lambda_{B/A}^2 \sigma_{B/A}^2 dt \right) - 1}{1 - \gamma}.

Now we differentiate the bracketed term with respect to λB/A\lambda_{B/A} and set the derivative to zero:

ddλB/A[(1γ)λB/AμB/Adtγ(1γ)2λB/A2σB/A2dt]=0.\frac{d}{d\lambda_{B/A}} \left[ (1 - \gamma) \lambda_{B/A} \mu_{B/A} dt - \frac{\gamma (1 - \gamma)}{2} \lambda_{B/A}^2 \sigma_{B/A}^2 dt \right] = 0.

Simplify:

(1γ)μB/Adtγ(1γ)λB/AσB/A2dt=0.(1 - \gamma) \mu_{B/A} dt - \gamma (1 - \gamma) \lambda_{B/A} \sigma_{B/A}^2 dt = 0.

Divide both sides by (1γ)dt(1 - \gamma) dt:

μB/AγλB/AσB/A2=0.\mu_{B/A} - \gamma \lambda_{B/A} \sigma_{B/A}^2 = 0.

Thus, the optimal weight for B/AB/A is

λB/A=μB/AγσB/A2\lambda_{B/A} = \frac{\mu_{B/A}}{\gamma \sigma_{B/A}^2}

And the optimal weight for A/AA/A is:

λA/A=1λB/A=1μB/AγσB/A2\lambda_{A/A} = 1 - \lambda_{B/A} = 1 - \frac{\mu_{B/A}}{\gamma \sigma_{B/A}^2}

Conclusion:

We can now express the optimal weights in the AA-denominated model directly in terms of μB/A\mu_{B/A} and σB/A2\sigma_{B/A}^2.

λB/A=μB/AγσB/A2,λA/A=1λB/A.\lambda_{B/A} = \frac{\mu_{B/A}}{\gamma \sigma_{B/A}^2}, \quad \lambda_{A/A} = 1 - \lambda_{B/A}.

Thus, after transforming the cash-denominated model's optimal weights to the AA-denominated model's optimal weights, we see that we've derived the famous Merton share.


Part IV: Extension to n Assets

In Part I, we solved the binary asset case. In Part II, we tried to move from two assets to three assets. The three-asset case is useful because it reveals something important: the algebra gets messy very quickly if we try to expand everything by hand.

At two assets, it is perfectly reasonable to eliminate one weight by writing λB=1λA\lambda_B = 1 - \lambda_A. At three assets, it is still possible to eliminate one weight by writing λC=1λAλB\lambda_C = 1 - \lambda_A - \lambda_B, though the expression becomes more cumbersome. But at nn assets, this approach becomes a bit masochistic.

There is also a subtle but important calculus point here. Once we write λC=1λAλB\lambda_C = 1 - \lambda_A - \lambda_B, the remaining variables λA\lambda_A and λB\lambda_B are independent coordinates for the constrained portfolio surface. So when we differentiate with respect to λA\lambda_A, we hold λB\lambda_B fixed. We do not need to include a term like dλBdλA\frac{d\lambda_B}{d\lambda_A}.

Rather than trying to manage all of this by hand, the cleanest way forward is to use a Lagrange multiplier. This lets us impose the full-investment constraint directly and gives a formula that scales naturally from two assets to three assets to nn assets.


Model Definition

Suppose we have a universe of nn risky assets indexed by i{1,2,,n}i \in \{1,2,\ldots,n\}. Each asset price follows an independent GBM process:

dSi,tSi,t=μidt+σidNi,t,\frac{dS_{i,t}}{S_{i,t}} = \mu_i \, dt + \sigma_i \, dN_{i,t},

where μi\mu_i is the drift of asset ii, σi\sigma_i is the volatility of asset ii, and Ni,tN_{i,t} is a Wiener process for asset ii.

We assume the Brownian shocks are independent across assets, so

E[dNi,tdNj,t]={dt,i=j,0,ij.E[dN_{i,t}dN_{j,t}] = \begin{cases} dt, & i=j, \\ 0, & i \neq j. \end{cases}

Let λi\lambda_i denote the portfolio weight on asset ii. As before, our portfolio weights must sum to one:

i=1nλi=1.\sum_{i=1}^{n} \lambda_i = 1.

We again use CRRA utility:

U(W)=W1γ11γ,U(W) = \frac{W^{1-\gamma} - 1}{1-\gamma},

where γ\gamma is our relative risk aversion parameter.

Our goal is to solve

maxλ1,,λnE[U(Wt+dt)],\underset{\lambda_1,\ldots,\lambda_n}{\max} \, E[U(W_{t+dt})],

subject to

i=1nλi=1.\sum_{i=1}^{n} \lambda_i = 1.

Wealth Dynamics

The wealth process evolves according to the chosen portfolio weights:

dWt=Wti=1nλidSi,tSi,t.dW_t = W_t \sum_{i=1}^{n} \lambda_i \frac{dS_{i,t}}{S_{i,t}}.

Substituting in each asset's GBM process gives

dWt=Wti=1nλi(μidt+σidNi,t).dW_t = W_t \sum_{i=1}^{n} \lambda_i (\mu_i \, dt + \sigma_i \, dN_{i,t}).

Equivalently,

dWtWt=i=1nλiμidt+i=1nλiσidNi,t.\frac{dW_t}{W_t} = \sum_{i=1}^{n} \lambda_i \mu_i \, dt + \sum_{i=1}^{n} \lambda_i \sigma_i \, dN_{i,t}.

Now define

x=i=1nλi(μidt+σidNi,t).x = \sum_{i=1}^{n} \lambda_i (\mu_i \, dt + \sigma_i \, dN_{i,t}).

Then

Wt+dt=Wt(1+x).W_{t+dt} = W_t(1+x).

So expected utility becomes

E[U(Wt+dt)]=E[Wt1γ(1+x)1γ11γ].E[U(W_{t+dt})] = E\left[ \frac{W_t^{1-\gamma}(1+x)^{1-\gamma}-1}{1-\gamma} \right].

Since WtW_t is known at time tt, we can pull Wt1γW_t^{1-\gamma} out of the expectation:

E[U(Wt+dt)]=Wt1γE[(1+x)1γ]11γ.E[U(W_{t+dt})] = \frac{ W_t^{1-\gamma}E[(1+x)^{1-\gamma}]-1 }{ 1-\gamma }.

Taylor Expanding Expected Utility

As before, we use a second-order Taylor expansion:

E[(1+x)1γ]1+(1γ)E[x]+(1γ)(γ)2E[x2].E[(1+x)^{1-\gamma}] \approx 1 + (1-\gamma)E[x] + \frac{(1-\gamma)(-\gamma)}{2}E[x^2].

First, we compute E[x]E[x]:

E[x]=E[i=1nλi(μidt+σidNi,t)].E[x] = E\left[ \sum_{i=1}^{n} \lambda_i (\mu_i \, dt + \sigma_i \, dN_{i,t}) \right].

Since E[dNi,t]=0E[dN_{i,t}]=0, this reduces to

E[x]=(i=1nλiμi)dt.E[x] = \left( \sum_{i=1}^{n} \lambda_i \mu_i \right)dt.

Now we compute E[x2]E[x^2]:

E[x2]=E[(i=1nλi(μidt+σidNi,t))2].E[x^2] = E\left[ \left( \sum_{i=1}^{n} \lambda_i (\mu_i \, dt + \sigma_i \, dN_{i,t}) \right)^2 \right].

The dt2dt^2 terms disappear. The dtdNi,tdt \, dN_{i,t} terms disappear in expectation. And since the assets are independent, the cross terms dNi,tdNj,tdN_{i,t}dN_{j,t} vanish for iji \neq j.

Thus, only the own-variance terms remain:

E[x2]=(i=1nλi2σi2)dt.E[x^2] = \left( \sum_{i=1}^{n} \lambda_i^2 \sigma_i^2 \right)dt.

Substituting these terms back into the Taylor expansion gives

E[(1+x)1γ]1+(1γ)(i=1nλiμi)dtγ(1γ)2(i=1nλi2σi2)dt.E[(1+x)^{1-\gamma}] \approx 1 + (1-\gamma) \left( \sum_{i=1}^{n} \lambda_i \mu_i \right)dt - \frac{\gamma(1-\gamma)}{2} \left( \sum_{i=1}^{n} \lambda_i^2 \sigma_i^2 \right)dt.

So expected utility is approximately

E[U(Wt+dt)]Wt1γ[1+(1γ)(i=1nλiμi)dtγ(1γ)2(i=1nλi2σi2)dt]11γ.E[U(W_{t+dt})] \approx \frac{ W_t^{1-\gamma} \left[ 1 + (1-\gamma) \left( \sum_{i=1}^{n} \lambda_i \mu_i \right)dt - \frac{\gamma(1-\gamma)}{2} \left( \sum_{i=1}^{n} \lambda_i^2 \sigma_i^2 \right)dt \right] -1 }{ 1-\gamma }.

The terms WtW_t, γ\gamma, and dtdt are fixed with respect to our portfolio weights. Therefore, maximizing expected utility is equivalent to maximizing the simpler quadratic objective

i=1nλiμiγ2i=1nλi2σi2,\sum_{i=1}^{n} \lambda_i \mu_i - \frac{\gamma}{2} \sum_{i=1}^{n} \lambda_i^2 \sigma_i^2,

subject to

i=1nλi=1.\sum_{i=1}^{n} \lambda_i = 1.

This is nice. The whole expected utility problem has collapsed into a tradeoff between portfolio drift and portfolio variance.


Solving the n-Asset Problem with a Lagrange Multiplier

We now solve

maxλ1,,λn[i=1nλiμiγ2i=1nλi2σi2],\underset{\lambda_1,\ldots,\lambda_n}{\max} \left[ \sum_{i=1}^{n} \lambda_i \mu_i - \frac{\gamma}{2} \sum_{i=1}^{n} \lambda_i^2 \sigma_i^2 \right],

subject to

i=1nλi=1.\sum_{i=1}^{n} \lambda_i = 1.

Define the Lagrangian:

L=i=1nλiμiγ2i=1nλi2σi2ν(i=1nλi1),\mathcal{L} = \sum_{i=1}^{n} \lambda_i \mu_i - \frac{\gamma}{2} \sum_{i=1}^{n} \lambda_i^2 \sigma_i^2 - \nu \left( \sum_{i=1}^{n} \lambda_i - 1 \right),

where ν\nu is the Lagrange multiplier on the full-investment constraint.

Taking the derivative with respect to λi\lambda_i gives

Lλi=μiγλiσi2ν.\frac{\partial \mathcal{L}}{\partial \lambda_i} = \mu_i - \gamma \lambda_i \sigma_i^2 - \nu.

Setting the first-order condition equal to zero:

μiγλiσi2ν=0.\mu_i - \gamma \lambda_i \sigma_i^2 - \nu = 0.

Rearranging,

γλiσi2=μiν.\gamma \lambda_i \sigma_i^2 = \mu_i - \nu.

Therefore,

λi=μiνγσi2.\lambda_i = \frac{\mu_i - \nu}{\gamma \sigma_i^2}.

This already tells us a lot. The optimal weight on asset ii is increasing in its drift μi\mu_i and decreasing in its variance σi2\sigma_i^2. This is exactly what we would hope to see.

Now we use the constraint that the weights sum to one:

i=1nλi=1.\sum_{i=1}^{n} \lambda_i = 1.

Substituting in our expression for λi\lambda_i,

i=1nμiνγσi2=1.\sum_{i=1}^{n} \frac{\mu_i - \nu}{\gamma \sigma_i^2} = 1.

Multiplying both sides by γ\gamma,

i=1nμiνσi2=γ.\sum_{i=1}^{n} \frac{\mu_i - \nu}{\sigma_i^2} = \gamma.

Expanding,

i=1nμiσi2νi=1n1σi2=γ.\sum_{i=1}^{n} \frac{\mu_i}{\sigma_i^2} - \nu \sum_{i=1}^{n} \frac{1}{\sigma_i^2} = \gamma.

Solving for ν\nu gives

ν=i=1nμiσi2γi=1n1σi2.\nu = \frac{ \sum_{i=1}^{n} \frac{\mu_i}{\sigma_i^2} - \gamma }{ \sum_{i=1}^{n} \frac{1}{\sigma_i^2} }.

Thus, the optimal allocation to asset ii is

λi=μiνγσi2,\lambda_i^* = \frac{\mu_i - \nu}{\gamma \sigma_i^2},

where

ν=j=1nμjσj2γj=1n1σj2.\nu = \frac{ \sum_{j=1}^{n} \frac{\mu_j}{\sigma_j^2} - \gamma }{ \sum_{j=1}^{n} \frac{1}{\sigma_j^2} }.

This is the clean nn-asset solution for independent risky assets.


Sanity Check: Recovering the Two-Asset Formula

It is worth checking that this formula recovers our original binary asset result.

For two assets, the solution gives

λA=μAνγσA2,λB=μBνγσB2,\lambda_A^* = \frac{\mu_A - \nu}{\gamma \sigma_A^2}, \quad \lambda_B^* = \frac{\mu_B - \nu}{\gamma \sigma_B^2},

where

ν=μAσA2+μBσB2γ1σA2+1σB2.\nu = \frac{ \frac{\mu_A}{\sigma_A^2} + \frac{\mu_B}{\sigma_B^2} - \gamma }{ \frac{1}{\sigma_A^2} + \frac{1}{\sigma_B^2} }.

After simplifying, this gives

λA=μAμB+γσB2γ(σA2+σB2),\lambda_A^* = \frac{ \mu_A - \mu_B + \gamma \sigma_B^2 }{ \gamma(\sigma_A^2 + \sigma_B^2) },

and

λB=μBμA+γσA2γ(σA2+σB2).\lambda_B^* = \frac{ \mu_B - \mu_A + \gamma \sigma_A^2 }{ \gamma(\sigma_A^2 + \sigma_B^2) }.

This is exactly the result from Part I.


Sanity Check: The Correct Three-Asset Formula

The same formula also gives a clean three-asset result. For assets AA, BB, and CC, we get

λA=σC2(μAμB)+σB2(μAμC)+γσB2σC2γ(σA2σB2+σA2σC2+σB2σC2),\lambda_A^* = \frac{ \sigma_C^2(\mu_A-\mu_B) + \sigma_B^2(\mu_A-\mu_C) + \gamma \sigma_B^2 \sigma_C^2 }{ \gamma( \sigma_A^2\sigma_B^2 + \sigma_A^2\sigma_C^2 + \sigma_B^2\sigma_C^2 ) }, λB=σC2(μBμA)+σA2(μBμC)+γσA2σC2γ(σA2σB2+σA2σC2+σB2σC2),\lambda_B^* = \frac{ \sigma_C^2(\mu_B-\mu_A) + \sigma_A^2(\mu_B-\mu_C) + \gamma \sigma_A^2 \sigma_C^2 }{ \gamma( \sigma_A^2\sigma_B^2 + \sigma_A^2\sigma_C^2 + \sigma_B^2\sigma_C^2 ) },

and

λC=σB2(μCμA)+σA2(μCμB)+γσA2σB2γ(σA2σB2+σA2σC2+σB2σC2).\lambda_C^* = \frac{ \sigma_B^2(\mu_C-\mu_A) + \sigma_A^2(\mu_C-\mu_B) + \gamma \sigma_A^2 \sigma_B^2 }{ \gamma( \sigma_A^2\sigma_B^2 + \sigma_A^2\sigma_C^2 + \sigma_B^2\sigma_C^2 ) }.

These weights sum to one. They also have a nice interpretation: each asset receives a baseline allocation determined by the other assets' variances, plus a tilt based on its return advantage over the other assets.

If all three assets have the same expected return, so μA=μB=μC\mu_A = \mu_B = \mu_C, then all of the return-difference terms disappear and we get

λA=σB2σC2σA2σB2+σA2σC2+σB2σC2,\lambda_A^* = \frac{ \sigma_B^2\sigma_C^2 }{ \sigma_A^2\sigma_B^2 + \sigma_A^2\sigma_C^2 + \sigma_B^2\sigma_C^2 }, λB=σA2σC2σA2σB2+σA2σC2+σB2σC2,\lambda_B^* = \frac{ \sigma_A^2\sigma_C^2 }{ \sigma_A^2\sigma_B^2 + \sigma_A^2\sigma_C^2 + \sigma_B^2\sigma_C^2 },

and

λC=σA2σB2σA2σB2+σA2σC2+σB2σC2.\lambda_C^* = \frac{ \sigma_A^2\sigma_B^2 }{ \sigma_A^2\sigma_B^2 + \sigma_A^2\sigma_C^2 + \sigma_B^2\sigma_C^2 }.

This is exactly the inverse-variance allocation. When all assets have the same drift, the only remaining problem is how to allocate across risk, and the lower-variance assets receive larger weights.


Fully Scalar n-Asset Formula

We can also write the nn-asset solution as a single scalar expression. Let

qi=σi2.q_i = \sigma_i^2.

Then the optimal weight on asset ii is

λi=γkiqk+ji(μiμj)ki,jqkγj=1nkjqk.\lambda_i^* = \frac{ \gamma \prod_{k \neq i} q_k + \sum_{j \neq i} (\mu_i-\mu_j) \prod_{k \neq i,j} q_k }{ \gamma \sum_{j=1}^{n} \prod_{k \neq j} q_k }.

Substituting qi=σi2q_i=\sigma_i^2, this becomes

λi=γkiσk2+ji(μiμj)ki,jσk2γj=1nkjσk2.\lambda_i^* = \frac{ \gamma \prod_{k \neq i} \sigma_k^2 + \sum_{j \neq i} (\mu_i-\mu_j) \prod_{k \neq i,j} \sigma_k^2 }{ \gamma \sum_{j=1}^{n} \prod_{k \neq j} \sigma_k^2 }.

The convention here is that an empty product equals 11. This matters in the two-asset case, where the term ki,jσk2\prod_{k \neq i,j} \sigma_k^2 has no elements.

This scalar expression is useful because it shows the direct generalization of the two-asset and three-asset formulas. But it is not the form I would actually use computationally. For computation and interpretation, the Lagrange multiplier form is cleaner.


Interpreting the Independent n-Asset Solution

We can rewrite the solution in a way that makes the intuition clearer.

Define

H=j=1n1σj2.H = \sum_{j=1}^{n} \frac{1}{\sigma_j^2}.

Now define the inverse-variance-weighted average drift:

μˉσ2=j=1nμjσj2j=1n1σj2=j=1nμjσj2H.\bar{\mu}_{\sigma^{-2}} = \frac{ \sum_{j=1}^{n} \frac{\mu_j}{\sigma_j^2} }{ \sum_{j=1}^{n} \frac{1}{\sigma_j^2} } = \frac{ \sum_{j=1}^{n} \frac{\mu_j}{\sigma_j^2} }{ H }.

Since

ν=μˉσ2γH,\nu = \bar{\mu}_{\sigma^{-2}} - \frac{\gamma}{H},

we can rewrite the optimal weight as

λi=μiμˉσ2+γHγσi2.\lambda_i^* = \frac{ \mu_i - \bar{\mu}_{\sigma^{-2}} + \frac{\gamma}{H} }{ \gamma \sigma_i^2 }.

Splitting this into two terms gives

λi=1σi2j=1n1σj2+μiμˉσ2γσi2.\lambda_i^* = \frac{\frac{1}{\sigma_i^2}}{\sum_{j=1}^{n} \frac{1}{\sigma_j^2}} + \frac{ \mu_i - \bar{\mu}_{\sigma^{-2}} }{ \gamma \sigma_i^2 }.

This is maybe the most intuitive version of the independent asset result.

The first term,

1σi2j=1n1σj2,\frac{\frac{1}{\sigma_i^2}}{\sum_{j=1}^{n} \frac{1}{\sigma_j^2}},

is the inverse-variance allocation. It is the portfolio we get if expected returns are all equal and we only care about minimizing variance subject to being fully invested.

The second term,

μiμˉσ2γσi2,\frac{ \mu_i - \bar{\mu}_{\sigma^{-2}} }{ \gamma \sigma_i^2 },

is the speculative tilt. Assets with drifts above the inverse-variance-weighted average drift receive larger allocations. Assets with drifts below that average receive smaller allocations.

As γ\gamma increases, the speculative tilt shrinks. In the limit as γ\gamma \to \infty, the investor becomes infinitely risk averse and the portfolio approaches the inverse-variance allocation. As γ\gamma decreases, the investor becomes more willing to tilt toward assets with higher expected returns.

This is a satisfying result. The model says that a risk-averse investor starts with an inverse-variance portfolio and then tilts toward assets that have better expected returns relative to that baseline.


Part V: Matrix Form and Correlated Assets

The independent asset model is mathematically convenient, but it is obviously a simplification. In real markets, assets are correlated. Stocks move together. Bonds and equities can become correlated in crises. Crypto assets often behave like one giant risk factor wearing different ticker symbols.

So the next natural extension is to replace the independent variance term

i=1nλi2σi2\sum_{i=1}^{n} \lambda_i^2 \sigma_i^2

with a full covariance matrix.

Before doing that, it helps to rewrite the independent case in matrix form.


Matrix Form of the Independent Asset Case

Let

λ=[λ1λ2λn],μ=[μ1μ2μn],1=[111].\lambda = \begin{bmatrix} \lambda_1 \\ \lambda_2 \\ \vdots \\ \lambda_n \end{bmatrix}, \quad \mu = \begin{bmatrix} \mu_1 \\ \mu_2 \\ \vdots \\ \mu_n \end{bmatrix}, \quad \mathbf{1} = \begin{bmatrix} 1 \\ 1 \\ \vdots \\ 1 \end{bmatrix}.

For independent assets, define the diagonal covariance matrix

D=[σ12000σ22000σn2].D = \begin{bmatrix} \sigma_1^2 & 0 & \cdots & 0 \\ 0 & \sigma_2^2 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \sigma_n^2 \end{bmatrix}.

The objective function becomes

λμγ2λDλ,\lambda^\top \mu - \frac{\gamma}{2} \lambda^\top D \lambda,

subject to

1λ=1.\mathbf{1}^\top \lambda = 1.

The Lagrangian is

L=λμγ2λDλν(1λ1).\mathcal{L} = \lambda^\top \mu - \frac{\gamma}{2} \lambda^\top D\lambda - \nu(\mathbf{1}^\top \lambda - 1).

Taking the derivative with respect to λ\lambda gives

μγDλν1=0.\mu - \gamma D\lambda - \nu \mathbf{1} = 0.

Therefore,

γDλ=μν1,\gamma D\lambda = \mu - \nu \mathbf{1},

and so

λ=1γD1(μν1).\lambda = \frac{1}{\gamma} D^{-1} (\mu - \nu \mathbf{1}).

Now impose the full-investment constraint:

1λ=1.\mathbf{1}^\top \lambda = 1.

Substituting in the expression for λ\lambda,

11γD1(μν1)=1.\mathbf{1}^\top \frac{1}{\gamma} D^{-1} (\mu - \nu \mathbf{1}) = 1.

Multiplying through by γ\gamma,

1D1μν1D11=γ.\mathbf{1}^\top D^{-1}\mu - \nu \mathbf{1}^\top D^{-1}\mathbf{1} = \gamma.

Solving for ν\nu gives

ν=1D1μγ1D11.\nu = \frac{ \mathbf{1}^\top D^{-1}\mu - \gamma }{ \mathbf{1}^\top D^{-1}\mathbf{1} }.

Thus, in matrix form, the independent asset solution is

λ=1γD1(μν1),\lambda^* = \frac{1}{\gamma} D^{-1} (\mu - \nu \mathbf{1}),

where

ν=1D1μγ1D11.\nu = \frac{ \mathbf{1}^\top D^{-1}\mu - \gamma }{ \mathbf{1}^\top D^{-1}\mathbf{1} }.

This is the same result as before. The only difference is that the notation is cleaner.


The Correlated Asset Case

Now suppose the assets are correlated. Instead of assuming

E[dNi,tdNj,t]=0forij,E[dN_{i,t}dN_{j,t}] = 0 \quad \text{for} \quad i \neq j,

we allow

E[dNi,tdNj,t]=ρijdt.E[dN_{i,t}dN_{j,t}] = \rho_{ij}dt.

The covariance between the instantaneous returns of assets ii and jj is then

Σij=ρijσiσj.\Sigma_{ij} = \rho_{ij}\sigma_i\sigma_j.

So the covariance matrix is

Σ=[σ12ρ12σ1σ2ρ1nσ1σnρ21σ2σ1σ22ρ2nσ2σnρn1σnσ1ρn2σnσ2σn2].\Sigma = \begin{bmatrix} \sigma_1^2 & \rho_{12}\sigma_1\sigma_2 & \cdots & \rho_{1n}\sigma_1\sigma_n \\ \rho_{21}\sigma_2\sigma_1 & \sigma_2^2 & \cdots & \rho_{2n}\sigma_2\sigma_n \\ \vdots & \vdots & \ddots & \vdots \\ \rho_{n1}\sigma_n\sigma_1 & \rho_{n2}\sigma_n\sigma_2 & \cdots & \sigma_n^2 \end{bmatrix}.

The expected return term remains

λμ.\lambda^\top \mu.

But the variance term is no longer

i=1nλi2σi2.\sum_{i=1}^{n} \lambda_i^2 \sigma_i^2.

Instead, portfolio variance is

λΣλ.\lambda^\top \Sigma \lambda.

Therefore, the optimization problem becomes

maxλ[λμγ2λΣλ],\underset{\lambda}{\max} \left[ \lambda^\top \mu - \frac{\gamma}{2} \lambda^\top \Sigma \lambda \right],

subject to

1λ=1.\mathbf{1}^\top \lambda = 1.

The Lagrangian is

L=λμγ2λΣλν(1λ1).\mathcal{L} = \lambda^\top \mu - \frac{\gamma}{2} \lambda^\top \Sigma\lambda - \nu(\mathbf{1}^\top \lambda - 1).

The first-order condition is

μγΣλν1=0.\mu - \gamma \Sigma\lambda - \nu \mathbf{1} = 0.

Rearranging,

γΣλ=μν1.\gamma \Sigma\lambda = \mu - \nu \mathbf{1}.

Assuming Σ\Sigma is invertible,

λ=1γΣ1(μν1).\lambda = \frac{1}{\gamma} \Sigma^{-1} (\mu - \nu \mathbf{1}).

Now impose the full-investment constraint:

11γΣ1(μν1)=1.\mathbf{1}^\top \frac{1}{\gamma} \Sigma^{-1} (\mu - \nu \mathbf{1}) = 1.

Multiplying through by γ\gamma,

1Σ1μν1Σ11=γ.\mathbf{1}^\top \Sigma^{-1}\mu - \nu \mathbf{1}^\top \Sigma^{-1}\mathbf{1} = \gamma.

Solving for ν\nu,

ν=1Σ1μγ1Σ11.\nu = \frac{ \mathbf{1}^\top \Sigma^{-1}\mu - \gamma }{ \mathbf{1}^\top \Sigma^{-1}\mathbf{1} }.

Thus, the optimal portfolio under correlated risky assets is

λ=1γΣ1(μν1),\lambda^* = \frac{1}{\gamma} \Sigma^{-1} (\mu - \nu \mathbf{1}),

where

ν=1Σ1μγ1Σ11.\nu = \frac{ \mathbf{1}^\top \Sigma^{-1}\mu - \gamma }{ \mathbf{1}^\top \Sigma^{-1}\mathbf{1} }.

This is the general solution. The independent asset model is just the special case where Σ\Sigma is diagonal.


Minimum-Variance Portfolio Plus a Speculative Tilt

We can also decompose the correlated asset solution into two pieces.

The first piece is the global minimum-variance portfolio:

λMV=Σ111Σ11.\lambda_{MV} = \frac{ \Sigma^{-1}\mathbf{1} }{ \mathbf{1}^\top \Sigma^{-1}\mathbf{1} }.

The second piece is a return-seeking tilt:

1γ[Σ1μ1Σ1μ1Σ11Σ11].\frac{1}{\gamma} \left[ \Sigma^{-1}\mu - \frac{ \mathbf{1}^\top \Sigma^{-1}\mu }{ \mathbf{1}^\top \Sigma^{-1}\mathbf{1} } \Sigma^{-1}\mathbf{1} \right].

Putting them together,

λ=Σ111Σ11+1γ[Σ1μ1Σ1μ1Σ11Σ11].\lambda^* = \frac{ \Sigma^{-1}\mathbf{1} }{ \mathbf{1}^\top \Sigma^{-1}\mathbf{1} } + \frac{1}{\gamma} \left[ \Sigma^{-1}\mu - \frac{ \mathbf{1}^\top \Sigma^{-1}\mu }{ \mathbf{1}^\top \Sigma^{-1}\mathbf{1} } \Sigma^{-1}\mathbf{1} \right].

This is a very useful way to understand the model.

The first term is the allocation we would choose if we only cared about minimizing variance while remaining fully invested. It does not use expected returns at all. It only uses the covariance matrix.

The second term is the speculative component. It tilts the portfolio toward assets that have attractive expected returns relative to the covariance structure of the asset universe.

As γ\gamma \to \infty, the speculative term disappears and we converge to the global minimum-variance portfolio. As γ\gamma gets smaller, the speculative term becomes larger.

This is the exact same intuition as the independent asset case, except the meaning of risk is richer. We no longer penalize each asset only by its own variance. We penalize it by how it contributes to total portfolio variance.

That distinction matters. A high-volatility asset can still receive a meaningful allocation if it diversifies the rest of the portfolio. Conversely, a seemingly safe asset can receive a smaller allocation if it is highly correlated with everything else we already own.


Part VI: What the Formula Is Really Saying

At this point, we have a compact solution for the optimal allocation across nn risky assets:

λ=1γΣ1(μν1),\lambda^* = \frac{1}{\gamma} \Sigma^{-1} (\mu - \nu \mathbf{1}),

where

ν=1Σ1μγ1Σ11.\nu = \frac{ \mathbf{1}^\top \Sigma^{-1}\mu - \gamma }{ \mathbf{1}^\top \Sigma^{-1}\mathbf{1} }.

This is a pretty small formula given how much it contains.

The model says that optimal portfolio choice depends on three things:

  1. expected returns, encoded by μ\mu;
  2. the covariance structure of returns, encoded by Σ\Sigma;
  3. relative risk aversion, encoded by γ\gamma.

The role of γ\gamma is especially clear. Higher γ\gamma means we care more about variance and less about expected return. Lower γ\gamma means we are more willing to tolerate variance in pursuit of expected return.

But the formula also tells us something subtler. In a fully invested risky-asset-only portfolio, we are not deciding how much risky exposure to hold relative to cash. We are deciding how to distribute risky exposure across assets. This is why the solution naturally contains a minimum-variance portfolio.

When there is no risk-free asset, the investor must hold something. If expected returns are all equal, then the entire problem collapses into choosing the lowest-variance way to remain fully invested. In the independent asset case, that means inverse-variance weighting. In the correlated asset case, that means the global minimum-variance portfolio.

Expected returns then create tilts away from that baseline.

This is also why the risky-numeraire result from Part III is so clean. When we denominate everything in terms of asset AA, the asset A/AA/A has no price movement relative to itself. It becomes the reference point. The only risky decision left in the two-asset AA-denominated world is how much of B/AB/A to hold. That is why the formula collapses back to the familiar Merton-style share:

λB/A=μB/AγσB/A2.\lambda_{B/A} = \frac{ \mu_{B/A} }{ \gamma \sigma_{B/A}^2 }.

The cash-denominated risky/risky problem and the risky-numeraire problem are not contradictory. They are just different ways of representing the same underlying portfolio choice.


The Shadow Return Interpretation of ν\nu

The Lagrange multiplier ν\nu is also worth interpreting. From the first-order condition,

μiγλiσi2ν=0\mu_i - \gamma \lambda_i \sigma_i^2 - \nu = 0

in the independent case. Rearranging,

μiν=γλiσi2.\mu_i - \nu = \gamma \lambda_i \sigma_i^2.

So ν\nu acts like a return threshold created by the full-investment constraint. Assets with expected returns above ν\nu tend to receive larger weights, scaled by their variance. Assets with expected returns below ν\nu tend to receive smaller weights or even negative weights if shorting is allowed.

This is similar in spirit to the role played by a risk-free rate in the classic Merton Share. But here, ν\nu is not an externally given risk-free rate. It is determined endogenously by the asset universe, the covariance structure, and the fact that the portfolio weights must sum to one.

That is a nice conceptual payoff. If there is no risk-free asset in the model, the optimization still creates a benchmark return internally.


What Happens If We Ban Shorting?

The formulas above allow shorting and leverage in the sense that individual weights can be negative or greater than one, as long as the weights sum to one. This is standard in the clean mathematical version of the problem, but it may not be what we want in practice.

If we impose constraints like

λi0\lambda_i \geq 0

for all ii, or

0λi1,0 \leq \lambda_i \leq 1,

then the closed-form solution may no longer apply directly. The unconstrained optimum might tell us to short an asset with a bad risk-adjusted expected return. If shorting is not allowed, that asset's weight gets pushed to zero and the optimization has to be solved with inequality constraints.

In that case, the right mathematical object is a constrained quadratic program. Conceptually, though, the logic remains similar. We are still balancing expected return against variance, but some assets may hit boundary constraints and drop out of the active portfolio.

This is another reason I like deriving the unconstrained solution first. It gives the clean benchmark. Then constraints can be layered on top.


Limitations

The obvious weakness of this whole setup is that the inputs are doing enormous work.

The formulas look precise, but the quantities μ\mu and Σ\Sigma are not handed to us by nature. We have to estimate them. And estimating expected returns is notoriously difficult. Small changes in μ\mu can lead to large changes in optimal weights, especially when γ\gamma is low.

The covariance matrix is usually easier to estimate than expected returns, but it is still unstable. Correlations change. Volatilities change. Assets that looked diversifying in normal times can suddenly become highly correlated in a crisis.

There is also the GBM assumption. GBM is analytically convenient, but real return distributions have jumps, fat tails, volatility clustering, changing regimes, and all sorts of other unpleasant features. The model is useful, but it is not reality.

Still, I think the exercise is valuable. It shows the basic structure of the problem in its cleanest form. If we know our beliefs about expected returns, covariances, and risk aversion, then the optimal portfolio has a simple shape:

minimum-variance baseline+return-seeking tilt.\text{minimum-variance baseline} + \text{return-seeking tilt}.

That is the main lesson.


Conclusion

We began with the two-risky-asset version of the Merton-style allocation problem. We then moved to three assets, changed the numeraire, and finally generalized the model to nn risky assets.

For independent assets, the optimal allocation to asset ii is

λi=μiνγσi2,\lambda_i^* = \frac{\mu_i - \nu}{\gamma \sigma_i^2},

where

ν=j=1nμjσj2γj=1n1σj2.\nu = \frac{ \sum_{j=1}^{n} \frac{\mu_j}{\sigma_j^2} - \gamma }{ \sum_{j=1}^{n} \frac{1}{\sigma_j^2} }.

For correlated assets, the optimal portfolio is

λ=1γΣ1(μν1),\lambda^* = \frac{1}{\gamma} \Sigma^{-1} (\mu - \nu \mathbf{1}),

where

ν=1Σ1μγ1Σ11.\nu = \frac{ \mathbf{1}^\top \Sigma^{-1}\mu - \gamma }{ \mathbf{1}^\top \Sigma^{-1}\mathbf{1} }.

Equivalently,

λ=Σ111Σ11+1γ[Σ1μ1Σ1μ1Σ11Σ11].\lambda^* = \frac{ \Sigma^{-1}\mathbf{1} }{ \mathbf{1}^\top \Sigma^{-1}\mathbf{1} } + \frac{1}{\gamma} \left[ \Sigma^{-1}\mu - \frac{ \mathbf{1}^\top \Sigma^{-1}\mu }{ \mathbf{1}^\top \Sigma^{-1}\mathbf{1} } \Sigma^{-1}\mathbf{1} \right].

This final expression is my favorite version of the result. The first term is the minimum-variance portfolio. The second term is the speculative tilt. Risk aversion determines how much of that tilt we are willing to take.

So, in the end, the optimal risky-asset portfolio under CRRA utility has a surprisingly intuitive structure: start with the lowest-risk way to be fully invested, then tilt toward assets whose expected returns justify their contribution to portfolio risk.

Footnotes

  1. I am skeptical of standard national CPI measures, as discussed in Chapter 5 of Keynes' Treatise on Money. I think exact CPI calculation seems like a fundamentally futile task, though it's nuanced so I haven't made up my mind yet.