This series of essays explores the optimization of portfolio weights to maximize a Constant Relative Risk Aversion (CRRA) utility function over an agent's wealth. We use classic stochastic calculus techniques to model price processes as Geometric Brownian Motion (GBM). In Part I, we derive the optimal allocation between two risky assets and find our solution is an extension of the famous Merton Share. In Part II, we extend the analysis to three assets and then to an n-asset model. In Part III, we examine what happens when we change the numeraire from cash to a risky asset.
Part I: The Binary Asset Model
Model Definition
Suppose we have a universe of two stocks, A and B, modeled as independent GBM processes with parameters μA, μB, σA, and σB. We also define λA and λB to be our portfolio weights for assets A and B, such that λA+λB=1. Finally, we have a CRRA utility function over possible wealth states W such that U(W)=1−γW1−γ−1 and γ is our relative risk aversion parameter. In what follows, we attempt to find the optimal portfolio weights λ which maximize the expected utility of our future wealth.
Deriving a Closed Form Expected Utility Function
We intend to find the portfolio allocation [λA,λB] which maximizes the expected utility of our wealth in the next period, that is:
λA,λBmaxE[U(Wt+dt)].
Incorporating our CRRA utility function yields E[U(Wt+dt)]=E[1−γ(Wt+dWt)1−γ−1].
We can now define our wealth dynamic dWt as evolving according to the chosen portfolio weights λA and λB
dWt=λAWtSA,tdSA,t+λBWtSB,tdSB,t.
Similarly, we note that each stock's price follows a GBM, defined by the stochastic differential equations (SDEs)
We now consider the second-order Taylor series expansion of E[(1+x)1−γ] around 1 because we know x will be very small since we're dealing with an infinitesimally small time increment dt. In this case, x=λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t).
We remember that the Taylor series of a function f(x) around a point a is given by
f(x)=f(a)+f′(a)(x−a)+2!f′′(a)(x−a)2+…
and we are careful to make sure to include the second order term which includes our volatility parameters.
This implies that E[(1+x)1−γ]≈1+(1−γ)E[x]+2(1−γ)(−γ)E[x2].
Using x=λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t), we see that E[x]=(λAμA+λBμB)dt because E[dNi,t]=0.
Furthermore, to solve for E[x2], we substitute in x which gives us
E[x2]=E[(λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t))2].
At this point, some properties of Brownian motion come to our aid, particularly that E[dNi,t]=0, E[dNi,t2]=dt, and E[dNA,tdNB,t]=0 (since A and B are independent processes).
We can make the following simplifications:
(λA(μAdt+σAdNA,t))2=λA2σA2dt
(λB(μBdt+σBdNB,t))2=λB2σB2dt
2λAλB(μAdt+σAdNA,t)(μBdt+σBdNB,t)=0
Putting it all together, the simplified expression for E[x2]=λA2σA2dt+λB2σB2dt.
Returning back to the earlier expectation we've been trying to solve with these new results in hand, we see that
Quick aside, my hunch is that if we were to extend to a multi-asset model this becomes:
E[U(Wt+dt)]=1−γWt1−γ(1+(1−γ)(∑i=1nλiμi)dt+2(1−γ)(−γ)(∑i=1nλi2σi2)dt)−1
Now, because we're in a dual-asset model where the weights sum to 100%, λB is determined to be 1−λA. We can subsitute this into the expression above which gives us:
Optimizing Portfolio Weights to Maximize Expected Utility
In order to maximize E[U(Wt+dt)], we can follow the classic method of differentiating the function with respect to λA and setting this partial derivative equal to zero.
First, we notice that the term 1−γW1−γ−1 is a constant with respect to λA, so we can focus on differentiating only the bracketed expression, which we can denote as f(λA):
We can see that (1−γ)dt is a common factor in both terms, and since dt is an infinitesimal time increment (which importantly is not zero), we can simplify the equation by dividing through by (1−γ)dt:
Given two assets modeled as independent GBM processes, wealth Wt, and a CRRA utility function, we have found that the optimal allocation to asset A is λA=γ(σA2+σB2)μA−μB+γσB2 and the optimal allocation to asset B is λB=γ(σA2+σB2)μB−μA+γσA2.
You might have noticed that the portfolio allocations λA and λB don't have a subscript t. This is because, given that the price processes are stationary and that our risk aversion parameter γ does not change, they are time and wealth independent! This implies a constant fractional allocation to each stock in our portfolio.
Notably, in the case where B is a risk-free investment, implying that σB2=0, our optimal allocation reduces to λA=γσA2μA−μB, which is the famous Merton Share!
While all models are lossy, I take issue with the idea of a risk-free rate. In particular, the real returns on a nation's treasuries are sensitive to interest rate changes, inflation, and currency fluctuations. It also shouldn't be overlooked that big debt crises occur fairly regularly and nations do default. With this in mind, I think extending the dual-asset Merton Share model to two risky assets is an improvement toward realism.
Part II: Extension to Three Assets
In Part I, we derived the optimal allocations under an independent binary asset model where the two stocks follow geometric Brownian motion processes. We now extend the analysis to three assets and then to an n-asset model.
Model Definition
We have three assets A, B, and C whose price processes follow GBM processeses with parameters (μA,σA), (μB,σB), and (μC,σC), respectively. We allocate our wealth W between A, B, and C in proportion λA, λB, and λC, respectively, such that λA+λB+λC=1. We again maintain a Constant Relative Risk Aversion (CRRA) utility function U(W)=1−γW1−γ−1 where Wt is our wealth at time t and γ is our relative risk aversion parameter.
Deriving Our Expected Utility Function
We intend to find the portfolio allocation [λA,λB,λC] which maximizes the expected utility of our wealth in the next period, that is:
λA,λB,λCmaxE[U(Wt+dt)].
After incorporating our CRRA utility function, we see E[U(Wt+dt)]=E[1−γ(Wt+dWt)1−γ−1].
We can now define our wealth dynamic dWt as evolving according to the chosen portfolio weights λA, λB, and λC.
We now consider the second-order Taylor series expansion of E[(1+x)1−γ] around 1 because we know x will be very small since we're dealing with an infinitesimally small time increment dt. In this case, x=λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t)+λC(μCdt+σCdNC,t).
We remember that the Taylor series of a function f(x) around a point a is given by
f(x)=f(a)+f′(a)(x−a)+2!f′′(a)(x−a)2+…
and we are careful to make sure to include the second order term which includes our volatility parameters.
This implies that E[(1+x)1−γ]≈1+(1−γ)E[x]+2(1−γ)(−γ)E[x2].
Using x=λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t)+λC(μCdt+σCdNC,t), we see that E[x]=(λAμA+λBμB+λCμC)dt because E[dNi,t]=0.
Furthermore, to solve for E[x2], we substitute in x which gives us
Though it seems a bit unwieldy, we can simplify the expression E[(λA(μAdt+σAdNA,t)+λB(μBdt+σBdNB,t)+λC(μCdt+σCdNC,t))2], we need to first expand the square and use the properties of Wiener processes, notably that E[dNi,t]=0 and E[dNi,t2]=dt.
We can now simplify each term by considering the properties of dNi,t noted before:
For terms like λA2(μAdt+σAdNA,t)2, the expansion will give λA2μA2dt2+2λA2μAσAdtdNA,t+λA2σA2dNA,t2. When taking the expected value of this, the dtdNA,t term disappears, and dNA,t2 becomes dt, leaving λA2σA2dt.
When taking the expected value of this, the dt,dNA,t term disappears, and dNA,t2 becomes dt, leaving λA2σA2dt.
The cross terms like 2λAλB(μAdt+σAdNA,t)(μBdt+σBdNB,t) expand out to
2λAλBμAμBdt2+2λAλBμAσBdtdNB,t+2λAλBσAμBdtdNA,t+2λAλBσAσBdNA,tdNB,t.
Each term in this expression goes to zero because E[dt2]=0 and E[Ni,t]=0.
After applying these simplifications, the expected value expression becomes:
We make one final adjustment by including our λA+λB+λC=1 constraint to reduce a degree of freedom our model.
We first substitute λC=1−λA−λB into the linear term λAμA+λBμB+λCμC=λAμA+λBμB+(1−λA−λB)μC=λA(μA−μC)+λB(μB−μC)+μC. This also makes intuitive sense because initially the expression was the sum of all of our allocation percentages times the average return of those investments which is the expected return of our portfolio. The final expression is the same expected return of our portfolio, except we can conceptualize this as 100% of our portfolio returning μC, and then for each non-C asset we compute how much more or less we'd make on that fraction of our portfolio against a C-based benchmark.
Now we need to handle the quadratic term, λA2σA2+λB2σB2+λC2σC2. We substitute out λC which yields λA2σA2+λB2σB2+(1−λA−λB)2σC2. We note that generally (1−∑i=1nxi)2=1−2∑i=1nxi+2∑1≤i<j≤nxixj+∑i=1nxi2, this generalization will help us when we extend to the n-asset framework, but we can use it in our three-asset model too.
Now we expand and simplify the quadratic term for λC:
This expression for our expected marginal utility incorporates all of the constraints of our model now that λC has been eliminated and replaced by solely λA and λB, which crucially reduces a degree of freedome from our model and allows the matrix inversion technique which follows to succeed.
Optimizing Portfolio Weights to Maximize Expected Utility (Take Three)
We know that the maximium of E[U(Wt+dt)] w.r.t. our λis will have a tangent plane with zero gradient in the λA and λB directions. That is, dλidE[U(Wt+dt)]=0 for i∈{1,2}. From this we will get a system of equations which we can then solve to get our optimal portfolio allocations. We start by solving for dλAdE[U(Wt+dt)].
We can apply a symmetry argument to find dλBdE[U] because λB can be interchanged with λA without changing E[U]. Setting each of these partial derivative to equal zero gives us the system of equations we're looking for.
In Part I, we made the case that there may be no such thing as a risk-free asset. In the case of treasuries, the typical example of the risk-free asset, the holder is exposed to inflation risks, dollar fluctuations, interest rate changes, and other factors. Given this, we constructed a model for the optimal allocation between two risky assets.
Every investor has their own basket of goods under which they estimate changes in their real purchasing power.1 This subjective basket is not exactly cash, which was the motivation for the first essay. But what if it were somehow a known tradable asset? Or even more simply, perhaps an investor wants to denominate their returns in ETH, SPY, or some known liquid asset. Does our previous analysis still hold if we change the currency units? In this essay, I examine the case where, instead of using cash as a base for both assets A and B, we designate asset A as the numeraire, expressing asset B in terms of A.
Let us begin with the optimal allocations derived from the previous cash-denominated model:
Both assets A and B are initially denominated in cash. We now shift our perspective by setting asset A as the numeraire, effectively redefining all quantities in relation to A. This transition moves us from a cash-denominated framework to one in which asset A is the central reference point.
Key Definitions in the A-Denominated Model:
SB/A: Price of asset B relative to A.
SA/A: Price of asset A relative to itself.
μB/A, σB/A: Drift and volatility of B with respect to A.
λA/A, λB/A: Portfolio weights for A and B in the A-denominated framework.
WA,t: Wealth expressed in terms of A.
dWA,t: Wealth dynamics in terms of A.
We start by solving for the simplest expressions.
SA/A is the price of asset A relative to itself, so SA/A=SA/ASA/A=1.
Since SA/A=1, its drift and volatility are zero: μA/A=0, σA/A=0.
The price of B in terms of A is given by SB/A=SASB.
In this model, the drift and volatility of B relative to A are defined as:
μB/A=μB−μA,σB/A=σA2+σB2.
The transition from the cash-denominated optimal B allocation, λB=γ(σA2+σB2)μB−μA+γσA2, to the A-denominated model can be achieved by the following transformations:
Replace the cash-denominated drift difference μB−μA with μB/A:
Next, replace the combined variance σA2+σB2 with σB/A2:
γ(σA2+σB2)μB/A+γσA2=γσB/A2μB/A+γσA2
Finally, note that the numeraire asset A/A has zero drift and zero volatility: μA/A=0 and σA/A=0:
γσB/A2μB/A+γσA2=γσB/A2μB/A
Thus, the optimal weight for B in the A-denominated framework becomes:
λB/A=γσB/A2μB/A
The derivation above is perfectly legitamate, though if we don't want to take as axiom the results of my previous essay, we can start again from scratch.
Fully Deriving the Optimal Portfolio Weights in the A-Denominated Model
We start with maximizing our expected CRRA utility as before:
We note that in the A-denominated model the only risky asset is B/A, so the wealth dynamics are driven solely by B/A. The optimization problem simplifies to maximizing expected utility with respect to λB/A.
Since SA/A=1, asset A contributes no differential to wealth dynamics in A-terms. Thus, wealth dynamics in this A-denominated framework are driven solely by B/A:
We can now express the optimal weights in the A-denominated model directly in terms of μB/A and σB/A2.
λB/A=γσB/A2μB/A,λA/A=1−λB/A.
Thus, after transforming the cash-denominated model's optimal weights to the A-denominated model's optimal weights, we see that we've derived the famous Merton share.
Part IV: Extension to n Assets
In Part I, we solved the binary asset case. In Part II, we tried to move from two assets to three assets. The three-asset case is useful because it reveals something important: the algebra gets messy very quickly if we try to expand everything by hand.
At two assets, it is perfectly reasonable to eliminate one weight by writing λB=1−λA. At three assets, it is still possible to eliminate one weight by writing λC=1−λA−λB, though the expression becomes more cumbersome. But at n assets, this approach becomes a bit masochistic.
There is also a subtle but important calculus point here. Once we write λC=1−λA−λB, the remaining variables λA and λB are independent coordinates for the constrained portfolio surface. So when we differentiate with respect to λA, we hold λB fixed. We do not need to include a term like dλAdλB.
Rather than trying to manage all of this by hand, the cleanest way forward is to use a Lagrange multiplier. This lets us impose the full-investment constraint directly and gives a formula that scales naturally from two assets to three assets to n assets.
Model Definition
Suppose we have a universe of n risky assets indexed by i∈{1,2,…,n}. Each asset price follows an independent GBM process:
Si,tdSi,t=μidt+σidNi,t,
where μi is the drift of asset i, σi is the volatility of asset i, and Ni,t is a Wiener process for asset i.
We assume the Brownian shocks are independent across assets, so
E[dNi,tdNj,t]={dt,0,i=j,i=j.
Let λi denote the portfolio weight on asset i. As before, our portfolio weights must sum to one:
i=1∑nλi=1.
We again use CRRA utility:
U(W)=1−γW1−γ−1,
where γ is our relative risk aversion parameter.
Our goal is to solve
λ1,…,λnmaxE[U(Wt+dt)],
subject to
i=1∑nλi=1.
Wealth Dynamics
The wealth process evolves according to the chosen portfolio weights:
dWt=Wti=1∑nλiSi,tdSi,t.
Substituting in each asset's GBM process gives
dWt=Wti=1∑nλi(μidt+σidNi,t).
Equivalently,
WtdWt=i=1∑nλiμidt+i=1∑nλiσidNi,t.
Now define
x=i=1∑nλi(μidt+σidNi,t).
Then
Wt+dt=Wt(1+x).
So expected utility becomes
E[U(Wt+dt)]=E[1−γWt1−γ(1+x)1−γ−1].
Since Wt is known at time t, we can pull Wt1−γ out of the expectation:
E[U(Wt+dt)]=1−γWt1−γE[(1+x)1−γ]−1.
Taylor Expanding Expected Utility
As before, we use a second-order Taylor expansion:
E[(1+x)1−γ]≈1+(1−γ)E[x]+2(1−γ)(−γ)E[x2].
First, we compute E[x]:
E[x]=E[i=1∑nλi(μidt+σidNi,t)].
Since E[dNi,t]=0, this reduces to
E[x]=(i=1∑nλiμi)dt.
Now we compute E[x2]:
E[x2]=E(i=1∑nλi(μidt+σidNi,t))2.
The dt2 terms disappear. The dtdNi,t terms disappear in expectation. And since the assets are independent, the cross terms dNi,tdNj,t vanish for i=j.
Thus, only the own-variance terms remain:
E[x2]=(i=1∑nλi2σi2)dt.
Substituting these terms back into the Taylor expansion gives
The terms Wt, γ, and dt are fixed with respect to our portfolio weights. Therefore, maximizing expected utility is equivalent to maximizing the simpler quadratic objective
i=1∑nλiμi−2γi=1∑nλi2σi2,
subject to
i=1∑nλi=1.
This is nice. The whole expected utility problem has collapsed into a tradeoff between portfolio drift and portfolio variance.
Solving the n-Asset Problem with a Lagrange Multiplier
We now solve
λ1,…,λnmax[i=1∑nλiμi−2γi=1∑nλi2σi2],
subject to
i=1∑nλi=1.
Define the Lagrangian:
L=i=1∑nλiμi−2γi=1∑nλi2σi2−ν(i=1∑nλi−1),
where ν is the Lagrange multiplier on the full-investment constraint.
Taking the derivative with respect to λi gives
∂λi∂L=μi−γλiσi2−ν.
Setting the first-order condition equal to zero:
μi−γλiσi2−ν=0.
Rearranging,
γλiσi2=μi−ν.
Therefore,
λi=γσi2μi−ν.
This already tells us a lot. The optimal weight on asset i is increasing in its drift μi and decreasing in its variance σi2. This is exactly what we would hope to see.
Now we use the constraint that the weights sum to one:
i=1∑nλi=1.
Substituting in our expression for λi,
i=1∑nγσi2μi−ν=1.
Multiplying both sides by γ,
i=1∑nσi2μi−ν=γ.
Expanding,
i=1∑nσi2μi−νi=1∑nσi21=γ.
Solving for ν gives
ν=∑i=1nσi21∑i=1nσi2μi−γ.
Thus, the optimal allocation to asset i is
λi∗=γσi2μi−ν,
where
ν=∑j=1nσj21∑j=1nσj2μj−γ.
This is the clean n-asset solution for independent risky assets.
Sanity Check: Recovering the Two-Asset Formula
It is worth checking that this formula recovers our original binary asset result.
For two assets, the solution gives
λA∗=γσA2μA−ν,λB∗=γσB2μB−ν,
where
ν=σA21+σB21σA2μA+σB2μB−γ.
After simplifying, this gives
λA∗=γ(σA2+σB2)μA−μB+γσB2,
and
λB∗=γ(σA2+σB2)μB−μA+γσA2.
This is exactly the result from Part I.
Sanity Check: The Correct Three-Asset Formula
The same formula also gives a clean three-asset result. For assets A, B, and C, we get
These weights sum to one. They also have a nice interpretation: each asset receives a baseline allocation determined by the other assets' variances, plus a tilt based on its return advantage over the other assets.
If all three assets have the same expected return, so μA=μB=μC, then all of the return-difference terms disappear and we get
This is exactly the inverse-variance allocation. When all assets have the same drift, the only remaining problem is how to allocate across risk, and the lower-variance assets receive larger weights.
Fully Scalar n-Asset Formula
We can also write the n-asset solution as a single scalar expression. Let
The convention here is that an empty product equals 1. This matters in the two-asset case, where the term ∏k=i,jσk2 has no elements.
This scalar expression is useful because it shows the direct generalization of the two-asset and three-asset formulas. But it is not the form I would actually use computationally. For computation and interpretation, the Lagrange multiplier form is cleaner.
Interpreting the Independent n-Asset Solution
We can rewrite the solution in a way that makes the intuition clearer.
Define
H=j=1∑nσj21.
Now define the inverse-variance-weighted average drift:
This is maybe the most intuitive version of the independent asset result.
The first term,
∑j=1nσj21σi21,
is the inverse-variance allocation. It is the portfolio we get if expected returns are all equal and we only care about minimizing variance subject to being fully invested.
The second term,
γσi2μi−μˉσ−2,
is the speculative tilt. Assets with drifts above the inverse-variance-weighted average drift receive larger allocations. Assets with drifts below that average receive smaller allocations.
As γ increases, the speculative tilt shrinks. In the limit as γ→∞, the investor becomes infinitely risk averse and the portfolio approaches the inverse-variance allocation. As γ decreases, the investor becomes more willing to tilt toward assets with higher expected returns.
This is a satisfying result. The model says that a risk-averse investor starts with an inverse-variance portfolio and then tilts toward assets that have better expected returns relative to that baseline.
Part V: Matrix Form and Correlated Assets
The independent asset model is mathematically convenient, but it is obviously a simplification. In real markets, assets are correlated. Stocks move together. Bonds and equities can become correlated in crises. Crypto assets often behave like one giant risk factor wearing different ticker symbols.
So the next natural extension is to replace the independent variance term
i=1∑nλi2σi2
with a full covariance matrix.
Before doing that, it helps to rewrite the independent case in matrix form.
Matrix Form of the Independent Asset Case
Let
λ=λ1λ2⋮λn,μ=μ1μ2⋮μn,1=11⋮1.
For independent assets, define the diagonal covariance matrix
D=σ120⋮00σ22⋮0⋯⋯⋱⋯00⋮σn2.
The objective function becomes
λ⊤μ−2γλ⊤Dλ,
subject to
1⊤λ=1.
The Lagrangian is
L=λ⊤μ−2γλ⊤Dλ−ν(1⊤λ−1).
Taking the derivative with respect to λ gives
μ−γDλ−ν1=0.
Therefore,
γDλ=μ−ν1,
and so
λ=γ1D−1(μ−ν1).
Now impose the full-investment constraint:
1⊤λ=1.
Substituting in the expression for λ,
1⊤γ1D−1(μ−ν1)=1.
Multiplying through by γ,
1⊤D−1μ−ν1⊤D−11=γ.
Solving for ν gives
ν=1⊤D−111⊤D−1μ−γ.
Thus, in matrix form, the independent asset solution is
λ∗=γ1D−1(μ−ν1),
where
ν=1⊤D−111⊤D−1μ−γ.
This is the same result as before. The only difference is that the notation is cleaner.
The Correlated Asset Case
Now suppose the assets are correlated. Instead of assuming
E[dNi,tdNj,t]=0fori=j,
we allow
E[dNi,tdNj,t]=ρijdt.
The covariance between the instantaneous returns of assets i and j is then
Thus, the optimal portfolio under correlated risky assets is
λ∗=γ1Σ−1(μ−ν1),
where
ν=1⊤Σ−111⊤Σ−1μ−γ.
This is the general solution. The independent asset model is just the special case where Σ is diagonal.
Minimum-Variance Portfolio Plus a Speculative Tilt
We can also decompose the correlated asset solution into two pieces.
The first piece is the global minimum-variance portfolio:
λMV=1⊤Σ−11Σ−11.
The second piece is a return-seeking tilt:
γ1[Σ−1μ−1⊤Σ−111⊤Σ−1μΣ−11].
Putting them together,
λ∗=1⊤Σ−11Σ−11+γ1[Σ−1μ−1⊤Σ−111⊤Σ−1μΣ−11].
This is a very useful way to understand the model.
The first term is the allocation we would choose if we only cared about minimizing variance while remaining fully invested. It does not use expected returns at all. It only uses the covariance matrix.
The second term is the speculative component. It tilts the portfolio toward assets that have attractive expected returns relative to the covariance structure of the asset universe.
As γ→∞, the speculative term disappears and we converge to the global minimum-variance portfolio. As γ gets smaller, the speculative term becomes larger.
This is the exact same intuition as the independent asset case, except the meaning of risk is richer. We no longer penalize each asset only by its own variance. We penalize it by how it contributes to total portfolio variance.
That distinction matters. A high-volatility asset can still receive a meaningful allocation if it diversifies the rest of the portfolio. Conversely, a seemingly safe asset can receive a smaller allocation if it is highly correlated with everything else we already own.
Part VI: What the Formula Is Really Saying
At this point, we have a compact solution for the optimal allocation across n risky assets:
λ∗=γ1Σ−1(μ−ν1),
where
ν=1⊤Σ−111⊤Σ−1μ−γ.
This is a pretty small formula given how much it contains.
The model says that optimal portfolio choice depends on three things:
expected returns, encoded by μ;
the covariance structure of returns, encoded by Σ;
relative risk aversion, encoded by γ.
The role of γ is especially clear. Higher γ means we care more about variance and less about expected return. Lower γ means we are more willing to tolerate variance in pursuit of expected return.
But the formula also tells us something subtler. In a fully invested risky-asset-only portfolio, we are not deciding how much risky exposure to hold relative to cash. We are deciding how to distribute risky exposure across assets. This is why the solution naturally contains a minimum-variance portfolio.
When there is no risk-free asset, the investor must hold something. If expected returns are all equal, then the entire problem collapses into choosing the lowest-variance way to remain fully invested. In the independent asset case, that means inverse-variance weighting. In the correlated asset case, that means the global minimum-variance portfolio.
Expected returns then create tilts away from that baseline.
This is also why the risky-numeraire result from Part III is so clean. When we denominate everything in terms of asset A, the asset A/A has no price movement relative to itself. It becomes the reference point. The only risky decision left in the two-asset A-denominated world is how much of B/A to hold. That is why the formula collapses back to the familiar Merton-style share:
λB/A=γσB/A2μB/A.
The cash-denominated risky/risky problem and the risky-numeraire problem are not contradictory. They are just different ways of representing the same underlying portfolio choice.
The Shadow Return Interpretation of ν
The Lagrange multiplier ν is also worth interpreting. From the first-order condition,
μi−γλiσi2−ν=0
in the independent case. Rearranging,
μi−ν=γλiσi2.
So ν acts like a return threshold created by the full-investment constraint. Assets with expected returns above ν tend to receive larger weights, scaled by their variance. Assets with expected returns below ν tend to receive smaller weights or even negative weights if shorting is allowed.
This is similar in spirit to the role played by a risk-free rate in the classic Merton Share. But here, ν is not an externally given risk-free rate. It is determined endogenously by the asset universe, the covariance structure, and the fact that the portfolio weights must sum to one.
That is a nice conceptual payoff. If there is no risk-free asset in the model, the optimization still creates a benchmark return internally.
What Happens If We Ban Shorting?
The formulas above allow shorting and leverage in the sense that individual weights can be negative or greater than one, as long as the weights sum to one. This is standard in the clean mathematical version of the problem, but it may not be what we want in practice.
If we impose constraints like
λi≥0
for all i, or
0≤λi≤1,
then the closed-form solution may no longer apply directly. The unconstrained optimum might tell us to short an asset with a bad risk-adjusted expected return. If shorting is not allowed, that asset's weight gets pushed to zero and the optimization has to be solved with inequality constraints.
In that case, the right mathematical object is a constrained quadratic program. Conceptually, though, the logic remains similar. We are still balancing expected return against variance, but some assets may hit boundary constraints and drop out of the active portfolio.
This is another reason I like deriving the unconstrained solution first. It gives the clean benchmark. Then constraints can be layered on top.
Limitations
The obvious weakness of this whole setup is that the inputs are doing enormous work.
The formulas look precise, but the quantities μ and Σ are not handed to us by nature. We have to estimate them. And estimating expected returns is notoriously difficult. Small changes in μ can lead to large changes in optimal weights, especially when γ is low.
The covariance matrix is usually easier to estimate than expected returns, but it is still unstable. Correlations change. Volatilities change. Assets that looked diversifying in normal times can suddenly become highly correlated in a crisis.
There is also the GBM assumption. GBM is analytically convenient, but real return distributions have jumps, fat tails, volatility clustering, changing regimes, and all sorts of other unpleasant features. The model is useful, but it is not reality.
Still, I think the exercise is valuable. It shows the basic structure of the problem in its cleanest form. If we know our beliefs about expected returns, covariances, and risk aversion, then the optimal portfolio has a simple shape:
minimum-variance baseline+return-seeking tilt.
That is the main lesson.
Conclusion
We began with the two-risky-asset version of the Merton-style allocation problem. We then moved to three assets, changed the numeraire, and finally generalized the model to n risky assets.
For independent assets, the optimal allocation to asset i is
λi∗=γσi2μi−ν,
where
ν=∑j=1nσj21∑j=1nσj2μj−γ.
For correlated assets, the optimal portfolio is
λ∗=γ1Σ−1(μ−ν1),
where
ν=1⊤Σ−111⊤Σ−1μ−γ.
Equivalently,
λ∗=1⊤Σ−11Σ−11+γ1[Σ−1μ−1⊤Σ−111⊤Σ−1μΣ−11].
This final expression is my favorite version of the result. The first term is the minimum-variance portfolio. The second term is the speculative tilt. Risk aversion determines how much of that tilt we are willing to take.
So, in the end, the optimal risky-asset portfolio under CRRA utility has a surprisingly intuitive structure: start with the lowest-risk way to be fully invested, then tilt toward assets whose expected returns justify their contribution to portfolio risk.
Footnotes
I am skeptical of standard national CPI measures, as discussed in Chapter 5 of Keynes' Treatise on Money. I think exact CPI calculation seems like a fundamentally futile task, though it's nuanced so I haven't made up my mind yet. ↩