Portfolio Optimization with Python : An example from SET50 Index

Suebsak Rochanarat, FRM
6 min readFeb 11, 2021

In this article, I will go through popular optimization technique, Mean-Variance Optimization, introduced by Markowitz (1952). All codes will be written in Python 3.0 in Google Colaboratory.

If you are new with Google Colab, find link below regarding how to start writing Python in Google Colab.

Outlines:

  • Understanding Mean-Variance Optimization
  • The mathematical expression
  • Coding
  • End notes

Before digging into codes, let’s briefly recap the theory behind Markowitz’s Mean-Variance Optimization.

Understanding Mean-Variance Optimization

If we assumed that investors are rational, then they would weigh expected risks (expressed as Variance) against the expected returns (expressed as Mean). This assumption leads to two behaviors:

1. Prefer More to Less

2. Risk Aversion

These behaviors result in these two optimization mechanisms:
finding the best expected return given the lowest expected risk
or finding the lowest expected risk given the best expected return

Readers can further study Mean-Variance Analysis and Modern Portfolio Theory (MPT) from Investopedia in links below.

The mathematical expression

So far, it is quite obvious that this method requires only two inputs.

  1. The expected returns is average returns for certain investment horizon given i data points.
Expected return | where i is the data point

2. The variance is simply the square of standard deviation (S.D.) of returns

Expected Variance
Expected Volatility

Now that we know all parameters for a single securities, let’s combine them to a portfolio.

Expected portfolio return
Expected portfolio risk of i to j securities
Expected portfolio risk in matrix expression

Then, our optimization model can be constructed as follows:
Objective functions:

Optimization objective

Given constraints:

Optimization constraints | sum of weights = 1.00 (not allow short selling)

Such conventional optimization as Markowitz (1952) can pose a ‘Corner Portfolio Problem’. To alleviate such hurdle, we can deploy ‘Forced Diversification’ by settings weight constraints as follows:

Weight boundary

Readers can read about Corner Portfolio Problem in a link below and also note that I will not impose ‘Forced Diversification’ and other methods that aim to overcome Mean-Variance Optimization’s drawbacks such as ‘Black-Litterman Optimization’ and ‘Resampled Optimization’ in this article.

Coding

1. Preparing Data

For this section, I will retrieve SET50 constituents from official SET website first. Please check out link below how to web scrap SET50 constituents with Python.

From the code in the article, I will adjust the ticker code with ‘.BK’ to retrieve stock price data from Yahoo Finance.

suffix = '.BK'ticker_list = list(set50_constituents['Ticker'])yf_ticker = [item + suffix for item in ticker_list]set50_constituents['YF_Ticker'] = yf_ticker

Now, to get data from Yahoo Finance, adjust the ticker_list from codes in link below with:

ticker_list = list(set50_cons['MS_Ticker'])

Run the code, and then I can create a cumulative returns for each stock in SET50 for Y2020 like picture showed below.

(data_df['2020':].pct_change()+1).cumprod().plot(figsize=(12,6),legend=None)plt.ylabel('Growth of hypothetical 1 Bt.')
Cumulative returns of stock in SET50 for Y2020
data_df['2015':].to_excel('2015to2020_SET50_ClosedPrice.xlsx') #export to excel

2. Setting up parameters

From the mathematical expression section, we can define functions for returns and covariance matrix as below:

closed_price = pd.read_excel('2015to_2020_SET50_ClosedPrice.xlsx',
header=0,index_col=0,parse_dates=True)
daily_ret = closed_price.pct_change()mean_returns = daily_ret.mean() #mean vectorcov_matrix = daily_ret.cov() #covariance matrixrisk_free_rate = 0.01 #assumed rf=1% per annumnum_portfolios = 25000 #total simulated portfolios

3. Calculating Efficient Frontier

For this part, I will rely on functions from Kim (2018), which is retrieved and showed in the article and codes below respectively.

Generate efficient frontier by running the functions

display_calculated_ef(mean_returns, cov_matrix, risk_free_rate)
SET50 efficient frontier

Now, depict each stock’s volatility and return along with the efficient frontier.

display_ef_with_selected(mean_returns, cov_matrix, risk_free_rate)
SET50 efficient frontier with stocks’ volatility and returns

Sometimes, investing in 50 securities at the same time may create somewhat hurdles to individual investors. Let’s scope down to 10 securities, for instance, to see the shape of the efficient frontier more closely.

I will use these selected 10 securities to generate new efficient frontier.

daily_ret_adj = daily_ret[[‘AOT.BK’,’OSP.BK’,’BDMS.BK’,’BEM.BK’,’PTT.BK’,‘CPALL.BK’,’EA.BK’,’GLOBAL.BK’,’GPSC.BK’,’IVL.BK’]]

With this new dataset, create new efficient frontier with random portfolios by running this function.

display_calculated_ef_with_random(mean_returns, cov_matrix, num_portfolios, risk_free_rate)
Efficient frontier with random portfolios

End notes

Markowitz (1952)’s Mean-Variance optimization is easy and straightforward to implement. However, several empirical studies have long discussed the assumptions on which the optimization relies. For instance, the technique solely depends on historical data and thus is ‘backward looking’. Then, Black and Litterman (1992) introduced their model to overcome such shortcoming by mathematically incorporating the ‘forward looking’ components into the optimization problem.

Despite an ingenious solution from Black-Litterman model, the optimization itself still exposes to a number of model risks. As discussed in Chopra and Ziemba (1993), noises from parameter estimation could have a significant impact on the optimization. Michaud (2002) ‘Resampled Optimization’ is one way to reduce model risks from estimation errors by deploying Monte Carlo simulations and also help mitigate ‘Corner Portfolio Problem’. Moreover, in light of the advancement of computational power, a machine-learning technique, ‘Regularized Optimization’ is another sound solution to cope with such problem.

Impacts from parameter noise | Chopra and Ziemba (1993)

Another way to reduce the model risks from dimensionality is ‘Statistical Shrinkage’. Ledoit and Wolf (2003), for example, suggests more robust estimation of the covariance matrix by shrinking the covariance parameters with mean correlation of each variable in the matrix.

Shrinkage Estimators for High-Dimensional Covariance Matrices | Williamson (2014)

Lastly, the model may also suffer risks from the fact that volatility is time-varying. Studies, such as Khanthavit (2001), have identified and confirmed the regime-switching behavior of the volatility. This stylized fact prevails in many markets around the globe. Therefore, the optimization should be reviewed frequently or as soon as the behavior of the market has been detected to shift to another state.

References:

  • Markowitz, H. (1952). Portfolio Selection. The Journal of Finance, 7(1), 77–91.
  • Black, F., & Litterman, R. (1992). Global Portfolio Optimization. Financial Analysts Journal, 48(5).
  • Michaud, R., & Michaud, R. (2007). Estimation Error and Portfolio Optimization: A Resampling Solution. New Frontier Advisors, LLC.
  • PyPortfolioOpt. (n.d.). Retrieved February 01, 2021, from https://pyportfolioopt.readthedocs.io/en/latest/
  • Ledoit, O., & Wolf, M. (2003). Improved Estimation of the Covariance Matrix of Stock Returns With an Application to Portfolio Selection. Journal of Empirical Finance, 10(4), 603–621.
  • Khanthavit, A., (2001), A Markov-Switching Model for Mutual Fund Performance Evaluation, Manuscript, Thammasat University and Stock Exchange of Thailand.

--

--

Suebsak Rochanarat, FRM

Investment Analytics Analyst 🇹🇭. Certified Financial Risk Manager. Python Novice. Investment Management and Data Science.