“Proposal of creation of a portfolio with minimal risk”

The aim of this work is to propose a method for creating portfolios with a minimal expected risk. The proposed method consists of two steps. In the first step, the authors use a method for finding a minimum spanning tree. It is a graph theory tool, which is the field of discrete mathematics. Graph is defined as a set of vertices and edges. By this method the authors distribute assets, for example a stock index, into several subgroups. From each group it is then chosen an asset, from which most of the edges come out. These selected assets will be used to create a portfolio. In the second step, the authors will use a method of minimizing the standard deviation of the portfolio to calculate the weight of its assets. By this method, first it is found the weight of each asset so that the resulting portfolio would have the lowest possible expected risk. Then the authors find the portfolio with the lowest possible expected risk at required yield and create invest- ment strategies. These strategies are compared during the time and between each other based on the variation coefficient. The article can be a practical guide for an individual investor during the minimal risk portfolio creation and shows him, which assets (and which asset weights) of the selected index to purchase.


INTRODUCTION
Yield and risk are two categories that are inseparable in financial markets. If you only invest in one of the underlying assets, the expected yield is proportional to the expected risk measured, for example using standard deviation. If we divide invested capital into several underlying assets, then we talk about the process of diversification, resulting in a portfolio. The aim of diversification is to create a portfolio that has a high expected rate of yield and low level of risk. It is obvious that not every diversification meets this objective.
Should diversification be effective, first it is necessary to answer two questions. The first is the choice of the underlying assets in the portfolio. By different sets of underlying assets is it possible to achieve different rate of reduction of expected risk of portfolio?
The aim of the investor is to select a set of underlying assets so that the expected portfolio risk decreased significantly, that it should be lower than the expected risk of investment into any single asset.
The second question that occurs after solving the first one is what should be the amount of resources invested in each selected underlying asset, so that the resulting portfolio would have the lowest expected risk? In other words, if we change the weights of assets, will the resulting portfolio always have lower expected risk?
Pioneering work in modern portfolio theory was done especially by authors Harry Markowitz (1952Markowitz ( , 1959 and William Forsyth Sharpe (1992). Authors have studied the effect of asset risk, return, correlation and diversification on expected investment portfolio returns. Asset selection process (the investment process) is in most cases realized by two main approaches. The first is by using technical analysis and the second by fundamental analysis, however both of the approaches simultaneously are rarely utilized by the investor. The popular technical analysis method is the Japanese candlestick method, which was used by authors Yeongija Goo, Darhsin Chen and Yiwei Chang (2007) for analysis of the Taiwan Top 50 Tracker Fund and Taiwan Mid-Cap 100 Tracker Fund. The methodology is used to predict future price trends based on the relationships among opening, high, low and closing prices of the analyzed financial instruments. If we invest in the larger number of high-risk assets, we can use the reverse stock split method, which allows us to effectively merge these assets to form a smaller number of proportionally more valuable shares. This method was used by Manuela Raisová, Martin Užik and Christian M. Hoffmeister (2016) for analysis of V4 (Vysegrad group) countries' assets. The quality of theoretical introduction to quantify return and risk in case of two and three asset portfolio is written by Michal Šoltés (2003Michal Šoltés ( , 2012. New approaches for portfolio selection using, for example, neural networks are introduced by authors Alberto Fernández and Sergio Gómez (2007) and Ali S. Hadi, Azza A. El Naggar and Mona N. Abdel Bary (2016).
In this work we propose to create a portfolio with minimal expected risk divided into two steps. In the first step by using the Minimum Spanning Tree method we select the most appropriate underlying assets in the portfolio and in the second step by using Generalized Reduced Gradient method we calculate their weights in the proposed portfolio.

DESCRIPTION OF THE DATABASE
We used the historical prices of stocks comprising the index Standard and Poor's 500 (hereinafter referred to as the S&P 500). Shares included in the index represent about 70% of the total capitalization of the US stock market. According to many experts, the index is considered to be the standard measure of the performance of the US stock market. It has been compiled since 1957: index creators have since determined its value retroactively since 1926.
This index represents the basis for selection of shares for our portfolio. An analysis is performed using data from finance.yahoo.com, which are available from 2. 1. 1962. Sufficient time horizon of analysis allows us to compare composition and performance of the created portfolio, taking different historical stock prices constituting the portfolio into account. Since the creation of the index is dynamic and its shares are constantly amended and supplemented, as relevant company shares we consider shares with at least five year price history.
Considering that the average month has 20 market trading days (after taking into account weekends and holidays, during which no business is done), we consider that the average year has about 240 business days. Depending on how big a share of a particular company forming index of price history is, we qualify it for an analysis in the following table. It is also clear that with the growth of the reporting period, the number of companies that meet a given price history decreases. Shares, which are found in 30-year price history are certainly also found for example in 5-year price history, the opposite is not necessarily true. It is therefore logical to watch the effect time period has on identification of shares in portfolio we create. In case we would create portfolios for different historical prices, whose compound somewhat differs, it would mean that the selected time period would have had a significant impact for the method of choosing shares into our portfolio. We calculate the price history to March 2016, when we launched the analysis.
In each share constituting the index we follow the company name, ticker, the sector in which the company operates and subsequently routine quantitative characteristics for each day of the reporting period: • open price; • highest price; • lowest price; • close price; • trade volume; • adjusted close price.
The S&P 500 index divides companies into 10 sectors. This distribution has also been used in the process of designing our portfolio:

STOCK SELECTION IN THE PORTFOLIO
After identification and creation of six groups according to their share price history (5, 10, 15, 20, 25 or 30 years) we calculate the classic daily return for each asset as: where n P is adjusted close price in time , n 1 n P − is adjusted close price in previous business day (in time 1 n − ) and n r represents daily return in time n .
A correlation matrix has been calculated for each of the six groups of stock daily yield using standard Pearson correlation coefficient. The correlation matrix is a square symmetric matrix of size nn × ( n is the number of shares analyzed) with ones on the main diagonal, in which the value of the line i and column j represents the Pearson correlation coefficient of the daily yield between shares i and j given by: The correlation matrices serve as a basis to calculate the distance matrices we get by transformation in accordance with the methodology of Professor Mantegna (1999) according to the following equation: where () , ij d represents the distance between shares i and j in index and , ij ρ is the correlation coefficient between these shares.
New mathematical objects, complete graphs, which are given by vertices and edges, are received from a distance matrix with the application of discrete mathematics tools. Vertices of these objects represent shares forming the index, taking into account different price history. The edges between the vertices represent distances between calculated shares of a distance matrix. On the basis of the relationship for the distance matrices calculation from the correlation matrix, an inverse relationship between these two variables is applied. With correlation growth of two shares their mutual distance decreases, and conversely a fall in the correlation of two shares their mutual distance increases. Given transformation is carried out for the distance to reach non-negative values only, when in fact the correlation coefficient did not meet such criteria.
When applying the method of minimum spanning tree, which is finding such a subgraph of original graph, that is continuous, does not contain cycles and has minimal edge evaluation considering that there is a path between every pair of vertices, we get the minimum spanning tree for each of six complete graphs. This minimum spanning tree represents such a structure of shares, in which shares that are the closest (they have the greatest possible crosscorrelation) are mutually linked. These minimum spanning trees and certain graph characteristics are the basis for selection of shares for our portfolio.
By this, for each price history we obtain one minimum spanning tree that represents mutual link be-tween shares in this history. At this stage of analysis, we need to choose a certain amount of shares from this structure systematically, so that we can create a portfolio with minimal risk. We select one share from each sector, in order to diversify risks. Our portfolio therefore consists of ten shares. Ten shares are considered to be appropriate for an individual investor. The following figure shows the calculated minimum spanning tree also with color differentiation of individual sectors.
In this part of the analysis, we show only some partial results from our analysis (not all the price histories, only 5 and 30-year historical return). In Figure 1 we have the comparison between 5 and 30-year historical return, in which we graphically illustrate the use of this method, we can also see that the method is very useful in identifying the market structures, according to fact that the company shares are located in the common sector so closely together. Our results confirm the findings of Professor Mantegna, who applied this method when analyzing the structure of the US stock exchange index of Dow Jones Industrial Average. As we can see in the picture below, graph on the left side is significantly thicker than the right one, it is because of different price history. As it is mentioned in Table 1, 5-year history have 478 assets compared with 197 assets with 30-year history.The

Year Return 30 Year Return
most important part of the analysis is to identify shares as representatives of particular sector, we select for the portfolio. In this section, two graph characteristics with a degree of vertex and eccentricity of a vertex are considered as decision criterion.
The degree of vertex represents the number of edges incident to the vertex. The higher the degree of vertex is, the more it can be considered as central, respectively, it may represent the vertices in close area and describe the graph itself. The vertex with degree one is located at the periphery of the graph, graph with a large number of degree with two vertices has a shape, in terminology of graph theory, called a Path. A graph with one high-degree vertex and with other degree vertices is called a Star.
Eccentricity of the graph's vertex represents the distance from this vertex to its most distant vertex. The lower the eccentricity of the vertex, the more it can be considered as central. The vertex with the lowest eccentricity is called the centre of the graph. The minimum from the eccentricities represents the radius of the graph and the maximum from eccentricities represents the diameter of the graph. According to previous allegations, it is obvious that graphs in the shape of path have high diameter, on the contrary, stars have very low radius and diameter.
When identifying and compiling the portfolio, we try to ensure that the shares we choose would represent the sector, in which they are located, in the best and most accurate way. In each sector, therefore we are looking for a central vertex: a share that best represents the sector as a whole. Considering that every share represents a specific vertex, we know what is the degree of the vertex, what is the eccentricity and what is the sector of the company the share represents. For each sector, therefore we are looking for such share that has the highest degree of vertex. If in any of the sectors there are several shares with the same highest degree, the decision-making criterion is the minimum from eccentricities. In this way, we clearly identify the representatives of each sector for each price history and we create a portfolio comprising of ten shares.
The results of this part of analysis are presented in the following table (see Table 2), which indicates that despite the change of the time interval the method identifies the sector's representatives relatively reliably, since many companies in the table occur several times. This points to the suitability of using such an identification method that is not very sensitive to the selected time period of performed analysis and the investor does not need a large number of historical data to compile a portfolio.
The aim of this part is to compile the portfolio in a way that share equities of the portfolio (weights) are optimized in order to minimize risk.

MINIMIZING PORTFOLIO RISK
After the identification of portfolio shares, based on historical data on prices and calculated profitability, we strive to optimize these portfolios: finding the weights of shares that create the portfolio so that risk is minimized as to such an investor is exposed. The risk in this case is calculated by the standard deviation of expected returns. The higher the standard deviation, the higher the portfolio risk for the investor. The standard deviation of the portfolio is calculated as the square root of the product of the row vector of individual stocks equities (weights) with covariance matrix profitability of these shares and the column vector of these equities: , T ww σ = ⋅Σ⋅ (4) where w represents row weight vector, Σ covariance matrix of profitability and T w transposed row weight vector (column weight vector).
The aim is to minimize this value for the portfolio with an individual price history. Optimizing conditions are as follows: • It is impossible to enter into short positions (the weight cannot be negative). In other words, we offer guidance for an individual investor to create a portfolio by purchasing shares.
• The portfolio can be comprised of any combination of 10 representatives of the sectors identified above while the portfolio can contain from 1 to 10 shares (the weight may also be zero).
• The size of invested funds is given by budgetary constraints of the investor (sum of equities is equal to one).
• Other financial instruments than previously identified shares are not included in the portfolio, the risk-free rate is also excluded.
The covariance matrix is computed for each portfolio and the weights of shares, which minimize the standard deviation of the portfolio determined by Generalized Reduced Gradient Method (GRG). The fulfilment of this objective allows us to find a combination of individual components equities of a portfolio that has the least possible expected risk, meaning with any other combination of weights the expected risk is higher. However, in addition to risk the investor also takes into account the expected rate of return, while annual investment of returns, specific share respectively, appear as the best indicator for the purpose of comparison. Therefore, the next part of the analysis deals with average annual returns.
For each of the identified shares the average annual rate of return is calculated as the average annual rate of return given by the formula: Based on the average annual return on shares in the portfolio, it is clear that the entire portfolio may acquire profitable return in the range from the yield of the lowest return on the share that is composed by to the yield of the highest return on share which it is composed by. However, if the investor would invest all the funds in only one asset, the risk of the investment would be very high. By weighting annual return on shares constituting a separate portfolio weights that minimize the risk of these portfolios, minimum returns that an investor would have required (the yield achieved with minimal risk) are brought for each portfolio. On the basis of the relationship of the risk and size of individual yields on shares in the portfolios, 4 basic investor strategies are formed. The first strategy is a strategy of minimal risk, as described above. The profitability of this strategy is the minimum required yield rate of an investor. If profitability of the portfolio with minimal risk is deducted from the yield on the share with the highest profitability in the portfolio, we get the range of profitability of previously assembled strategies. If this range is divided appropriately, a strategy of gradually widening regular returns can be drawn up. Again, we solve the optimization problem, in which in comparison with the previous case we have additional condition that the profitability of the portfolio should be equal to a specific value. Again, however, we minimize the standard deviation of the portfolio under the conditions mentioned above.
Thus for each of the six portfolios we get four strategies appropriate for investors with varying degree of risk perception. The first strategy is the already mentioned minimal risk strategy. The second strategy is called a conservative strategy, the third is called a balanced strategy and the last is an aggressive strategy. Each additional strategy provides greater expected return and of course higher risk. For each strategy, we calculate the size of the eq-uities of various financial instruments comprising the portfolio and calculate the expected yield and risk of the strategy.
Data on prices of individual stocks come from freely available database finance.yahoo.com. Analysis was performed using the statistical programming language R. When working with graphs the tool library igraph was used. For optimization problems the Solver add-in MS Office Excel was used, while making use of the nonlinear algorithm Generalized Reduced Gradient Method.
In this part we again present results only for price history for 5 and 30 years. Results are presented in Tables 3 and 4.   Table 5 and riskiness of individual portfolios is distinguished by colors (light shades: the lowest risk level, dark hues: the highest risk level).

CONCLUSION
The present method provides practical guidance for identification and selection of shares into investor's portfolio. After the implementation of this method, we are looking for portfolios that offer the lowest risk at predefined performance conditions. A positive relationship between the portfolio's profitability and risk was confirmed. Table 5 shows, however, that the least risky strategy is the balanced one.
In five portfolios it has the lowest coefficient of variation (only in history 25 the conservative strategy is less risky). Paradoxically, the portfolio created by minimal risk strategy is not the portfolio with minimal risk in either case. On contrary, minimal risk strategy is the most risky, because portfolios created by this strategy show the highest coefficient of variation in all periods. The absolutely lowest risk shows the balanced portfolio in the five-year history and on the other hand the absolutely most risky portfolio is the minimal risk one during the ten-year history.
The main benefit of the methodology used in this article is that it combines well-known approach of portfolio optimizing with graph theory, which is not often used in portfolio creation process.