Corporate rating forecasting using Artificial Intelligence statistical techniques

ARTICLE INFO Daniel Caridad, Jana Hančlová, Hosn el Woujoud Bousselmi and Lorena Caridad y López del Río (2019). Corporate rating forecasting using Artificial Intelligence statistical techniques. Investment Management and Financial Innovations, 16(2), 295-312. doi:10.21511/imfi.16(2).2019.25 DOI http://dx.doi.org/10.21511/imfi.16(2).2019.25 RELEASED ON Monday, 24 June 2019 RECEIVED ON Tuesday, 22 January 2019 ACCEPTED ON Tuesday, 21 May 2019


INTRODUCTION AND STATISTICAL METHODS IN RATING FORECASTING
Credit Rating Agencies (CRA) provide ordinal assessments associated with the ability of companies, governments, institutions or financial assets to meet debt obligations on time. These ratings are generated by CRAs as an 'objective' information about the financial health of their customers (although, in some cases, the CRAs provide ratings for third parties), bonds emissions, companies, institutions, and some other agents or financial products. This information is based on two components: the first is estimated from financial and economic sources, usually public, and the second on so called 'qualitative' data, which is part of the proprietary know-how of the agencies. The methodology used is somewhat fuzzy, as the CRAs do not publish their methodology fully. Once ratings are published, different agents, such as investors, banks, companies or public bodies, use them to assess the financial situation of the firm, the institution, or the asset emission; they are used, thus, as a measure to evaluate credit risk.
Most ratings are evaluated at the request of companies or institutions. For this purpose, an application is submitted to a CRA, and a contract is established between the parties. Costs are borne by the company, and that could raise some clear conflicts of interest. There are issues, or issuers, not rated as a matter of policy by CRAs; in some other cases, there are not enough data available to estimate a rating, or the issue is privately placed, or there can be a withdrawal of rating associated to new circumstances, as can be the case of a bond called for redemption. Ratings evolve along time as the position of issuers and their obligations change, and CRAs always warn about the limitation of their judgements. Their ratings are qualified as 'opinions' and they declare that their purpose is not to guide investors in their decisions. Even their discrete and ordinal classification of ratings means that two issues with the same rating cannot be associated to a similar level of risk.
But, how an independent investor or institution could evaluate the financial health of a company or a new issue? How can it be done without having to incur in the costs of arranging a contract for this purpose with a CRA? Some statistical methods have been used with this aim, but as the emergency of Artificial Intelligence (AI) tools is becoming widespread, it is possible to model the rating of a company using public data. The Financial Stability Board (2017) reveals a number of potential benefits and risks for financial stability in the coming years and as more data become available.
If the rating process is associated with such uncertainty and red tape, a question arises about the possibility of obtaining, by companies or third parties, these ratings with a less cumbersome and expensive approach, even in cases of non-rated risks or issues. In fact, it will be shown that this is possible using public information, available to investors and financial institutions. The answer to these question lies in using statistical multivariate methods and AI models to estimate ratings of companies based on information available about their financial and economic data.
Most of the literature about financial rating forecasting focuses on bond rating prediction, and not on company's ratings. Fitch (2018) describes the methodology and variables used. Some authors, such as Bissoondoyal-Bheenick and Treepongkaruna (2011), have modelled banks ratings. Gogas et al. (2013) introduced a forecasting rating model for these entities, using numerical methods, Ptak-Chmielewska (2016) and Novotna (2012) compare several statistical methods; Altman (2010) develops a new metric to rate firms, and to forecast their probability of default. Gangolf et al. (2016) provide a complete comparison of quantitative methods in credit rating forecasting with different statistical and AI techniques. Karminsky and Khromova (2016) approach the forecasting of bank ratings using ordered probit models, with a sample from Bankscope database. Khemakhem and Boujelbène (2015) forecast credit rating in Tunisia. Mayer et al. (2017) treat the validation of ratings as they are computed by the CRAs and how they would be influenced by the business cycle. Metz (2006) describes some alternative approaches in rating forecasting. Kumar and Haynes (2003) use ANN and discriminant analysis for forecasting credit ratings.
In bond rating prediction, several statistical methods have provided models with a medium or high accuracy, especially using Artificial Intelligence with embedded learning capabilities. These models are oriented to forecast the bond rates. Some other techniques as Support Vector Machines (SVM) have also been used (Kaplan & Urwitz, 1979). Saha and Waheed (2017) use ANN to estimate ratings of bonds. These techniques can be also applied to company rating prediction with advantages over classical statistical methods (Devasena, 2014), such as regression models, multivariate classifications, or decision trees. Kim and Ahn (2012) use a modeling procedure called 'vector machines', also utilized by Rovira et al. (2005). Dima and Vasilache (2016) use a large sample of companies to estimate their ratings at different levels.
Back propagation Artificial Neural Networks have been applied in forecasting ratings, usually of bond emissions, although here they are used to rate companies. ANNs are fully connected, layered, feed-forward non-linear models, based on a set of causal variables used to estimate one or several dependent variables. In this case, the exogenous variables include financial data obtained from public sources, related to a random chosen set of companies, rated by S&P and Moody's. The causal variables constitute the ANN input layer from which the activations flow through one or several hidden layers, and finally reach the output layer composed by the rating of each company. The back propagation procedure starts with a random set of weights or parameters, which are modified for each additional case; comparing the output with the real rating, the back phase adjusts the parameters to reduce the errors, usually by a gradient descent method. This recursive procedure is repeated until the parameters estimates stabilize.
In the following pages, a set of publicly available variables is proposed as explanatory of the ratings levels, with a short introduction to companies' long-term ratings provided by two main CRAs (S&P and Moody's). Then, two random samples of more than one thousand companies are obtained using Bloomberg's databases. In addition, fifteen economic and financial variables are measured for each of them during a five-year period. With this information, several rating forecasting models are proposed, both cross-section and dynamic, and their accuracy is assessed with the main explanatory factors.

CREDIT RATING FACTORS
The specification of econometric models to forecast the credit rating of a company involves using exogenous financial variables that produce a causal effect on their creditworthiness, and on their capacity to fulfill their future obligations. Public data are available for most of the companies that are traded on financial markets. In Bloomberg's database, this information and the ratings obtained from the main CRAs are available.
The set of variables used in the proposed models are of different types: a) related to the company size: total sales revenues, short-and long-term debt, net income, total assets and non-current assets; b) related to its economic activity: net profits, inventories, cash and short-term investments; c) related to financial aspects: EBITDA, EBIT, ROA, ROE, EPS, CFO, FFO, net interest expenses, total equity, cash flow from ordinary operations, total capital expenditures, returns to shareholders, financial expenses, financial expenses due to sales, operating expenses; d) market variables: current market capitalization, five-year Credit Default Swaps; e) leverage data of different relative magnitudes, like cash flow over total debt or over its total assets, short-and long-term debt ratios, and so on; liquidity ratios, sales evolution and stock market capitalization.
Several authors, including Jayadev (2006), propose using financial ratios as exogenous causal variables. The more widely used variable for forecasting the rating of a company is its size, measured by the total equity (Horrigan, 1966;Kaplan & Urwitz, 1979;Pinches & Mingo, 1973;Maher & Sen, 1997;Huang et al., 2004). Also, their assets (Pinches & Mingo, 1973) and the total sales (Surkan & Singleton, 1990), or their capital (Horrigan, 1966;Kaplan & Urwitz, 1979;Maher & Sen, 1997;Huang et al., 2004) and the level of corporate debt (Chaveesuk et al., 1999;Huang et al., 2004). The size of a company is usually linked to its ability to cope with financial crisis; that is, for a larger company size, it should have greater ability to cope with business cycles and, therefore, its credit rating should be higher. Evidence supporting this belief is found in Huang et al. (2004) who measured the contribution of several variables in the rating prediction, concluding that the variable that had the greatest predictive power for their US data sample, were two variables related to size (total assets and total liabilities) and financial ratio (total long-term debt over total capital paid). Activity variables regularly refer to company sales, either as a ratio or as a growth rate (Horrigan, 1966;Dutta & Shekhar, 1988;Surkan & Singleton, 1990). Most activity variables try to capture the speed of operations, such as the burden of interest paid on total expenses or total sales (Kaplan & Urwitz, 1979;Shin et al., 2005). The relationship between the rating and the activity level is important, since these types of ratios indicate the rhythm of activities that the company has, i.e. if the active projects contribute to the adequate payment of their commitments. The variables related to the financing of the companies usually refer to the ratio between the debt (short-or long-term) and the total assets (Shin et al., 2005;Dutta & Shekhar;1988, Chaveesuk et al., 1999 or between debt and equity (Kaplan & Urwitz, 1979;Huang et al., 2004). It is also common to use liquidity ratios composed from the current assets and/or the current liabilities. In this type of ratios, quotients or differences are used between both components of the current or as a proportion of capital or assets (Chaveesuk et al., 1999;Dutta & Shekhar, 1988). In other approaches (Lee & Lin, 2014) use genetic algorithms to determine the ratings, or 'double ensemble' methods such as Kwon et al. (2013). Bongaerts (2014) discusses the role of rating agencies and some alternatives for risk assessment. The purpose of liquidity ratios is to capture company information that indicates whether your financial situation to face the immediate payment of your obligations is appropriate or not, and how the funding structure is.
Some other ratios are used to predict ratings, measure the company's profitability over a time interval. These ratios estimate the efficiency through items from the same income statement (Horrigan, 1966;Pinches & Mingo, 1973;Dutta & Shekhar, 1988;Huang et al., 2004), or, in relation to the amounts invested in the company (Surkan & Singleton, 1990;Kaplan & Urwitz, 1979). Irmatova (2017) use an alternative non-parametric method for forecasting country's ratings. The volatility of the stock prices of a company is another element that has been included in this type of approach (Kaplan & Urwitz, 1979;Maher & Sen, 1997). Its objective is to measure the degree of uncertainty that the market perceives and materializes in the stock market quotation of the company's shares. However, these types of variables may also include effects other than credit quality, which are related to capital market fluctuations.

COMPANIE'S RATINGS
Moody's and S&P define an ordered classification of ratings linked to an estimated probability of default for each company on its debt's obligations. These classifications are quite similar and are defined by a set of letters and numbers. A summary of these ratings is presented in Table 1 with the estimated probabilities of default.
In fact, S&P and Fitch use ten categories for prime rates (from AAA to BBB) with their equivalent in Moody's (from AAA to Baa); S&P uses 13 categories for not prime ratings (from BB to D) while Moody's uses 11 ratings (from Ba to C and an additional -or Ca, equivalent to S&P's D); Fitch provides 11 ratings. All these are considered longterm measures. The agencies provide also a classification for short-term evaluations. Beside these levels, a plus or minus notch can be attached to the rating as a warning of a possible change. Each rating agency uses its own procedures to calculate the ratings of companies and institutions. In fact, it uses financial data (in general available to the investors) and so called 'qualitative data', which takes into account the business strategies, and the changes in the company's industry or sector.
In this study, only past yearly financial information is used, that is the 'quantitative data' that are publicly available for each company. With these data plus the ratings calculated by S&P and Moody's, some models are proposed to forecast the actual ratings, and, as it will be shown, the re- From the measured variables, several well-known ratios are proposed. Subsequently, both types of explanatory variables will be used in the specification of models, as well as some qualitative factors, such as the sector to which the company belongs. Given the number of variables that can be used to replicate a company's rating, the following financial variables have been selected, measured in millions of euros: X 1 = sales revenues, X 2 = EBITDA, X 3 = EBIT, X 4 = interest expenses, X 5 = net income, X 6 = balance sheet total assets, X 7 = non-current assets, X 8 = inventories, X 9 = cash and short-term investments, X 10 = shortand long term debt, X 11 = total equity, X 12 = cash flow from operations, X 13 = CAPEX, X 14 = dividends paid and X 15 = current market capitalization. Some of these variables are used by rating agencies, such as the size of each company; at the higher rating levels, there are no medium-or low-size companies.
From these variables, some usual economic-financial ratios can be obtained: R 1 = EBITDA margin over sales = X 2 /X 1 , R 2 = EBIT margin = X 3 /X 1 , R 3 = net profit margin = X 4 /X 1 , R 4 = interests paid coverage = X 2 /X 4 , R 5 = DFT/(DFT+EQUITY) = X 10 / (X 10 +X 11 ), R 6 = DFN/EBITDA = (X 10 -X 9 )/X 2 , R 7 = financial autonomy = X 11 /X 7 , R 8 = X 10 /X 12 and R 9 = FCF = X 12 +X 13 +X 14 . In some cases, these ratios are used to forecast rating, but it has been assessed that the original variables are more useful for this purpose. There are also available three additional variables associated with the growth of sales, profits and market capitalization in the last five years.
The variables that are intended to be predicted are the companies' ratings elaborated by the CRAs considered: Y 1 = S&P rating and Y 2 = Moody's rating. These are ordinal variables, although rating companies estimate them numerically, and then categorize them by defining an interval partition. In their final score, in addition to economic-financial data of each company, they use other types of data, including opinions or rumors, and, above all, they use the previous valuation level, which has given rise to unforeseen and, sometime, worryingly situations.
Neural networks have been used in numerous approaches in the last decade, with precedents in Spain, to forecast bond ratings, although most of these are linked to Asian institutions and to journals related to Artificial Intelligence (Zhao et al., 2015;Tsai et al., 2008). These techniques are used in this paper to estimate the ratings of companies from different sectors.

DATA SAMPLES
Two random samples are used, obtained from Bloomberg financial database, including firms from different sectors. The first includes n = 1,324 companies for which the previous variables (X) corresponding to fiscal year 2014 have been observed. The second includes n = 1,094 firms with the X variables measured during a period of five years (2010-2014). Both samples were selected just after the effects of 2008 financial crisis have been taken into account. For each of them, the S&P rating and the Moody's rating are also available. The companies ratings in the samples are shown below. In the rating's distribution, it is more frequent to find companies in the categories A+ to BB+, considered as recommended investments; companies classified as non-prime in BB or lower categories are also present. Still, in these non-recommended classifications, many companies have been selected. In the groups C and D, there are fewer companies, because their mortality is high. On the other hand, in the higher categories, the number of companies is much smaller. In short, the sample distribution of the selected companies (in both samples) is similar to the distribution in the population of companies that are valued by the rating agencies.
Both ratings are available for most of the selected firms, although 22% of the companies do not have Moody's rating. For this reason, the S&P valuation will be initially used as endogenous variable in the ANN model; ratings are codified in 20 categories from AAA (or Aaa) to D (or Ca), and some additional variables are obtained defining lower numbers of ratings.
These Y valuation variables are forecasted. As could be expected, joining the qualification in a lower number of classes increases the predictive power of the models.  A general description of the variables used in the modeling process is presented, starting with economic-financial (X) and ratios (R): the basic descriptive statistics are shown in the following tables.
Although in most of previous publications, financial ratios are used, in our ANN, these variables are not relevant, once the X variables are included.
Besides modeling the original S&P and Moody's ratings, as these are somewhat 'diffuse', some transformations can be done to reduce the original twenty categories to a smaller number. Here, diverging from previous works, the ANN models presented forecast the full range of ratings. Considering a first classification of the rating in two categories (stable and speculative), 57.9% of the companies analyzed have a rating equal to or higher than BBB-, and 42.1% fall into speculative   categories (no prime). Using a larger number of categories, the distribution is presented in Table 6.
The univariate description of each of the X variables shows that none of them individually can be used as a benchmark for rating estimation, and their description is thus omitted. In the 'utilities' sector, the behavior of the EBITDA variable is contrary to that of other sectors, at least for several companies. In the rest, the trend is higher for entities with higher value of X 2 . The variable X 4 presents a heterogeneous behavior; several sectors tend to be rated higher with X 4 , while in others the opposite occurs. The net income variable, as expected, tends to increase with higher revenues, at least for the best-qualified companies. Total assets are not clearly related to the company's rating; in some sectors, is even negatively related. The distribution of non-current assets in each sector is also very variable. The level of inventories is higher in the industrial sector, so it is not predictive of the company's rating. The distribution of short-term and liquid assets is also variable, although some industrial, telecommunications, computing and energy have some important reserves. Debt, both short-and long-term, is highly variable, highlighting some industrial companies, telecommunications and utilities. When considering the distribution of the firm's capital, there is a huge variability between the different companies and sectors. Cash flow is quite heterogeneous in different sectors, and not directly related to ratings. In energy and telecommunications firms, the level of investment is very high, in comparison with the rest of the sectors. In the health industries, profits for shareholders tend to overcome to the rest. Market capitalization is quite variable, not showing a direct link by itself to ratings.
In the second sample, the results are similar: it is not possible to associate clearly none of the variables with the ratings, when considering each ex-ogenous variable alone. In addition, when analyzing the temporal evolution of each variable against ratings, there is no clear association. The consequence is that multivariate methods are needed to be able to forecast ratings using financial data of each firm.

RATING PREDICTION MODELS
Most of the published works deal with the prediction of bond issue's ratings, and many less with those of companies, which is the problem addressed here. The process of modeling the rating of the different companies analyzed in both samples is based on ANN techniques. It is clear that other possible approaches exist, using several multivariate statistical techniques, or multinomial logistic (or similar) models. Discriminant analysis methods are easy to apply, but nonlinear relationships between variables limit their use. ANN were first used in the 1980s to classify bonds. In predicting the rating, it is usual to join the scores classes, but we address the forecasting the final non-aggregate rating of each company. For higher degree of aggregation of the rating's categories, the proportion of correct predictions increases. For example, Garavaglia (1991) attempts to predict all levels of scores, ranging from the highest credit quality to the lowest, with a correct prediction rate of 23%, and then, grouping the ratings into three classes (investment grade, from AAA to BBB, speculative grade, from BB to C, and the last with the D score), reaching 84% of correct predictions. As the granularity increases, that is, the number of classes to predict, this proportion decreases. Here, in contrast with previous publications, several models are estimated to forecast the exact rating their whole range published by S&P.
With respect to the time interval considered, in sample I, the data include just one year, and, therefore, ratings are estimated at the end of this period. With sample II, having data from a five-year period, it is possible to address the dynamic prediction, or at least predict the rating of the last fiscal year taking into account the public financial X variables in previous years. A proportion of 70%-80% of the available data is used to estimate the models and leaving the rest to validate the results. In addition to the usual adjustment measures and the proportion of successes in the rating prediction, the proportion of hits considering moving classes is obtained, using the immediately preceding and the subsequent category. Since the ratings of S&P and Moody's have a variability that reaches almost 45% between evaluations of one against the other, it is usual to see differences in one or two steps in the score.
In the first set of data, with a sample of companies whose data correspond to the last year treated, an MLP network (16, 7, 1) is estimated using exogenous economic-financial variables, plus the sector to which the companies belong. The results obtained with this network improve those presented in the literature using of all rating levels in the S&P classifications: in 29.81% of the cases, the classification is reproduced correctly. However, if the prediction of the exact score or of the two adjacent ones is to be made, in the test set 58.10% of correct predictions are obtained, and, if the objective is to forecast rating measured in an interval of two classes above or below de Y value, the success rate increases to 80.24% of the S&P scores.
The ROC curves show the behavior at each of the rating levels.
In the absence of a high number of data, it is noted that some rating classes do not show sufficient detail. With the sample II models, these problems will be corrected. Another way to look at these curves is by measuring the area delimited below them. Higher values show better predictability. The most important explanatory variable in the network is 'market capitalization', in line with the importance of the size of the company evaluated, according to the majority of authors. Net income is the next variable to consider. Again with the ROC there can be seen the difficulty of estimating the predictive accuracy for some infrequent ratings. In order to evaluate the importance of the different independent variables in the rating prediction model, Table 10 shows that the first two variables remain the capitalization value of the companies  and the net income, although permuted with respect to the S&P model.
Joining together several classes, as thus appears in several referenced works, the predictive capacity of the networks improves substantially. Then, considering as a variable to explain YSn ∈ {1,2,3} forS&P being Ysn = 1 for the lowest ratings to BB+, Ysn = 2 for the scores from BBB-to BBB+, and finally Ysn = 3 for the ratings from A-to AAA. The training set includes 536 cases and 247 the test set. In the network, seven neurons are used in the hidden layer, and 61.5% of correct classifications are reached, as can be seen in Table 11.
The importance of the exogenous variables appears in Table 12, in which, again, the market capitalization appears first, followed by the variables related to the income. The description of the data in the three groups shows clear differences between them. The impor-  tance of the introduced variables appears in Table  13, in which, again, the market capitalization appears first, followed by the variables related to the income.
In Table 13, there are clear differences between the groups, as can be seen in the Wilks's tests of homogeneity of means, that produce p-values p < 0.001, for all the variables.
Although we do not consider them as informative, since the prediction for grouped ratings classes is not the priority objective, it is simply presented as a point of comparison with previous existing results in the literature. With other classification methods, such as discriminant analysis, the results are less accurate. By including the same classification variables as in the ANN, Fisher's discriminant functions lead to lower forecasting power.
Only 52.4% of correct predictions were achieved, compared to 61.5% in the case of using a neural network, although less data were used in the training set. With logistic models, the results are similar to those obtained by discriminant analysis techniques, that is, also inferior to those deduced by neural network models.

DYNAMIC RATING PREDICTION MODELS
In this case, sample II is used. The first model includes as exogenous the fifteen variables X considered, the industrial sector of the company and the year to which the data corresponds. The variable explained is the full set of S&P rating with all its categories, that is, it is not about joining categories in blocks to improve the percentage of hits, which, in previous works, do not reach 25% of correctly predicted ratings.
Thus, there are 45 input variables to the network and two factors. The selected network has a hidden layer with 13 neurons, with hyperbolic tangent activation function, and the S&P rating as the output layer. This output variable takes the following values: 1 for the maximum level AAA, 2 for the next, AA+, and so on up to the value 20 for the D rating. The number of data used is 4,812, with 3,530 being dedicated, a little over 70%, as a subsample for estimating the network (training set) and 1,562, the remaining 30%, for validation.
The correct rating forecasts in the training set were 33.97% and 29.92% in the test set, compared with previous results by Garavaglia (1991), which reached 23% with rating using 17 levels instead of the 20 considered here. If it is accepted as correct prediction when the forecast are either the actual rating or the next higher or lower levels, the correct forecasts are 65.29% in the training set, and 62.09% in the test set.
If the ratios are included as additional input variables, the percentages of correct predictions increase slightly (3%), so, for economy of parameters, the proposed model with the 15 exogenous variables for each year, the sector and the year factor, are retained; in this case the correct ratings forecast are 35.41% in the training set. If the aim is to forecast the actual rating plus/minus one notch, the correct forecasts rise to 64.59%. Predicting the rating with a maximum deviation of two steps, these values increase to 81.53%. In the test set, these percentages are 30.73%, 59.73% and 80.15%, respectively. Not taking into account the sector to which the company belongs, these percentages decrease slightly. The ROC curve that shows the behavior for each of the ratings.
The capitalization level (X 15 ) is the input variable with the greatest predictive power, in line with the results of several authors who associate creditworthiness with the size of the company. The next is associated with liquidity (X 9 ) and the third, (X 14 ) corresponds to dividends paid to shareholders.
Likewise, Moody's ratings can be estimated using similar models. With the same topology as in the network selected for S&P, i.e. with the fifteen yearly explanatory base variables (X), the economic sector and the year associated with the data, for  the entire period 2010-2014, the following results are obtained, with n = 4201 cases of which 2,987 in the training set (71.1%) and 1214 (28.9%) in the test set. The selected network is still an MLP (47, 13, 1), and the output variable is Moody's rating structured into the whole set of 20 levels, without adding any lagged ratings. In the case of the training set, 1,152 cases were correctly estimated, that is 38.57% of the total. If the objective is to estimate the rating with a maximum deviation of one-step, there are 835 additional correct cases, increasing the predictive capacity to 66.52%. If a maximum deviation of two rating levels is allowed in the prediction, the percentage of correctly classified increases to 84.40%. In the test set, these measures of predictive capacity are maintained, with respective proportions of 34.43%, 65.16% and 83.69%.
The forecasts obtained for the rating of Moody's are therefore more accurate than those obtained for the S&P rating, which were 30.73%, 59.73% and 80.15%, respectively. Perhaps, it could be concluded that S&P's ratings are overvalued relative to reality, and that Moody's proposes more compatible ratings according to economic-financial variables of companies.
A clearer way to obtain the information provided by the ROC curve is by the area contained under each of the curves associated with each rating level, although it must be reminded that some categories have a rather small number of data. The relative importance of each variable entered in the network is shown in the following table.
The capitalization level (X 15 ) shows the second most predictive power, in line with the results of several authors who associate creditworthiness with the size of the company, and down one position with respect to the S&P model. The first is the cash flow (X 12 ). Note the difference in the criteria prevailing by Moody's with respect to S& P. It is possible to simplify the previous model by eliminating some explanatory variables. The most accurate model eliminates the temporal factor as an explanatory variable, in addition to the variables 'total non-current assets' (X 7 ) and 'inventories' (X 8 ). The model finally estimated is the following MLP (40, 19, 1), estimated with 3,539 cases in the training set and 1,564 in the test set. The proportion of correctly classified cases rises to 37.64%, to 65.78% with a deviation of one level on both sides, and to 81.75% with a deviation of two levels on both sides. For the test set, these proportions are 33.31%, 62.15% and 81.52%, respectively.
The ROC curves show the behavior of the model for each of the ratings. The importance of explanatory variables is shown below. It is the total equity (X 11 ) and the current market capitalization (X 15 ) variables that have the greatest predictive power.

CONCLUSION
Forecasting long-term companie's ratings can be solved using several statistical and econometric techniques. Previous references use a simplified approach, reducing the number of categories in the rating variable, defining class intervals of ratings. Nowadays, S&P and Moody's use twenty levels, and this real rating has been estimated with different statistical and AI models. Most of the academic literature is oriented to estimate the rating of bonds emissions and credit instruments, while here the objective is to forecast the long-term rating of companies, which is obtained with fairly high accuracy; and this can be done without the ordinary costs associated when rating is demanded by a company to a CRA. With this aim, only public data are used, based on the economic and financial data provided by the companies as general information provided associated to their fiscal year accounts. Also repeated measures over several years do improve the accuracy of forecasting. Of course, the methodology proposed can be extended when these institutions inform about partial data (for example, on a monthly or quarterly basis). ANN provide a flexible instrument to forecast the rating provided by S&P or Moody's, and to estimate them within moving interval with semi-amplitude equal to one or two rating levels. AI models are more precise that other multivariate discrimination and classification procedures.
The input variables used in the proposed models are related to the company size, its activity, financial aspects, market capitalization, cash flow, liquidity, sales, and debt related. Cross-section and fixed panel data are used in two different samples, obtaining the corresponding forecasts. Better results are obtained when five-year public data are used as explanatory variables, taking into account the industry sector of each company. These variables can not be individually linked to the published ratings; thus they are jointly treated. One random sample of 1,324 companies is used for the cross section model with ratings of both CRA. A second sample of 1,087 companies, and related to a five-year period, is the base of a dynamic (and more precise) forecasting model. Some alternative models can be specified, in some cases omitting some of the variables, with similar results.
Comparing S&P and Moody's ratings, the former tend to be higher than the later (in both samples and every year). In addition, Moody's results tend to be more accurate and robust in alternative models. This could be related to possible conflicts of interest in the rating of companies, which are customers of the