In the era of data mining, the selection of key factors is an important step in economic investment. Variables are selected through the forward strategy. On the basis of Goodman-Skruskal-_model, we supervise and discretize high-dimensional data, find out new quantitative viewpoints from historical data, and form a new portfolio model. Empirical test shows that the investment strategy given by this model can obtain better returns and has certain practicability. Inspired by the research on discretization of variables in Huang Wen [1], this paper, based on the GK-_model, uses the supervisory discretization strategy of the forward method to find out the key factors which are important for high-dimensional variables, instead of the viewpoint of direct selection of factors by investors, conducts economic portfolio research.
In the estimation model of correlation, the selection of GK-_can better measure the local and global correlation.
The specific structure of the article is as follows: The second part first introduces the model, and the third part analyses the results.
The GK-_model in high dimension and the prediction by forward method. Among them, the discretized independent variable is the independent variable to be discretized by the forward method; Epy is the accuracy of prediction when there is no independent variable and when there is independent variable, and EPY is certain, so the prediction ability is equivalent to the accuracy.
The GK-_model is used to forecast the investment model by the discretization of forward supervision, instead of the traditional view of directly selecting factors to predict, so as to improve the forecasting ability and achieve better investment.
Empirical results and analysis. Part of the data in this paper comes from a bank’s loan income database, from which the concept of payment time (punctuality and punctuality) is selected as dependent variable, while assets, income, debt, economic demand and age are continuous independent variables; for example, the concept of payment time is a two-dimensional variable of 0 or 1, (0 means unable to pay on time, 1 means paying on time), and the age is continuous.
Variables can be divided into juvenile, middle-aged and elderly. The results show that the second variable is economic demand, and the forecast result is 0.
83812, which is better than the direct choice of the best variable and better predictive ability.
GK-_model is a predominance ratio prediction which combines weight factors from part to whole, thermostatic element and uses the forward discretization to better cut the zones and obtain better prediction ability.
It overcomes some shortcomings of the traditional investment model in the application practice of direct selection of key factors, and uses the information given by bank loan data to carry out empirical analysis.
The results show that the model has a certain application scope and potential, and it also has a considerable guiding significance for the economic investment of ordinary investors.
At the same time, it also provides a new idea for the application of the investment market.