Abstract:
The occurrence of red tide is the result of various natural factors,involving physical,chemical,biological and other factors.Aiming at the difficulty in selecting impact factors and low accuracy in red tide prediction,this study proposed an extreme gradient boosting tree(XGBoost)red tide level prediction model based on random forest(RF)feature selection method.We used Sansha Bay as a research area,taking the annual red tide event data from 2005 to 2019 as the model input data,combining the importance of random forest algorithm features and Pearson correlation analysis to obtain the final ranking of features.Secondly,according to the AUC value of each feature of the RF algorithm,we obtained the optimal feature number of the model,and then selected the optimal feature set required by the XGBoost model based on the importance of the feature.Finally,we used the best feature set to train the XGBoost classification model.Experimental results showed that this method can achieve higher classification accuracy than other classification methods,which can provide a new solution for the prediction of red tide levels in Sansha Bay.