As cities get smarter and more data-oriented, a big question looms: how effectively can data be used for infrastructure asset management, where data is often limited and low-quality, especially in smaller municipalities?

A recent paper in the Journal of Transportation Engineering explores the accuracy of various types of algorithms at predicting deterioration of asphalt roads.

From the paper:

Using an example of a data set containing roads in the US and Canada, this study showcased the value of data analytics in infrastructure asset management and decision making. Predicting the deterioration in PCI was used as the case example in this study because of its popularity among municipalities in Ontario. The study compared GBT, random forest classifier, and naïve Bayes coupled with kernels. It was shown that only with affordable (and circumstantial and noncausal) data, the algorithms were able to predict deterioration with high accuracy. The data were segmented multiple times and according to different rationales. The algorithm parameters were also altered several times. This helped in analyzing the reasons for accuracy variations. Consequently, several tips were discussed for enhancing accuracy in practice and in future studies. Of particular interest is the performance of two examples of ensemble learning algorithms: random forest and GBT. The accuracy of GBT and random forest was boosted by 25% and 10%, respectively, compared with their base learner. Other methods of learning, such as k-NN (a lazy learner), were investigated as well. The impact of data normalization was tested on k-NN, and it was observed that normalization increased the cross-validation accuracy by at least 10%. Furthermore, kernel density estimates were used to increase the accuracy of the naïve Bayes classifier (instead of a generic Gaussian distribution). It was observed that the accuracy of naïve Bayes classifier was boosted dramatically by using more realistic density estimates.