- Publications
- Abstract of Theses and Dissertations
- Database
- Extended spatial decision tree algorithm for classifying hotspot...
Extended spatial decision tree algorithm for classifying hotspot occurrence
Dissertation Abstract:
Forest fire in Riau Province, Indonesia is a yearly disaster especially in dry season. It causes many negative effects in various aspects of life for people in Indonesia and neighboring countries including Singapore and Malaysia. In order to minimize the negative effects of forest fires, classifying hotspot (active fires) occurrence is an essential activity in fires prevention. The existing methods to classify hotspot occurrence including the logistic regression and the decision tree algorithms do not include spatial objects in the forest fires dataset because these methods are designed for non-spatial dataset. On the other hand, supporting factors for hotspots occurrence are mostly represented in spatial objects. Therefore, spatial objects should be included in forest fires datasets for classifying hotspots occurrence in order to obtain the classifiers with high accuracy.
This study proposed a new spatial decision tree algorithm—the extended spatial ID3 decision tree algorithm—to classify hotspot occurrence from a forest fires dataset that contains point, line, and polygon features. The method is an extension of the existing spatial decision tree algorithm which works on polygon features only. The proposed algorithm uses spatial information gain to choose the best splitting layers from a set of explanatory layers. The new formula for spatial information gain was proposed using spatial measures for point, line, and polygon features.
The extended spatial ID3 algorithm had been applied to the real forests fires dataset consisting of 10 explanatory layers (river, road, city center, land cover, source of income, precipitation in mm/day, screen temperature in K, 10m wind speed in m/s, peatland type, and peatland depth) and a target layer. The target layer consisted of true alarm data (hotspots 2008) and false alarm data. The result was a spatial decision tree with 134 leaves with an accuracy of 71.12 percent. After pruning, the spatial decision tree became smaller with 122 leaves and 71.66 percent accuracy.
For comparison, classifiers for hotspot occurrence were also developed using the non-spatial methods, namely: ID3 algorithm, C4.5 algorithm, and logistic regression. The accuracy of decision tree generated by ID3 and C4.5 algorithm was 49.02 and 65.24 percent, respectively. Meanwhile, the accuracy of the logistic regression model was 68.63 percent. Empirical results using the real spatial forest fires dataset demonstrate that the extended spatial ID3 algorithm has better performance in terms of accuracy compared with the non-spatial methods.
The spatial decision tree had been tested using the new dataset on forest fires containing hotspots 2010. The experimental results showed that the accuracy of the tree without pruning was 60.06 percent. Meanwhile, the accuracy of the tree with pruning was 61.89 percent. The pruned trees did not classify about 4.24 percent objects in the new dataset. These unclassified objects mostly took place in nonpeatland areas where forestry and agriculture are the sources of income of the people living in these areas. Moreover, most of the unclassified objects were located in plantation and dryland forest.