Use of noise to augment training data: A neural network method of mineral-potential mapping in regions of limited known deposit examples

One of the main factors that affects the performance of MLP neural networks trained using the backpropagation algorithm in mineral-potential mapping isthe paucity of deposit relative to barren training patterns. To overcome this problem, random noise is added to the original training patterns in ord...

Full description

Saved in:
Bibliographic Details
Published in:Natural resources research (New York, N.Y.) Vol. 12; no. 2; pp. 141 - 152
Main Authors: BROWN, Warick M, GEDEON, Tamas D, GROVES, David I
Format: Journal Article
Language:English
Published: Heidelberg Springer 01-06-2003
Springer Nature B.V
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:One of the main factors that affects the performance of MLP neural networks trained using the backpropagation algorithm in mineral-potential mapping isthe paucity of deposit relative to barren training patterns. To overcome this problem, random noise is added to the original training patterns in order to create additional synthetic deposit training data. Experiments on the effect of the number of deposits available for training in the Kalgoorlie Terrane orogenic gold province show that both the classification performance of a trained network and the quality of the resultant prospectivity map increasesignificantly with increased numbers of deposit patterns. Experiments are conducted to determine the optimum amount of noise using both uniform and normally distributed random noise. Through the addition of noise to the original deposit training data, the number of deposit training patterns is increased from approximately 50 to 1000. The percentage of correct classifications significantly improves for the independent test set as well as for deposit patterns in the test set. For example, using ±40% uniform random noise, the test-set classification performance increases from 67.9% and 68.0% to 72.8% and 77.1% (for test-set overall and test-set deposit patterns, respectively). Indices for the quality of the resultant prospectivity map, (i.e. D/A, D × (D/A), where D is the percentage of deposits and A is the percentage of the total area for the highest prospectivity map-class, and area under an ROC curve) also increase from 8.2, 105, 0.79 to 17.9, 226, 0.87, respectively. Increasing the size of the training-stop data set results in a further increase in classification performance to 73.5%, 77.4%, 14.7, 296, 0.87 for test-set overall and test-set deposit patterns, D/A, D × (D/A), and area under the ROC curve, respectively.
ISSN:1520-7439
1573-8981
DOI:10.1023/A:1024218913435