Thermal maturity and TOC prediction using machine learning techniques: case study from the Cretaceous–Paleocene source rock, Taranaki Basin, New Zealand


Thermal maturity, organic richness and kerogen typing are very important parameters to be evaluated for source rock characterization. Due to the difficulties of high cost geochemical analyses and the unavailability of rock samples, it was necessary to examine and test many different method and techniques to help in the prediction of TOC values as well as other maturity indicators in case of missing or absence of geochemical data. Integrated study of machine learning techniques and well-log data has been applied on Cretaceous–Paleocene formations in the Taranaki Basin, New Zealand. A novel approach of maturity prediction using Tmax and vitrinite reflectance (VR%) is the first and preliminary objective of this research. Moreover, the organic richness or the total organic carbon (TOC) content has been predicted as well. Geochemical and well-log data collected from the Cretaceous Rakopi and North Cape formations and Paleocene Mangahewa Formation have been processed and prepared to apply the machine learning techniques. Five machine learning techniques, namely Bayesian regularization for feed-forward neural networks (BRNNs), random forest (RF), support vector machine (SVM) for regression, linear regression (LR) and Gaussian process regression (GPR), were employed for prediction of TOC, Tmax and VR, and their results have been compared. For TOC prediction, the best model achieved the coefficient of determination (R2) value of 0.964 using RF model. For Tmax prediction, BRNN with one hidden layer achieved the R2 value of 0.828. BRNN with two hidden layers produced the best model for VR prediction achieving R2 = 0.636. A comparison of five ML techniques showed that all of these techniques performed exceedingly well for TOC prediction with a value of R2 > 0.96. In contrast, BRNN with one hidden layer was the only ML technique able to achieve R2 > 0.8 for Tmax and BRNN with two hidden layers was the only ML technique able to achieve R2 > 0.6 for VR prediction. Therefore, this research provides a strong empirical evidence that ML techniques can capture the nonlinear relationship between the well-log data and TOC as well as the maturity indicators which may not be fully understood by existing linear models.

Journal of Petroleum Exploration and Production Technology