CNN is a deep learning algorithm by considering spatial inputs. Identical to other neural networks, CNN neurons have learnable weights and biases. However, CNN is mainly used for processing data with a grid topology, giving it a specific characteristic of its architecture [26].CNN is a feedforward network because information flow occurs in one direction only, that is, from their inputs to their outputs [27]. The CNN model uses three main layers, namely, the convolutional, pooling, and fully connected layers (Figure 7). The convolutional and pooling layers are used to reduce the computational complexity. Meanwhile, the fully connected layer is the flattened layer connected to the output. Various pooling techniques are available in the architecture of CNN. However, max pooling is mostly used in CNN layers, where the pooling window contains the maximum value from each elementCNN–LSTM was developed for visual time series prediction problems and generating textual descriptions from the sequences of images. The CNN–LSTM architecture uses CNN layers for feature extraction on input data and combines with LSTM to support sequence prediction. Specifically, CNN extracts the features from spatial inputs and uses them in the LSTM architecture to output the caption. The architecture of the CNN–LSTM model is illustrated in Figure 8. the maximum value from each element [28]The applications of this hybrid model have been used to solve many problems, such as rod pumping [29], particulate matter [30], waterworks [31], and heart rate signals [32]. Studies have demonstrated promising results; for example, Xingjian et al. [33] predicted the future rainfall intensity in a local region over a relatively short period. The experiments show that the CNN–LSTM network captures spatiotemporal correlations better and consistently outperforms the fully connected LSTM (FC‐LSTM) model for precipitation forecasting.