
31REVISTA PERSPECTIVASVOLUMEN 7, N˚1 / ENERO - JULIO 2025 / e - ISSN: 2661
MACHINE LEARNING ALGORITHMS FOR
PREDICTIVE MAINTENANCE: A SYSTEMATIC
LITERATURE MAPPING
1 Programa de Doctorado en Ingeniería Eléctrica, Escuela Politécnica Nacional (EPN), Quito, Ecuador.
2 Maestría en Electrónica y Automatización, Escuela Superior Politécnica de Chimborazo (ESPOCH), Riobamba, Ecuador.
RESUMENABSTRACT
Jorge Paredes Carrillo 1 jorge.paredes01@epn.edu.ec
Carlos Romero Barreno 2 carlos.romerob@espoch.edu.ec
Predictive maintenance is a practice that industrial
companies can apply to their processes thanks to
technologies such as artificial intelligence and the
Internet of Things. Machine Learning algorithms
are used in many fields to make predictions or
classifications. Predictive maintenance is an area
of research that provides new practices, strategies,
or methodologies. As a relatively new field, the
methodologies are still scattered and there is little
information on the maturity of the algorithms.
To provide a solid foundation, a systematic
literature review is presented to give engineers
and researchers an overview of machine learning
algorithms used in predictive maintenance. The
results obtained show some growth in recent years,
demonstrating the interest in this area of research.
However, most of the contributions in this field
can be summarized as concept proofs and it is still
difficult to obtain a prototype that can be validated
as a complete and certified system. This paper
describes the main machine learning algorithms
used in predictive maintenance according to their
type of use and supervision, analyses their input
and output parameters, and determines their
maturity.
Keywords: Machine Learning, Predictive
Maintenance, Systematic Literature Mapping,
PdM.
El mantenimiento predictivo es una práctica que
gracias a tecnologías como la inteligencia artificial
e internet de las cosas permiten que las empresas
industriales lo puedan aplicar en sus procesos. Los
algoritmos de Machine Learning son utilizados
en muchos campos y sirven para realizar tareas
de predicción o clasificación. El mantenimiento
predictivo es un campo de Investigación que aporta
con nuevas prácticas, estrategias o metodologías. Al
ser un campo relativamente nuevo las metodologías
aún se encuentran dispersas y existe poca información
sobre la madurez de los algoritmos. Para proporcionar
una base sólida, se presenta un mapeo sistemático
de literatura con el objetivo de ofrecer a ingenieros e
investigadores una visión general de los algoritmos
de Machine Learning usados en mantenimiento
predictivo. Los resultados obtenidos muestran un
crecimiento en los últimos años demostrando un
interés en este campo de investigación. Sin embargo,
la mayoría de las contribuciones en este campo
se pueden resumir como pruebas concepto y aún
resulta difícil obtener un prototipo para que sea
validado como un sistema completo y certificado. En
este artículo se describen los principales algoritmos
de Machine Learning usados en mantenimiento
predictivo de acuerdo al tipo de uso y su supervisión,
además, se analizan los parámetros de entrada y las
salidas de los mismos y por último se determina su
nivel de madurez.
Palabras Clave: Machine Learning,
Mantenimiento Predictivo, Mapeo Sistemático de
Literatura, PdM.
REVISTA PERSPECTIVAS
Algoritmos de Machine Learning usados en mantenimiento
predictivo: un mapeo sistemático de literatura
Fecha de Recepción: 06/03/2023. Fecha de Aceptación: 05/06/2024 Fecha de Publicación: 20/01/2025
VOLUMEN 7, N˚1 / ENERO - JULIO 2025 / e - ISSN: 2661
DOI: https://doi.org/10.47187/perspectivas.7.1.227

32REVISTA PERSPECTIVASVOLUMEN 7, N˚1 / ENERO - JULIO 2025 / e - ISSN: 2661
The world is currently experiencing a new
industrial revolution called 'Industry 4.0', thanks
to advances in technologies such as artificial
intelligence, the Internet of Things (IoT) or big
data [1]. To ensure digital transformation, a new
approach is needed that combines physical and
digital systems. The integration of these two
systems will result in a large amount of data
from different parts of a factory, which must be
processed to extract information [2].
The use of IoT architectures generates a large
amount of data [3]. Much of this data includes
events and alarms that occur on the production
line of a factory. By processing and analyzing this
data, information about the production process can
be easily obtained. This is important for decision
making, maintenance tasks, fault detection, cost
reduction and improving operator safety [4].
The above benefits are closely related to internal
processes in the manufacturing industry. It is
necessary to implement strategies to identify
possible failures in critical machinery in order to
avoid unplanned shutdowns that affect production
[5]. For example, [6] proposes the monitoring of
an oil refinery's compressed gas system, using
Industrial Internet of Things architectures to
obtain data from specific machines and then using
machine learning algorithms to obtain predictions
of the machine's current state. A system to
perform predictive maintenance tasks is proposed
by [7], which is able to obtain a health index of
the machinery of an entire factory. A predictive
maintenance model uses neural network algorithms
to determine the remaining life of a machine and
supports maintenance scheduling in a factory [8].
A method can use quantitative and qualitative
analysis to apply machine learning techniques
to predict failures, thereby aiding maintenance
decision making and reducing the costs associated
with these tasks [9].
Several nomenclatures can be found in the literature
to describe maintenance strategies. However, in
this paper we consider the classification proposed
by [10]. They classify maintenance strategies as
shown below:
I. Introduction Corrective maintenance: this type of maintenance
is carried out to repair a machine after a fault has
occurred.
Preventive maintenance: this is carried out at
regular intervals according to a maintenance
schedule, even if the machine has not yet failed.
Predictive maintenance: this type of maintenance
attempts to predict a future failure before it occurs
in order to plan maintenance tasks and reduce
costs.
Fig. 1 gives an overview of the types of maintenance.
Although corrective maintenance is the simplest
strategy, it requires stopping production to correct
the fault, which increases maintenance costs.
Preventive maintenance is effective in preventing
breakdowns, but increases costs by performing
unnecessary maintenance when the machine is in
optimal condition. Predictive maintenance uses
data on specific machine quantities and a history
of failures. It can also use statistical approaches
and machine learning algorithms. Therefore,
predictive maintenance has several advantages,
such as maximizing machine uptime and reducing
maintenance tasks and associated costs [11].
Predictive maintenance uses machine learning, an
application of artificial intelligence, as its main
tool. This approach is the most optimal because
several machine learning algorithms have recently
emerged that are highly accurate and easy to
implement. In addition, machine learning is also
capable of handling large amounts of data and
extracting hidden relationships from dynamic
environments such as industrial environments
[12]. Therefore, machine learning can serve as
a powerful tool in predictive maintenance tasks,
although it is highly dependent on the algorithm
used. Therefore, the aim of this paper is to
present a systematic literature review that presents
the main machine learning algorithms used in
predictive maintenance. This paper provides a
useful background on the main machine learning
algorithms, as well as their main applications and
maturity levels, and will help future research work.
The paper is structured as follows: Section II

33REVISTA PERSPECTIVASVOLUMEN 7, N˚1 / ENERO - JULIO 2025 / e - ISSN: 2661
II. Background
presents a background with several concepts of
the different machine leaning algorithms Section
III presents the planning and execution of the
systematic literature mapping. Section IV presents
a description of the main machine learning
algorithms used, while section V presents the
types of input data and outputs produced by the
algorithms, as well as their maturity. Section VI
presents a related work and finally, section VII
presents the contributions and conclusions of this
paper.
It is important to have some concepts, so here is
a brief description of some that were considered
important after the systematic mapping:
• Machine learning: is a branch of artificial
intelligence that allows machines to learn
without being programmed. This allows
machines to make predictions, classify or
identify patterns [13].
• Supervised algorithms: they base their learning
on a set of previously labelled data, so that the
value of their target attribute is known [14].
• Unsupervised algorithms: they base their
learning on an unlabeled data set, or no target
value or class is known. It is used for clustering
tasks [14].
Regression: aims to predict a numerical result
Fig. 1: Overview of types of maintenance
[15].
• Classification: aims to predict a categorical
outcome [15].
• Linear regression: is a supervised machine
learning algorithm. It is a data analysis
technique that predicts the value of unknown
data using another related and known data
value [16].
• Decision tree: is a supervised, non-parametric
machine learning algorithm. It has a
hierarchical tree structure consisting of a root
node, leaf nodes and internal nodes. It can be
used for regression or classification tasks [17].
• Random forest: is a popular machine learning
algorithm used for classification or regression.
It is a set of decision trees [18].
• Support Vector Machines: is a supervised
machine learning algorithm that allows
finding the optimal way to classify among
several classes. It can be used for both
regression and classification. It is based on the
principle of separating two classes by means
of a hyperplane called a support vector [19].
• Neural Networks: is a type of supervised
machine learning algorithm that aims to
simulate the behavior of the human brain. It
can be defined as a network of interconnected
nodes. They can be used to perform
classification or regression [20].
• K-means: is an unsupervised machine learning
algorithm that attempts to form clusters of
data with similar characteristics [21].
• K-nearest neighbor: is a supervised non-
parametric machine learning algorithm. It is
based on the distance from one data to another
and classifies objects based on the classes
of the nearest neighbors. This algorithm is
designed to perform classifications, although
it can also be applied to regressions [22].
• Long short-term memory: This is a type of
recurrent neural network. The output of the last
stage feeds the current stage. It is specifically
designed to handle sequential data. It is used
for classification or regression [23].
• Autoencoder: is a machine learning technique
that attempts to reconstruct the input data
from the output to eliminate errors or outliers
[24].

34REVISTA PERSPECTIVASVOLUMEN 7, N˚1 / ENERO - JULIO 2025 / e - ISSN: 2661
III. Related works
IV. Systematic Literature Mapping
Machine learning algorithms used in predictive
maintenance is an area that is currently being
researched and exploited. Some literature
review works have already emerged from this
field. Carvalho et al, focuses on describing four
important algorithms such as random forest, neural
networks, support vector machine and k-means, in
addition, it also mentions the type of equipment
where these algorithms can be used, the year in
which this research is conducted is 2019 [25].
Machine learning algorithms can be used to
perform regression or classification. Classification
is an important task and supervised or
unsupervised algorithms can be used. Saranavan et
al, provide a literature review where they focus on
those supervised machine learning algorithms to
perform classifications, revealing methodologies,
advantages and disadvantages [26].
Another approach is to compare machine learning
algorithms used in predictive maintenance,
Silvestrin et al, provides a comparison of
convolutional neural networks with time series,
finding significant differences in the use of
convolutional neural networks [27]. Industry 4.0
is the new industrial revolution that the world is
currently experiencing, which is why Serradilla
et al, in their literature review, provide models
of machine learning architectures that can be
used in predictive maintenance tasks to ensure
reproducibility and replicability in different
environments. Following the Industry 4.0 line,
Drakaki et al. provide an insight into the main
machine learning algorithms used in predictive
maintenance of induction motors, focusing on
machine learning architectures and techniques
[28].
Focusing on more specific applications, there
are several works, the most notable of which is
that of Olesen et al. It identifies new trends and
challenges that can be solved by using predictive
maintenance and machine learning in pumping
systems and thermal power plants [29].
Of the works described above, none focuses on
classifying the algorithms or analyzing the types
of input data required by the machine learning
algorithm to be used and the output produced
by the algorithm. Similarly, no work focuses on
providing a maturity level for machine learning
algorithms used in predictive maintenance.
A systematic literature mapping SLM can
provide an overview of the area of interest.
This method identifies, appraises and interprets
information relevant to a particular area, problem
or phenomenon of interest [30]. A systematic
literature review is a secondary study that aims to
critically evaluate research with a similar scope.
The methodology proposed by [31] is used to
carry out SLR.
A. Scope of the Study
The main objective of this study is to provide an
overview of the state of the art of machine learning
based data analysis algorithms used in predictive
maintenance. To successfully achieve this goal, the
following research questions have been proposed:
• RQ 1. What types of machine learning
algorithms are used in predictive maintenance?
• RQ1.1 What are the algorithms?
A classification of all the algorithms
found will help the reader to have a
better understanding to find similarities
or differences that will help to improve
predictive maintenance.
• RQ 2. What input data does the machine
learning algorithm use?
Identifying the input and output parameters is
important as it will help to better understand
how the algorithms used in machine learning
work.
• RQ 2.1 What types of data does the machine
learning algorithm use?
Knowing whether the data used in machine
learning is synthetic or real data is important
for predictive maintenance applications.
• RQ 2.2 What input parameters are required
for this type of data?

35REVISTA PERSPECTIVASVOLUMEN 7, N˚1 / ENERO - JULIO 2025 / e - ISSN: 2661
• RQ 3. What is the output of the algorithm?
It is important to know the type of output the
algorithm produces to use it for predictive
maintenance tasks.
• RQ 4. What is the maturity of the algorithms
used?
It is important to know the maturity level of the
algorithms in order to know which ones are most
commonly used in predictive maintenance tasks.
B. Study Identification
A database-driven search approach was used
in this study, with Scopus as the main search
database:
1) Search String
The choice of keywords to construct the search
string was based on common terms used in
the literature and terms related to this work
(for example, PdM or ML). Some of the terms
suggested by [32] were used to find synonyms.
Table 1 shows the search string used.
The search string was validated by an expert
in the field. The expert provided 10 relevant
items, and the search string found 9 of them,
that is the string contains 90% of the items
provided by the expert.
A search was performed on 5 October 2023
and 6019 items were found.
2) Inclusion and Exclusion Procedure
The inclusion and exclusion process consists of
two phases: an automatic phase and a manual
phase. The automatic phase uses the Scopus
functions, which values are listed in Table II,
while the manual phase uses the CADIMA
software [33]. A flowchart of the inclusion and
exclusion procedure is shown in Fig. 2.
TABLE I
Search String USed
((((predictive AND (maintenance or monitoring)) or PdM))
AND (“machine learning” or ML) AND (algorithm OR model
OR strategy OR technique))
The manual phase was carried out on 2349
articles after removing duplicate articles
with a CADIMA proprietary function using
the inclusion and exclusion criteria in Table
III. Before starting the manual phase, a pilot
phase was carried out between the principal
investigator and the expert with a set of 10
randomly selected articles to standardize the
inclusion and exclusion criteria. The title and
abstract of each article were read and marked
as included or excluded. To ensure inter-rater
reliability, the Krippendorff alpha coefficient
was set at 0.8, which is an accepted value in
most studies [34]. At the end of the pilot study,
a Krippendorff alpha coefficient of 1 was
obtained.
The manual inclusion and exclusion process
consisted of 3 iterations carried out by the principal
investigator. In the first iteration, the title was
read and 502 articles were included for the next
iteration. In the second iteration, the abstract was
read, including 116 articles. In the third iteration,
TABLE II
Inclusion Criteria Used in Scopus
Filter Values
Research Field Engineering
Computer Science
Type of document Conference article
Journal article
Language English
Fig. 2: Inclusion and exclusion procedure

36REVISTA PERSPECTIVASVOLUMEN 7, N˚1 / ENERO - JULIO 2025 / e - ISSN: 2661
the conclusions were read, including 77 articles.
For the coding and information extraction phase
we have a set of 77 articles. Fig. 3 shows the
percentage of articles included in each iteration.
The articles included are presented in Appendix
A.
Four labels were used to classify articles in the
manual inclusion and exclusion phase. These
labels are:
Included: the article meets all the inclusion criteria
and none of the exclusion criteria.
Excluded: the article meets the exclusion criteria
or none of the inclusion criteria.
Unclear: the investigator is in doubt as to whether
the article should be included or excluded.
Secondary: the article is a secondary or tertiary
contribution.
Fig. 5 shows the number of articles published
between 2014 and 2023 using the exclusion
and inclusion criteria presented in this paper.
This confirms that predictive maintenance is
a technique that has been used in papers since
2014. On the other hand, it can be observed that
the interest in this field of research has increased
in recent years, reaching a peak between 2021
and 2023. This effect is related to the amount of
data generated by industrial equipment and the
latest advances in machine learning algorithms.
The small number of works in the field
of predictive maintenance is due to the
complexity of implementing efficient strategies
in production environments [39]. On the
other hand, the number of machine learning
algorithms is limited because data science is
still a relatively new field of study and there
are still no defined methodologies for obtaining
historical maintenance and failure data in
industrial environments.
A. RQ. 1. What types of machine learning
algorithms are used in predictive maintenance?
The articles reviewed fall into two main
categories: use and type of supervision. Most
articles report the use of supervised machine
learning algorithms and for use in regression
(data prediction). This is because the datasets
used are labelled and categorized, which makes
it easier to make a prediction or classification.
On the other hand, there is little work using
unsupervised machine learning algorithms,
as they only aim to find patterns of possible
failures for future use in maintenance task
planning. Unsupervised algorithms are more
prone to failure when making predictions or
classifications. Of the selected papers, those
using unsupervised algorithms are only used in
regression tasks.
Fig. 6 shows the proportion of papers using
supervised and unsupervised algorithms and
whether they are also used for regression or
classification.
TABLE III
Inclusion and Exclusion Criteria
Criteria Type Values
Inclusion (all must
be met)
The article must be related to predictive
maintenance.
The article must be related to data analy-
sis based on machine learning.
The article should contain information
about the machine learning algorithm
used.
The article must contain information
about the input and output parameters
as well as the data used in the machine
learning model used.
The article must be a primary contri-
bution.
Exclusion (none can
be fulfilled)
The article is a secondary or tertiary
contribution.
Fig. 3: Articles included in each iteration
V. Results

37REVISTA PERSPECTIVASVOLUMEN 7, N˚1 / ENERO - JULIO 2025 / e - ISSN: 2661
In more than 50% of the selected papers, machine
learning algorithms are used for regression,
since one of the main objectives of predictive
maintenance is to estimate the remaining useful
life (RUL). On the other hand, the algorithms
using classification try to provide a state of health
of the machine or equipment by classifying it in
categories proposed by each author.
Table IV summarizes the types of algorithms
according to their use and type of supervision in
the articles studied.
RQ. 1.1. What are the algorithms?
The papers consulted use a range of machine
learning algorithms that can be applied to
predictive maintenance tasks. These algorithms
range from those with a relatively simple
mathematical basis, such as linear regression,
to the more mathematically complex variants
of artificial neural networks.
The authors do not use a single algorithm in
their work but use several to test the accuracy
and error results of their main contribution.
The most used algorithms for supervised
models are:
• Linear regression
• Decision tree
• Random forests
• Support vector machines
• Neural Networks
• K-Nearest Neighbour
• Gradient Boost
• XGboost
• Adaboost
• Long term memory
• Autoencoder
The choice of these algorithms depends very much
on the practical application and the data obtained.
For example, if you have a dataset with a lot of
outliers, a robust algorithm to use is Decision
Trees, while a vulnerable one is Adaboost.
On the other hand, the most used algorithms in
unsupervised models are:
• K-means
• Neural Networks
Fig. 4: Classification Scheme used
Fig. 5: Classification Scheme used
Fig. 6: Papers per type and use and supervision