At the beginning of the 20th century, the mass electrification of buildings led to a boom in household electrical technologies [
1]. Nowadays almost all buildings are equipped with heating, hot water, and ventilation systems. This equipment represents the overwhelming majority of building energy consumption (77% in 2020
https://www.ceren.fr/ (accessed on 1 July 2022)). Given that buildings are generally occupied by several people who do not all have the same occupation habits, the effective use of this equipment by the occupants is, therefore, complicated in practice. It is, therefore, clear that the building sector (residential, tertiary) is an important breeding ground for reducing energy consumption. The problem of optimizing the energy consumption of buildings through the efficient use of the heating, domestic hot water, and ventilation systems was posed to the researchers. Many studies have sought to assess the impact of occupancy and/or occupants on the energy consumption of buildings. Indeed, Bing Dong et al. [
2] showed that although building insulation and the number of occupants have an influence on energy consumption, it is the habits of the occupants that have the greatest correlation with consumption. To achieve this result, the authors looked at five types of housing with different insulation envelopes and different numbers of occupants. At the level of each building, motion sensors (PIR) have been installed as well as four power-monitoring systems to record consumption data. Kaiyu Sun and Tianzhen Hong [
3] identified three occupant styles (austere, wasteful, and normal) and showed that the occupant style has a significant impact on energy consumption. The authors also showed that in the context of an occupant-independent energy consumption management system, energy consumption is weakly influenced by occupant style. Zhiyuan He et al. [
4], like [
3], sought to quantify the potential energy savings obtained by improving the behavior of the occupants. However, they used real survey data from Singapore. They considered four occupant styles (normal, wasteful, moderate, and austere) and incorporated occupancy models using Markov chains developed by Yixing Chen et al. [
5]. When compared to the normal style, their work shows a 13.4% increase in consumption of the wasteful style, a 9.5% reduction in the moderate style, and a 21% reduction in the austere style. W. Zhang et al. [
6] conducted a survey on the energy usage of 112 families in high-rise buildings and found that energy consumption and thermal satisfaction vary widely between occupants and that occupant behavior matters more than the quality and quantity of the equipment used for lowering energy usage. MS. Aliero et al. [
7] showed that different control strategies must be used between commercial and residential buildings to account for occupant responses and unexpected variations in occupancy and weather conditions. Ashouri et al. [
8] proposed a recommendation system that provides occupants with potential energy savings achievable based on past energy consumption patterns obtained with data-mining techniques (clustering, association rule, artificial neural networks). An efficient HVAC system is also important for occupants’ health; González-Lezcano [
9] emphasized the need to maintain optimal indoor air quality to promote the well-being of inhabitants.
The correlation between the habits of people occupying a building and the energy consumption of the building being established, several tools have been developed to model these habits. J. Page et al. [
10] used an inhomogeneous Markov chain to model the transitions between presence (1) and absence (0). The CDF inversion method is used to generate the occupancy profile. Shide Salim et al. [
11] use an inhomogeneous Markov chain to predict transitions from one area to another in a workplace. Data were collected using a real-time locating system (RTLS). Transition probabilities are a function of the occupant, weather, and day of the week. Zhaoxuan Li et al. [
12] also used Markovian modeling on the occupancy profile of a residential building. The transition matrices are estimated by maximum likelihood and the procedure is optimized using the Pearson divergence test to determine the best training window. The authors compare their method to different models (SVM, ANN, probability sampling) over different prediction horizons (15, 30 min, and 24 h). Their model shows better performance over the 15 and 30 min horizons and comparable performance for the 24 h horizon. Kabbaj, O.A. et al. [
13] in their paper used hidden Markov chains to predict occupancy state from synthetic occupancy data. In practice, it is common to have missing data for several reasons including hardware and/or network problems which can lead to corruption or absence of data. The authors of this paper have developed a model adapted to this type of situation with interesting results on simulated data. Ardeshir Mahdavi et al. [
14] use an empirical method based on the calculation of occupation frequencies for a given time interval. By thresholding, they distinguish the significant proportions. They exploit the occupancy status of an office obtained through a motion sensor. Their method shows performances comparable to those of Reinhart [
15] and Page et al. [
10]. Mohammad Saiedur Rahaman et al. [
16] exploit the data generated by the employees of a shopping center. Each employee wears a low-energy Bluetooth beacon that emits a unique ID; four Bluetooth gateways scattered around the mall collect the ID (unique identifier) of nearby beacons, the detection interval and the variations of the indicator received signal strength. The information allows them to locate each employee carrying a beacon, in time and space (states). The authors compare different machine learning algorithms to determine the positions of employees from the intensity of the signals received (DT, RF, SVM, MLP, KNN) and show that the random forest performs better than the others. Jesica E.M. et al. [
17] used LSTM networks combined with different classification algorithms (SVM, RF, MLP, KNN) to predict the number of occupants at three offices. Environmental data (CO
2, temperature, etc.), the number of occupants, and the consumption of certain appliances were collected. Their strategy was to predict environmental variables via LSTM networks (one for each office) and rank the predictions. Their strategy offers good results and the random forest shows better performance than other classification algorithms. In papers [
18,
19], LSTM networks are also used to predict the occupancy state. Hamza Elkhoukhi et al. [
20] use an LSTM network to predict CO
2 concentration and merge this prediction with ventilation rate, normal CO
2 concentration in the air, and the rate of generated CO
2/person through a steady-state model to determine the number of occupants. Their model manages to predict the number of occupants with 70% accuracy. Marina Dorokhova et al. [
21] in their paper use a k-means to estimate the occupancy state and an LSTM network trained with these states for the following ones. Their model predicts occupancy status (presence/absence) with over 97% accuracy. In papers [
22,
23,
24,
25], feedforward hidden layer neural networks called ELM (extreme learning machine) networks are used. For ELM networks, the weights entering the neurons of the hidden layer are generated randomly and are not learned, only the weights linked to the output layer are learned [
26]. ELM networks show quite good performance in predicting the occupancy state. Having the right room occupancy profile is crucial for effective HVAC system control. Indeed, knowing the occupancy schedules can make it possible to establish a heating and ventilation schedule, and knowing the number of occupants can allow more effective control of this equipment. Yukun Yuan et al. [
27] seek to minimize the power of the system that is being penalized by the comfort of the occupants. Finally, Seungwoo Lee et al. [
28], after predicting the times of arrival, determine the preheating or ventilation time necessary for comfort in the room.
As we have seen, the prediction of the state of occupation by neural networks can be done according to two strategies. The first is to predict environmental variables and then infer the occupancy status from these predictions. The second strategy is to directly predict the state from observed data. In this work, we subscribe to the second strategy and propose a model for predicting the occupancy state of a building based on a priori labeling and the use of an LSTM network. We use an architecture that makes it possible to link the data of the different rooms of a building to provide a prediction of the occupancy state of all the rooms without restricting ourselves to only two states (presence/absence). The architecture we use (provided by tensorflow) has the advantage of making it possible to adjust a single network for all the rooms of a building, thus avoiding the difficult task of building and adjusting different architecture for each room.