Electricity Demand Forecast LSTM Model

A short-term electricity load forecasting tool tailored for the city of San Diego

Course Project (Berkeley MEng) - Machine Learning

Github Project Repository:

Background

Anticipating fluctuations in electricity demand through energy demand forecasting is of paramount significance to effectively manage energy systems. Historically, predicting the demand in energy was quite simple; it followed the same flat pattern each day with only slight gradual peaks in the morning and evening with only a bit of variation across seasons. However, the rapid deployment of renewables, especially behind-the-meter rooftop solar, the EV adoption and the electrification of heating and cooling have created new unpredictable load shifts. These shifts have introduced a higher level of unpredictability, given the dependence on weather conditions for energy production and demand.

These variations of electricity demand caused by changing weather conditions throughout the day will become further drastic as renewable energy continues to grow. Moreover, the projection that energy demand will increase 4.7% over the next five years with an increase of 38 gigawatts in peak load demand in the US, does not help the cause either

Task

Given this anticipated rise in electricity demand and increased adoption of variable renewable energy sources, it will be critical for utility companies to optimize their power demand forecasting to align with impending fluctuations in the weather. This study aimed to create and compare models capable of predicting energy demands for San Diego based solely on short-term weather forecasts and historical energy data. We focused on exploring two use cases with our models: 1) predict the next 1 hour of energy demand and 2) predict the next 24 hours or day-ahead of energy demand.

Building the Model

To effectively forecast energy demand, we built our models to input the following key variables: day of the year, day of the week, hour, historical energy demand, temperature, and solar radiation. The day, time, and temperature variables cover the expected seasonal and daily variations in energy demand, while solar radiation indicates energy production from residential solar. Historical weather data was obtained from the National Oceanic and Atmospheric Administration Application Programming Interface. Energy Demand data was collected from the US Energy Information Administration (EIA).

Air Temp Data

Electricity Demand Data

For this study, I built a Long Short-Term Memory (LSTM) model, due to its ability to handle long time-series data, to evaluate its effectiveness at forecasting electricity demand. Recurrent Neural Networks (RNNs), in general, input vectors of related data called sequences into a “looped-layered” network to predict desired scalar or vector outputs. LSTM follows the classic Neural Network structure, but instead of being composed of neurons passing input to output within its hidden layers, it uses LSTM Recurrent Units. These units, in addition to passing input to output, retain key vectors of previously inputted data that resemble short and long term memory.

Image Accessed 2024, Saul Dobilas

Basic 1 Layer LSTM Network

To dive a bit deeper, the LSTM unit retains its “memory” through two states: the cell state (long term memory) and the hidden state (short term memory). The cell state holds a vector of older time-step data that it deems to be important, local to the unit. On the other hand, the hidden state holds a vector of data from the previous time-step of data. Thus, an LSTM unit, let’s say at time-step (t), works by first using the previous hidden state (t-1) and input x(t) to update the unit’s cell state (t-1), becoming cell state (t). Next, the unit uses cell state (t) to create hidden state (t), which is then passed on to the next time-step unit. The LSTM unit uses a combination of feedforward neural network gates to manage the information retained in each of these memory states. The Forget, Input, and Candidate gates manage what information is deleted from and inserted into the cell state (long term memory). Finally, the Output gate merges information from the cell state and prior time-step hidden state to create the new hidden state. The gates take advantage of the sigmoid and tanh functions, which scale values from 0 to 1 and -1 to 1, respectively, to control the degree of memory management for the elements of each memory vector.

Image Accessed 2024, Saul Dobilas

LSTM Recurrent Unit

Zooming out, our LSTM model takes the weather and energy demand data from the past 24 hours, along with the forthcoming hour's weather forecast as input, to forecast the energy demand for the next hour or 24 hours. This data is structured into two-dimensional arrays, sequences as mentioned before, where each row holds the hourly set of weather and energy demand data. LSTMs, including RNNs in general, then take the input sequence to predict the output variable or, in our model’s case, the next hour or 24 hours of energy demand.

To best incorporate the historical (past 24 hours) and future (next hour or 24 hours) data as inputs into the LSTM model, a sequence was made for each; the historical sequence was a 24 x 6 array, composed of the five meteorological variables and energy demand, while the future sequence was a 1 x 5 array, for the hourly case, and a 24 x 5 array, for the 24 hour case, only composing of the meteorological forecasted values.

Considering the varying importance of each input sequence, the LSTM model was architected to process each sequence through their own branches; each branch consists of three LSTM and three dropout layers. By splitting the input into two branches, the model can better capture temporal dependencies in the historical context while also considering the influence of future predictors. The branches are then merged at the end of the network to integrate the learned representations before making a final prediction.

LSTM Model Structure

The entire dataset was split into training and testing subsets with the former composing 80% of the dataset. The model was then trained on the training subset for 20 epochs while using the Adam gradient descent optimizer to update the weights of the model through backpropagation. The training set ended with a mean squared error loss below 0.005%.

Results

With the model trained, we provided it with the unseen test dataset to see how it performed in predicting the next hour of energy demand. The root mean squared error of 2.53% for our validation dataset indicates that the model performed extraordinarily well. Despite the high intra-day volatility of demand, mainly driven by behind-the-meter rooftop solar generation, the model captures that early afternoon dip and the sharp ramping during the night peak very accurately.

Last Two Weeks of Predicted Energy Demand Against Test Dataset

(RNN Hour-Ahead Model)

When analyzing the hourly root mean square error graphs for the hour-ahead model, the model exhibits its highest errors during the mid-day demand dip, primarily attributable to Behind-the-Meter (BtM) solar production. This is in contrary to expectations of weaker performance during demand peaks. This suggests that the solar radiation and temperature data integrated used to feed this information into the model don't provide complete insight into the actual production values that decrease the system's demand.

RMSE for each Hour of Day (RNN Hour-Ahead Model)

For our day-ahead forecast model, it aims to forecast the entire day's demand curve. The impact of this alteration is evident in the notable increase in error. The model underestimates both the daily dip and peak in demand, primarily due to the absence of previous-hour demand feedback, which would enable continuous refinement of its predictions. However, it's worth noting that during midnight and early morning hours, the model continues to perform admirably.

Last Two Weeks of Predicted Energy Demand Against Test Dataset

(RNN Day-Ahead Model)

Analyzing the hourly Root Mean Square Error (RMSE) values for our day-ahead forecast model reveals a pattern akin to the hour-ahead model, with the highest error values occurring during hours of photovoltaic (PV) production. Comparatively, the overall error values are higher than those of the hour-ahead model, which aligns with our expectations. Accordingly, a significant deviation is observed at midnight, where the error value experiences a considerable spike. This anomaly is attributable to the model's structure. Under the current setup, each day is structured as a distinct block of information. Consequently, for hour 0, the model lacks past data to refine its prediction, leading to heightened errors. This limitation highlights a clear drawback of the model, one that could be mitigated through potential adjustments to our sequence structure.

RMSE for each Hour of Day (RNN Day-Ahead Model)

Discussion

Our exploration of various models and structures has provided valuable insights into their performance characteristics, strengths, and weaknesses. Each model demonstrated satisfactory performance, yet with distinct advantages and limitations. For instance, extending the forecasting horizon for our neural network model revealed an accuracy trade-off, a common challenge in predictive modeling.

Remarkably, the hour-ahead neural network model emerged as a standout performer with an RMSE of less than 3%. However, there remains room for improvement in error reduction.

A significant challenge observed across all models is the inadequate representation of behind-the-meter solar generation in our input data. This deficiency contributes to increased RMSE values during solar production hours. To address this issue, one proposed strategy is to refine our dataset by selectively trimming earlier data, focusing on recent years such as 2023, during which there has been exponential growth in installed capacity. This adjustment aims to better demonstrate the correlation between solar radiation levels and the consequent decrease in electricity load due to behind-the-meter generation.

Furthermore, considering datasets that include information on installed rooftop solar capacity could provide valuable insights. Although such data may be initially uncorrelated with our current input data, it could enhance our modeling of intraday demand dips.

Another avenue for improvement involves adopting a strategy that merges multiple models and weighs their predictions. This approach can help mitigate inefficiencies, particularly evident in the day-ahead model's struggle to accurately predict load for midnight (hour 0 of our sequence structure).

Page updated

Google Sites

Report abuse