Tutorial: Visualising Solar Forecast Data Using Python and Web APIs

This tutorial will shown you how to visualise solar forecast data via Python and APIs

APIs/Development

Web APIs are becoming an increasingly popular standard for communicating data. The days of CSV downloads, file transfers, and even web-scraping are numbered, and for good reason. Before APIs can be used and integrated into any system or process, however, there’s usually an element of investigation and play that needs to happen.

What You'll Need For This Tutorial

Understanding how an API works, what data it contains, how the data is structured and formatted are all activities which have to take place upfront. One technique which helps with this is the process of collecting and transforming data into a state which is ready to be visualised. That’s what we’ll cover today 😎

For this tutorial, we’re going to create a Python script which pulls Belgian solar forecast data via an API, post-processes it into a Pandas dataframe and finally visualises the output into a time series graph.

Why solar forecasts? Well solar forecasts are an interesting time-based dataset to work with since they’re critical for many different actors within the energy sector. For example grid operators, renewable asset owners/operators/developers, traders, smart buildings, energy management systems, the list goes on and on.

So let’s begin!

 

What you’re going to need
Querying the Elia solar forecast API using Postman

Before we get into any coding, let’s first take a look at the API which Elia (the Belgian Transmission System Operator) has created. On the technical operations page you can see that there’s only one operation/endpoint to integrate and a variety of parameters we can play with.

Let’s boot up Postman and create a request using the following query parameters:

We also need to include an API subscription key (retrieved under the “My Subscriptions” area on the re.alto portal). This needs to be included in your request header with the following key-value pair:

OCP-Apim-Subscription-Key” : “YOUR-API-TOKEN

Hit run and your Postman client should get a response. For example:

Okay great, so now we have a fairly good idea of what data the API contains and also what post-processing work we’ll need to be done before we can visualise the output.

Step One - Setting Up the Python Script and Details

Over to our Python script, let’s start of by defining our imports and variables.

## script.py

# imports

import requests
import pandas as pd
import re
import matplotlib.pyplot as plt
import seaborn as sns

# inputs

token = ‘YOUR_API_TOKEN_HERE’
endpoint = ‘https://api.realto.io/elia-sf-BE/GetChartDataForZone’
sourceId = 1 # 1 = Belgium
dateFrom = ‘2021-03-07’
dateTo = ‘2021-03-15’

 

A note on some of the packages here, we’re going to be using “requests” to actually call the API. “re” will be used for running a RegEx query which is needed for our post-processing. Pandas will be used to create the data structure and finally seaborn will be used to create the timeseries plot at the end.

Step Two - Fetching and Preparing the Data

Next, let’s query the API, convert and set-up our dataframe:

# call API and set-up DataFrame

response = requests.get(endpoint, headers={‘OCP-Apim-Subscription-Key’: token}, params={‘dateFrom’:dateFrom, ‘dateTo’:dateTo, ‘sourceId’:sourceId})
json = response.json()
data = json[‘SolarForecastingChartDataForZoneItems’]
df = pd.DataFrame(data)

Here we’re calling the API and then taking the “SolarForecastingChartDataForZoneItems” array and converting it into a Pandas DataFrame. Let’s take a look at what that produces:

Step Three - Plotting a Time Series Graph and Visualising the Output

One last thing we need to do is append a proper timestamp to each row. You can see that a Unix epoch DateTime currently lives in the StartsOn column in a JSON syntax. Let’s go through each row, parse the DateTime string and pull the numbers out (using RegEx), convert this into a Pandas timestamp, and finally append the new column to the DataFrame.

# create timestamps

dates = []

for index, row in df.iterrows():
dates.append(pd.to_datetime(re.findall(“\d+”, row[‘StartsOn’][‘DateTime’])[0], unit=’ms’))

df[‘DateTime’] = dates

Once that’s done the data is ready to be visualised!

Plotting the Graph:

There’s a couple of Python packages that exist for visualising data- the one we’re using is a package called seaborn. Check out some of the visuals in their examples gallery for inspiration. Since our DataFrame is set-up, getting a timeseries graph is simply a case of defining what columns we want to plot + any settings we want to play with (e.g. colours, titles, sizes, etc). Let’s keep it simple:

# visualising

new_df = df[[‘WeekAheadForecast’, ‘DayAheadForecast’, ‘MostRecentForecast’, ‘RealTime’, ‘DateTime’]]
new_df.set_index(‘DateTime’, inplace=True)
sns.set_style(“darkgrid”)
fig, ax = plt.subplots(figsize=(10, 10))
sns.lineplot(data=new_df, ax=ax, palette=”tab10″, linewidth=2.5)

This outputs the following graph:

You can see the forecasts become more accurate the closer you are to the forecast date… Who would have guessed?

And voila, that’s it!

 

In Summary:

We queried the solar forecast API, post-processed the data, and visualising the output on a time series graph.

We hope this tutorial proved useful to you. Feel free to reach out to us with any questions you may have.

 

The full Python script

You can find a copy of our script on our GitHub page.