Friday, 30 October 2020

PyBloom coding project part 7: Plotting graphs from the data

Previously, I set the coding environments, and wrote the code to generate the weather observation data. Here in part 7, I create the graphs of the temperature data.

Generate graphs function

Whilst the point of this program is to look at the current temperature outside just by glancing at my Hue lamps, I also want to be able to look at the history of temperature measurements. I could’ve logged the temperature measurements into a csv file and examine them as a spreadsheet, but I wanted to present them in a more accessible way = data visualisation.

Python has many powerful visualisation libraries, and I’ve spent some time with MatPlotLib (with its Seaborn) wrapper, and find it difficult because it’s comprehensive. Looking at PyGal, it seems much simpler to work with, so I decided to use it for my program.

The technologies

PyGal
Custom CSS

The code

Observation sets

def generate_graphs(timestamp):

# observation sets

First things first, let’s set the points that mark the bounds of the observation data. I’m interested in data over the last day, over the last week and over the last month.

now = datetime.strptime(timestamp, DATETIME_STRING)

If you remember from earlier, maths with dates and times is surprisingly difficult, but using the datetime module makes it all easier. However, both Python and SQLite have their own implementation of datetime, so we need to make sure both use the same data dictionary.

DATETIME_STRING defines the format (data dictionary)
The .strptime method parses a string into a datetime object according to the format

This statement creates a datetime object now from a timestamp string. This might seem puzzling as there’s a perfectly good datetime.now() function, but this is a different now object. It’s not the up-to-the-microsecond now() from the operating system, but the now of the most recent weather observation.

last_day = now - timedelta(days=1)

last_week = now - timedelta(weeks=1)

last_month = now - timedelta(weeks=4)

The timedelta module fulfils the promise of simpler maths, with simple-to-understand syntax.

observation_sets = {

'last_day': last_day,

'last_week': last_week,

'last_month': last_month

}

Having calculated the observation points, I put them in an easy-to-access Python dictionary.

Fetching the data

rows = get_rows('colours')

There are two sets of data that we’re interested in: the colour values (which will map onto the temperatures) and the temperatures themselves. This first get_rows fetches all the columns from the colours table. It’s important to note that the results are returned as a list of tuples. This is going to be hard to manipulate later as we want a list of hex values - but hex values that PyGal understands (i.e. prefixed by #), which isn’t the same as Python type hex (i.e. prefixed by 0x). We therefore need to parse the results.

hex_list = [f'#{hex}' for hex in [row['hex_value'] for row in rows]]

We use a combination of nested list comprehension and f-strings to do this conversion:

[row['hex_value'] for row in rows] : this list comprehension cycles through the rows, and creates a list of the hex values
for hex in [row['hex_value'] … : an outer list comprehension cycles through the newly-created list
[f'#{hex}' for hex … : each item of this outer list is inserted into the output string (in effect inserting the hash sign prefix for each element), to create the final list

temps_count = {row['temperature']: 0 for row in rows}

This second command is to set up histogram bins for each of the temperature thresholds, using a bit of dictionary comprehension. Now we have the temperature data in more accessible formats, let’s move onto the observation data.

for string, then in observation_sets.items():

Since we want a similar graph for each of three different time periods, we can save a bit of effort by iterating through the observation points (then). Thanks to the dictionary, we can access both the data and the string representation, which we’ll need in a moment.

sql = 'WHERE timestamp BETWEEN datetime((?)) AND datetime((?))'

when = (then, now)

results = get_rows('observations', rows_sql=sql, args=when)

Again, our little get_rows utility makes accessing the database a little easier. The double brackets look a bit odd. The variable replaced by the underlying SQLite .execute method is represented by (?). However, they are themselves parameters into SQLite’s own datetime method, which is why they need to be enclosed in another set of brackets.

The underlying SQLite query uses its own datetime method to convert Python datetime objects into SQL datetime strings. SQL doesn’t ask for the data dictionary explicitly; it’s assumed that the string corresponds to one of the accepted formats.

Once we have the bounds correctly interpreted into the query, we can select only the data that we’re interested in. This is the reason I chose SQLite over csv. If we’d used a csv file to store all the data, I’d have to read all the data into a dataframe into memory, then search it for the data using a library such as Pandas. Not a big overhead, but given SQLite is built into Python, it's an opportunity to import one less library.

Generating the bar graphs

times = [row['timestamp'] for row in rows]

temps = [row['temperature'] for row in rows]

The database gives us a list of dictionaries; each dictionary contains a timestamp and a measurement. We need to convert these multiple lists into two lists, one containing all the timestamps, and another containing all the measurements. This is akin to transposing rows and columns of a table. This is simply done using some list comprehension.

bar_chart = pygal.Bar(x_label_rotation=20,

x_labels_major_count=6,

show_minor_x_labels=False,

show_legend=False)

Now we get into using PyGal to generate the graph. This first statement instantiates the chart object. It’s at this point that we declare it’s a bar graph, and also tell it to rotate the x labels (so that they’ll fit), limit x-axis labels to 6 in total (otherwise they’ll look cluttered), and remove the legend (because there’s only one series being shown).

bar_chart.add('Temperature', [

{'value': temp,

'color': '#' + lookup_colour(find_temp_threshold(temp))}

for temp in temps]

)

Now we add the data series, and its title. To make the colours meaningful, the colour of the bar matches the Bloom colour. This means adding each value to be plotted individually, as a dictionary which describes the value and the colour. The colour is calculated using our previously defined functions, remembering to prefix with a hash symbol. We loop all the temperature measurements using list comprehension.

bar_chart.x_labels = times

The final piece of formatting is to add labels for the x axis, using a nice simple syntax.

filename = string + '_bar.svg'

bar_chart.render_to_file(FILEPATH + filename)

We want the filename to reflect the data set, so this little snippet takes the string representation of the data from the dictionary key, creates a full file path prepending the FILEPATH = './app/static/' global constant, and saves the generated graph there.

Generating the pie charts

We want to generate one pie chart for every bar chart (just in case we need them). The first thing we need to do is to count the number of occurrences of temperature in the bins that were previously generated.

for temp in temps:

temp_threshold = find_temp_threshold(temp)

temps_count[temp_threshold] += 1

The algorithm is simplified as we’re done most of the prep work beforehand. For each temperature reading in this observation set, we first find its corresponding temperature threshold, then increment the count in its bin.

custom_style = Style(colors=(tuple(hex_list)))

We can reuse the hex values from the colours table to create a custom colour key for the pie chart. These values are in the form of a list of strings, each prefixed with a hash, which we did earlier. To be properly formatted for PyGal, we now need to convert this list into a tuple, then pass it to the Style function as one of the optional keyword arguments.

pie_chart = pygal.Pie(inner_radius=0.6,

style=custom_style)

Instantiating the pie chart is straightforward; the only configuration parameters are the size of its donut hole and the colours of the sections.

for temp, count in temps_count.items():

pie_chart.add(str(temp), count)

Each pie segment needs to be added individually, so we have a little loop.

filename = string + '_pie.svg'

pie_chart.render_to_file(FILEPATH + filename)

Finally, let’s save the chart to be used by the web app later.

Wrapping up

Now we’ve done it once, we can repeat it for each of the remaining observation sets.

return 'Created graphs'

The database utility has taken care of closing the connection to the database, so all that’s left is to return an acknowledgement string for debugging purposes. The graphs themselves are already stored (or overwritten) in the predefined folder within the loop.

Putting it together

def generate_graphs(timestamp):

# observation sets

now = datetime.strptime(timestamp, DATETIME_STRING)

last_day = now - timedelta(days=1)

last_week = now - timedelta(weeks=1)

last_month = now - timedelta(weeks=4)

observation_sets = {

'last_day': last_day,

'last_week': last_week,

'last_month': last_month

}

# get datapoints from database

rows = get_rows('colours')

hex_list = [f'#{hex}' for hex in [row['hex_value'] for row in rows]]

temps_count = {row['temperature']: 0 for row in rows}

# 3x graphs for every reading in last day, week, month

for string, then in observation_sets.items():

# fetch data

sql = 'WHERE timestamp BETWEEN datetime((?)) AND datetime((?))'

when = (then, now)

results = get_rows('observations', rows_sql=sql, args=when)

# generate bar graph

times = [row['timestamp'] for row in rows]

temps = [row['temperature'] for row in rows]

bar_chart = pygal.Bar(x_label_rotation=20,

x_labels_major_count=6,

show_minor_x_labels=False,

show_legend=False)

bar_chart.add('Temperature', [

{'value': temp,

'color': '#' + lookup_colour(find_temp_threshold(temp))}

for temp in temps]

)

bar_chart.x_labels = times

filename = string + '_bar.svg'

bar_chart.render_to_file(FILEPATH + filename)

# generate pie chart

for temp in temps:

temp_threshold = find_temp_threshold(temp)

temps_count[temp_threshold] += 1

custom_style = Style(colors=(tuple(hex_list)))

pie_chart = pygal.Pie(inner_radius=0.6,

style=custom_style)

for temp, count in temps_count.items():

pie_chart.add(str(temp), count)

filename = string + '_pie.svg'

pie_chart.render_to_file(FILEPATH + filename)

return 'Created graphs'

We've covered the final functional module in this part. They now need to be strung together, which is what I'll cover in part 8 when I talk about orchestrating the code. Also visit https://github.com/Schmoiger/pybloom for the full story.

Thursday, 29 October 2020

PyBloom coding project part 6: Setting the colours

We're into part 6 of this project, and I figure out how to set the colours of the Philips Hue lamps. Of course it's more complicated than I first thought.

Hue lamp object

This is the module that accesses the colour light bulbs, Philips Hue Bloom, and changes the colour according to the temperature observation. The observations are accessed from the persistent SQLite database. The module then makes use of a Python library that wraps the Hue API, and a couple of utilities that do the maths to convert colours expressed as CSS hex values into xy values that the Hue lamps understand.

The technologies

Philips Hue API
Qhue library
Rgbxy library
SQLite database

The code

Initialisation

class hue_lamp:

def __init__(self, lamp_id):

# not accessible

ip = HUE_IP

username = HUE_USERNAME

Instantiating a hue_lamp object creates the accessors (getters and setters) for the lamp colours. These first two variables within the object are only needed during initialisation, in order to set up the connection to the Hue Bridge, so aren't made available in an object method.

Time for a detour to get to know the Hue API. Philips encourages the hacker community and has comprehensive developer documentation on its website, which walks through setup. Whilst the API is a RESTful interface, you don’t actually send messages to the cloud. Instead, you communicate with the Hue Bridge, which is somewhere in your internal network, so there are none of the security issues of the big bad internet.

self.bridge = qhue.Bridge(ip, username)

The Bridge is identified uniquely by its IP address. It recognises your username as trustworthy, and allows access to its methods. How it decides you are trustworthy is simple and elegant - when you set up the username on the Bridge, you have to press a button on the Bridge at the same time, therefore proving that you are in possession of it and that it’s in the same internal network.

self.getter = self.bridge.lights[lamp_id]()

self.setter = self.bridge.lights[lamp_id]

The actual data object was a little difficult to get my head around, and it took a bit of trial and error to get the syntax for the getter and setter methods right. The getter is a function that returns a Python dictionary of all the resources associated with the bulb. This dictionary is created by Qhue from the JSON returned in the API. Here’s an example JSON response:

{

"state": {

"hue": 50000,

"on": true,

"effect": "none",

"alert": "none",

"bri": 200,

"sat": 200,

"ct": 500,

"xy": [0.5, 0.5],

"reachable": true,

"colormode": "hs"

"type": "Living Colors",

"name": "LC 1",

"modelid": "LC0015",

"swversion": "1.0.3"

}

The setter is different as it reflects the difference in the API. The API allows sending a PUT message to the URL of the root of the bulb; you don’t have to access the URL of the actual setting.

self.is_on = self.getter['state']['on']

self.colour = self.getter['state']['xy']

self.name = self.getter['name']

The final step in initialisation is to make a few of the statuses easier to access. Whether the light is on or not is a boolean within the state dictionary, which is itself within the returned dictionary. The colour (in xy values) is a list within the same dictionary. To access a dictionary within a dictionary you can simply chain the key names. By contrast, the name is a string in the returned dictionary, so no chaining is needed.

Representation string

def __str__(self):

if self.is_on:

status = 'on and is set to xy:' + str(self.colour)

As usual, I’ve written a representation string to aid in debugging. If the light is on, we have access to information on the state and colour.

else:

status = 'off'

If the light is off, we don’t know its colour, so the status information is different.

return f'{self.name} is {status}'

The return string uses my favourite f-string Python function to construct the message.

Setter methods

def turn_on(self):

self.setter.state(on=True)

return 'Lamp turned on'

def turn_off(self):

self.setter.state(on=False)

return 'Lamp turned off'

These first two methods simplify the (already quite simple) way to to turn lights on and off, to make it more readable.

def set_colour(self, colour): # colour is a tuple of xy values

self.setter.state(on=True)

self.setter.state(xy=colour)

return 'Lamp changed colour'

This final setter method first turns on the lamp if it isn’t already on, then sets its colour. Perhaps obviously, if the lamp is off, changing its colour alone will not make the bulb show the colour; the state has to be set to on as well.

The colours lookup table

At its core, there is a lookup table that converts temperature to colour. Since this colour is to be shown both on the Hue lamps and on the accompanying website, I decided to store this table externally. I had experience of using csv files for this purpose, but I decided to use SQLite. Both are supported natively by Python, but SQLite has some built-in methods which will be useful later on. This first utility function pulls this lookup table from the external database.

def lookup_colour(temp):

This helper function takes in a temperature value, and will return a corresponding rgb hex value.

sql = 'WHERE temperature = (?)'

what = (temp, ) # tuple with single item

results = get_rows('colours', 'hex_value', rows_sql=sql, args=what)

We now use the second utility function that accesses our database. It’s quite a flexible function that handles opening and closing the connection to the SQLite database, and creating the SQLite query from filter criteria. Later on, we’ll get into the detail of how this utility is created, but for now, let’s see why this particular query was constructed.

The SQLite query we’re trying to construct is:

SELECT ‘hex_value’

FROM ‘colours’

WHERE temperature = temp

hex_value is the column which contains the data we’re interested in
colors is the table that contains all the data
temperature is the row which we want to filter
= temp is the condition we want to apply to the row

Under the hood, the utility function makes use of the SQLite .execute(sql, args_opt) method. If the SQLite query contains one or more (?), SQLite3 replaces the ? with the arguments from the second half of the method. This is a neat way of passing parameters from Python to SQLite.

There are a fair few syntax wrinkles to keep track of. The SQLite must be a string, the ? within that string must be enclosed by brackets, and the arguments must be presented as a tuple. In this case, we’re passing a single parameter, the temperature. Because it’s a tuple of a single value, we need the trailing comma to make it work - it’s one of those Python things that tripped me up the first time I came across it.

return results[0][0] # to return the hex value string only

The function returns a table as a list of tuples, where each tuple is a row. In this case, we’re getting a table of a single value [(hex_string, )]. Both lists and tuples are iterable, so we can chain index values to access the string directly. It looks awkward, though.

Calculating the temperature to look up

def find_temp_threshold(temp):

rows = get_rows('colours', 'temperature')

all_thresholds = [row['temperature'] for row in rows]

First, let’s get all the temperature thresholds from our external table using our utility database access function. As we saw in the previous section, the result is a table comprising of a list of dictionaries. Each value can be accessed using an index or by its column label. In the last function, I used the index to access the value for brevity but in this function I use the label as it’s more readable. From the database setup (described later), we know that this is an integer which means we can use normal list methods.

max_threshold = max(all_thresholds)

min_threshold = min(all_thresholds)

Finding the maximum and minimum threshold temperatures are now easy to find.

Calculating the colour

As mentioned earlier, we can’t just pass the CSS hex value of the colour we want to the Hue lamp and expect it to understand. To get the web site to talk to the bulb, we need to convert rgb hex values to xy coordinates.

temp_threshold = int(temp/5) * 5

if temp_threshold > max_threshold:

temp_threshold = max_threshold

if temp_threshold < min_threshold:

temp_threshold = min_threshold

return temp_threshold

The algorithm does the following:

Round the temperature down to the nearest multiple of 5
If the temperature is higher than 40, return the higher bound temperature which is 40
If the temperature is lower than -10, return the lower bound temperature which is -10

The resulting temperature table looks like this:

Temperature range (celsius)	Corresponding colour (hex value)
40 or greater	ff0000
Between 35 and 40	ff4000
Between 30 and 35	ff8000
Between 25 and 30	ffbf00
Between 20 and 25	ffff00
Between 15 and 20	bfff00
Between 10 and 15	00ff80
Between 5 and 10	00ffbf
Between 0 and 5	00ffff
Between -5 and 0	00bfff
Between -5 and -10	0080ff
Lower than -10	0040ff

def convert_temp_to_colour(temp):

temp_threshold = find_temp_threshold(temp)

Having used our little helper function to calculate the temperature threshold for our lookup table, we now use a couple of modules from the rgbxy library to do the maths of the conversion from rgb hex to xy coordinate. I’ve previously touched on why this maths is complicated, but let’s recap. The hex value is used in CSS. Each pair of hex digits is the value for each of red, green and blue. So the first entry in the table for temperatures of 40C or greater shows maximum red (ff) and no green or blue (00 each). Note that rgb values do not have any concept of saturation (brightness), so maximum saturation is assumed. So the colour for 40C is bright maximum red.

converter = Converter(GamutA)

Each Hue bulb has its own colour gamut, the range of colours that it can reproduce (i.e. colour space constrained by its physical limitations). This first statement sets the applicable colour gamut to the one supported by my Hue Blooms, in this case gamut A.

colour = converter.hex_to_xy(lookup_colour(temp_threshold))

return colour

There’s a few nested things going on here, so let’s unpack them.

We found the temp_threshold using the algorithm described above
The lookup_colour function fetches the corresponding hex value from the lookup table in the external database table
We use the .hex_to_xy method from the converter library to do the maths
And because we set the gamut already, the method ensures that resulting colour is within the colour space of the Bloom lamp

Identifying the lamps

When we set colours and status of the Hue lamps, we need to be sure we’re doing so to the correct lamps. The Hue Bridge maintains a representation of all of the items (bulbs, lamps, switches, sensors etc) attached to it, and each item is assigned an ID. These are created when the item is provisioned.

hue_lamp_ids = {

'den bloom': 10,

'lounge bloom': 11

}

This dictionary is another lookup table that maps IDs to human-readable names, to make the code easier to follow. I got the numbers by taking a look at the JSON from the Bridge, using the web tool provided by Philips. You can find it on the development page.

Putting it together

class hue_lamp:

def __init__(self, lamp_id):

# not accessible

ip = HUE_IP

username = HUE_USERNAME

# accessible

self.bridge = qhue.Bridge(ip, username)

self.getter = self.bridge.lights[lamp_id]()

self.setter = self.bridge.lights[lamp_id]

self.is_on = self.getter['state']['on']

self.colour = self.getter['state']['xy']

self.name = self.getter['name']

def __str__(self):

if self.is_on:

status = 'on and is set to xy:' + str(self.colour)

else:

status = 'off'

return f'{self.name} is {status}'

def turn_on(self):

self.setter.state(on=True)

return 'Lamp turned on'

def turn_off(self):

self.setter.state(on=False)

return 'Lamp turned off'

def set_colour(self, colour): # colour is a tuple of xy values

self.setter.state(on=True)

self.setter.state(xy=colour)

return 'Lamp changed colour'

def lookup_colour(temperature):

# lookup table of temperatures to colours in database.sqlite3

sql = 'WHERE temperature = (?)'

what = (temperature, ) # tuple with single item

results = get_rows('colours', 'hex_value', rows_sql=sql, args=what)

return results[0][0] # to return the hex value string only

def find_temp_threshold(temp):

# find max and min thresholds from external database

rows = get_rows('colours', 'temperature')

all_thresholds = [row['temperature'] for row in rows]

max_threshold = max(all_thresholds)

min_threshold = min(all_thresholds)

temp_threshold = int(temp/5) * 5

if temp_threshold > max_threshold:

temp_threshold = max_threshold

if temp_threshold < min_threshold:

temp_threshold = min_threshold

return temp_threshold

def convert_temp_to_colour(temp):

temp_threshold = find_temp_threshold(temp)

converter = Converter(GamutA)

colour = converter.hex_to_xy(lookup_colour(temp_threshold))

return colour

hue_lamp_ids = {

'den bloom': 10,

'lounge bloom': 11

}

Part 6 has taken us through the second main module and how to work the the Philips Hue API. In part 7, I'll take you through how to create pretty graphs from the data, the fundamentals of data visualisation. Also visit https://github.com/Schmoiger/pybloom for the full story.