Tuesday 3 November 2020

PyBloom coding project part 9: Working with the database

Over the previous parts, I've shown how I set up my coding environments, and how I wrote the main program. In this part 9, I look at how I set up my external database, including learning SQL.


Database utilities

The database is set up automatically on first connection, but the tables within need to be instantiated.


Global libraries and constants


import sqlite3


DEFAULT_DB = 'database.sqlite3'


All will become clear later!


con = db_connect()  # connect to the database

cur = con.cursor()  # instantiate a cursor obj


Let’s connect to the database and start creating our two tables.


Table of Bloom colours


colours_sql = """

CREATE TABLE IF NOT EXISTS colours (


We create here the colours table, using the modifier IF NOT EXISTS to make sure we don’t delete a load of data if we accidentally run this script after the system is live.


id integer PRIMARY KEY,

temperature integer UNIQUE,

hex_value text NOT NULL


The three columns created are:

  • The primary key, id, constrained to be an integer

  • The temperature, constrained to be unique (as we should never have one temperature associated with more than one hex value)

  • The corresponding hex_value, which is constrained to be text since SQLite doesn’t have a hex data type, and can’t not exist if the temperature exists


The constraints will prevent the code from breaking as I’ve not written much error case handling.


cur.execute(colours_sql)


We need to pre-populate this colours table.


temp_colours = {  

    40: 'ff0000',  # colour value if temperature is >= than key

    35: 'ff4000',  # note: value is string, not hex

    30: 'ff8000',

    25: 'ffbf00',

    20: 'ffff00',

    15: 'bfff00',

    10: '00ff80',

     5: '00ffbf',

     0: '00ffff',

    -5: '00bfff',

   -10: '0080ff',

   -15: '0040ff'

}



These values are taken from https://www.w3schools.com/colors/colors_picker.asp. The hex values are actually strings, even though Python can very happily handle hex natively. This is because the value is intended to be used by the Python RGBxy library, the Python PyGal library, the database and the web page CSS; and neither RGBxy nor SQL can understand hex. String is therefore the common format across all of them.


colours_sql = '''INSERT OR IGNORE INTO colours (temperature, hex_value)

                 VALUES (?, ?)'''


By using the OR IGNORE modifier, we can be sure that we’re not overwriting any value if the table already exists and the script has been run in error. (I haven’t written all CRUD operations as this script is for create only; I’ll be writing the update and delete operations in the main code.) 


for temperature, hex_value in temp_colours.items():

    cur.execute(colours_sql, (temperature, hex_value))


The SQLite query above has spaces for two parameters, which we’ll pass as we iterate through the temp_colours dictionary. Every key, value pair gets written as a new row.


Table of observations


observations_sql = """

CREATE TABLE IF NOT EXISTS observations (

    id integer PRIMARY KEY,

    timestamp text UNIQUE,

    temperature real NOT NULL,

    detailed_status text NOT NULL)"""

cur.execute(observations_sql)


The rows for the table of observations are:

  • The primary key, id, constrained to be an integer

  • The timestamp, constrained to be unique (as we should never be able to make more than one observation simultaneously)

  • The corresponding observation values: temperature (constrained to be real since we want to keep the decimals) and detailed_status (constrained to be text


con.commit()

con.close()  # close connection


Finally, we commit the changes and close the connection.


Utility function to connect to the database


def db_connect(db_file=DEFAULT_DB):

    # create connection to SQLite database

    connection = sqlite3.connect(db_file)

    return connection


Because we only have one place where we declare the database filename, it means all the code remains coherent.


Utility function to get rows from a table


def get_rows(table, columns='*', **kwargs):


To make this utility flexible, it’s meant to simplify 3 use cases: 

  1. Getting all the rows from the table, in which case you don’t have to call this function with any further parameters beyond the table name

  2. Getting only the rows that fulfil a condition described in an optional dictionary of keyword arguments (**kwargs) (defaults to no filters)

  3. Getting only the named columns (defaults to all columns)


if (table == 'colours') or (table == 'observations'):


Some rudimentary logic to prevent calling for a table that doesn’t actually exist, as the code doesn’t handle SQLite errors.


con = db_connect()  # connect to the database


Here we reuse the first utility function to connect to the database. We don’t have to remember which database file and where because we’re using the utility. 


con.row_factory = sqlite3.Row


The next statement refactors the output from the queries into a list of dictionaries, which makes it a little easier to access the content. In this list, each item is a row, and each item consists of a dictionary of column name and value pairs. Each row is accessed directly by its index, but each column can be accessed by index or label. So to access the hex_value for the first row we use result[0][2] or result[0]['hex_value'].


cur = con.cursor()  # instantiate a cursor obj


Next we instantiate a cursor object, which is SQLite’s accessor object. 


rows_sql = ''

args = ()


If there aren’t any keywords passed, then we just want all the rows from the named table, and we don’t have to modify the SQLite query further.


for key, value in kwargs.items():

    if key == 'rows_sql':

        rows_sql = ' ' + value

    elif key == 'args':

        args = value


If we do have a rows filter, then we need to insert it into the SQLite query. We expect a simple dictionary of {row_sql: sql, args: (arg1_value, …)}. This little loop unpacks them. Note: I’m relying on the SQL string to be well-formed; the only parsing I’m doing is to add a space so that the final SQL query will be well-formed too.


if rows_sql.count('?') != len(args):

    results = 'Unexpected number of arguments in row modifier'


Then a simple check to see if we have the same number of arguments as places to put them.


cur.execute('SELECT ' + columns + ' FROM ' + table + rows_sql, args)


Now we have everything we need to build our .execute command. The first argument is a simple string concatenation to build the full SQL query. The second argument is the tuple of values to be inserted in place of any (?).


results = cur.fetchall()

con.close()  # close connection

return results


Once we get the results (as a list of dictionaries), we close the connection to the database and return the results.


Putting it together

import sqlite3


DEFAULT_DB = 'database.sqlite3'



def db_connect(db_file=DEFAULT_DB):

    # create connection to SQLite database

    connection = sqlite3.connect(db_file)

    return connection



# utility function, handles opening and closing the database connection

def get_rows(table, columns='*', **kwargs):

    # columns_names should be a string enclosing a tuple of selected columns

    if (table == 'colours') or (table == 'observations'):

        con = db_connect()  # connect to the database

        cur = con.cursor()  # instantiate a cursor obj

        con.row_factory = sqlite3.Row


        rows_sql = ''

        args = ()

        for key, value in kwargs.items():

            if key == 'rows_sql':

                rows_sql = ' ' + value

            elif key == 'args':

                args = value

        if rows_sql.count('?') != len(args):

            results = 'Unexpected number of arguments in row modifier'


        cur.execute('SELECT ' + columns + ' FROM ' + table + rows_sql, args)

        results = cur.fetchall()

    else:

        results = 'invalid table name'

    con.close()  # close connection

    return results



con = db_connect()  # connect to the database

cur = con.cursor()  # instantiate a cursor obj


# create table of Hue Bloom colours if one doesn't already exist

colours_sql = """

CREATE TABLE IF NOT EXISTS colours (

    id integer PRIMARY KEY,

    temperature integer UNIQUE,

    hex_value text NOT NULL)"""

cur.execute(colours_sql)


# create table of weather observations if one doesn't already exist

observations_sql = """

CREATE TABLE IF NOT EXISTS observations (

    id integer PRIMARY KEY,

    timestamp text UNIQUE,

    temperature real NOT NULL,

    detailed_status text NOT NULL)"""

cur.execute(observations_sql)


# populate Hue colours db with default values if doesn't already exist

temp_colours = {  # from https://www.w3schools.com/colors/colors_picker.asp

    40: 'ff0000',  # colour value if temperature is >= than key

    35: 'ff4000',  # note: value is string, not hex

    30: 'ff8000',

    25: 'ffbf00',

    20: 'ffff00',

    15: 'bfff00',

    10: '00ff40',

     5: '00ffbf',

     0: '00ffff',

    -5: '00bfff',

   -10: '0080ff'

}


sql = '''INSERT OR IGNORE INTO colours (temperature, hex_value) 

   VALUES (?, ?)'''

for temperature, hex_value in temp_colours.items():

    cur.execute(sql, (temperature, hex_value))


con.commit()

con.close()  # close connection



This part has introduced the second programming language of SQL. Over the course of this project I've found that you need a combination of languages in order to get things done. In the next part 10, I'll start to use another language, HTML, and introduce a scripting framework to simplify things. Also visit https://github.com/Schmoiger/pybloom for the full story.

No comments:

Post a Comment

It's always great to hear what you think. Please leave a comment, and start a conversation!