Using Python's `sqlite3` Module

A walk through the Standard Library's reference implementation of DB API 2

Assumptions

I assume that you have a Python interpreter on your own machines. I'll used Python 2.7 in my examples, but they will work equally well in 2.6 and almost as well in 2.5 or earlier.

I am also assuming that you will be following along. You learn best by doing, so do.

A reminder that the resources I'll be referencing are available in the examples directory of this repository:

$ git clone git://github.com/cewing/training.python_sql.git

Getting Started

Start by moving to the examples folder, opening a Python interpreter and importing the sqlite3 module:

(create an examples folder if you don't have the repo cloned)

$ cd examples
$ python2.7
Python 2.7.1 (r271:86832, Apr  4 2011, 22:22:40)
[GCC 4.2.1 (Apple Inc. build 5664)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import sqlite3

Learning About the Module

We can poke the module a bit to learn about it:

>>> sqlite3.sqlite_version
'3.6.12'
>>> sqlite3.apilevel
'2.0'
>>> sqlite3.paramstyle
'qmark'
>>> sqlite3.threadsafety
1

level	meaning
0	Not safe
1	Safe at Module level only
2	Safe at Module and Connection
3	Safe at Module, Connection and Cursor

Connecting

SQLite3 is a file-based system, and it will create the file it needs if one doesn't exist. We can create a sqlite3 database just by attempting to connect to it:

>>> import createdb
>>> createdb.main()
Need to create database and schema
>>> reload(createdb)
<module 'createdb' from 'createdb.pyc'>
>>> createdb.main()
Database exists, assume schema does, too.

Let's see how this works

edit createdb.py

Open createdb.py in your favorite text editor:

import os
import sqlite3

DB_FILENAME = 'books.db'
DB_IS_NEW = not os.path.exists(DB_FILENAME)

def main():
    conn =  sqlite3.connect(DB_FILENAME)
    if DB_IS_NEW:
        print 'Need to create database and schema'
    else:
        print 'Database exists, assume schema does, too.'
    conn.close()

if __name__ == '__main__':
    main()

Set Up The Schema

Make the following changes to createdb.py:

DB_FILENAME = 'books.db'
SCHEMA_FILENAME = 'ddl.sql' # <- this is new
DB_IS_NEW = not os.path.exists(DB_FILENAME)

def main():
    with sqlite3.connect(DB_FILENAME) as conn: # <- context mgr
        if DB_IS_NEW: # A whole new if clause:
            print 'Creating schema'
            with open(SCHEMA_FILENAME, 'rt') as f:
                schema = f.read()
            conn.executescript(schema)
        else:
            print 'Database exists, assume schema does, too.'
    # delete the `conn.close()` that was here.

Verify Your Work

Quit your python interpreter and delete the file books.db that should be in the examples folder

Then run the script from the command line to try it out:

$ python2.7 createdb.py
Creating schema
$ python2.7 createdb.py
Database exists, assume schema does, too.

Introspect the Database

Add the following to createdb.py:

# in the imports, add this line:
from utils import show_table_metadata

else:
    # in the else clause, replace the print statement with this:
    print "Database exists, introspecting:"
    tablenames = ['author', 'book']
    cursor = conn.cursor()
    for name in tablenames:
        print "\n"
        show_table_metadata(cursor, name)

Then try running python2.7 createdb.py again

My Results

$ python2.7 createdb.py
Table Metadata for 'author':
cid        | name       | type       | notnull    | dflt_value | pk         |
-----------+------------+------------+------------+------------+------------+-
0          | authorid   | INTEGER    | 1          | None       | 1          |
-----------+------------+------------+------------+------------+------------+-
1          | name       | TEXT       | 0          | None       | 0          |
-----------+------------+------------+------------+------------+------------+-


Table Metadata for 'book':
cid        | name       | type       | notnull    | dflt_value | pk         |
-----------+------------+------------+------------+------------+------------+-
0          | bookid     | INTEGER    | 1          | None       | 1          |
-----------+------------+------------+------------+------------+------------+-
1          | title      | TEXT       | 0          | None       | 0          |
-----------+------------+------------+------------+------------+------------+-
2          | author     | INTEGER    | 1          | None       | 0          |
-----------+------------+------------+------------+------------+------------+-

Inserting Data

Let's load up some data. Fire up your interpreter and type:

>>> import sqlite3
>>> insert = """
... INSERT INTO author (name) VALUES("Iain M. Banks");"""
>>> with sqlite3.connect("books.db") as conn:
...     cur = conn.cursor()
...     cur.execute(insert)
...     cur.rowcount
...     cur.close()
...
<sqlite3.Cursor object at 0x10046e880>
1
>>>

Did that work?

Querying Data

Let's query our database to find out:

>>> query = """
... SELECT * from author;"""
>>> with sqlite3.connect("books.db") as conn:
...     cur = conn.cursor()
...     cur.execute(query)
...     rows = cur.fetchall()
...     for row in rows:
...         print row
...
<sqlite3.Cursor object at 0x10046e8f0>
(1, u'Iain M. Banks')

Alright! We've got data in there. Let's make it more efficient

Parameterized Statements

Try this:

>>> insert = """
... INSERT INTO author (name) VALUES(?);"""
>>> authors = [["China Mieville"], ["Frank Herbert"],
... ["J.R.R. Tolkien"], ["Susan Cooper"], ["Madeline L'Engle"]]
>>> with sqlite3.connect("books.db") as conn:
...     cur = conn.cursor()
...     cur.executemany(insert, authors)
...     print cur.rowcount
...     cur.close()
...
<sqlite3.Cursor object at 0x10046e8f0>
5

Check Your Work

Again, query the database:

>>> query = """
... SELECT * from author;"""
>>> with sqlite3.connect("books.db") as conn:
...     cur = conn.cursor()
...     cur.execute(query)
...     rows = cur.fetchall()
...     for row in rows:
...         print row
...
<sqlite3.Cursor object at 0x10046e8f0>
(1, u'Iain M. Banks')
...
(4, u'J.R.R. Tolkien')
(5, u'Susan Cooper')
(6, u"Madeline L'Engle")

Transactions

Transactions let you group a number of operations together, allowing you to make sure they worked before you actually push the results into the database.

In SQLite3, operations that belong to the Data Manipulation subset (INSERT, UPDATE, DELETE) require an explicit commit unless auto-commit has been enabled.

So far, commits have been hidden from us by the with statement. The context manager takes care of committing when the context closes (at the end of the with statement)

Let's add some code so we can see the effect of transactions.

Populating the Database

Let's start by seeing what happens when you try to look for newly added data before the insert transaction is committed.

Begin by quitting your interpreter and deleting books.db.

Then re-create the database, empty:

$ python2.7 createdb.py
Creating schema

Setting Up the Test

In populatedb.py, add this code at the end of the file:

with sqlite3.connect(DB_FILENAME) as conn1:
    print "\nOn conn1, before insert:"
    show_authors(conn1)

    authors = ([author] for author in AUTHORS_BOOKS.keys())
    cur = conn1.cursor()
    cur.executemany(author_insert, authors)
    print "\nOn conn1, after insert:"
    show_authors(conn1)

    with sqlite3.connect(DB_FILENAME) as conn2:
        print "\nOn conn2, before commit:"
        show_authors(conn2)

        conn1.commit()
        print "\nOn conn2, after commit:"
        show_authors(conn2)

Running the Test

Quit your python interpreter and run the populatedb.py script:

$ python2.7 populatedb.py
On conn1, before insert:
no rows returned
On conn1, after insert:
(1, u'China Mieville')
(2, u'Frank Herbert')
(3, u'Susan Cooper')
(4, u'J.R.R. Tolkien')
(5, u"Madeline L'Engle")
On conn2, before commit:
no rows returned
On conn2, after commit:
(1, u'China Mieville')
(2, u'Frank Herbert')
(3, u'Susan Cooper')
(4, u'J.R.R. Tolkien')
(5, u"Madeline L'Engle")

Rollback

That's all well and good, but what happens if an error occurs?

Transactions can be rolled back in order to wipe out partially completed work.

Like with commit, using connect as a context manager in a with statement will automatically rollback for exceptions.

Let's rewrite our populatedb script so it explicitly commits or rolls back a transaction depending on exceptions occurring

edit populatedb.py (slide 1)

First, add the following function above the if __name__ == '__main__' block:

def populate_db(conn):
    authors = ([author] for author in AUTHORS_BOOKS.keys())
    cur = conn.cursor()
    cur.executemany(author_insert, authors)

    for author in AUTHORS_BOOKS.keys():
        params = ([book, author] for book in AUTHORS_BOOKS[author])
        cur.executemany(book_insert, params)

edit populatedb.py (slide 2)

Then, in the runner:

with sqlite3.connect(DB_FILENAME) as conn1:
    with sqlite3.connect(DB_FILENAME) as conn2:
        try:
            populate_db(conn1)
            print "\nauthors and books on conn2 before commit:"
            show_authors(conn2)
            show_books(conn2)
        except Exception:
            conn1.rollback()
            print "\nauthors and books on conn2 after rollback:"
            show_authors(conn2)
            show_books(conn2)
            raise
        else:
            conn1.commit()
            print "\nauthors and books on conn2 after commit:"
            show_authors(conn2)
            show_books(conn2)

Try it Out

Remove books.db and recrete the database, then run our script:

$ rm books.db
$ python2.7 createdb.py
Creating schema
$ python2.7 populatedb.py

authors and books on conn2 after rollback:
no rows returned
no rows returned
Traceback (most recent call last):
  File "populatedb.py", line 57, in <module>
    populate_db(conn1)
  File "populatedb.py", line 46, in populate_db
    cur.executemany(book_insert, params)
sqlite3.InterfaceError: Error binding parameter 0 - probably unsupported type.

Oooops, Fix It

Okay, we got an error, and the transaction was rolled back correctly.

Open utils.py and find this:

'Susan Cooper': ["The Dark is Rising", ["The Greenwitch"]],

Fix it like so:

'Susan Cooper': ["The Dark is Rising", "The Greenwitch"],

It appears that we were attempting to bind a list as a parameter. Ooops.

Try It Again

Now that the error in our data is repaired, let's try again:

$ python2.7 populatedb.py

Reporting authors and books on conn2 before commit:
no rows returned
no rows returned
Reporting authors and books on conn2 after commit:
(1, u'China Mieville')
(2, u'Frank Herbert')
(3, u'Susan Cooper')
(4, u'J.R.R. Tolkien')
(5, u"Madeline L'Engle")
(1, u'Perdido Street Station', 1)
(2, u'The Scar', 1)
(3, u'King Rat', 1)
(4, u'Dune', 2)
(5, u"Hellstrom's Hive", 2)
(6, u'The Dark is Rising', 3)
(7, u'The Greenwitch', 3)
(8, u'The Hobbit', 4)
(9, u'The Silmarillion', 4)
(10, u'A Wrinkle in Time', 5)
(11, u'A Swiftly Tilting Planet', 5)

Isolation

So far, our transactions have been managed. Either explicitly by us, or automatically by the context manager statement with

This behavior is the result of an aspect of the database connection called the isolation level. There are three isolation levels available:

DEFERRED - Locks the database once changes have begun to be written to the filesystem. Read-only operations are not blocked
IMMEDIATE - Locks the database as soon as a transaction is begun. Read-only operations are not blocked
EXCLUSIVE - Locks the database as soon as a transaction is begun. This blocks any read-only operations as well

The default level is DEFERRED

Autocommit

The isolation level of a connection can be set with a keyword argument provided to the connect constructor:

con = sqlite3.connect('mydb.db', isolation_level="EXCLUSIVE")

If you explicitly set this argument to None, you can enable autocommit behavior.

If autocommit is enabled, then any DML operations that occur on a connection will be immediately committed

Testing Autocommit

First, edit populatedb.py:

with sqlite3.connect(DB_FILENAME,
                     isolation_level=None) as conn1:
    with sqlite3.connect(DB_FILENAME,
                         isolation_level=None) as conn2:

Next, undo your changes to utils.py so that the error we had will happen again

Finally, delete books.db, recreate it and test the populate script:

$ rm books.db
$ python2.7 createdb.py
Creating schema
$ python2.7 populatedb.py

The Result

authors and books on conn2 after rollback:
(1, u'China Mieville')
(2, u'Frank Herbert')
(3, u'Susan Cooper')
(4, u'J.R.R. Tolkien')
(5, u"Madeline L'Engle")
(1, u'Perdido Street Station', 1)
(2, u'The Scar', 1)
(3, u'King Rat', 1)
(4, u'Dune', 2)
(5, u"Hellstrom's Hive", 2)
(6, u'The Dark is Rising', 3)
Traceback (most recent call last):
  File "populatedb.py", line 57, in <module>
    populate_db(conn1)
  File "populatedb.py", line 46, in populate_db
    cur.executemany(book_insert, params)
sqlite3.InterfaceError: Error binding parameter 0 - probably unsupported type.

EXCLUSIVE isolation

There's not a whole lot of difference between the default "DEFERRED" isolation level and "IMMEDIATE"

There's quite a large difference, though for the "EXCLUSIVE" level.

Open threaded.py in your editors.

This is an example of using our existing database population setup in a threaded environment. One thread will load the database, the other will read it.

Take a few moments to review the control flow here. What should happen?

Testing It

First, re-fix the bug in our utils.py file so that we don't get errors when running this test.

Then, kill the old database, recreate it and run our new script:

$ rm books.db
$ python2.7 createdb.py
Creating schema
$ python2.7 threaded.py

The Results

2013-04-30 15:37:37,556 (Writer    ) connecting
2013-04-30 15:37:37,556 (Reader    ) waiting to sync
2013-04-30 15:37:37,556 (Writer    ) connected
2013-04-30 15:37:37,557 (Writer    ) changes made
2013-04-30 15:37:37,557 (Writer    ) waiting to sync
2013-04-30 15:37:39,556 (MainThread) sending sync event
2013-04-30 15:37:39,557 (Reader    ) beginning read
2013-04-30 15:37:39,557 (Reader    ) beginning read
2013-04-30 15:37:39,557 (Writer    ) PAUSING
2013-04-30 15:37:42,559 (Writer    ) CHANGES COMMITTED
2013-04-30 15:37:42,590 (Reader    ) selects issued
(1, u'China Mieville')
(2, u'Frank Herbert')
(3, u'Susan Cooper')
(4, u'J.R.R. Tolkien')
(5, u"Madeline L'Engle")
2013-04-30 15:37:42,590 (Reader    ) results fetched
2013-04-30 15:37:42,590 (Reader    ) beginning read
2013-04-30 15:37:42,590 (Reader    ) selects issued
(1, u'Perdido Street Station', 1)
(2, u'The Scar', 1)
(3, u'King Rat', 1)
(4, u'Dune', 2)
(5, u"Hellstrom's Hive", 2)
(6, u'The Dark is Rising', 3)
(7, u'The Greenwitch', 3)
(8, u'The Hobbit', 4)
(9, u'The Silmarillion', 4)
(10, u'A Wrinkle in Time', 5)
(11, u'A Swiftly Tilting Planet', 5)
2013-04-30 15:37:42,591 (Reader    ) results fetched

The End

There's a lot more about both the DB API and SQLite that could be said.

Unfortunately, that's all the time we have for tonight.

Thanks very much for your attention.

Questions?

Using Python's sqlite3 Module

View document source.

Using Python's sqlite3 Module

Assumptions

Getting Started

Learning About the Module

Connecting

edit createdb.py

Set Up The Schema

Verify Your Work

Introspect the Database

My Results

Inserting Data

Querying Data

Parameterized Statements

Check Your Work

Transactions

Populating the Database

Setting Up the Test

Running the Test

Rollback

edit populatedb.py (slide 1)

edit populatedb.py (slide 2)

Try it Out

Oooops, Fix It

Try It Again

Isolation

Autocommit

Testing Autocommit

The Result

EXCLUSIVE isolation

Testing It

The Results

The End

Using Python's `sqlite3` Module

Using Python's `sqlite3` Module