*********************************** A 20-minute introduction to DB-API2 *********************************** **An introduction to the standard interface for Pythonic database interactions.** History ======= Despite the norms of SQL, individual databases have lots of differences. Programmers don't want to have to think about implementation details for underlying systems. It would be nice to have a single API to hide these details. Any package implementing this API would then be interchangeable. Finalized in 1996, PEP 248 specified DB-API version 1.0 to fulfill this goal: "This API has been defined to encourage similarity between the Python modules that are used to access databases. By doing this, we hope to achieve a consistency leading to more easily understood modules, code that is generally more portable across databases, and a broader reach of database connectivity from Python." source: http://www.python.org/dev/peps/pep-0248/ By 2001, PEP 249 brought version 2.0 of the DB-API specification, with improvements: * New column types were added to support all basic data types in "modern" SQL * New API constants were added to help detect differences between implementations * The semantics for calling stored procedures were clarified. * Class-based exceptions were added to improve error handling possibilities Discussions are currently underway to push DB-API v3.0, particularly in light of the changes in Python 3.0 source: http://www.python.org/dev/peps/pep-0249/ Specification, not Implementation --------------------------------- It is important to remember that PEP 249 is **only a specification**. There is no code or package for DB-API 2.0 on it's own. Since 2.5, the Python Standard Library has provided a `reference implementation of the api `_ based on the SQLite3 RDBMS. Before version 2.5, this package was available as ``pysqlite`` (and it still is). Installing Implementations ========================== To use the DB API with any database other than SQLite3, you must have an underlying API package available. Implementations are available for: * PostgreSQL (**psycopg2**, txpostgres, ...) * MySQL (**mysql-python**, PyMySQL, ...) * MS SQL Server (**adodbapi**, pymssql, mxODBC, pyodbc, ...) * Oracle (**cx_Oracle**, mxODBC, pyodbc, ...) * and many more... source: http://wiki.python.org/moin/DatabaseInterfaces Most db api packages can be installed using typical Pythonic methods:: $ pip install psycopg2 $ pip install mysql-python ... Some Assembly Required ---------------------- Most api packages will require that the development headers for the underlying database system be available. Without these, the C symbols required for communication with the db are not present and the wrapper cannot work. **Moreover** Some of the db api wrappers have special installation requirements. The MS SQL package runs only on Windows and requires ``pywin32``. It is included in versions of ``pywin32`` since v211. The ``cx_Oracle`` package has binary installers, or can be installed from source using distutils:: $ python setup.py build $ python setup.py install The API ======= **What is in the DB API?** Globals ------- DB-API2 implementations provide the following global values: **apilevel** String constant indicating the api version ("1.0" or "2.0") **threadsafety** Integer constant between 0 and 3 indicating the scope in which threads may safely be used **paramstyle** String contstant indicating the style of marker used for parameter substitution in SQL expressions These can be used to tailor your program's expectations. A Constructor ------------- DB API provides a constructor ``connect``, which returns a ``Connection`` object: .. code-block:: python connect(parameters) This can be considered the entry point for the module. Once you've got a connection, everything else flows from there. The *parameters* required and accepted by the ``connect`` constructor will vary from implementation to implementation, since they are highly specific to the underlying database. A Connection ------------ Some of the methods of the ``connection`` may not be supported by all implementations: **.close()** Closes the connection to the database permanently. Attempts to use the connection after calling this will raise a DB-API ``Error``. **.commit()** explicitly commit any pending transactions to the databse. The method should be a no-op if the underlying db does not support transactions. **.rollback()** This optional method causes a transaction to be rolled back to the starting point. It may not be implemented everywhere. **.cursor()** returns a ``Cursor`` object which uses this ``Connection``. A Cursor - settings ------------------- You can use a few values to control the rows returned by the cursor: **.arraysize** An integer which controls how many rows are returned at a time by ``.fetchmany`` (and optionally how many to send at a time with ``.executemany``) Defaults to ``1`` **.setinputsizes(sizes)** Used to set aside memory regions for paramters passed to an operation **.setoutputsize(size[, column])** Used to control buffer size for large columns returned by an operation (``BLOB`` or ``LONG`` types, for example). The final two methods may be implemented as no-ops A Cursor - operations --------------------- The cursor should be used to run operations on the database: **.execute(operation[, parameters])** Prepares and then runs a database operation. Parameter style (seq or mapping) and markers are implementation specific (see ``paramstyle``) **.executemany(operation[, seq_of_params])** Prepares and the runs an operations once for each set of parameters provided (this replaces the old v1 behavior of passing a seq to ``.execute``). **.callproc(procname[, parameters])** Calls a stored DB procedure with the provided parameters. Returns a modified version of the provided parameters with ``output`` and ``input/output`` parameters replaced A Cursor - attributes --------------------- These attributes of ``Cursor`` can help you learn about the results of operations: **.rowcount** Tells how many rows have been returned or affected by the last operation. The number will be ``-1`` if no operation has been performed. **.description** Returns a sequence of 7-item sequences describing each of the columns in the result row(s) returned (None if no operation has been performed): * name * type_code * display_size (optional) * internal_size (optional) * precision (optional) * scale (optional) * null_ok (optional) A Cursor - results ------------------ The return value ``.execute`` or ``.executemany`` is undefined and should not be used. These methods are the way to get results after an operation: **.fetchone()** Returns the next row from a result set, and ``None`` when none remain. **.fetchmany([size=cursor.arraysize])** Returns a sequence of *size* rows (or fewer) from a result set. An empty sequence is returned when no rows remain. Defaults to ``arraysize``. **.fetchall()** Returns all (remaining) rows from a result set. This behavior *may* be affected by ``arraysize``. Note that each of these methods will raise a DB API ``Error`` if no operation has been performed (or if no result set was produced) Data Types & Constructors ------------------------- The DB-API provides types and constructors for data: * **Date(year, month, day)** - constructs an object holding a date value * **Time(hour, min, sec)** - constructs an object holding a time value * **Timestamp(y, m, d, h, min, s)** - constructs an object holding a timestamp Each of the above has a corresponding **FromTicks(ticks)** which returns the same type given a single integer argument (seconds since the *epoch*) * **Binary(string)** - constructs an object to hold long binary string data * **STRING** - a type to describe columns that hold string values (``CHAR``) * **BINARY** - a type to describe long binary columns (``BLOB``, ``RAW``) * **NUMBER** - a type to describe numeric columns * **DATETIME** - a type to describe date/time/datetime columns * **ROWID** - a type to describe the ``Row ID`` column in a database SQL ``NULL`` values are represented by Python's ``None`` Exceptions ---------- The DB API specification requires implementations to create the following hierarchy of custom Exception classes:: StandardError ├──Warning └──Error ├──InterfaceError (a problem with the db api) └──DatabaseError (a problem with the database) ├──DataError (bad data, values out of range, etc.) ├──OperationalError (the db has an issue out of our control) ├──IntegrityError ├──InternalError ├──ProgrammingError (something wrong with the operation) └──NotSupportedError (the operation is not supported) Aside from some custom extensions not required by the specification, that's it.