| ========================== |
| Performing raw SQL queries |
| ========================== |
| |
| .. currentmodule:: django.db.models |
| |
| When the :doc:`model query APIs </topics/db/queries>` don't go far enough, you |
| can fall back to writing raw SQL. Django gives you two ways of performing raw |
| SQL queries: you can use :meth:`Manager.raw()` to `perform raw queries and |
| return model instances`__, or you can avoid the model layer entirely and |
| `execute custom SQL directly`__. |
| |
| __ `performing raw queries`_ |
| __ `executing custom SQL directly`_ |
| |
| .. _executing-raw-queries: |
| |
| Performing raw queries |
| ====================== |
| |
| .. versionadded:: 1.2 |
| |
| The ``raw()`` manager method can be used to perform raw SQL queries that |
| return model instances: |
| |
| .. method:: Manager.raw(raw_query, params=None, translations=None) |
| |
| This method method takes a raw SQL query, executes it, and returns a |
| :class:`~django.db.models.query.RawQuerySet` instance. This |
| :class:`~django.db.models.query.RawQuerySet` instance can be iterated |
| over just like an normal QuerySet to provide object instances. |
| |
| This is best illustrated with an example. Suppose you've got the following model:: |
| |
| class Person(models.Model): |
| first_name = models.CharField(...) |
| last_name = models.CharField(...) |
| birth_date = models.DateField(...) |
| |
| You could then execute custom SQL like so:: |
| |
| >>> for p in Person.objects.raw('SELECT * FROM myapp_person'): |
| ... print p |
| John Smith |
| Jane Jones |
| |
| Of course, this example isn't very exciting -- it's exactly the same as |
| running ``Person.objects.all()``. However, ``raw()`` has a bunch of other |
| options that make it very powerful. |
| |
| .. admonition:: Model table names |
| |
| Where'd the name of the ``Person`` table come from in that example? |
| |
| By default, Django figures out a database table name by joining the |
| model's "app label" -- the name you used in ``manage.py startapp`` -- to |
| the model's class name, with an underscore between them. In the example |
| we've assumed that the ``Person`` model lives in an app named ``myapp``, |
| so its table would be ``myapp_person``. |
| |
| For more details check out the documentation for the |
| :attr:`~Options.db_table` option, which also lets you manually set the |
| database table name. |
| |
| .. warning:: |
| |
| No checking is done on the SQL statement that is passed in to ``.raw()``. |
| Django expects that the statement will return a set of rows from the |
| database, but does nothing to enforce that. If the query does not |
| return rows, a (possibly cryptic) error will result. |
| |
| .. warning:: |
| |
| If you are performing queries on MySQL, note that MySQL's silent type coercion |
| may cause unexpected results when mixing types. If you query on a string |
| type column, but with an integer value, MySQL will coerce the types of all values |
| in the table to an integer before performing the comparison. For example, if your |
| table contains the values ``'abc'``, ``'def'`` and you query for ``WHERE mycolumn=0``, |
| both rows will match. To prevent this, perform the correct typecasting |
| before using the value in a query. |
| |
| Mapping query fields to model fields |
| ------------------------------------ |
| |
| ``raw()`` automatically maps fields in the query to fields on the model. |
| |
| The order of fields in your query doesn't matter. In other words, both |
| of the following queries work identically:: |
| |
| >>> Person.objects.raw('SELECT id, first_name, last_name, birth_date FROM myapp_person') |
| ... |
| >>> Person.objects.raw('SELECT last_name, birth_date, first_name, id FROM myapp_person') |
| ... |
| |
| Matching is done by name. This means that you can use SQL's ``AS`` clauses to |
| map fields in the query to model fields. So if you had some other table that |
| had ``Person`` data in it, you could easily map it into ``Person`` instances:: |
| |
| >>> Person.objects.raw('''SELECT first AS first_name, |
| ... last AS last_name, |
| ... bd AS birth_date, |
| ... pk as id, |
| ... FROM some_other_table''') |
| |
| As long as the names match, the model instances will be created correctly. |
| |
| Alternatively, you can map fields in the query to model fields using the |
| ``translations`` argument to ``raw()``. This is a dictionary mapping names of |
| fields in the query to names of fields on the model. For example, the above |
| query could also be written:: |
| |
| >>> name_map = {'first': 'first_name', 'last': 'last_name', 'bd': 'birth_date', 'pk': 'id'} |
| >>> Person.objects.raw('SELECT * FROM some_other_table', translations=name_map) |
| |
| Index lookups |
| ------------- |
| |
| ``raw()`` supports indexing, so if you need only the first result you can |
| write:: |
| |
| >>> first_person = Person.objects.raw('SELECT * from myapp_person')[0] |
| |
| However, the indexing and slicing are not performed at the database level. If |
| you have a big amount of ``Person`` objects in your database, it is more |
| efficient to limit the query at the SQL level:: |
| |
| >>> first_person = Person.objects.raw('SELECT * from myapp_person LIMIT 1')[0] |
| |
| Deferring model fields |
| ---------------------- |
| |
| Fields may also be left out:: |
| |
| >>> people = Person.objects.raw('SELECT id, first_name FROM myapp_person') |
| |
| The ``Person`` objects returned by this query will be deferred model instances |
| (see :meth:`~django.db.models.query.QuerySet.defer()`). This means that the |
| fields that are omitted from the query will be loaded on demand. For example:: |
| |
| >>> for p in Person.objects.raw('SELECT id, first_name FROM myapp_person'): |
| ... print p.first_name, # This will be retrieved by the original query |
| ... print p.last_name # This will be retrieved on demand |
| ... |
| John Smith |
| Jane Jones |
| |
| From outward appearances, this looks like the query has retrieved both |
| the first name and last name. However, this example actually issued 3 |
| queries. Only the first names were retrieved by the raw() query -- the |
| last names were both retrieved on demand when they were printed. |
| |
| There is only one field that you can't leave out - the primary key |
| field. Django uses the primary key to identify model instances, so it |
| must always be included in a raw query. An ``InvalidQuery`` exception |
| will be raised if you forget to include the primary key. |
| |
| Adding annotations |
| ------------------ |
| |
| You can also execute queries containing fields that aren't defined on the |
| model. For example, we could use `PostgreSQL's age() function`__ to get a list |
| of people with their ages calculated by the database:: |
| |
| >>> people = Person.objects.raw('SELECT *, age(birth_date) AS age FROM myapp_person') |
| >>> for p in people: |
| ... print "%s is %s." % (p.first_name, p.age) |
| John is 37. |
| Jane is 42. |
| ... |
| |
| __ http://www.postgresql.org/docs/8.4/static/functions-datetime.html |
| |
| Passing parameters into ``raw()`` |
| --------------------------------- |
| |
| If you need to perform parameterized queries, you can use the ``params`` |
| argument to ``raw()``:: |
| |
| >>> lname = 'Doe' |
| >>> Person.objects.raw('SELECT * FROM myapp_person WHERE last_name = %s', [lname]) |
| |
| ``params`` is a list of parameters. You'll use ``%s`` placeholders in the |
| query string (regardless of your database engine); they'll be replaced with |
| parameters from the ``params`` list. |
| |
| .. warning:: |
| |
| **Do not use string formatting on raw queries!** |
| |
| It's tempting to write the above query as:: |
| |
| >>> query = 'SELECT * FROM myapp_person WHERE last_name = %s' % lname |
| >>> Person.objects.raw(query) |
| |
| **Don't.** |
| |
| Using the ``params`` list completely protects you from `SQL injection |
| attacks`__, a common exploit where attackers inject arbitrary SQL into |
| your database. If you use string interpolation, sooner or later you'll |
| fall victim to SQL injection. As long as you remember to always use the |
| ``params`` list you'll be protected. |
| |
| __ http://en.wikipedia.org/wiki/SQL_injection |
| |
| .. _executing-custom-sql: |
| |
| Executing custom SQL directly |
| ============================= |
| |
| Sometimes even :meth:`Manager.raw` isn't quite enough: you might need to |
| perform queries that don't map cleanly to models, or directly execute |
| ``UPDATE``, ``INSERT``, or ``DELETE`` queries. |
| |
| In these cases, you can always access the database directly, routing around |
| the model layer entirely. |
| |
| The object ``django.db.connection`` represents the |
| default database connection, and ``django.db.transaction`` represents the |
| default database transaction. To use the database connection, call |
| ``connection.cursor()`` to get a cursor object. Then, call |
| ``cursor.execute(sql, [params])`` to execute the SQL and ``cursor.fetchone()`` |
| or ``cursor.fetchall()`` to return the resulting rows. After performing a data |
| changing operation, you should then call |
| ``transaction.commit_unless_managed()`` to ensure your changes are committed |
| to the database. If your query is purely a data retrieval operation, no commit |
| is required. For example:: |
| |
| def my_custom_sql(): |
| from django.db import connection, transaction |
| cursor = connection.cursor() |
| |
| # Data modifying operation - commit required |
| cursor.execute("UPDATE bar SET foo = 1 WHERE baz = %s", [self.baz]) |
| transaction.commit_unless_managed() |
| |
| # Data retrieval operation - no commit required |
| cursor.execute("SELECT foo FROM bar WHERE baz = %s", [self.baz]) |
| row = cursor.fetchone() |
| |
| return row |
| |
| If you are using more than one database you can use |
| ``django.db.connections`` to obtain the connection (and cursor) for a |
| specific database. ``django.db.connections`` is a dictionary-like |
| object that allows you to retrieve a specific connection using its |
| alias:: |
| |
| from django.db import connections |
| cursor = connections['my_db_alias'].cursor() |
| # Your code here... |
| transaction.commit_unless_managed(using='my_db_alias') |
| |
| By default, the Python DB API will return results without their field |
| names, which means you end up with a ``list`` of values, rather than a |
| ``dict``. At a small performance cost, you can return results as a |
| ``dict`` by using something like this:: |
| |
| def dictfetchall(cursor): |
| "Returns all rows from a cursor as a dict" |
| desc = cursor.description |
| return [ |
| dict(zip([col[0] for col in desc], row)) |
| for row in cursor.fetchall() |
| ] |
| |
| Here is an example of the difference between the two:: |
| |
| >>> cursor.execute("SELECT id, parent_id from test LIMIT 2"); |
| >>> cursor.fetchall() |
| ((54360982L, None), (54360880L, None)) |
| |
| >>> cursor.execute("SELECT id, parent_id from test LIMIT 2"); |
| >>> dictfetchall(cursor) |
| [{'parent_id': None, 'id': 54360982L}, {'parent_id': None, 'id': 54360880L}] |
| |
| |
| .. _transactions-and-raw-sql: |
| |
| Transactions and raw SQL |
| ------------------------ |
| |
| When you make a raw SQL call, Django will automatically mark the |
| current transaction as dirty. You must then ensure that the |
| transaction containing those calls is closed correctly. See :ref:`the |
| notes on the requirements of Django's transaction handling |
| <topics-db-transactions-requirements>` for more details. |
| |
| .. versionchanged:: 1.3 |
| |
| Prior to Django 1.3, it was necessary to manually mark a transaction |
| as dirty using ``transaction.set_dirty()`` when using raw SQL calls. |
| |
| Connections and cursors |
| ----------------------- |
| |
| ``connection`` and ``cursor`` mostly implement the standard Python DB-API |
| described in :pep:`249` (except when it comes to :doc:`transaction handling |
| </topics/db/transactions>`). If you're not familiar with the Python DB-API, note |
| that the SQL statement in ``cursor.execute()`` uses placeholders, ``"%s"``, |
| rather than adding parameters directly within the SQL. If you use this |
| technique, the underlying database library will automatically add quotes and |
| escaping to your parameter(s) as necessary. (Also note that Django expects the |
| ``"%s"`` placeholder, *not* the ``"?"`` placeholder, which is used by the SQLite |
| Python bindings. This is for the sake of consistency and sanity.) |