lib/django-1.4/docs/topics/db/sql.txt - external/googleappengine/python - Git at Google

 ==========================
 Performing raw SQL queries
 ==========================

 .. currentmodule:: django.db.models

 When the :doc:`model query APIs </topics/db/queries>` don't go far enough, you
 can fall back to writing raw SQL. Django gives you two ways of performing raw
 SQL queries: you can use :meth:`Manager.raw()` to `perform raw queries and
 return model instances`__, or you can avoid the model layer entirely and
 `execute custom SQL directly`__.

 __ `performing raw queries`_
 __ `executing custom SQL directly`_

 .. _executing-raw-queries:

 Performing raw queries
 ======================

 .. versionadded:: 1.2

 The ``raw()`` manager method can be used to perform raw SQL queries that
 return model instances:

 .. method:: Manager.raw(raw_query, params=None, translations=None)

 This method method takes a raw SQL query, executes it, and returns a
 :class:`~django.db.models.query.RawQuerySet` instance. This
 :class:`~django.db.models.query.RawQuerySet` instance can be iterated
 over just like an normal QuerySet to provide object instances.

 This is best illustrated with an example. Suppose you've got the following model::

     class Person(models.Model):
         first_name = models.CharField(...)
         last_name = models.CharField(...)
         birth_date = models.DateField(...)

 You could then execute custom SQL like so::

     >>> for p in Person.objects.raw('SELECT * FROM myapp_person'):
     ...     print p
     John Smith
     Jane Jones

 Of course, this example isn't very exciting -- it's exactly the same as
 running ``Person.objects.all()``. However, ``raw()`` has a bunch of other
 options that make it very powerful.

 .. admonition:: Model table names

     Where'd the name of the ``Person`` table come from in that example?

     By default, Django figures out a database table name by joining the
     model's "app label" -- the name you used in ``manage.py startapp`` -- to
     the model's class name, with an underscore between them. In the example
     we've assumed that the ``Person`` model lives in an app named ``myapp``,
     so its table would be ``myapp_person``.

     For more details check out the documentation for the
     :attr:`~Options.db_table` option, which also lets you manually set the
     database table name.

 .. warning::

     No checking is done on the SQL statement that is passed in to ``.raw()``.
     Django expects that the statement will return a set of rows from the
     database, but does nothing to enforce that. If the query does not
     return rows, a (possibly cryptic) error will result.

 .. warning::

     If you are performing queries on MySQL, note that MySQL's silent type coercion
     may cause unexpected results when mixing types. If you query on a string
     type column, but with an integer value, MySQL will coerce the types of all values
     in the table to an integer before performing the comparison. For example, if your
     table contains the values ``'abc'``, ``'def'`` and you query for ``WHERE mycolumn=0``,
     both rows will match. To prevent this, perform the correct typecasting
     before using the value in a query.

 Mapping query fields to model fields
 ------------------------------------

 ``raw()`` automatically maps fields in the query to fields on the model.

 The order of fields in your query doesn't matter. In other words, both
 of the following queries work identically::

     >>> Person.objects.raw('SELECT id, first_name, last_name, birth_date FROM myapp_person')
     ...
     >>> Person.objects.raw('SELECT last_name, birth_date, first_name, id FROM myapp_person')
     ...

 Matching is done by name. This means that you can use SQL's ``AS`` clauses to
 map fields in the query to model fields. So if you had some other table that
 had ``Person`` data in it, you could easily map it into ``Person`` instances::

     >>> Person.objects.raw('''SELECT first AS first_name,
     ...                              last AS last_name,
     ...                              bd AS birth_date,
     ...                              pk as id,
     ...                       FROM some_other_table''')

 As long as the names match, the model instances will be created correctly.

 Alternatively, you can map fields in the query to model fields using the
 ``translations`` argument to ``raw()``. This is a dictionary mapping names of
 fields in the query to names of fields on the model. For example, the above
 query could also be written::

     >>> name_map = {'first': 'first_name', 'last': 'last_name', 'bd': 'birth_date', 'pk': 'id'}
     >>> Person.objects.raw('SELECT * FROM some_other_table', translations=name_map)

 Index lookups
 -------------

 ``raw()`` supports indexing, so if you need only the first result you can
 write::

     >>> first_person = Person.objects.raw('SELECT * from myapp_person')[0]

 However, the indexing and slicing are not performed at the database level. If
 you have a big amount of ``Person`` objects in your database, it is more
 efficient to limit the query at the SQL level::

     >>> first_person = Person.objects.raw('SELECT * from myapp_person LIMIT 1')[0]

 Deferring model fields
 ----------------------

 Fields may also be left out::

     >>> people = Person.objects.raw('SELECT id, first_name FROM myapp_person')

 The ``Person`` objects returned by this query will be deferred model instances
 (see :meth:`~django.db.models.query.QuerySet.defer()`). This means that the
 fields that are omitted from the query will be loaded on demand. For example::

     >>> for p in Person.objects.raw('SELECT id, first_name FROM myapp_person'):
     ...     print p.first_name, # This will be retrieved by the original query
     ...     print p.last_name # This will be retrieved on demand
     ...
     John Smith
     Jane Jones

 From outward appearances, this looks like the query has retrieved both
 the first name and last name. However, this example actually issued 3
 queries. Only the first names were retrieved by the raw() query -- the
 last names were both retrieved on demand when they were printed.

 There is only one field that you can't leave out - the primary key
 field. Django uses the primary key to identify model instances, so it
 must always be included in a raw query. An ``InvalidQuery`` exception
 will be raised if you forget to include the primary key.

 Adding annotations
 ------------------

 You can also execute queries containing fields that aren't defined on the
 model. For example, we could use `PostgreSQL's age() function`__ to get a list
 of people with their ages calculated by the database::

     >>> people = Person.objects.raw('SELECT *, age(birth_date) AS age FROM myapp_person')
     >>> for p in people:
     ...     print "%s is %s." % (p.first_name, p.age)
     John is 37.
     Jane is 42.
     ...

 __ http://www.postgresql.org/docs/8.4/static/functions-datetime.html

 Passing parameters into ``raw()``
 ---------------------------------

 If you need to perform parameterized queries, you can use the ``params``
 argument to ``raw()``::

     >>> lname = 'Doe'
     >>> Person.objects.raw('SELECT * FROM myapp_person WHERE last_name = %s', [lname])

 ``params`` is a list of parameters. You'll use ``%s`` placeholders in the
 query string (regardless of your database engine); they'll be replaced with
 parameters from the ``params`` list.

 .. warning::

     **Do not use string formatting on raw queries!**

     It's tempting to write the above query as::

         >>> query = 'SELECT * FROM myapp_person WHERE last_name = %s' % lname
         >>> Person.objects.raw(query)

     **Don't.**

     Using the ``params`` list completely protects you from `SQL injection
     attacks`__, a common exploit where attackers inject arbitrary SQL into
     your database. If you use string interpolation, sooner or later you'll
     fall victim to SQL injection. As long as you remember to always use the
     ``params`` list you'll be protected.

 __ http://en.wikipedia.org/wiki/SQL_injection

 .. _executing-custom-sql:

 Executing custom SQL directly
 =============================

 Sometimes even :meth:`Manager.raw` isn't quite enough: you might need to
 perform queries that don't map cleanly to models, or directly execute
 ``UPDATE``, ``INSERT``, or ``DELETE`` queries.

 In these cases, you can always access the database directly, routing around
 the model layer entirely.

 The object ``django.db.connection`` represents the
 default database connection, and ``django.db.transaction`` represents the
 default database transaction. To use the database connection, call
 ``connection.cursor()`` to get a cursor object. Then, call
 ``cursor.execute(sql, [params])`` to execute the SQL and ``cursor.fetchone()``
 or ``cursor.fetchall()`` to return the resulting rows. After performing a data
 changing operation, you should then call
 ``transaction.commit_unless_managed()`` to ensure your changes are committed
 to the database. If your query is purely a data retrieval operation, no commit
 is required. For example::

     def my_custom_sql():
         from django.db import connection, transaction
         cursor = connection.cursor()

         # Data modifying operation - commit required
         cursor.execute("UPDATE bar SET foo = 1 WHERE baz = %s", [self.baz])
         transaction.commit_unless_managed()

         # Data retrieval operation - no commit required
         cursor.execute("SELECT foo FROM bar WHERE baz = %s", [self.baz])
         row = cursor.fetchone()

         return row

 If you are using more than one database you can use
 ``django.db.connections`` to obtain the connection (and cursor) for a
 specific database. ``django.db.connections`` is a dictionary-like
 object that allows you to retrieve a specific connection using its
 alias::

     from django.db import connections
     cursor = connections['my_db_alias'].cursor()
     # Your code here...
     transaction.commit_unless_managed(using='my_db_alias')

 By default, the Python DB API will return results without their field
 names, which means you end up with a ``list`` of values, rather than a
 ``dict``. At a small performance cost, you can return results as a
 ``dict`` by using something like this::

     def dictfetchall(cursor):
         "Returns all rows from a cursor as a dict"
         desc = cursor.description
         return [
             dict(zip([col[0] for col in desc], row))
             for row in cursor.fetchall()
         ]

 Here is an example of the difference between the two::

     >>> cursor.execute("SELECT id, parent_id from test LIMIT 2");
     >>> cursor.fetchall()
     ((54360982L, None), (54360880L, None))

     >>> cursor.execute("SELECT id, parent_id from test LIMIT 2");
     >>> dictfetchall(cursor)
     [{'parent_id': None, 'id': 54360982L}, {'parent_id': None, 'id': 54360880L}]


 .. _transactions-and-raw-sql:

 Transactions and raw SQL
 ------------------------

 When you make a raw SQL call, Django will automatically mark the
 current transaction as dirty. You must then ensure that the
 transaction containing those calls is closed correctly. See :ref:`the
 notes on the requirements of Django's transaction handling
 <topics-db-transactions-requirements>` for more details.

 .. versionchanged:: 1.3

 Prior to Django 1.3, it was necessary to manually mark a transaction
 as dirty using ``transaction.set_dirty()`` when using raw SQL calls.

 Connections and cursors
 -----------------------

 ``connection`` and ``cursor`` mostly implement the standard Python DB-API
 described in :pep:`249` (except when it comes to :doc:`transaction handling
 </topics/db/transactions>`). If you're not familiar with the Python DB-API, note
 that the SQL statement in ``cursor.execute()`` uses placeholders, ``"%s"``,
 rather than adding parameters directly within the SQL. If you use this
 technique, the underlying database library will automatically add quotes and
 escaping to your parameter(s) as necessary. (Also note that Django expects the
 ``"%s"`` placeholder, *not* the ``"?"`` placeholder, which is used by the SQLite
 Python bindings. This is for the sake of consistency and sanity.)
	==========================
	Performing raw SQL queries
	==========================

	.. currentmodule:: django.db.models

	When the :doc:`model query APIs </topics/db/queries>` don't go far enough, you
	can fall back to writing raw SQL. Django gives you two ways of performing raw
	SQL queries: you can use :meth:`Manager.raw()` to `perform raw queries and
	return model instances`__, or you can avoid the model layer entirely and
	`execute custom SQL directly`__.

	__ `performing raw queries`_
	__ `executing custom SQL directly`_

	.. _executing-raw-queries:

	Performing raw queries
	======================

	.. versionadded:: 1.2

	The ``raw()`` manager method can be used to perform raw SQL queries that
	return model instances:

	.. method:: Manager.raw(raw_query, params=None, translations=None)

	This method method takes a raw SQL query, executes it, and returns a
	:class:`~django.db.models.query.RawQuerySet` instance. This
	:class:`~django.db.models.query.RawQuerySet` instance can be iterated
	over just like an normal QuerySet to provide object instances.

	This is best illustrated with an example. Suppose you've got the following model::

	class Person(models.Model):
	first_name = models.CharField(...)
	last_name = models.CharField(...)
	birth_date = models.DateField(...)

	You could then execute custom SQL like so::

	>>> for p in Person.objects.raw('SELECT * FROM myapp_person'):
	... print p
	John Smith
	Jane Jones

	Of course, this example isn't very exciting -- it's exactly the same as
	running ``Person.objects.all()``. However, ``raw()`` has a bunch of other
	options that make it very powerful.

	.. admonition:: Model table names

	Where'd the name of the ``Person`` table come from in that example?

	By default, Django figures out a database table name by joining the
	model's "app label" -- the name you used in ``manage.py startapp`` -- to
	the model's class name, with an underscore between them. In the example
	we've assumed that the ``Person`` model lives in an app named ``myapp``,
	so its table would be ``myapp_person``.

	For more details check out the documentation for the
	:attr:`~Options.db_table` option, which also lets you manually set the
	database table name.

	.. warning::

	No checking is done on the SQL statement that is passed in to ``.raw()``.
	Django expects that the statement will return a set of rows from the
	database, but does nothing to enforce that. If the query does not
	return rows, a (possibly cryptic) error will result.

	.. warning::

	If you are performing queries on MySQL, note that MySQL's silent type coercion
	may cause unexpected results when mixing types. If you query on a string
	type column, but with an integer value, MySQL will coerce the types of all values
	in the table to an integer before performing the comparison. For example, if your
	table contains the values ``'abc'``, ``'def'`` and you query for ``WHERE mycolumn=0``,
	both rows will match. To prevent this, perform the correct typecasting
	before using the value in a query.

	Mapping query fields to model fields
	------------------------------------

	``raw()`` automatically maps fields in the query to fields on the model.

	The order of fields in your query doesn't matter. In other words, both
	of the following queries work identically::

	>>> Person.objects.raw('SELECT id, first_name, last_name, birth_date FROM myapp_person')
	...
	>>> Person.objects.raw('SELECT last_name, birth_date, first_name, id FROM myapp_person')
	...

	Matching is done by name. This means that you can use SQL's ``AS`` clauses to
	map fields in the query to model fields. So if you had some other table that
	had ``Person`` data in it, you could easily map it into ``Person`` instances::

	>>> Person.objects.raw('''SELECT first AS first_name,
	... last AS last_name,
	... bd AS birth_date,
	... pk as id,
	... FROM some_other_table''')

	As long as the names match, the model instances will be created correctly.

	Alternatively, you can map fields in the query to model fields using the
	``translations`` argument to ``raw()``. This is a dictionary mapping names of
	fields in the query to names of fields on the model. For example, the above
	query could also be written::

	>>> name_map = {'first': 'first_name', 'last': 'last_name', 'bd': 'birth_date', 'pk': 'id'}
	>>> Person.objects.raw('SELECT * FROM some_other_table', translations=name_map)

	Index lookups
	-------------

	``raw()`` supports indexing, so if you need only the first result you can
	write::

	>>> first_person = Person.objects.raw('SELECT * from myapp_person')[0]

	However, the indexing and slicing are not performed at the database level. If
	you have a big amount of ``Person`` objects in your database, it is more
	efficient to limit the query at the SQL level::

	>>> first_person = Person.objects.raw('SELECT * from myapp_person LIMIT 1')[0]

	Deferring model fields
	----------------------

	Fields may also be left out::

	>>> people = Person.objects.raw('SELECT id, first_name FROM myapp_person')

	The ``Person`` objects returned by this query will be deferred model instances
	(see :meth:`~django.db.models.query.QuerySet.defer()`). This means that the
	fields that are omitted from the query will be loaded on demand. For example::

	>>> for p in Person.objects.raw('SELECT id, first_name FROM myapp_person'):
	... print p.first_name, # This will be retrieved by the original query
	... print p.last_name # This will be retrieved on demand
	...
	John Smith
	Jane Jones

	From outward appearances, this looks like the query has retrieved both
	the first name and last name. However, this example actually issued 3
	queries. Only the first names were retrieved by the raw() query -- the
	last names were both retrieved on demand when they were printed.

	There is only one field that you can't leave out - the primary key
	field. Django uses the primary key to identify model instances, so it
	must always be included in a raw query. An ``InvalidQuery`` exception
	will be raised if you forget to include the primary key.

	Adding annotations
	------------------

	You can also execute queries containing fields that aren't defined on the
	model. For example, we could use `PostgreSQL's age() function`__ to get a list
	of people with their ages calculated by the database::

	>>> people = Person.objects.raw('SELECT *, age(birth_date) AS age FROM myapp_person')
	>>> for p in people:
	... print "%s is %s." % (p.first_name, p.age)
	John is 37.
	Jane is 42.
	...

	__ http://www.postgresql.org/docs/8.4/static/functions-datetime.html

	Passing parameters into ``raw()``
	---------------------------------

	If you need to perform parameterized queries, you can use the ``params``
	argument to ``raw()``::

	>>> lname = 'Doe'
	>>> Person.objects.raw('SELECT * FROM myapp_person WHERE last_name = %s', [lname])

	``params`` is a list of parameters. You'll use ``%s`` placeholders in the
	query string (regardless of your database engine); they'll be replaced with
	parameters from the ``params`` list.

	.. warning::

	Do not use string formatting on raw queries!

	It's tempting to write the above query as::

	>>> query = 'SELECT * FROM myapp_person WHERE last_name = %s' % lname
	>>> Person.objects.raw(query)

	Don't.

	Using the ``params`` list completely protects you from `SQL injection
	attacks`__, a common exploit where attackers inject arbitrary SQL into
	your database. If you use string interpolation, sooner or later you'll
	fall victim to SQL injection. As long as you remember to always use the
	``params`` list you'll be protected.

	__ http://en.wikipedia.org/wiki/SQL_injection

	.. _executing-custom-sql:

	Executing custom SQL directly
	=============================

	Sometimes even :meth:`Manager.raw` isn't quite enough: you might need to
	perform queries that don't map cleanly to models, or directly execute
	``UPDATE``, ``INSERT``, or ``DELETE`` queries.

	In these cases, you can always access the database directly, routing around
	the model layer entirely.

	The object ``django.db.connection`` represents the
	default database connection, and ``django.db.transaction`` represents the
	default database transaction. To use the database connection, call
	``connection.cursor()`` to get a cursor object. Then, call
	``cursor.execute(sql, [params])`` to execute the SQL and ``cursor.fetchone()``
	or ``cursor.fetchall()`` to return the resulting rows. After performing a data
	changing operation, you should then call
	``transaction.commit_unless_managed()`` to ensure your changes are committed
	to the database. If your query is purely a data retrieval operation, no commit
	is required. For example::

	def my_custom_sql():
	from django.db import connection, transaction
	cursor = connection.cursor()

	# Data modifying operation - commit required
	cursor.execute("UPDATE bar SET foo = 1 WHERE baz = %s", [self.baz])
	transaction.commit_unless_managed()

	# Data retrieval operation - no commit required
	cursor.execute("SELECT foo FROM bar WHERE baz = %s", [self.baz])
	row = cursor.fetchone()

	return row

	If you are using more than one database you can use
	``django.db.connections`` to obtain the connection (and cursor) for a
	specific database. ``django.db.connections`` is a dictionary-like
	object that allows you to retrieve a specific connection using its
	alias::

	from django.db import connections
	cursor = connections['my_db_alias'].cursor()
	# Your code here...
	transaction.commit_unless_managed(using='my_db_alias')

	By default, the Python DB API will return results without their field
	names, which means you end up with a ``list`` of values, rather than a
	``dict``. At a small performance cost, you can return results as a
	``dict`` by using something like this::

	def dictfetchall(cursor):
	"Returns all rows from a cursor as a dict"
	desc = cursor.description
	return [
	dict(zip([col[0] for col in desc], row))
	for row in cursor.fetchall()
	]

	Here is an example of the difference between the two::

	>>> cursor.execute("SELECT id, parent_id from test LIMIT 2");
	>>> cursor.fetchall()
	((54360982L, None), (54360880L, None))

	>>> cursor.execute("SELECT id, parent_id from test LIMIT 2");
	>>> dictfetchall(cursor)
	[{'parent_id': None, 'id': 54360982L}, {'parent_id': None, 'id': 54360880L}]


	.. _transactions-and-raw-sql:

	Transactions and raw SQL
	------------------------

	When you make a raw SQL call, Django will automatically mark the
	current transaction as dirty. You must then ensure that the
	transaction containing those calls is closed correctly. See :ref:`the
	notes on the requirements of Django's transaction handling
	<topics-db-transactions-requirements>` for more details.

	.. versionchanged:: 1.3

	Prior to Django 1.3, it was necessary to manually mark a transaction
	as dirty using ``transaction.set_dirty()`` when using raw SQL calls.

	Connections and cursors
	-----------------------

	``connection`` and ``cursor`` mostly implement the standard Python DB-API
	described in :pep:`249` (except when it comes to :doc:`transaction handling
	</topics/db/transactions>`). If you're not familiar with the Python DB-API, note
	that the SQL statement in ``cursor.execute()`` uses placeholders, ``"%s"``,
	rather than adding parameters directly within the SQL. If you use this
	technique, the underlying database library will automatically add quotes and
	escaping to your parameter(s) as necessary. (Also note that Django expects the
	``"%s"`` placeholder, not the ``"?"`` placeholder, which is used by the SQLite
	Python bindings. This is for the sake of consistency and sanity.)