| ================== |
| Multiple databases |
| ================== |
| |
| .. versionadded:: 1.2 |
| |
| This topic guide describes Django's support for interacting with |
| multiple databases. Most of the rest of Django's documentation assumes |
| you are interacting with a single database. If you want to interact |
| with multiple databases, you'll need to take some additional steps. |
| |
| Defining your databases |
| ======================= |
| |
| The first step to using more than one database with Django is to tell |
| Django about the database servers you'll be using. This is done using |
| the :setting:`DATABASES` setting. This setting maps database aliases, |
| which are a way to refer to a specific database throughout Django, to |
| a dictionary of settings for that specific connection. The settings in |
| the inner dictionaries are described fully in the :setting:`DATABASES` |
| documentation. |
| |
| Databases can have any alias you choose. However, the alias |
| ``default`` has special significance. Django uses the database with |
| the alias of ``default`` when no other database has been selected. If |
| you don't have a ``default`` database, you need to be careful to |
| always specify the database that you want to use. |
| |
| The following is an example ``settings.py`` snippet defining two |
| databases -- a default PostgreSQL database and a MySQL database called |
| ``users``: |
| |
| .. code-block:: python |
| |
| DATABASES = { |
| 'default': { |
| 'NAME': 'app_data', |
| 'ENGINE': 'django.db.backends.postgresql_psycopg2', |
| 'USER': 'postgres_user', |
| 'PASSWORD': 's3krit' |
| }, |
| 'users': { |
| 'NAME': 'user_data', |
| 'ENGINE': 'django.db.backends.mysql', |
| 'USER': 'mysql_user', |
| 'PASSWORD': 'priv4te' |
| } |
| } |
| |
| If you attempt to access a database that you haven't defined in your |
| :setting:`DATABASES` setting, Django will raise a |
| ``django.db.utils.ConnectionDoesNotExist`` exception. |
| |
| Synchronizing your databases |
| ============================ |
| |
| The :djadmin:`syncdb` management command operates on one database at a |
| time. By default, it operates on the ``default`` database, but by |
| providing a :djadminopt:`--database` argument, you can tell syncdb to |
| synchronize a different database. So, to synchronize all models onto |
| all databases in our example, you would need to call:: |
| |
| $ ./manage.py syncdb |
| $ ./manage.py syncdb --database=users |
| |
| If you don't want every application to be synchronized onto a |
| particular database, you can define a :ref:`database |
| router<topics-db-multi-db-routing>` that implements a policy |
| constraining the availability of particular models. |
| |
| Alternatively, if you want fine-grained control of synchronization, |
| you can pipe all or part of the output of :djadmin:`sqlall` for a |
| particular application directly into your database prompt, like this:: |
| |
| $ ./manage.py sqlall sales | ./manage.py dbshell |
| |
| Using other management commands |
| ------------------------------- |
| |
| The other ``django-admin.py`` commands that interact with the database |
| operate in the same way as :djadmin:`syncdb` -- they only ever operate |
| on one database at a time, using :djadminopt:`--database` to control |
| the database used. |
| |
| .. _topics-db-multi-db-routing: |
| |
| Automatic database routing |
| ========================== |
| |
| The easiest way to use multiple databases is to set up a database |
| routing scheme. The default routing scheme ensures that objects remain |
| 'sticky' to their original database (i.e., an object retrieved from |
| the ``foo`` database will be saved on the same database). The default |
| routing scheme ensures that if a database isn't specified, all queries |
| fall back to the ``default`` database. |
| |
| You don't have to do anything to activate the default routing scheme |
| -- it is provided 'out of the box' on every Django project. However, |
| if you want to implement more interesting database allocation |
| behaviors, you can define and install your own database routers. |
| |
| Database routers |
| ---------------- |
| |
| A database Router is a class that provides up to four methods: |
| |
| .. method:: db_for_read(model, **hints) |
| |
| Suggest the database that should be used for read operations for |
| objects of type ``model``. |
| |
| If a database operation is able to provide any additional |
| information that might assist in selecting a database, it will be |
| provided in the ``hints`` dictionary. Details on valid hints are |
| provided :ref:`below <topics-db-multi-db-hints>`. |
| |
| Returns None if there is no suggestion. |
| |
| .. method:: db_for_write(model, **hints) |
| |
| Suggest the database that should be used for writes of objects of |
| type Model. |
| |
| If a database operation is able to provide any additional |
| information that might assist in selecting a database, it will be |
| provided in the ``hints`` dictionary. Details on valid hints are |
| provided :ref:`below <topics-db-multi-db-hints>`. |
| |
| Returns None if there is no suggestion. |
| |
| .. method:: allow_relation(obj1, obj2, **hints) |
| |
| Return True if a relation between obj1 and obj2 should be |
| allowed, False if the relation should be prevented, or None if |
| the router has no opinion. This is purely a validation operation, |
| used by foreign key and many to many operations to determine if a |
| relation should be allowed between two objects. |
| |
| .. method:: allow_syncdb(db, model) |
| |
| Determine if the ``model`` should be synchronized onto the |
| database with alias ``db``. Return True if the model should be |
| synchronized, False if it should not be synchronized, or None if |
| the router has no opinion. This method can be used to determine |
| the availability of a model on a given database. |
| |
| A router doesn't have to provide *all* these methods - it omit one or |
| more of them. If one of the methods is omitted, Django will skip that |
| router when performing the relevant check. |
| |
| .. _topics-db-multi-db-hints: |
| |
| Hints |
| ~~~~~ |
| |
| The hints received by the database router can be used to decide which |
| database should receive a given request. |
| |
| At present, the only hint that will be provided is ``instance``, an |
| object instance that is related to the read or write operation that is |
| underway. This might be the instance that is being saved, or it might |
| be an instance that is being added in a many-to-many relation. In some |
| cases, no instance hint will be provided at all. The router checks for |
| the existence of an instance hint, and determine if that hint should be |
| used to alter routing behavior. |
| |
| Using routers |
| ------------- |
| |
| Database routers are installed using the :setting:`DATABASE_ROUTERS` |
| setting. This setting defines a list of class names, each specifying a |
| router that should be used by the master router |
| (``django.db.router``). |
| |
| The master router is used by Django's database operations to allocate |
| database usage. Whenever a query needs to know which database to use, |
| it calls the master router, providing a model and a hint (if |
| available). Django then tries each router in turn until a database |
| suggestion can be found. If no suggestion can be found, it tries the |
| current ``_state.db`` of the hint instance. If a hint instance wasn't |
| provided, or the instance doesn't currently have database state, the |
| master router will allocate the ``default`` database. |
| |
| An example |
| ---------- |
| |
| .. admonition:: Example purposes only! |
| |
| This example is intended as a demonstration of how the router |
| infrastructure can be used to alter database usage. It |
| intentionally ignores some complex issues in order to |
| demonstrate how routers are used. |
| |
| This example won't work if any of the models in ``myapp`` contain |
| relationships to models outside of the ``other`` database. |
| :ref:`Cross-database relationships <no_cross_database_relations>` |
| introduce referential integrity problems that Django can't |
| currently handle. |
| |
| The master/slave configuration described is also flawed -- it |
| doesn't provide any solution for handling replication lag (i.e., |
| query inconsistencies introduced because of the time taken for a |
| write to propagate to the slaves). It also doesn't consider the |
| interaction of transactions with the database utilization strategy. |
| |
| So - what does this mean in practice? Say you want ``myapp`` to |
| exist on the ``other`` database, and you want all other models in a |
| master/slave relationship between the databases ``master``, ``slave1`` and |
| ``slave2``. To implement this, you would need 2 routers:: |
| |
| class MyAppRouter(object): |
| """A router to control all database operations on models in |
| the myapp application""" |
| |
| def db_for_read(self, model, **hints): |
| "Point all operations on myapp models to 'other'" |
| if model._meta.app_label == 'myapp': |
| return 'other' |
| return None |
| |
| def db_for_write(self, model, **hints): |
| "Point all operations on myapp models to 'other'" |
| if model._meta.app_label == 'myapp': |
| return 'other' |
| return None |
| |
| def allow_relation(self, obj1, obj2, **hints): |
| "Allow any relation if a model in myapp is involved" |
| if obj1._meta.app_label == 'myapp' or obj2._meta.app_label == 'myapp': |
| return True |
| return None |
| |
| def allow_syncdb(self, db, model): |
| "Make sure the myapp app only appears on the 'other' db" |
| if db == 'other': |
| return model._meta.app_label == 'myapp' |
| elif model._meta.app_label == 'myapp': |
| return False |
| return None |
| |
| class MasterSlaveRouter(object): |
| """A router that sets up a simple master/slave configuration""" |
| |
| def db_for_read(self, model, **hints): |
| "Point all read operations to a random slave" |
| return random.choice(['slave1','slave2']) |
| |
| def db_for_write(self, model, **hints): |
| "Point all write operations to the master" |
| return 'master' |
| |
| def allow_relation(self, obj1, obj2, **hints): |
| "Allow any relation between two objects in the db pool" |
| db_list = ('master','slave1','slave2') |
| if obj1._state.db in db_list and obj2._state.db in db_list: |
| return True |
| return None |
| |
| def allow_syncdb(self, db, model): |
| "Explicitly put all models on all databases." |
| return True |
| |
| Then, in your settings file, add the following (substituting ``path.to.`` with |
| the actual python path to the module where you define the routers):: |
| |
| DATABASE_ROUTERS = ['path.to.MyAppRouter', 'path.to.MasterSlaveRouter'] |
| |
| The order in which routers are processed is significant. Routers will |
| be queried in the order the are listed in the |
| :setting:`DATABASE_ROUTERS` setting . In this example, the |
| ``MyAppRouter`` is processed before the ``MasterSlaveRouter``, and as a |
| result, decisions concerning the models in ``myapp`` are processed |
| before any other decision is made. If the :setting:`DATABASE_ROUTERS` |
| setting listed the two routers in the other order, |
| ``MasterSlaveRouter.allow_syncdb()`` would be processed first. The |
| catch-all nature of the MasterSlaveRouter implementation would mean |
| that all models would be available on all databases. |
| |
| With this setup installed, lets run some Django code:: |
| |
| >>> # This retrieval will be performed on the 'credentials' database |
| >>> fred = User.objects.get(username='fred') |
| >>> fred.first_name = 'Frederick' |
| |
| >>> # This save will also be directed to 'credentials' |
| >>> fred.save() |
| |
| >>> # These retrieval will be randomly allocated to a slave database |
| >>> dna = Person.objects.get(name='Douglas Adams') |
| |
| >>> # A new object has no database allocation when created |
| >>> mh = Book(title='Mostly Harmless') |
| |
| >>> # This assignment will consult the router, and set mh onto |
| >>> # the same database as the author object |
| >>> mh.author = dna |
| |
| >>> # This save will force the 'mh' instance onto the master database... |
| >>> mh.save() |
| |
| >>> # ... but if we re-retrieve the object, it will come back on a slave |
| >>> mh = Book.objects.get(title='Mostly Harmless') |
| |
| |
| Manually selecting a database |
| ============================= |
| |
| Django also provides an API that allows you to maintain complete control |
| over database usage in your code. A manually specified database allocation |
| will take priority over a database allocated by a router. |
| |
| Manually selecting a database for a ``QuerySet`` |
| ------------------------------------------------ |
| |
| You can select the database for a ``QuerySet`` at any point in the |
| ``QuerySet`` "chain." Just call ``using()`` on the ``QuerySet`` to get |
| another ``QuerySet`` that uses the specified database. |
| |
| ``using()`` takes a single argument: the alias of the database on |
| which you want to run the query. For example:: |
| |
| >>> # This will run on the 'default' database. |
| >>> Author.objects.all() |
| |
| >>> # So will this. |
| >>> Author.objects.using('default').all() |
| |
| >>> # This will run on the 'other' database. |
| >>> Author.objects.using('other').all() |
| |
| Selecting a database for ``save()`` |
| ----------------------------------- |
| |
| Use the ``using`` keyword to ``Model.save()`` to specify to which |
| database the data should be saved. |
| |
| For example, to save an object to the ``legacy_users`` database, you'd |
| use this:: |
| |
| >>> my_object.save(using='legacy_users') |
| |
| If you don't specify ``using``, the ``save()`` method will save into |
| the default database allocated by the routers. |
| |
| Moving an object from one database to another |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| If you've saved an instance to one database, it might be tempting to |
| use ``save(using=...)`` as a way to migrate the instance to a new |
| database. However, if you don't take appropriate steps, this could |
| have some unexpected consequences. |
| |
| Consider the following example:: |
| |
| >>> p = Person(name='Fred') |
| >>> p.save(using='first') # (statement 1) |
| >>> p.save(using='second') # (statement 2) |
| |
| In statement 1, a new ``Person`` object is saved to the ``first`` |
| database. At this time, ``p`` doesn't have a primary key, so Django |
| issues a SQL ``INSERT`` statement. This creates a primary key, and |
| Django assigns that primary key to ``p``. |
| |
| When the save occurs in statement 2, ``p`` already has a primary key |
| value, and Django will attempt to use that primary key on the new |
| database. If the primary key value isn't in use in the ``second`` |
| database, then you won't have any problems -- the object will be |
| copied to the new database. |
| |
| However, if the primary key of ``p`` is already in use on the |
| ``second`` database, the existing object in the ``second`` database |
| will be overridden when ``p`` is saved. |
| |
| You can avoid this in two ways. First, you can clear the primary key |
| of the instance. If an object has no primary key, Django will treat it |
| as a new object, avoiding any loss of data on the ``second`` |
| database:: |
| |
| >>> p = Person(name='Fred') |
| >>> p.save(using='first') |
| >>> p.pk = None # Clear the primary key. |
| >>> p.save(using='second') # Write a completely new object. |
| |
| The second option is to use the ``force_insert`` option to ``save()`` |
| to ensure that Django does a SQL ``INSERT``:: |
| |
| >>> p = Person(name='Fred') |
| >>> p.save(using='first') |
| >>> p.save(using='second', force_insert=True) |
| |
| This will ensure that the person named ``Fred`` will have the same |
| primary key on both databases. If that primary key is already in use |
| when you try to save onto the ``second`` database, an error will be |
| raised. |
| |
| Selecting a database to delete from |
| ----------------------------------- |
| |
| By default, a call to delete an existing object will be executed on |
| the same database that was used to retrieve the object in the first |
| place:: |
| |
| >>> u = User.objects.using('legacy_users').get(username='fred') |
| >>> u.delete() # will delete from the `legacy_users` database |
| |
| To specify the database from which a model will be deleted, pass a |
| ``using`` keyword argument to the ``Model.delete()`` method. This |
| argument works just like the ``using`` keyword argument to ``save()``. |
| |
| For example, if you're migrating a user from the ``legacy_users`` |
| database to the ``new_users`` database, you might use these commands:: |
| |
| >>> user_obj.save(using='new_users') |
| >>> user_obj.delete(using='legacy_users') |
| |
| Using managers with multiple databases |
| -------------------------------------- |
| |
| Use the ``db_manager()`` method on managers to give managers access to |
| a non-default database. |
| |
| For example, say you have a custom manager method that touches the |
| database -- ``User.objects.create_user()``. Because ``create_user()`` |
| is a manager method, not a ``QuerySet`` method, you can't do |
| ``User.objects.using('new_users').create_user()``. (The |
| ``create_user()`` method is only available on ``User.objects``, the |
| manager, not on ``QuerySet`` objects derived from the manager.) The |
| solution is to use ``db_manager()``, like this:: |
| |
| User.objects.db_manager('new_users').create_user(...) |
| |
| ``db_manager()`` returns a copy of the manager bound to the database you specify. |
| |
| Using ``get_query_set()`` with multiple databases |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| If you're overriding ``get_query_set()`` on your manager, be sure to |
| either call the method on the parent (using ``super()``) or do the |
| appropriate handling of the ``_db`` attribute on the manager (a string |
| containing the name of the database to use). |
| |
| For example, if you want to return a custom ``QuerySet`` class from |
| the ``get_query_set`` method, you could do this:: |
| |
| class MyManager(models.Manager): |
| def get_query_set(self): |
| qs = CustomQuerySet(self.model) |
| if self._db is not None: |
| qs = qs.using(self._db) |
| return qs |
| |
| Exposing multiple databases in Django's admin interface |
| ======================================================= |
| |
| Django's admin doesn't have any explicit support for multiple |
| databases. If you want to provide an admin interface for a model on a |
| database other than that specified by your router chain, you'll |
| need to write custom :class:`~django.contrib.admin.ModelAdmin` classes |
| that will direct the admin to use a specific database for content. |
| |
| ``ModelAdmin`` objects have five methods that require customization for |
| multiple-database support:: |
| |
| class MultiDBModelAdmin(admin.ModelAdmin): |
| # A handy constant for the name of the alternate database. |
| using = 'other' |
| |
| def save_model(self, request, obj, form, change): |
| # Tell Django to save objects to the 'other' database. |
| obj.save(using=self.using) |
| |
| def delete_model(self, request, obj): |
| # Tell Django to delete objects from the 'other' database |
| obj.delete(using=self.using) |
| |
| def queryset(self, request): |
| # Tell Django to look for objects on the 'other' database. |
| return super(MultiDBModelAdmin, self).queryset(request).using(self.using) |
| |
| def formfield_for_foreignkey(self, db_field, request=None, **kwargs): |
| # Tell Django to populate ForeignKey widgets using a query |
| # on the 'other' database. |
| return super(MultiDBModelAdmin, self).formfield_for_foreignkey(db_field, request=request, using=self.using, **kwargs) |
| |
| def formfield_for_manytomany(self, db_field, request=None, **kwargs): |
| # Tell Django to populate ManyToMany widgets using a query |
| # on the 'other' database. |
| return super(MultiDBModelAdmin, self).formfield_for_manytomany(db_field, request=request, using=self.using, **kwargs) |
| |
| The implementation provided here implements a multi-database strategy |
| where all objects of a given type are stored on a specific database |
| (e.g., all ``User`` objects are in the ``other`` database). If your |
| usage of multiple databases is more complex, your ``ModelAdmin`` will |
| need to reflect that strategy. |
| |
| Inlines can be handled in a similar fashion. They require three customized methods:: |
| |
| class MultiDBTabularInline(admin.TabularInline): |
| using = 'other' |
| |
| def queryset(self, request): |
| # Tell Django to look for inline objects on the 'other' database. |
| return super(MultiDBTabularInline, self).queryset(request).using(self.using) |
| |
| def formfield_for_foreignkey(self, db_field, request=None, **kwargs): |
| # Tell Django to populate ForeignKey widgets using a query |
| # on the 'other' database. |
| return super(MultiDBTabularInline, self).formfield_for_foreignkey(db_field, request=request, using=self.using, **kwargs) |
| |
| def formfield_for_manytomany(self, db_field, request=None, **kwargs): |
| # Tell Django to populate ManyToMany widgets using a query |
| # on the 'other' database. |
| return super(MultiDBTabularInline, self).formfield_for_manytomany(db_field, request=request, using=self.using, **kwargs) |
| |
| Once you've written your model admin definitions, they can be |
| registered with any ``Admin`` instance:: |
| |
| from django.contrib import admin |
| |
| # Specialize the multi-db admin objects for use with specific models. |
| class BookInline(MultiDBTabularInline): |
| model = Book |
| |
| class PublisherAdmin(MultiDBModelAdmin): |
| inlines = [BookInline] |
| |
| admin.site.register(Author, MultiDBModelAdmin) |
| admin.site.register(Publisher, PublisherAdmin) |
| |
| othersite = admin.Site('othersite') |
| othersite.register(Publisher, MultiDBModelAdmin) |
| |
| This example sets up two admin sites. On the first site, the |
| ``Author`` and ``Publisher`` objects are exposed; ``Publisher`` |
| objects have an tabular inline showing books published by that |
| publisher. The second site exposes just publishers, without the |
| inlines. |
| |
| Using raw cursors with multiple databases |
| ========================================= |
| |
| If you are using more than one database you can use |
| ``django.db.connections`` to obtain the connection (and cursor) for a |
| specific database. ``django.db.connections`` is a dictionary-like |
| object that allows you to retrieve a specific connection using its |
| alias:: |
| |
| from django.db import connections |
| cursor = connections['my_db_alias'].cursor() |
| |
| Limitations of multiple databases |
| ================================= |
| |
| .. _no_cross_database_relations: |
| |
| Cross-database relations |
| ------------------------ |
| |
| Django doesn't currently provide any support for foreign key or |
| many-to-many relationships spanning multiple databases. If you |
| have used a router to partition models to different databases, |
| any foreign key and many-to-many relationships defined by those |
| models must be internal to a single database. |
| |
| This is because of referential integrity. In order to maintain a |
| relationship between two objects, Django needs to know that the |
| primary key of the related object is valid. If the primary key is |
| stored on a separate database, it's not possible to easily evaluate |
| the validity of a primary key. |
| |
| If you're using Postgres, Oracle, or MySQL with InnoDB, this is |
| enforced at the database integrity level -- database level key |
| constraints prevent the creation of relations that can't be validated. |
| |
| However, if you're using SQLite or MySQL with MyISAM tables, there is |
| no enforced referential integrity; as a result, you may be able to |
| 'fake' cross database foreign keys. However, this configuration is not |
| officially supported by Django. |