| ============ |
| File Uploads |
| ============ |
| |
| .. currentmodule:: django.core.files.uploadedfile |
| |
| When Django handles a file upload, the file data ends up placed in |
| :attr:`request.FILES <django.http.HttpRequest.FILES>` (for more on the |
| ``request`` object see the documentation for :doc:`request and response objects |
| </ref/request-response>`). This document explains how files are stored on disk |
| and in memory, and how to customize the default behavior. |
| |
| Basic file uploads |
| ================== |
| |
| Consider a simple form containing a :class:`~django.forms.FileField`:: |
| |
| from django import forms |
| |
| class UploadFileForm(forms.Form): |
| title = forms.CharField(max_length=50) |
| file = forms.FileField() |
| |
| A view handling this form will receive the file data in |
| :attr:`request.FILES <django.http.HttpRequest.FILES>`, which is a dictionary |
| containing a key for each :class:`~django.forms.FileField` (or |
| :class:`~django.forms.ImageField`, or other :class:`~django.forms.FileField` |
| subclass) in the form. So the data from the above form would |
| be accessible as ``request.FILES['file']``. |
| |
| Note that :attr:`request.FILES <django.http.HttpRequest.FILES>` will only |
| contain data if the request method was ``POST`` and the ``<form>`` that posted |
| the request has the attribute ``enctype="multipart/form-data"``. Otherwise, |
| ``request.FILES`` will be empty. |
| |
| Most of the time, you'll simply pass the file data from ``request`` into the |
| form as described in :ref:`binding-uploaded-files`. This would look |
| something like:: |
| |
| from django.http import HttpResponseRedirect |
| from django.shortcuts import render_to_response |
| |
| # Imaginary function to handle an uploaded file. |
| from somewhere import handle_uploaded_file |
| |
| def upload_file(request): |
| if request.method == 'POST': |
| form = UploadFileForm(request.POST, request.FILES) |
| if form.is_valid(): |
| handle_uploaded_file(request.FILES['file']) |
| return HttpResponseRedirect('/success/url/') |
| else: |
| form = UploadFileForm() |
| return render_to_response('upload.html', {'form': form}) |
| |
| Notice that we have to pass :attr:`request.FILES <django.http.HttpRequest.FILES>` |
| into the form's constructor; this is how file data gets bound into a form. |
| |
| Handling uploaded files |
| ----------------------- |
| |
| .. class:: UploadedFile |
| |
| The final piece of the puzzle is handling the actual file data from |
| :attr:`request.FILES <django.http.HttpRequest.FILES>`. Each entry in this |
| dictionary is an ``UploadedFile`` object -- a simple wrapper around an uploaded |
| file. You'll usually use one of these methods to access the uploaded content: |
| |
| .. method:: read() |
| |
| Read the entire uploaded data from the file. Be careful with this |
| method: if the uploaded file is huge it can overwhelm your system if you |
| try to read it into memory. You'll probably want to use ``chunks()`` |
| instead; see below. |
| |
| .. method:: multiple_chunks() |
| |
| Returns ``True`` if the uploaded file is big enough to require |
| reading in multiple chunks. By default this will be any file |
| larger than 2.5 megabytes, but that's configurable; see below. |
| |
| .. method:: chunks() |
| |
| A generator returning chunks of the file. If ``multiple_chunks()`` is |
| ``True``, you should use this method in a loop instead of ``read()``. |
| |
| In practice, it's often easiest simply to use ``chunks()`` all the time; |
| see the example below. |
| |
| .. attribute:: name |
| |
| The name of the uploaded file (e.g. ``my_file.txt``). |
| |
| .. attribute:: size |
| |
| The size, in bytes, of the uploaded file. |
| |
| There are a few other methods and attributes available on ``UploadedFile`` |
| objects; see `UploadedFile objects`_ for a complete reference. |
| |
| Putting it all together, here's a common way you might handle an uploaded file:: |
| |
| def handle_uploaded_file(f): |
| destination = open('some/file/name.txt', 'wb+') |
| for chunk in f.chunks(): |
| destination.write(chunk) |
| destination.close() |
| |
| Looping over ``UploadedFile.chunks()`` instead of using ``read()`` ensures that |
| large files don't overwhelm your system's memory. |
| |
| Where uploaded data is stored |
| ----------------------------- |
| |
| Before you save uploaded files, the data needs to be stored somewhere. |
| |
| By default, if an uploaded file is smaller than 2.5 megabytes, Django will hold |
| the entire contents of the upload in memory. This means that saving the file |
| involves only a read from memory and a write to disk and thus is very fast. |
| |
| However, if an uploaded file is too large, Django will write the uploaded file |
| to a temporary file stored in your system's temporary directory. On a Unix-like |
| platform this means you can expect Django to generate a file called something |
| like ``/tmp/tmpzfp6I6.upload``. If an upload is large enough, you can watch this |
| file grow in size as Django streams the data onto disk. |
| |
| These specifics -- 2.5 megabytes; ``/tmp``; etc. -- are simply "reasonable |
| defaults". Read on for details on how you can customize or completely replace |
| upload behavior. |
| |
| Changing upload handler behavior |
| -------------------------------- |
| |
| Three settings control Django's file upload behavior: |
| |
| :setting:`FILE_UPLOAD_MAX_MEMORY_SIZE` |
| The maximum size, in bytes, for files that will be uploaded into memory. |
| Files larger than :setting:`FILE_UPLOAD_MAX_MEMORY_SIZE` will be |
| streamed to disk. |
| |
| Defaults to 2.5 megabytes. |
| |
| :setting:`FILE_UPLOAD_TEMP_DIR` |
| The directory where uploaded files larger than |
| :setting:`FILE_UPLOAD_MAX_MEMORY_SIZE` will be stored. |
| |
| Defaults to your system's standard temporary directory (i.e. ``/tmp`` on |
| most Unix-like systems). |
| |
| :setting:`FILE_UPLOAD_PERMISSIONS` |
| The numeric mode (i.e. ``0644``) to set newly uploaded files to. For |
| more information about what these modes mean, see the `documentation for |
| os.chmod`_ |
| |
| If this isn't given or is ``None``, you'll get operating-system |
| dependent behavior. On most platforms, temporary files will have a mode |
| of ``0600``, and files saved from memory will be saved using the |
| system's standard umask. |
| |
| .. warning:: |
| |
| If you're not familiar with file modes, please note that the leading |
| ``0`` is very important: it indicates an octal number, which is the |
| way that modes must be specified. If you try to use ``644``, you'll |
| get totally incorrect behavior. |
| |
| **Always prefix the mode with a 0.** |
| |
| :setting:`FILE_UPLOAD_HANDLERS` |
| The actual handlers for uploaded files. Changing this setting allows |
| complete customization -- even replacement -- of Django's upload |
| process. See `upload handlers`_, below, for details. |
| |
| Defaults to:: |
| |
| ("django.core.files.uploadhandler.MemoryFileUploadHandler", |
| "django.core.files.uploadhandler.TemporaryFileUploadHandler",) |
| |
| Which means "try to upload to memory first, then fall back to temporary |
| files." |
| |
| .. _documentation for os.chmod: http://docs.python.org/library/os.html#os.chmod |
| |
| ``UploadedFile`` objects |
| ======================== |
| |
| In addition to those inherited from :class:`File`, all ``UploadedFile`` objects |
| define the following methods/attributes: |
| |
| .. attribute:: UploadedFile.content_type |
| |
| The content-type header uploaded with the file (e.g. ``text/plain`` or |
| ``application/pdf``). Like any data supplied by the user, you shouldn't |
| trust that the uploaded file is actually this type. You'll still need to |
| validate that the file contains the content that the content-type header |
| claims -- "trust but verify." |
| |
| .. attribute:: UploadedFile.charset |
| |
| For ``text/*`` content-types, the character set (i.e. ``utf8``) supplied |
| by the browser. Again, "trust but verify" is the best policy here. |
| |
| .. attribute:: UploadedFile.temporary_file_path() |
| |
| Only files uploaded onto disk will have this method; it returns the full |
| path to the temporary uploaded file. |
| |
| .. note:: |
| |
| Like regular Python files, you can read the file line-by-line simply by |
| iterating over the uploaded file: |
| |
| .. code-block:: python |
| |
| for line in uploadedfile: |
| do_something_with(line) |
| |
| However, *unlike* standard Python files, :class:`UploadedFile` only |
| understands ``\n`` (also known as "Unix-style") line endings. If you know |
| that you need to handle uploaded files with different line endings, you'll |
| need to do so in your view. |
| |
| Upload Handlers |
| =============== |
| |
| When a user uploads a file, Django passes off the file data to an *upload |
| handler* -- a small class that handles file data as it gets uploaded. Upload |
| handlers are initially defined in the :setting:`FILE_UPLOAD_HANDLERS` setting, |
| which defaults to:: |
| |
| ("django.core.files.uploadhandler.MemoryFileUploadHandler", |
| "django.core.files.uploadhandler.TemporaryFileUploadHandler",) |
| |
| Together the ``MemoryFileUploadHandler`` and ``TemporaryFileUploadHandler`` |
| provide Django's default file upload behavior of reading small files into memory |
| and large ones onto disk. |
| |
| You can write custom handlers that customize how Django handles files. You |
| could, for example, use custom handlers to enforce user-level quotas, compress |
| data on the fly, render progress bars, and even send data to another storage |
| location directly without storing it locally. |
| |
| Modifying upload handlers on the fly |
| ------------------------------------ |
| |
| Sometimes particular views require different upload behavior. In these cases, |
| you can override upload handlers on a per-request basis by modifying |
| ``request.upload_handlers``. By default, this list will contain the upload |
| handlers given by :setting:`FILE_UPLOAD_HANDLERS`, but you can modify the list |
| as you would any other list. |
| |
| For instance, suppose you've written a ``ProgressBarUploadHandler`` that |
| provides feedback on upload progress to some sort of AJAX widget. You'd add this |
| handler to your upload handlers like this:: |
| |
| request.upload_handlers.insert(0, ProgressBarUploadHandler()) |
| |
| You'd probably want to use ``list.insert()`` in this case (instead of |
| ``append()``) because a progress bar handler would need to run *before* any |
| other handlers. Remember, the upload handlers are processed in order. |
| |
| If you want to replace the upload handlers completely, you can just assign a new |
| list:: |
| |
| request.upload_handlers = [ProgressBarUploadHandler()] |
| |
| .. note:: |
| |
| You can only modify upload handlers *before* accessing |
| ``request.POST`` or ``request.FILES`` -- it doesn't make sense to |
| change upload handlers after upload handling has already |
| started. If you try to modify ``request.upload_handlers`` after |
| reading from ``request.POST`` or ``request.FILES`` Django will |
| throw an error. |
| |
| Thus, you should always modify uploading handlers as early in your view as |
| possible. |
| |
| Also, ``request.POST`` is accessed by |
| :class:`~django.middleware.csrf.CsrfViewMiddleware` which is enabled by |
| default. This means you will need to use |
| :func:`~django.views.decorators.csrf.csrf_exempt` on your view to allow you |
| to change the upload handlers. You will then need to use |
| :func:`~django.views.decorators.csrf.csrf_protect` on the function that |
| actually processes the request. Note that this means that the handlers may |
| start receiving the file upload before the CSRF checks have been done. |
| Example code: |
| |
| .. code-block:: python |
| |
| from django.views.decorators.csrf import csrf_exempt, csrf_protect |
| |
| @csrf_exempt |
| def upload_file_view(request): |
| request.upload_handlers.insert(0, ProgressBarUploadHandler()) |
| return _upload_file_view(request) |
| |
| @csrf_protect |
| def _upload_file_view(request): |
| ... # Process request |
| |
| |
| Writing custom upload handlers |
| ------------------------------ |
| |
| All file upload handlers should be subclasses of |
| ``django.core.files.uploadhandler.FileUploadHandler``. You can define upload |
| handlers wherever you wish. |
| |
| Required methods |
| ~~~~~~~~~~~~~~~~ |
| |
| Custom file upload handlers **must** define the following methods: |
| |
| ``FileUploadHandler.receive_data_chunk(self, raw_data, start)`` |
| Receives a "chunk" of data from the file upload. |
| |
| ``raw_data`` is a byte string containing the uploaded data. |
| |
| ``start`` is the position in the file where this ``raw_data`` chunk |
| begins. |
| |
| The data you return will get fed into the subsequent upload handlers' |
| ``receive_data_chunk`` methods. In this way, one handler can be a |
| "filter" for other handlers. |
| |
| Return ``None`` from ``receive_data_chunk`` to sort-circuit remaining |
| upload handlers from getting this chunk.. This is useful if you're |
| storing the uploaded data yourself and don't want future handlers to |
| store a copy of the data. |
| |
| If you raise a ``StopUpload`` or a ``SkipFile`` exception, the upload |
| will abort or the file will be completely skipped. |
| |
| ``FileUploadHandler.file_complete(self, file_size)`` |
| Called when a file has finished uploading. |
| |
| The handler should return an ``UploadedFile`` object that will be stored |
| in ``request.FILES``. Handlers may also return ``None`` to indicate that |
| the ``UploadedFile`` object should come from subsequent upload handlers. |
| |
| Optional methods |
| ~~~~~~~~~~~~~~~~ |
| |
| Custom upload handlers may also define any of the following optional methods or |
| attributes: |
| |
| ``FileUploadHandler.chunk_size`` |
| Size, in bytes, of the "chunks" Django should store into memory and feed |
| into the handler. That is, this attribute controls the size of chunks |
| fed into ``FileUploadHandler.receive_data_chunk``. |
| |
| For maximum performance the chunk sizes should be divisible by ``4`` and |
| should not exceed 2 GB (2\ :sup:`31` bytes) in size. When there are |
| multiple chunk sizes provided by multiple handlers, Django will use the |
| smallest chunk size defined by any handler. |
| |
| The default is 64*2\ :sup:`10` bytes, or 64 KB. |
| |
| ``FileUploadHandler.new_file(self, field_name, file_name, content_type, content_length, charset)`` |
| Callback signaling that a new file upload is starting. This is called |
| before any data has been fed to any upload handlers. |
| |
| ``field_name`` is a string name of the file ``<input>`` field. |
| |
| ``file_name`` is the unicode filename that was provided by the browser. |
| |
| ``content_type`` is the MIME type provided by the browser -- E.g. |
| ``'image/jpeg'``. |
| |
| ``content_length`` is the length of the image given by the browser. |
| Sometimes this won't be provided and will be ``None``. |
| |
| ``charset`` is the character set (i.e. ``utf8``) given by the browser. |
| Like ``content_length``, this sometimes won't be provided. |
| |
| This method may raise a ``StopFutureHandlers`` exception to prevent |
| future handlers from handling this file. |
| |
| ``FileUploadHandler.upload_complete(self)`` |
| Callback signaling that the entire upload (all files) has completed. |
| |
| ``FileUploadHandler.handle_raw_input(self, input_data, META, content_length, boundary, encoding)`` |
| Allows the handler to completely override the parsing of the raw |
| HTTP input. |
| |
| ``input_data`` is a file-like object that supports ``read()``-ing. |
| |
| ``META`` is the same object as ``request.META``. |
| |
| ``content_length`` is the length of the data in ``input_data``. Don't |
| read more than ``content_length`` bytes from ``input_data``. |
| |
| ``boundary`` is the MIME boundary for this request. |
| |
| ``encoding`` is the encoding of the request. |
| |
| Return ``None`` if you want upload handling to continue, or a tuple of |
| ``(POST, FILES)`` if you want to return the new data structures suitable |
| for the request directly. |