docs/source/dynamodb2_tut.rst - external/boto - Git at Google

 .. _dynamodb2_tut:

 ===============================================
 An Introduction to boto's DynamoDB v2 interface
 ===============================================

 This tutorial focuses on the boto interface to AWS' DynamoDB_ v2. This tutorial
 assumes that you have boto already downloaded and installed.

 .. _DynamoDB: http://aws.amazon.com/dynamodb/

 .. warning::

     This tutorial covers the **SECOND** major release of DynamoDB (including
     local secondary index support). The documentation for the original
     version of DynamoDB (& boto's support for it) is at
     :doc:`DynamoDB v1 <dynamodb_tut>`.

 The v2 DynamoDB API has both a high-level & low-level component. The low-level
 API (contained primarily within ``boto.dynamodb2.layer1``) provides an
 interface that rough matches exactly what is provided by the API. It supports
 all options available to the service.

 The high-level API attempts to make interacting with the service more natural
 from Python. It supports most of the featureset.


 The High-Level API
 ==================

 Most of the interaction centers around a single object, the ``Table``. Tables
 act as a way to effectively namespace your records. If you're familiar with
 database tables from an RDBMS, tables will feel somewhat familiar.


 Creating a New Table
 --------------------

 To create a new table, you need to call ``Table.create`` & specify (at a
 minimum) both the table's name as well as the key schema for the table.

 Since both the key schema and local secondary indexes can not be
 modified after the table is created, you'll need to plan ahead of time how you
 think the table will be used. Both the keys & indexes are also used for
 querying, so you'll want to represent the data you'll need when querying
 there as well.

 For the schema, you can either have a single ``HashKey`` or a combined
 ``HashKey+RangeKey``. The ``HashKey`` by itself should be thought of as a
 unique identifier (for instance, like a username or UUID). It is typically
 looked up as an exact value.
 A ``HashKey+RangeKey`` combination is slightly different, in that the
 ``HashKey`` acts like a namespace/prefix & the ``RangeKey`` acts as a value
 that can be referred to by a sorted range of values.

 For the local secondary indexes, you can choose from an ``AllIndex``, a
 ``KeysOnlyIndex`` or a ``IncludeIndex`` field. Each builds an index of values
 that can be queried on. The ``AllIndex`` duplicates all values onto the index
 (to prevent additional reads to fetch the data). The ``KeysOnlyIndex``
 duplicates only the keys from the schema onto the index. The ``IncludeIndex``
 lets you specify a list of fieldnames to duplicate over.

 Simple example::

     >>> from boto.dynamodb2.fields import HashKey
     >>> from boto.dynamodb2.table import Table

     # Uses your ``aws_access_key_id`` & ``aws_secret_access_key`` from either a
     # config file or environment variable & the default region.
     >>> users = Table.create('users', schema=[
     ...     HashKey('username'),
     ... ])

 A full example::

     >>> from boto.dynamodb2.fields import HashKey, RangeKey, KeysOnlyIndex
     >>> from boto.dynamodb2.layer1 import DynamoDBConnection
     >>> from boto.dynamodb2.table import Table
     >>> from boto.dynamodb2.types import Number

     >>> users = Table.create('users', schema=[
     ...     HashKey('account_type', data_type=NUMBER),
     ...     RangeKey('last_name'),
     ... ], throughput={
     ...     'read': 5,
     ...     'write': 15,
     ... }, indexes=[
     ...     AllIndex('EverythingIndex', parts=[
     ...         HashKey('account_type', data_type=NUMBER),
     ...     ])
     ... ],
     ... # If you need to specify custom parameters like keys or region info...
     ... connection=DynamoDBConnection(
     ...     aws_access_key_id='key',
     ...     aws_secret_access_key='key',
     ...     region='us-west-2'
     ... ))


 Using an Existing Table
 -----------------------

 Once a table has been created, using it is relatively simple. You can either
 specify just the ``table_name`` (allowing the object to lazily do an additional
 call to get details about itself if needed) or provide the ``schema/indexes``
 again (same as what was used with ``Table.create``) to avoid extra overhead.

 Lazy example::

     >>> from boto.dynamodb2.table import Table
     >>> users = Table('users')

 Efficient example::

     >>> from boto.dynamodb2.table import Table
     >>> users = Table('users', schema=[
     ...     HashKey('account_type', data_type=NUMBER),
     ...     RangeKey('last_name'),
     ... ], indexes=[
     ...     AllIndex('EverythingIndex', parts=[
     ...         HashKey('account_type', data_type=NUMBER),
     ...     ])
     ... ])


 Creating a New Item
 -------------------

 Once you have a ``Table`` instance, you can add new items to the table. There
 are two ways to do this.

 The first is to use the ``Table.put_item`` method. Simply hand it a dictionary
 of data & it will create the item on the server side. This dictionary should
 be relatively flat (as you can nest in other dictionaries) & **must** contain
 the keys used in the ``schema``.

 Example::

     >>> from boto.dynamodb2.table import Table
     >>> users = Table('users')

     # Create the new user.
     >>> users.put_item(data={
     ...     'username': 'johndoe',
     ...     'first_name': 'John',
     ...     'last_name': 'Doe',
     ... })
     True

 The alternative is to manually construct an ``Item`` instance & tell it to
 ``save`` itself. This is useful if the object will be around for awhile & you
 don't want to re-fetch it.

 Example::

     >>> from boto.dynamodb2.items import Item
     >>> from boto.dynamodb2.table import Table
     >>> users = Table('users')

     # WARNING - This doens't save it yet!
     >>> johndoe = Item(users, data={
     ...     'username': 'johndoe',
     ...     'first_name': 'John',
     ...     'last_name': 'Doe',
     ... })
     # The data now gets persisted to the server.
     >>> johndoe.save()
     True


 Getting an Item & Accessing Data
 --------------------------------

 With data now in DynamoDB, if you know the key of the item, you can fetch it
 back out. Specify the key value(s) as kwargs to ``Table.get_item``.

 Example::

     >>> from boto.dynamodb2.table import Table
     >>> users = Table('users')

     >>> johndoe = users.get_item(username='johndoe')

 Once you have an ``Item`` instance, it presents a dictionary-like interface to
 the data.::

     >>> johndoe = users.get_item(username='johndoe')

     # Read a field out.
     >>> johndoe['first_name']
     'John'

     # Change a field (DOESN'T SAVE YET!).
     >>> johndoe['first_name'] = 'Johann'

     # Delete data from it (DOESN'T SAVE YET!).
     >>> del johndoe['last_name']


 Updating an Item
 ----------------

 Just creating new items or changing only the in-memory version of the ``Item``
 isn't particularly effective. To persist the changes to DynamoDB, you have
 three choices.

 The first is sending all the data with the expectation nothing has changed
 since you read the data. DynamoDB will verify the data is in the original state
 and, if so, will all of the item's data. If that expectation fails, the call
 will fail::

     >>> johndoe = users.get_item(username='johndoe')
     >>> johndoe['first_name'] = 'Johann'
     >>> johndoe['whatever'] = "man, that's just like your opinion"
     >>> del johndoe['last_name']

     # Affects all fields, even the ones not changed locally.
     >>> johndoe.save()
     True

 The second is a full overwrite. If you can be confident your version of the
 data is the most correct, you can force an overwrite of the data.::

     >>> johndoe = users.get_item(username='johndoe')
     >>> johndoe['first_name'] = 'Johann'
     >>> johndoe['whatever'] = "man, that's just like your opinion"
     >>> del johndoe['last_name']

     # Specify ``overwrite=True`` to fully replace the data.
     >>> johndoe.save(overwrite=True)
     True

 The last is a partial update. If you've only modified certain fields, you
 can send a partial update that only writes those fields, allowing other
 (potentially changed) fields to go untouched.::

     >>> johndoe = users.get_item(username='johndoe')
     >>> johndoe['first_name'] = 'Johann'
     >>> johndoe['whatever'] = "man, that's just like your opinion"
     >>> del johndoe['last_name']

     # Partial update, only sending/affecting the
     # ``first_name/whatever/last_name`` fields.
     >>> johndoe.partial_save()
     True


 Deleting an Item
 ----------------

 You can also delete items from the table. You have two choices, depending on
 what data you have present.

 If you already have an ``Item`` instance, the easiest approach is just to call
 ``Item.delete``.::

     >>> johndoe.delete()
     True

 If you don't have an ``Item`` instance & you don't want to incur the
 ``Table.get_item`` call to get it, you can call ``Table.delete_item`` method.::

     >>> from boto.dynamodb2.table import Table
     >>> users = Table('users')

     >>> users.delete_item(username='johndoe')
     True


 Batch Writing
 -------------

 If you're loading a lot of data at a time, making use of batch writing can
 both speed up the process & reduce the number of write requests made to the
 service.

 Batch writing involves wrapping the calls you want batched in a context manager.
 The context manager immitates the ``Table.put_item`` & ``Table.delete_item``
 APIs. Getting & using the context manager looks like::

     >>> from boto.dynamodb2.table import Table
     >>> users = Table('users')

     >>> with users.batch_write() as batch:
     ...     batch.put_item(data={
     ...         'username': 'anotherdoe',
     ...         'first_name': 'Another',
     ...         'last_name': 'Doe',
     ...         'date_joined': int(time.time()),
     ...     })
     ...     batch.put_item(data={
     ...         'username': 'alice',
     ...         'first_name': 'Alice',
     ...         'date_joined': int(time.time()),
     ...     })
     ...     batch.delete_item(username=jane')

 However, there are some limitations on what you can do within the context
 manager.

 * It can't read data at all or do batch any other operations.
 * You can't put & delete the same data within a batch request.

 .. note::

     Additionally, the context manager can only batch 25 items at a time for a
     request (this is a DynamoDB limitation). It is handled for you so you can
     keep writing additional items, but you should be aware that 100 ``put_item``
     calls is 4 batch requests, not 1.


 Querying
 --------

 Manually fetching out each item by itself isn't tenable for large datasets.
 To cope with fetching many records, you can either perform a standard query,
 query via a local secondary index or scan the entire table.

 A standard query typically gets run against a hash+range key combination.
 Filter parameters are passed as kwargs & use a ``__`` to separate the fieldname
 from the operator being used to filter the value.

 In terms of querying, our original schema is less than optimal. For the
 following examples, we'll be using the following table setup::

     >>> users = Table.create('users', schema=[
     ...     HashKey('account_type'),
     ...     RangeKey('last_name'),
     ... ], indexes=[
     ...     AllIndex('DateJoinedIndex', parts=[
     ...         HashKey('account_type'),
     ...         RangeKey('date_joined', data_type=NUMBER),
     ...     ]),
     ... ])

 When executing the query, you get an iterable back that contains your results.
 These results may be spread over multiple requests as DynamoDB paginates them.
 This is done transparently, but you should be aware it may take more than one
 request.

 To run a query for last names starting with the letter "D"::

     >>> names_with_d = users.query(
     ...     account_type__eq='standard_user',
     ...     last_name__beginswith='D'
     ... )

     >>> for user in names_with_d:
     ...     print user['first_name']
     'Bob'
     'Jane'
     'John'

 You can also reverse results (``reverse=True``) as well as limiting them
 (``limit=2``)::

     >>> rev_with_d = users.query(
     ...     account_type__eq='standard_user',
     ...     last_name__beginswith='D',
     ...     reverse=True,
     ...     limit=2
     ... )

     >>> for user in rev_with_d:
     ...     print user['first_name']
     'John'
     'Jane'

 You can also run queries against the local secondary indexes. Simply provide
 the index name (``index='FirstNameIndex'``) & filter parameters against its
 fields::

     # Users within the last hour.
     >>> recent = users.query(
     ...     account_type__eq='standard_user',
     ...     date_joined__gte=time.time() - (60 * 60),
     ...     index='DateJoinedIndex'
     ... )

     >>> for user in recent:
     ...     print user['first_name']
     'Alice'
     'Jane'

 Finally, if you need to query on data that's not in either a key or in an
 index, you can run a ``Table.scan`` across the whole table, which accepts a
 similar but expanded set of filters. If you're familiar with the Map/Reduce
 concept, this is akin to what DynamoDB does.

 .. warning::

     Scans are consistent & run over the entire table, so relatively speaking,
     they're more expensive than plain queries or queries against an LSI.

 An example scan of all records in the table looks like::

     >>> all_users = users.scan()

 Filtering a scan looks like::

     >>> owners_with_emails = users.scan(
     ...     is_owner__eq=1,
     ...     email__null=False,
     ... )

     >>> for user in recent:
     ...     print user['first_name']
     'George'
     'John'


 Parallel Scan
 -------------

 DynamoDB also includes a feature called "Parallel Scan", which allows you
 to make use of **extra** read capacity to divide up your result set & scan
 an entire table faster.

 This does require extra code on the user's part & you should ensure that
 you need the speed boost, have enough data to justify it and have the extra
 capacity to read it without impacting other queries/scans.

 To run it, you should pick the ``total_segments`` to use, which is an integer
 representing the number of temporary partitions you'd divide your table into.
 You then need to spin up a thread/process for each one, giving each
 thread/process a ``segment``, which is a zero-based integer of the segment
 you'd like to scan.

 An example of using parallel scan to send out email to all users might look
 something like::

     #!/usr/bin/env python
     import threading

     import boto.ses
     import boto.dynamodb2
     from boto.dynamodb2.table import Table


     AWS_ACCESS_KEY_ID = '<YOUR_AWS_KEY_ID>'
     AWS_SECRET_ACCESS_KEY = '<YOUR_AWS_SECRET_KEY>'
     APPROVED_EMAIL = 'some@address.com'


     def send_email(email):
         # Using Amazon's Simple Email Service, send an email to a given
         # email address. You must already have an email you've verified with
         # AWS before this will work.
         conn = boto.ses.connect_to_region(
             'us-east-1',
             aws_access_key_id=AWS_ACCESS_KEY_ID,
             aws_secret_access_key=AWS_SECRET_ACCESS_KEY
         )
         conn.send_email(
             APPROVED_EMAIL,
             "[OurSite] New feature alert!",
             "We've got some exciting news! We added a new feature to...",
             [email]
         )


     def process_segment(segment=0, total_segments=10):
         # This method/function is executed in each thread, each getting its
         # own segment to process through.
         conn = boto.dynamodb2.connect_to_region(
             'us-east-1',
             aws_access_key_id=AWS_ACCESS_KEY_ID,
             aws_secret_access_key=AWS_SECRET_ACCESS_KEY
         )
         table = Table('users', connection=conn)

         # We pass in the segment & total_segments to scan here.
         for user in table.scan(segment=segment, total_segments=total_segments):
             send_email(user['email'])


     def send_all_emails():
         pool = []
         # We're choosing to divide the table in 3, then...
         pool_size = 3

         # ...spinning up a thread for each segment.
         for i in range(pool_size):
             worker = threading.Thread(
                 target=process_segment,
                 kwargs={
                     'segment': i,
                     'total_segments': pool_size,
                 }
             )
             pool.append(worker)
             # We start them to let them start scanning & consuming their
             # assigned segment.
             worker.start()

         # Finally, we wait for each to finish.
         for thread in pool:
             thread.join()


     if __name__ == '__main__':
         send_all_emails()


 Batch Reading
 -------------

 Similar to batch writing, batch reading can also help reduce the number of
 API requests necessary to access a large number of items. The
 ``Table.batch_get`` method takes a list (or any sliceable collection) of keys
 & fetches all of them, presented as an iterator interface.

 This is done lazily, so if you never iterate over the results, no requests are
 executed. Additionally, if you only iterate over part of the set, the minumum
 number of calls are made to fetch those results (typically max 100 per
 response).

 Example::

     >>> from boto.dynamodb2.table import Table
     >>> users = Table('users')

     # No request yet.
     >>> many_users = users.batch_get(keys=[
         {'username': 'alice'},
         {'username': 'bob'},
         {'username': 'fred'},
         {'username': 'jane'},
         {'username': 'johndoe'},
     ])

     # Now the request is performed, requesting all five in one request.
     >>> for user in many_users:
     ...     print user['first_name']
     'Alice'
     'Bobby'
     'Fred'
     'Jane'
     'John'


 Deleting a Table
 ----------------

 Deleting a table is a simple exercise. When you no longer need a table, simply
 run::

     >>> users.delete()


 Next Steps
 ----------

 You can find additional information about other calls & parameter options
 in the :doc:`API docs <ref/dynamodb2>`.
	.. _dynamodb2_tut:

	===============================================
	An Introduction to boto's DynamoDB v2 interface
	===============================================

	This tutorial focuses on the boto interface to AWS' DynamoDB_ v2. This tutorial
	assumes that you have boto already downloaded and installed.

	.. _DynamoDB: http://aws.amazon.com/dynamodb/

	.. warning::

	This tutorial covers the SECOND major release of DynamoDB (including
	local secondary index support). The documentation for the original
	version of DynamoDB (& boto's support for it) is at
	:doc:`DynamoDB v1 <dynamodb_tut>`.

	The v2 DynamoDB API has both a high-level & low-level component. The low-level
	API (contained primarily within ``boto.dynamodb2.layer1``) provides an
	interface that rough matches exactly what is provided by the API. It supports
	all options available to the service.

	The high-level API attempts to make interacting with the service more natural
	from Python. It supports most of the featureset.


	The High-Level API
	==================

	Most of the interaction centers around a single object, the ``Table``. Tables
	act as a way to effectively namespace your records. If you're familiar with
	database tables from an RDBMS, tables will feel somewhat familiar.


	Creating a New Table
	--------------------

	To create a new table, you need to call ``Table.create`` & specify (at a
	minimum) both the table's name as well as the key schema for the table.

	Since both the key schema and local secondary indexes can not be
	modified after the table is created, you'll need to plan ahead of time how you
	think the table will be used. Both the keys & indexes are also used for
	querying, so you'll want to represent the data you'll need when querying
	there as well.

	For the schema, you can either have a single ``HashKey`` or a combined
	``HashKey+RangeKey``. The ``HashKey`` by itself should be thought of as a
	unique identifier (for instance, like a username or UUID). It is typically
	looked up as an exact value.
	A ``HashKey+RangeKey`` combination is slightly different, in that the
	``HashKey`` acts like a namespace/prefix & the ``RangeKey`` acts as a value
	that can be referred to by a sorted range of values.

	For the local secondary indexes, you can choose from an ``AllIndex``, a
	``KeysOnlyIndex`` or a ``IncludeIndex`` field. Each builds an index of values
	that can be queried on. The ``AllIndex`` duplicates all values onto the index
	(to prevent additional reads to fetch the data). The ``KeysOnlyIndex``
	duplicates only the keys from the schema onto the index. The ``IncludeIndex``
	lets you specify a list of fieldnames to duplicate over.

	Simple example::

	>>> from boto.dynamodb2.fields import HashKey
	>>> from boto.dynamodb2.table import Table

	# Uses your ``aws_access_key_id`` & ``aws_secret_access_key`` from either a
	# config file or environment variable & the default region.
	>>> users = Table.create('users', schema=[
	... HashKey('username'),
	... ])

	A full example::

	>>> from boto.dynamodb2.fields import HashKey, RangeKey, KeysOnlyIndex
	>>> from boto.dynamodb2.layer1 import DynamoDBConnection
	>>> from boto.dynamodb2.table import Table
	>>> from boto.dynamodb2.types import Number

	>>> users = Table.create('users', schema=[
	... HashKey('account_type', data_type=NUMBER),
	... RangeKey('last_name'),
	... ], throughput={
	... 'read': 5,
	... 'write': 15,
	... }, indexes=[
	... AllIndex('EverythingIndex', parts=[
	... HashKey('account_type', data_type=NUMBER),
	... ])
	... ],
	... # If you need to specify custom parameters like keys or region info...
	... connection=DynamoDBConnection(
	... aws_access_key_id='key',
	... aws_secret_access_key='key',
	... region='us-west-2'
	... ))


	Using an Existing Table
	-----------------------

	Once a table has been created, using it is relatively simple. You can either
	specify just the ``table_name`` (allowing the object to lazily do an additional
	call to get details about itself if needed) or provide the ``schema/indexes``
	again (same as what was used with ``Table.create``) to avoid extra overhead.

	Lazy example::

	>>> from boto.dynamodb2.table import Table
	>>> users = Table('users')

	Efficient example::

	>>> from boto.dynamodb2.table import Table
	>>> users = Table('users', schema=[
	... HashKey('account_type', data_type=NUMBER),
	... RangeKey('last_name'),
	... ], indexes=[
	... AllIndex('EverythingIndex', parts=[
	... HashKey('account_type', data_type=NUMBER),
	... ])
	... ])


	Creating a New Item
	-------------------

	Once you have a ``Table`` instance, you can add new items to the table. There
	are two ways to do this.

	The first is to use the ``Table.put_item`` method. Simply hand it a dictionary
	of data & it will create the item on the server side. This dictionary should
	be relatively flat (as you can nest in other dictionaries) & must contain
	the keys used in the ``schema``.

	Example::

	>>> from boto.dynamodb2.table import Table
	>>> users = Table('users')

	# Create the new user.
	>>> users.put_item(data={
	... 'username': 'johndoe',
	... 'first_name': 'John',
	... 'last_name': 'Doe',
	... })
	True

	The alternative is to manually construct an ``Item`` instance & tell it to
	``save`` itself. This is useful if the object will be around for awhile & you
	don't want to re-fetch it.

	Example::

	>>> from boto.dynamodb2.items import Item
	>>> from boto.dynamodb2.table import Table
	>>> users = Table('users')

	# WARNING - This doens't save it yet!
	>>> johndoe = Item(users, data={
	... 'username': 'johndoe',
	... 'first_name': 'John',
	... 'last_name': 'Doe',
	... })
	# The data now gets persisted to the server.
	>>> johndoe.save()
	True


	Getting an Item & Accessing Data
	--------------------------------

	With data now in DynamoDB, if you know the key of the item, you can fetch it
	back out. Specify the key value(s) as kwargs to ``Table.get_item``.

	Example::

	>>> from boto.dynamodb2.table import Table
	>>> users = Table('users')

	>>> johndoe = users.get_item(username='johndoe')

	Once you have an ``Item`` instance, it presents a dictionary-like interface to
	the data.::

	>>> johndoe = users.get_item(username='johndoe')

	# Read a field out.
	>>> johndoe['first_name']
	'John'

	# Change a field (DOESN'T SAVE YET!).
	>>> johndoe['first_name'] = 'Johann'

	# Delete data from it (DOESN'T SAVE YET!).
	>>> del johndoe['last_name']


	Updating an Item
	----------------

	Just creating new items or changing only the in-memory version of the ``Item``
	isn't particularly effective. To persist the changes to DynamoDB, you have
	three choices.

	The first is sending all the data with the expectation nothing has changed
	since you read the data. DynamoDB will verify the data is in the original state
	and, if so, will all of the item's data. If that expectation fails, the call
	will fail::

	>>> johndoe = users.get_item(username='johndoe')
	>>> johndoe['first_name'] = 'Johann'
	>>> johndoe['whatever'] = "man, that's just like your opinion"
	>>> del johndoe['last_name']

	# Affects all fields, even the ones not changed locally.
	>>> johndoe.save()
	True

	The second is a full overwrite. If you can be confident your version of the
	data is the most correct, you can force an overwrite of the data.::

	>>> johndoe = users.get_item(username='johndoe')
	>>> johndoe['first_name'] = 'Johann'
	>>> johndoe['whatever'] = "man, that's just like your opinion"
	>>> del johndoe['last_name']

	# Specify ``overwrite=True`` to fully replace the data.
	>>> johndoe.save(overwrite=True)
	True

	The last is a partial update. If you've only modified certain fields, you
	can send a partial update that only writes those fields, allowing other
	(potentially changed) fields to go untouched.::

	>>> johndoe = users.get_item(username='johndoe')
	>>> johndoe['first_name'] = 'Johann'
	>>> johndoe['whatever'] = "man, that's just like your opinion"
	>>> del johndoe['last_name']

	# Partial update, only sending/affecting the
	# ``first_name/whatever/last_name`` fields.
	>>> johndoe.partial_save()
	True


	Deleting an Item
	----------------

	You can also delete items from the table. You have two choices, depending on
	what data you have present.

	If you already have an ``Item`` instance, the easiest approach is just to call
	``Item.delete``.::

	>>> johndoe.delete()
	True

	If you don't have an ``Item`` instance & you don't want to incur the
	``Table.get_item`` call to get it, you can call ``Table.delete_item`` method.::

	>>> from boto.dynamodb2.table import Table
	>>> users = Table('users')

	>>> users.delete_item(username='johndoe')
	True


	Batch Writing
	-------------

	If you're loading a lot of data at a time, making use of batch writing can
	both speed up the process & reduce the number of write requests made to the
	service.

	Batch writing involves wrapping the calls you want batched in a context manager.
	The context manager immitates the ``Table.put_item`` & ``Table.delete_item``
	APIs. Getting & using the context manager looks like::

	>>> from boto.dynamodb2.table import Table
	>>> users = Table('users')

	>>> with users.batch_write() as batch:
	... batch.put_item(data={
	... 'username': 'anotherdoe',
	... 'first_name': 'Another',
	... 'last_name': 'Doe',
	... 'date_joined': int(time.time()),
	... })
	... batch.put_item(data={
	... 'username': 'alice',
	... 'first_name': 'Alice',
	... 'date_joined': int(time.time()),
	... })
	... batch.delete_item(username=jane')

	However, there are some limitations on what you can do within the context
	manager.

	* It can't read data at all or do batch any other operations.
	* You can't put & delete the same data within a batch request.

	.. note::

	Additionally, the context manager can only batch 25 items at a time for a
	request (this is a DynamoDB limitation). It is handled for you so you can
	keep writing additional items, but you should be aware that 100 ``put_item``
	calls is 4 batch requests, not 1.


	Querying
	--------

	Manually fetching out each item by itself isn't tenable for large datasets.
	To cope with fetching many records, you can either perform a standard query,
	query via a local secondary index or scan the entire table.

	A standard query typically gets run against a hash+range key combination.
	Filter parameters are passed as kwargs & use a ``__`` to separate the fieldname
	from the operator being used to filter the value.

	In terms of querying, our original schema is less than optimal. For the
	following examples, we'll be using the following table setup::

	>>> users = Table.create('users', schema=[
	... HashKey('account_type'),
	... RangeKey('last_name'),
	... ], indexes=[
	... AllIndex('DateJoinedIndex', parts=[
	... HashKey('account_type'),
	... RangeKey('date_joined', data_type=NUMBER),
	... ]),
	... ])

	When executing the query, you get an iterable back that contains your results.
	These results may be spread over multiple requests as DynamoDB paginates them.
	This is done transparently, but you should be aware it may take more than one
	request.

	To run a query for last names starting with the letter "D"::

	>>> names_with_d = users.query(
	... account_type__eq='standard_user',
	... last_name__beginswith='D'
	... )

	>>> for user in names_with_d:
	... print user['first_name']
	'Bob'
	'Jane'
	'John'

	You can also reverse results (``reverse=True``) as well as limiting them
	(``limit=2``)::

	>>> rev_with_d = users.query(
	... account_type__eq='standard_user',
	... last_name__beginswith='D',
	... reverse=True,
	... limit=2
	... )

	>>> for user in rev_with_d:
	... print user['first_name']
	'John'
	'Jane'

	You can also run queries against the local secondary indexes. Simply provide
	the index name (``index='FirstNameIndex'``) & filter parameters against its
	fields::

	# Users within the last hour.
	>>> recent = users.query(
	... account_type__eq='standard_user',
	... date_joined__gte=time.time() - (60 * 60),
	... index='DateJoinedIndex'
	... )

	>>> for user in recent:
	... print user['first_name']
	'Alice'
	'Jane'

	Finally, if you need to query on data that's not in either a key or in an
	index, you can run a ``Table.scan`` across the whole table, which accepts a
	similar but expanded set of filters. If you're familiar with the Map/Reduce
	concept, this is akin to what DynamoDB does.

	.. warning::

	Scans are consistent & run over the entire table, so relatively speaking,
	they're more expensive than plain queries or queries against an LSI.

	An example scan of all records in the table looks like::

	>>> all_users = users.scan()

	Filtering a scan looks like::

	>>> owners_with_emails = users.scan(
	... is_owner__eq=1,
	... email__null=False,
	... )

	>>> for user in recent:
	... print user['first_name']
	'George'
	'John'


	Parallel Scan
	-------------

	DynamoDB also includes a feature called "Parallel Scan", which allows you
	to make use of extra read capacity to divide up your result set & scan
	an entire table faster.

	This does require extra code on the user's part & you should ensure that
	you need the speed boost, have enough data to justify it and have the extra
	capacity to read it without impacting other queries/scans.

	To run it, you should pick the ``total_segments`` to use, which is an integer
	representing the number of temporary partitions you'd divide your table into.
	You then need to spin up a thread/process for each one, giving each
	thread/process a ``segment``, which is a zero-based integer of the segment
	you'd like to scan.

	An example of using parallel scan to send out email to all users might look
	something like::

	#!/usr/bin/env python
	import threading

	import boto.ses
	import boto.dynamodb2
	from boto.dynamodb2.table import Table


	AWS_ACCESS_KEY_ID = '<YOUR_AWS_KEY_ID>'
	AWS_SECRET_ACCESS_KEY = '<YOUR_AWS_SECRET_KEY>'
	APPROVED_EMAIL = 'some@address.com'


	def send_email(email):
	# Using Amazon's Simple Email Service, send an email to a given
	# email address. You must already have an email you've verified with
	# AWS before this will work.
	conn = boto.ses.connect_to_region(
	'us-east-1',
	aws_access_key_id=AWS_ACCESS_KEY_ID,
	aws_secret_access_key=AWS_SECRET_ACCESS_KEY
	)
	conn.send_email(
	APPROVED_EMAIL,
	"[OurSite] New feature alert!",
	"We've got some exciting news! We added a new feature to...",
	[email]
	)


	def process_segment(segment=0, total_segments=10):
	# This method/function is executed in each thread, each getting its
	# own segment to process through.
	conn = boto.dynamodb2.connect_to_region(
	'us-east-1',
	aws_access_key_id=AWS_ACCESS_KEY_ID,
	aws_secret_access_key=AWS_SECRET_ACCESS_KEY
	)
	table = Table('users', connection=conn)

	# We pass in the segment & total_segments to scan here.
	for user in table.scan(segment=segment, total_segments=total_segments):
	send_email(user['email'])


	def send_all_emails():
	pool = []
	# We're choosing to divide the table in 3, then...
	pool_size = 3

	# ...spinning up a thread for each segment.
	for i in range(pool_size):
	worker = threading.Thread(
	target=process_segment,
	kwargs={
	'segment': i,
	'total_segments': pool_size,
	}
	)
	pool.append(worker)
	# We start them to let them start scanning & consuming their
	# assigned segment.
	worker.start()

	# Finally, we wait for each to finish.
	for thread in pool:
	thread.join()


	if __name__ == '__main__':
	send_all_emails()


	Batch Reading
	-------------

	Similar to batch writing, batch reading can also help reduce the number of
	API requests necessary to access a large number of items. The
	``Table.batch_get`` method takes a list (or any sliceable collection) of keys
	& fetches all of them, presented as an iterator interface.

	This is done lazily, so if you never iterate over the results, no requests are
	executed. Additionally, if you only iterate over part of the set, the minumum
	number of calls are made to fetch those results (typically max 100 per
	response).

	Example::

	>>> from boto.dynamodb2.table import Table
	>>> users = Table('users')

	# No request yet.
	>>> many_users = users.batch_get(keys=[
	{'username': 'alice'},
	{'username': 'bob'},
	{'username': 'fred'},
	{'username': 'jane'},
	{'username': 'johndoe'},
	])

	# Now the request is performed, requesting all five in one request.
	>>> for user in many_users:
	... print user['first_name']
	'Alice'
	'Bobby'
	'Fred'
	'Jane'
	'John'


	Deleting a Table
	----------------

	Deleting a table is a simple exercise. When you no longer need a table, simply
	run::

	>>> users.delete()


	Next Steps
	----------

	You can find additional information about other calls & parameter options
	in the :doc:`API docs <ref/dynamodb2>`.