Dump and Restore

New in Kay-0.2.0

Overview

The new appengine SDK release has a dump/restore capability. This feature enables you to dump/restore all the data on datastore. Having said that, for that purpose, you need to execute a command for every kind you defined, and you need to specify kind_name each time. Kay has a wrapper command for dumping/restoring all the data more easily.

Now I introduce some commands to you, please execute these commands in your project directory.

Dumping from a server environment

You can execute following command for dumping all the data from the server:

$ python manage.py dump_all -n 20090919

Please bear in mind that this command determine which url to access acording to your app.yaml settings.

This command will ask you username and password, so answer it with app admin’s information. Then, all the data will be dumped into the directory _backup/20090919. The files which has .dat suffix includes actual data. You can specify which directory to dump with -n option.

Clearing all the data on a server environment

To delete all the data on a server environment, you can do as follows:

$ python manage.py clear_datastore [-c]

You can delete all the memcached data with -c option.

Restoring dumped data to a server environment

You can restore dumped data in the directory 20090919 to a server environment by following command:

$ python manage.py restore_all -n 20090919

In some cases, you need to delete all the data on the server. (TODO: More detailed explanation for this)

Dumping data from a local dev server

A local dev server needs to be running for dumping/restoring data from/to a local environment. You can dump all the data form a local dev server by following command:

$ python manage.py dump_all -n 20090919local -u http://localhost:8080/remote_api

The only difference between dumping from a server env and a local env, is -u option.

Deleting all the data from a local dev server

To delete all the data in a local env, please stop the server and restart it with -c option.

$ python manage.py runserver -c

Restoring dumped data to a local dev server

For restoring dumped data to a local dev server, you need to specify a URL for the remote API handler of a local server with -u option.

$ python manage.py restore_all -n 20090919local -u http://localhost:8080/remote_api

Notice about autogenerated ids

When you restoring data in these way, all the autogenerated ids will be restored back as it is, so all the relations of any ReferenceProperty and parent/child relations are entirely restored. Its very nice. You have to rememver that restoring dumped data from another environment causes collisions between restored ids and autogenerated ids. You may need to reset the id counter of an application with a method db.allocate_ids in some cases.

This work is very cumbersome sometimes, so you can avoid this by creating all the entities with key_name. kay.models.NamedModel will help you creating entities with key_name.

Corresponds with dumping/restoring failure

When dumping/restoring fails, you can configure per-kind-options of bulkloader by creating _backup/__init__.py.

  • Failure case 1

    When entities are huge, restoring more than one entities at a time might fail because any API call is limited by 1M. For example, you can specify bulkloader to restore one bbs_image entity at a time by creating _backup/__init__.py with following contents.

    _backup/__init__.py:

    restore_options = {
      'bbs_image': ['--batch_size=1'],
    }
    
  • Failure case 2

    When dumping 1000 entities from a local dev server, it fails with an error. It succeeds with following configuration:

    _backup/__init__.py:

    dump_options = {
      'chat_message': ['--num_threads=1'],
    }
    

If you encounter any other failure case, please let me know. I will add setting examples to this section for such cases.