内容简介:Sync Postgres data between databases. Designed for:pgsync is a command line tool. To install, run:
pgsync
Sync Postgres data between databases. Designed for:
- speed - up to 4x faster than traditional tools on a 4-core machine
- security - built-in methods to prevent sensitive data from ever leaving the server
- convenience - sync partial tables, groups of tables, and related records
Installation
pgsync is a command line tool. To install, run:
gem install pgsync
This will give you the pgsync
command. If installation fails, you may need to install.
In your project directory, run:
pgsync --init
This creates .pgsync.yml
for you to customize. We recommend checking this into your version control (assuming it doesn’t contain sensitive information). pgsync
commands can be run from this directory or any subdirectory.
How to Use
Sync all tables
pgsync
Note:pgsync assumes your schema is already set up on your local machine. See theif that’s not the case.
Sync specific tables
pgsync table1,table2
Sync specific rows (existing rows are overwritten)
pgsync products "where store_id = 1"
You can also preserve existing rows
pgsync products "where store_id = 1" --preserve
Or truncate them
pgsync products "where store_id = 1" --truncate
Exclude Tables
pgsync --exclude users
To always exclude, add to .pgsync.yml
.
exclude: - table1 - table2
For Rails, you probably want to exclude schema migrations and ActiveRecord metadata.
exclude: - schema_migrations - ar_internal_metadata
Groups
Define groups in .pgsync.yml
:
groups: group1: - table1 - table2
And run:
pgsync group1
You can also use groups to sync a specific record and associated records in other tables.
To get product 123
with its reviews, last 10 coupons, and store, use:
groups: product: products: "where id = {1}" reviews: "where product_id = {1}" coupons: "where product_id = {1} order by created_at desc limit 10" stores: "where id in (select store_id from products where id = {1})"
And run:
pgsync product:123
Schema
Note:pgsync is designed to sync data. You should use a schema migration tool to manage schema changes. The methods in this section are provided for convenience but not recommended.
Sync schema before the data
pgsync --schema-first
Note:This wipes out existing data
Specify tables
pgsync table1,table2 --schema-first
Or just the schema
pgsync --schema-only
pgsync does not try to sync Postgres extensions.
Data Protection
Always make sure your connection is secure
when connecting to a database over a network you don’t fully trust. Your best option is to connect over SSH or a VPN. Another option is to use sslmode=verify-full
. If you don’t do this, your database credentials can be compromised.
Sensitive Information
Prevent sensitive information like email addresses from leaving the remote server.
Define rules in .pgsync.yml
:
data_rules: email: unique_email last_name: random_letter birthday: random_date users.auth_token: value: secret visits_count: statement: "(RANDOM() * 10)::int" encrypted_*: null
last_name
matches all columns named last_name
and users.last_name
matches only the users table. Wildcards are supported, and the first matching rule is applied.
Options for replacement are:
unique_email unique_phone unique_secret random_letter random_int random_date random_time random_ip value statement null untouched
Rules starting with unique_
require the table to have a primary key. unique_phone
requires a numeric primary key.
Multiple Databases
To use with multiple databases, run:
pgsync --init db2
This creates .pgsync-db2.yml
for you to edit. Specify a database in commands with:
pgsync --db db2
Safety
To keep you from accidentally overwriting production, the destination is limited to localhost
or 127.0.0.1
by default.
To use another host, add to_safe: true
to your .pgsync.yml
.
Large Tables
For extremely large tables, sync in batches.
pgsync large_table --in-batches
The script will resume where it left off when run again, making it great for backfills.
Foreign Keys
By default, tables are copied in parallel. If you use foreign keys, this can cause violations. You can specify tables to be copied serially with:
pgsync group1 --debug
Reference
Help
pgsync --help
Version
pgsync --version
Scripts
Use groups when possible to take advantage of parallelism.
For Ruby scripts, you may need to do:
Bundler.with_clean_env do system "pgsync ..." end
Dependencies
If installation fails, your system may be missing Ruby or libpq.
On Mac, run:
brew install postgresql
On Ubuntu, run:
sudo apt-get install ruby-dev libpq-dev build-essential
Upgrading
Run:
gem install pgsync
To use master, run:
gem install specific_install gem specific_install https://github.com/ankane/pgsync.git
Thanks
Inspired by heroku-pg-transfer .
Contributing
Everyone is encouraged to help improve this project. Here are a few ways you can help:
- Report bugs
- Fix bugs and submit pull requests
- Write, clarify, or fix documentation
- Suggest or add new features
To get started with development:
git clone https://github.com/ankane/pgsync.git cd pgsync bundle install createdb pgsync_test1 createdb pgsync_test2 createdb pgsync_test3 bundle exec rake test
以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。