PgSync: Sync Postgres data between databases

栏目: IT技术 · 发布时间: 5年前

内容简介:Sync Postgres data between databases. Designed for:pgsync is a command line tool. To install, run:

pgsync

Sync Postgres data between databases. Designed for:

  • speed - up to 4x faster than traditional tools on a 4-core machine
  • security - built-in methods to prevent sensitive data from ever leaving the server
  • convenience - sync partial tables, groups of tables, and related records

:tangerine: Battle-tested at Instacart

Installation

pgsync is a command line tool. To install, run:

gem install pgsync

This will give you the pgsync command. If installation fails, you may need to install.

In your project directory, run:

pgsync --init

This creates .pgsync.yml for you to customize. We recommend checking this into your version control (assuming it doesn’t contain sensitive information). pgsync commands can be run from this directory or any subdirectory.

How to Use

Sync all tables

pgsync

Note:pgsync assumes your schema is already set up on your local machine. See theif that’s not the case.

Sync specific tables

pgsync table1,table2

Sync specific rows (existing rows are overwritten)

pgsync products "where store_id = 1"

You can also preserve existing rows

pgsync products "where store_id = 1" --preserve

Or truncate them

pgsync products "where store_id = 1" --truncate

Exclude Tables

pgsync --exclude users

To always exclude, add to .pgsync.yml .

exclude:
  - table1
  - table2

For Rails, you probably want to exclude schema migrations and ActiveRecord metadata.

exclude:
  - schema_migrations
  - ar_internal_metadata

Groups

Define groups in .pgsync.yml :

groups:
  group1:
    - table1
    - table2

And run:

pgsync group1

You can also use groups to sync a specific record and associated records in other tables.

To get product 123 with its reviews, last 10 coupons, and store, use:

groups:
  product:
    products: "where id = {1}"
    reviews: "where product_id = {1}"
    coupons: "where product_id = {1} order by created_at desc limit 10"
    stores: "where id in (select store_id from products where id = {1})"

And run:

pgsync product:123

Schema

Note:pgsync is designed to sync data. You should use a schema migration tool to manage schema changes. The methods in this section are provided for convenience but not recommended.

Sync schema before the data

pgsync --schema-first

Note:This wipes out existing data

Specify tables

pgsync table1,table2 --schema-first

Or just the schema

pgsync --schema-only

pgsync does not try to sync Postgres extensions.

Data Protection

Always make sure your connection is secure when connecting to a database over a network you don’t fully trust. Your best option is to connect over SSH or a VPN. Another option is to use sslmode=verify-full . If you don’t do this, your database credentials can be compromised.

Sensitive Information

Prevent sensitive information like email addresses from leaving the remote server.

Define rules in .pgsync.yml :

data_rules:
  email: unique_email
  last_name: random_letter
  birthday: random_date
  users.auth_token:
    value: secret
  visits_count:
    statement: "(RANDOM() * 10)::int"
  encrypted_*: null

last_name matches all columns named last_name and users.last_name matches only the users table. Wildcards are supported, and the first matching rule is applied.

Options for replacement are:

unique_email
unique_phone
unique_secret
random_letter
random_int
random_date
random_time
random_ip
value
statement
null
untouched

Rules starting with unique_ require the table to have a primary key. unique_phone requires a numeric primary key.

Multiple Databases

To use with multiple databases, run:

pgsync --init db2

This creates .pgsync-db2.yml for you to edit. Specify a database in commands with:

pgsync --db db2

Safety

To keep you from accidentally overwriting production, the destination is limited to localhost or 127.0.0.1 by default.

To use another host, add to_safe: true to your .pgsync.yml .

Large Tables

For extremely large tables, sync in batches.

pgsync large_table --in-batches

The script will resume where it left off when run again, making it great for backfills.

Foreign Keys

By default, tables are copied in parallel. If you use foreign keys, this can cause violations. You can specify tables to be copied serially with:

pgsync group1 --debug

Reference

Help

pgsync --help

Version

pgsync --version

Scripts

Use groups when possible to take advantage of parallelism.

For Ruby scripts, you may need to do:

Bundler.with_clean_env do
  system "pgsync ..."
end

Dependencies

If installation fails, your system may be missing Ruby or libpq.

On Mac, run:

brew install postgresql

On Ubuntu, run:

sudo apt-get install ruby-dev libpq-dev build-essential

Upgrading

Run:

gem install pgsync

To use master, run:

gem install specific_install
gem specific_install https://github.com/ankane/pgsync.git

Thanks

Inspired by heroku-pg-transfer .

Contributing

Everyone is encouraged to help improve this project. Here are a few ways you can help:

To get started with development:

git clone https://github.com/ankane/pgsync.git
cd pgsync
bundle install

createdb pgsync_test1
createdb pgsync_test2
createdb pgsync_test3

bundle exec rake test

以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

游戏编程模式

游戏编程模式

Robert Nystrom / GPP翻组 / 人民邮电出版社 / 2016-9-1 / 61.4

游戏开发一直是热门的领域,掌握良好的游戏编程模式是开发人员的应备技能。本书细致地讲解了游戏开发需要用到的各种编程模式,并提供了丰富的示例。 全书共分20章,通过三大部分内容全面介绍了与游戏编程模式相关的各类知识点。首部分介绍了基础知识和框架;第二部分深入探索设计模式,并介绍了模式与游戏开发之间的关联;第三部分介绍了13种有效的游戏设计模式。 本书提供了丰富的代码示例,通过理论和代码示例......一起来看看 《游戏编程模式》 这本书的介绍吧!

图片转BASE64编码
图片转BASE64编码

在线图片转Base64编码工具

随机密码生成器
随机密码生成器

多种字符组合密码

HSV CMYK 转换工具
HSV CMYK 转换工具

HSV CMYK互换工具