Converting MySQL to PostgreSQL
当前需要将 MySQL 数据迁移至 PostgreSQL,参考资料后决定采用 pgloader 去做这件事。
pgloader https://github.com/dimitri/pgloader
pgloader 介绍 https://pgloader.io/
pgloader 文档 https://pgloader.readthedocs.io/en/latest/
pgloader 安装
apt-get 方式与 docker
根据介绍可知比较简单的方式有 apt-get 与 使用 docker。
apt-get 方式
apt-get install pgloader
使用 docker
docker pull dimitri/pgloader
docker run --rm --name pgloader dimitri/pgloader:latest pgloader --version
docker run --rm --name pgloader dimitri/pgloader:latest pgloader --help
由于我的环境是 CentOS 7,所以就采用 docker 方式安装了。
使用中遇到了一些问题,发现已经有人提了 issue
Couldn’t re-execute SBCL with proper personality flags #511
开发者认为是 SBCL 的问题,在最后有人给出了解决方案。
Running in Docker (SBCL warning)
源码安装
clone 至本地后,需要执行bootstrap-centos7.sh
安装所需依赖
git clone https://github.com/dimitri/pgloader.git
cd pgloader
chmod +x ./bootstrap-centos7.sh
./bootstrap-centos7.sh
make pgloader
bootstrap-centos7.sh 中内容为
#!/usr/bin/env bash
sudo yum -y install yum-utils rpmdevtools @"Development Tools" \
sqlite-devel zlib-devel
# Enable epel for sbcl
sudo yum -y install epel-release
sudo yum -y install sbcl
# Missing dependency
sudo yum install freetds freetds-devel -y
sudo ln -s /usr/lib64/libsybdb.so.5 /usr/lib64/libsybdb.so
# prepare the rpmbuild setup
rpmdev-setuptree
因为需要考虑离线安装的方式,所以下载所需依赖的 rpm 包,注意考虑依赖是否存在依赖需要安装或者升级。
sbcl
cl-asdf-20101028-8.el7.noarch.rpm
common-lisp-controller-7.4-8.el7.noarch.rpm
sbcl-1.4.0-1.el7.x86_64.rpm
freetds
unixODBC-2.3.1-14.el7.x86_64.rpm
libtool-ltdl-2.4.2-22.el7_3.x86_64.rpm
freetds-libs-1.1.11-1.el7.x86_64.rpm
freetds-1.1.11-1.el7.x86_64.rpm
freetds-devel-1.1.11-1.el7.x86_64.rpm
rpm -ivh --replacefiles *.rpm
后再去 make 即可。
make 时如果出现问题,可以参考如下 issue
PGLOADER installation in centos #876
开发者提供了不必在编译时获取构建依赖项的版本
https://github.com/dimitri/pgloader/releases/download/v3.6.0/pgloader-bundle-3.6.0.tgz
使用
详细使用方式参照上方文档连接
./build/bin/pgloader --version
./build/bin/pgloader --help
alias
alias pgloader=./bin/pgloader
pg.load 中定义操作
LOAD DATABASE
FROM mysql://mysql_user:mysql_passwd@localhost:3306/mysql_database
INTO pgsql://pg_user:pg_passwd@localhost:5432/pg_database
WITH include drop, create tables, create indexes, workers = 8, concurrency = 1
ALTER SCHEMA 'mysql_database' RENAME TO 'public'
;
加载数据
pgloader pg.load
数据处理
The following table shows the mapping between PostgreSQL (source) data types and MySQL data types.
Source Type | MySQL Type | Comment |
---|---|---|
INT | INT | |
SMALLINT | SMALLINT | |
BIGINT | BIGINT | |
SERIAL | INT | Sets AUTO_INCREMENT in its table definition. |
SMALLSERIAL | SMALLINT | Sets AUTO_INCREMENT in its table definition. |
BIGSERIAL | BIGINT | Sets AUTO_INCREMENT in its table definition. |
BIT | BIT | |
BOOLEAN | TINYINT(1) | |
REAL | FLOAT | |
DOUBLE PRECISION | DOUBLE | |
NUMERIC | DECIMAL | |
DECIMAL | DECIMAL | |
MONEY | DECIMAL(19,2) | |
CHAR | CHAR/LONGTEXT | Depending on its length. MySQL Server 5.6 and higher can have CHAR columns with a length up to 255 characters. Anything larger is migrated as LONGTEXT. |
NATIONAL CHARACTER | CHAR/LONGTEXT | Depending on its length. MySQL Server 5.6 and higher can have VARCHAR columns with a length up to 65535 characters. Anything larger is migrated to one of the TEXT blob types. In MySQL, a character set of strings depends on the column character set instead of the data type. |
VARCHAR | VARCHAR/MEDIUMTEXT/LONGTEXT | Depending on its length. MySQL Server 5.6 and higher can have VARCHAR columns with a length up to 65535 characters. Anything larger is migrated to one of the TEXT blob types. |
NATIONAL CHARACTER VARYING | VARCHAR/MEDIUMTEXT/LONGTEXT | Depending on its length. MySQL Server 5.6 and higher can have VARCHAR columns with a length up to 65535 characters. Anything larger is migrated to one of the TEXT blob types. In MySQL, a character set of strings depends on the column character set instead of the data type. |
DATE | DATE | |
TIME | TIME | |
TIMESTAMP | DATETIME | |
INTERVAL | TIME | |
BYTEA | LONGBLOB | |
TEXT | LONGTEXT | |
CIDR | VARCHAR(43) | |
INET | VARCHAR(43) | |
MACADDR | VARCHAR(17) | |
UUID | VARCHAR(36) | |
XML | LONGTEXT | |
JSON | LONGTEXT | |
TSVECTOR | LONGTEXT | |
TSQUERY | LONGTEXT | |
ARRAY | LONGTEXT | |
POINT | POINT | |
LINE | LINESTRING | Although LINE length is infinite, and LINESTRING is finite in MySQL, it is approximated. |
LSEG | LINESTRING | A LSEG is like a LINESTRING with only two points. |
BOX | POLYGON | A BOX is a POLYGON with five points and right angles. |
PATH | LINESTRING | |
POLYGON | POLYGON | |
CIRCLE | POLYGON | A POLYGON is used to approximate a CIRCLE. |
TXID_SNAPSHOT | VARCHAR |
MySQL 中的点和线
迁移后 geometry 相关的字段是不对的,需要另外处理。
MySQL 中的点、线字段类型是point
、linestring
。
PostgreSQL 中点、线字段类型是geometry(Point,4326)
、geometry(LineString,4326)
。
MySQL 中的点生成方式为ST_GeomFromText('POINT(longitude latitude)', 4326)
MySQL 中的线生成方式为GeometryFromText('LINESTRING(x0 y0, x1 y1, x2 y2)',0)
PostgreSQL 中的点
表中已经储存了经纬度,所以直接 drop 掉重新 add 后再维护数据。
ALTER TABLE table_name DROP COLUMN column_name;
ALTER TABLE table_name ADD COLUMN column_name geometry(Point, 4326);
重新生成
UPDATE table_name SET column_name = ST_SetSRID(ST_Point(longitude, latitude), 4326);
或者
UPDATE table_name SET column_name = ST_GeomFromText('POINT(longitude latitude)', 4326);
PostgreSQL 中的线
同理
ALTER TABLE table_name DROP COLUMN column_name;
ALTER TABLE table_name ADD COLUMN column_name geometry(LineString, 4326);
UPDATE table_name SET column_name = ST_GeomFromText('LINESTRING(x0 y0, x1 y1, x2 y2)', 4326));