导入csv文件
LOAD DATA INFILE '/var/lib/mysql-files/jjdb_fkdb_all_dropdup_20w.csv'
into table `jjdb_fkdb_all_dropdup_20w` character set utf8
fields terminated by ',' optionally enclosed by '"' escaped by '"'
lines terminated by '\r\n';
注:导入csv文件,对文件的结构和内容要求苛刻,失败概率很大。
导入txt文件
1,将csv文件保存为txt文件输出
df.to_csv('jjdb_fkdb_all_dropdup.txt',sep='\t',index=False)
2,进入数据库
mysql -uroot -puse mysql57
3,数据导入mysql
LOAD DATA INFILE '/var/lib/mysql-files/jjdb_fkdb_all_dropdup_20w.txt'
若txt中字段的顺序与table中字段的顺序不一致,则会报错,所以以下命令中,按照table中的顺序输入字段名称:
INTO TABLE jjdb_fkdb_all_dropdup_20w IGNORE 1 LINES (xzqhdm,jjdbh,jjdwdm,jjybh,jjyxm,jjtbh,jjtip,jjsj,bjdh,jjlyh,bjrxm,bjrxbdm,lxdh,lxdz,jqdz,gxdwdm,jqlxdm,zddwxzb,zddwyzb,jqztdm,gxsjc,gljqbh,tfhm,rksjc,bjnr,bjlxmc,labels,sjdbh,jjdbh_f,cjdbh,fkdbh,fksj,fkdwip,fktbh,fkybh,fkdwdm,fkyxm,jwqdm,jwqmc,cjrxm,sjcjsj,ddxcsj,jqlbdm,jqxldm,jqfssj,jqjssj,jqdjdm,hzdjdm,afcslxdm,qhjzlbdm,cjqk,sfphxsaj,sfcczaaj,sfjjjf,cdclqk,cdryqk,hzyydm,cljgdm,cljg,rksjc_f,gxsjc_f,jqztdm_f,id);
导入成功
Query OK, 6331041 rows affected, 65535 warnings (2 min 30.16 sec)Records: 6331041 Deleted: 0 Skipped: 0 Warnings: 48877809
注意:
1,导入过程中出现1261报错:
ERROR 1261 (01000): Row 404 doesn't contain data for all columns
则设置sql_mode,操作如下:
show variables like "sql_mode";set sql_mode='';
2,导入过程中出现1064报错:
ERROR 1062 (23000): Duplicate entry '20' for key 'PRIMARY'
则在Navicat中,将table中的主键去掉即可,但如此导入结束后会发现数据量增加。
导出csv文件
从表jjdb_fkdb_all_dropdup_copy中导出csv文件(导出的csv文件用pandas读取可能会出错)
select * into outfile '/var/lib/mysql-files/all_vs_dup.csv' fields terminated by '\t' lines terminated by '\n' from jjdb_fkdb_all_dropdup_copy;
利用pandas从数据库中读取数据
import pandas as pdimport pymysql# sql 命令sql_cmd = "SELECT * FROM table_name"# 用DBAPI构建数据库链接enginecon = pymysql.connect(host='172.**.**.**', user='****', password='**这里填写数据库密码**', database='**填写数据库名称**', charset='utf8', use_unicode=True)df = pd.read_sql(sql_cmd, con)