Huge mysql database migration

3 posts / 0 new
Last post
deano
deano's picture
Offline
Last seen: 1 year 3 months ago
Joined: Sep 22 2011
Junior Boarder

Posts: 15

Dean Kamali
Huge mysql database migration

Hello

I have a huge mysql database with size of approximately 85G for 2011 alone, previous years are relatively smaller in size.

From docs, I should use the following command to import data into my infinidb db.

[code]idbmysql -e ' source_table;' -N db2 | /usr/local/Calpont/bin/cpimport db table1 -s 't' [/code]

I'm using the open source version of infinidb on a virtual machine which has 4G of memory.

The only issue I faced, my vm runs out of memory soon after running cpimport utility.

After reading how cpimport tool works, I learned that it loads data into memory before it starts importing it to columns.

I tried spliting the database by lines using split command, it works, but it requires lots of baby sitting before you finish importing the entire database.

I found a better way of doing it by following steps below

What I have done is count number of rows in mysql using the following command.

1. [code]select count(*) from table name;
[/code]
Note the number of rows down (in my case this is a small database with 23897040)

2. Run a bash loop with count option to limit number of rows which gets loaded to cpimport tool

[code]for (( i=1000000; i <= 23897040; i=$i+1000000 )); do idbmysql -proot -e "calldetailrecord limit $i,1000000;" -N source_db | /usr/local/Calpont/bin/cpimport archive_db calldetailrecord -j501 -s 't' ;done[/code]

This will feed 1million rows at a time, till it reaches 23897040.

I found this is the only way I get the job done without having my virtual machine run out of memory.

is there is a better way? ;)

bdempsey
bdempsey's picture
Offline
Last seen: 1 month 3 weeks ago
Joined: Oct 27 2009
Platinum Boarder

Posts: 194

Robert Dempsey
Re: Huge mysql database migration

cpimport doesn't normally read the whole input file into memory. By default, it reads about 10MB of data at a time. It's probably the mysql client under the covers in the idbmysql script that is consuming all your memory. Try adding a "--quick" to the idbmysql command to keep the mysql client from running faster than cpimport can consume.

deano
deano's picture
Offline
Last seen: 1 year 3 months ago
Joined: Sep 22 2011
Junior Boarder

Posts: 15

Dean Kamali
Re: Huge mysql database migration

Wow, it has been almost a year already ..

Well, working on the same project still .. anyways

I have been following a little different strategy, when importing data, I think its alot better from the above steps.

1. Take MySQL dump from your production database

2. convert MySQL dump file to CSV file

3. import CSV to infinidb using cpimport

Note: You can use Select function in MySQL and generate CSV file, but in my environment this just takes lots of time and causes lots of issues.

converting MySQL dump file to CSV is very annoying, here is some of what I have done to get it done.

[code]awk '/INSERT INTO `Table-Name` VALUES/' MySQL-Backup.sql | awk '{print $NF;}' | sed -e 's/(//;s/);//;s/,/|/g;s|["''']||g' -e 's/)/n/g' -e 's/(//g' -e 's/^|//g' > FILE-NAME.csv [/code]

Now sometimes you will get missmatch when importing data, infinidb complains about number of rows are 36 for example and some of your data rows in MySQL dump maybe less than 36 for some reason.

You can modify the following awk script

[code]awk -F'|' -v OFS='|' '{for(i=NF+1;i<=36;i++)$i=""}1' FILE-NAME.csv [/code]

which will count each row, and if it less than 36, it will add required fields.

Hope this helps

cheers