Infinidb broken....

8 posts / 0 new
Last post
steffzt
steffzt's picture
Offline
Last seen: 1 year 10 months ago
Joined: Feb 5 2010
Senior Boarder

Posts: 30

Steffen Kirschke
Infinidb broken....

Last night, my infinsdb installation (1.5 final) stopped working cpimport.

In collxml error logs sometimes I see messages like this:
2010-07-17 10:48:16 (7033) ERR : colxml exception: DDLPackageProcessor::getColumnsForTable: while reading columns for table logdb.mlogs: Error occured when calling makeJobList

This happens sometimes, not always.

in cpimport. error log:

2010-07-16 19:50:19 (2274) ERR : Error in acquiring table lock for table logdb.mlogs; OID-3262; BRM error setting a table lock. [1006]

but it shouldn´t. There´s only one process at a time writing data.
Trying to clear the table lock, shutting down infinidb, clearing shm
and restartting didn´t change anything.

I thought, ok, my table may be broken, let´s rename it and create a new
one. No other jobs accessing the database at the time.
This didn´t work, but thereafter it ha been in readonly mode.
Set it readwite and tried to drop the table, same result.

Ok, I uninstalled infinifb completely, removed all remaining Files and
directories - except the debian packages or course.

Uninstallation:
-------------------
abu-hub001:~# sudo apt-get purge calpont calpont-mysql calpont-mysqld
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages will be REMOVED:
calpont* calpont-mysql* calpont-mysqld*
0 upgraded, 0 newly installed, 3 to remove and 144 not upgraded.
After this operation, 40.0MB disk space will be freed.
Do you want to continue [Y/n]?
(Reading database ... 36057 files and directories currently installed.)
Removing calpont ...
Purging configuration files for calpont ...
dpkg: warning: while removing calpont, directory '/usr/local/Calpont/etc' not empty so not removed.
dpkg: warning: while removing calpont, directory '/usr/local/Calpont/bin' not empty so not removed.
Removing calpont-mysql ...
Purging configuration files for calpont-mysql ...
dpkg: warning: while removing calpont-mysql, directory '/usr/local/Calpont/lib' not empty so not removed.
Removing calpont-mysqld ...
Purging configuration files for calpont-mysqld ...
dpkg: warning: while removing calpont-mysqld, directory '/usr/local/Calpont/mysql/lib/mysql/plugin' not empty so not removed.
dpkg: warning: while removing calpont-mysqld, directory '/usr/local/Calpont/mysql/lib/mysql' not empty so not removed.
dpkg: warning: while removing calpont-mysqld, directory '/usr/local/Calpont/mysql/lib' not empty so not removed

steffzt
steffzt's picture
Offline
Last seen: 1 year 10 months ago
Joined: Feb 5 2010
Senior Boarder

Posts: 30

Steffen Kirschke
Re:Infinidb broken....

And that´s, what I get from syslog, when trying to start infinidb:

Jul 17 12:12:57 abu-hub001
Jul 17 12:13:17 abu-hub001
Jul 17 12:18:36 abu-hub001 DMLProc[28907]: 36.559305 |0|0|0| I 20 CAL0002: DMLProc starts bulk roll back.
Jul 17 12:18:36 abu-hub001 DMLProc[28907]: 36.560027 |0|0|0| I 20 CAL0002: No table need rollback
Jul 17 12:18:36 abu-hub001 messagequeue[28907]: 36.574938 |0|0|0| I 20 CAL0002: DMLProc finished bulk roll back.
Jul 17 12:18:36 abu-hub001 DMLProc[28907]: 36.576397 |0|0|0| I 20 CAL0002: DMLProc starts rollbackAll transactions.
Jul 17 12:18:41 abu-hub001 writeengine[28915]: 41.606001 |0|0|0| I 19 CAL0060: dbbuilder system catalog status: System catalog appears to exist. It will remain intact for reuse. The database is not recreated.
Jul 17 12:18:50 abu-hub001 kernel: mysqld[29203]: segfault at 7f1f712f8630 ip 00007f1f95b45db7 sp 00007f1f96eee400 error 4 in libgcc_s.so.1[7f1f95b36000+16000]
Jul 17 12:18:55 abu-hub001

steffzt
steffzt's picture
Offline
Last seen: 1 year 10 months ago
Joined: Feb 5 2010
Senior Boarder

Posts: 30

Steffen Kirschke
Re:Infinidb broken....

Platform Process check: ./infinidb: line 112: [: too many arguments
DONE

found, that this happens, whenever there´s an old workernode left.
(which I´ve overseen)

added the following to killProcesses() in infinidb start script to
make sure I´ll never ever get trapped by hanging processes:

sleep 1
# to make sure, they are really dead
pkill -9 DMLProc > /dev/null
pkill -9 DDLProc > /dev/null
pkill -9 ExeMgr > /dev/null
pkill -9 PrimProc > /dev/null
pkill -9 controllernode > /dev/null
pkill -9 workernode > /dev/null

steffzt
steffzt's picture
Offline
Last seen: 1 year 10 months ago
Joined: Feb 5 2010
Senior Boarder

Posts: 30

Steffen Kirschke
Re:Infinidb broken....

Finally made it working again.

Seems to me, as if there has been a workernode, which hung and which I´ve overseen when checking the process table.

davidhill
davidhill's picture
Offline
Last seen: 4 hours 5 sec ago
Joined: Oct 27 2009
Administrator

Posts: 564

david hill
Re:Infinidb broken....

Thanks for the feedback.

We will investigate this issue:

Platform Process check: ./infinidb: line 112: [: too many arguments
DONE

On the error related to logging not working:

InfiniDB Logging check: grep: /var/log/Calpont/debug.log: No such file or directory
ERROR: InfiniDB logging not functioning

Reasons this could happen:

1. You aren't using syslog or rsyslog, which is what we currently support.
2. The syslog config file didn't get setup correctly, we append to the bottom.
3. Or the syslog daemon didn't get restarted correctly after install

Thanks for the input on the killing of the processes. There are times where a InfiniDB or mysqld needs to be forcefully shutdown if an errror occurs in the processing of a query or DDL/DML statement. So that is a good procedure to make sure all process's are totally shutdown after the command 'infinidb stop' is running on a system that was having a problem.

And the BRM going into read-only mode is a good indication that something failed causing InfiniDB to lock the BRM data. Log files would general show you what happened and why it went into read-only mode.
As you found, we our logs excluding our debug.log gets logged into /var/log/messages.

David H.

davidhill
davidhill's picture
Offline
Last seen: 4 hours 5 sec ago
Joined: Oct 27 2009
Administrator

Posts: 564

david hill
Re:Infinidb broken....

I was able to reporduce the script error of "too many arguments" and I opened a BUG:

https://bugs.launchpad.net/infinidb/+bug/607368

Problem is related to the one work-node wasn't cleanly shutdown and the start launched a second one, which caused the scripting error.

So looking at changing the script:

1. To handle the mutliple version running and not causing the error
2. To possible add in the force killing of the processes, like you did in your version.

steffzt
steffzt's picture
Offline
Last seen: 1 year 10 months ago
Joined: Feb 5 2010
Senior Boarder

Posts: 30

Steffen Kirschke
Re:Infinidb broken....

One thing to add:

The cause, to try to restart infindb was, that I encountered problems while cpimporting and very slow results on selects.
cpimporting didn´s work, because infini was in readonly and, after setting it
rw, with the first attempt to change anything, cpimport, rename or alter tables,
dropping table... it went back to readonly, failing the requested action.

This led me to a complete new, clean install and I hoped to come back soon,
which failed due the left workernode.

Now, how may I create this situation?
- start a long lasting request
- think, ups, that´l give me a few rows too much
- ^C in mysql client.
- and we have a workernode, which hangs.

Happens not always, but frequently. (annotated the bug)

It also _seems_ to happen sometimes, when cpimport looses it´s
parent process; this is not verified, cause cpimport is too fast. ;-)

@logs: You´re right, I´m using syslog-ng V3.0;
Thanks for the tip.

davidhill
davidhill's picture
Offline
Last seen: 4 hours 5 sec ago
Joined: Oct 27 2009
Administrator

Posts: 564

david hill
Re:Infinidb broken....

If you hadn't located the syslog-ng setup posting, here you go:

http://www.infinidb.org/index.php?option=com_kunena&Itemid=64&func=view&...

Also thanks for the procedure about the hanging worker-node issue, we will investigate and BUG that one.

David H.