Query performance with and's and or's

3 posts / 0 new
Last post
vdhanda
vdhanda's picture
Offline
Last seen: 1 month 1 week ago
Joined: Dec 3 2010
Expert Boarder

Posts: 49

Vidur Dhanda
Query performance with and's and or's

Query performance with and's and or's

It seems that infinidb has a performance issue when the query contains and's and or's. The query that I am trying to understand is:

mysql> select count(*) from csad where active_business = 1 and (bnkrpt = 0 or (bnkrpt= 1 and ifnull(cscore, 0) > 0)) and ( (datepll = 201011 and as_of_date = '2010-10-31' ) or (datepll = 201010 and as_of_date = '2010-09-30' ) );
+----------+
| count(*) |
+----------+
| 54348633 |
+----------+
1 row in set, 1 warning (1 hour 3 min 5.86 sec)

During query execution, PrimProc is consuming all 4 CPUs in a 4-way AMD Opteron box with 24 GB RAM. And there is no swapping with free reporting 9GB of free RAM.

Calpont.xml is attached.
An (abbreviated) description of the table is attached.
Transcript of the complete idbmysql session with trace and stats is attached.

In an attempt to understand this "poor" performance, I ran several queries and am close to concluding that there is indeed a challenge when the where clause has multiple and's and or's.

mysql> select count(*) from csad;
+------------+
| count(*) |
+------------+
| 1524303387 |
+------------+
1 row in set (31.83 sec)

mysql> select count(*) from csad where active_business = 1;
+-----------+
| count(*) |
+-----------+
| 762485208 |
+-----------+
1 row in set (39.71 sec)

mysql> select count(*) from csad where (bnkrpt = 0);
+------------+
| count(*) |
+------------+
| 1483582902 |
+------------+
1 row in set (50.56 sec)

mysql> select count(*) from csad where (bnkrpt= 1 and ifnull(cscore, 0) > 0);
+----------+
| count(*) |
+----------+
| 5887771 |
+----------+
1 row in set (24.73 sec)

Excellent performance thus far. Now put an OR in the clause.

mysql> select count(*) from csad where (bnkrpt = 0 or (bnkrpt= 1 and ifnull(cscore, 0) > 0));
+------------+
| count(*) |
+------------+
| 1489470673 |
+------------+
1 row in set (2 min 43.37 sec)

Slight degradation ...

mysql> select count(*) from csad where datepll = 201011 and as_of_date = '2010-10-31';
+----------+
| count(*) |
+----------+
| 58399542 |
+----------+
1 row in set (5.57 sec)

mysql> select count(*) from csad where datepll = 201010 and as_of_date = '2010-09-30';
+----------+
| count(*) |
+----------+
| 56882936 |
+----------+
1 row in set (4.03 sec)

mysql> select count(*) from csad where (datepll = 201011 and as_of_date = '2010-10-31' ) or (datepll = 201010 and as_of

radams
radams's picture
Offline
Last seen: 1 day 14 hours ago
Joined: Jan 3 2011
Administrator

Posts: 492

Robert Adams
Re: Query performance with and's and or's

Hi Vidur,

Yes, the difference is the result of using the OR. With the AND, blocks are eliminated and filters are applied at a low level. With the OR, all blocks are retrieved.

Thanks,

Robert
Calpont

vdhanda
vdhanda's picture
Offline
Last seen: 1 month 1 week ago
Joined: Dec 3 2010
Expert Boarder

Posts: 49

Vidur Dhanda
Re: Query performance with and's and or's

Can I get a little more visibility into this? Given the difference in performance, it is important to understand this in some detail. I'm assuming "With the OR, all blocks are retrieved" is not suggesting a complete scan of both sides ...

Vidur