Very poor update/insert ETL performance

4 posts / 0 new
Last post
davesisk
davesisk's picture
Offline
Last seen: 3 months 2 hours ago
Joined: Oct 17 2011
Junior Boarder

Posts: 3

Dave Sisk , DBA
Gender: Male
Very poor update/insert ETL performance

Greetings! We are doing some testing with the community edition of Infinidb. We've found good performance on insert-only loads(nearly 800-1000 rows/second), and good query performance.

However, our data integration architecture involves capturing changes on the source databases and incrementally applying those changes on the target via "upsert" functionality. In other words, we first attempt an update...if no rows were updated, then we perform an insert. Fairly simple approach.

We've seen upsert performance of about 1-2 rows/second. :ohmy: Essentially, it takes us just as long to process 1000 changed rows as it does to just drop/re-insert 1000000+ rows. Presumably, this is because Infinidb does not currently support primary or unique keys, nor any explicitly-created indexes.

Is this supported in the commercial edition? If not, is support for this planned any time in the future?

Thanks and best regards,
Dave

radams
radams's picture
Offline
Last seen: 15 hours 57 min ago
Joined: Jan 3 2011
Administrator

Posts: 487

Robert Adams
Re: Very poor update/insert ETL performance

Hi Dave,

InfiniDB is not intended to be a row by row OLTP database. We are a column based system that can easily make updates at a hundred thousand rows per second using our cpimport tool. Here is a link that explains a process of extract,load and transform.

http://infinidb.org/component/content/article/53/225

Thanks,

Robert
Calpont

davesisk
davesisk's picture
Offline
Last seen: 3 months 2 hours ago
Joined: Oct 17 2011
Junior Boarder

Posts: 3

Dave Sisk , DBA
Gender: Male
Re: Very poor update/insert ETL performance

I completely understand that Infinidb is not an OLTP database...that is why we are looking at it in the first place. ;)

However, just because it's not an OLTP database does not mean that updates are never needed. Dimension maintenance is not always a simple insert...it depends on the dimension type, for instance. Surely a data warehousing database company is at least aware of different dimensions types and how processing of those are correctly handled...right?

So, does cpimport handle updates as well as inserts? How would you integrate cpimport with a tool such as Talend Open Studio or Pentaho Data Integrator?

Regards,
Dave

radams
radams's picture
Offline
Last seen: 15 hours 57 min ago
Joined: Jan 3 2011
Administrator

Posts: 487

Robert Adams
Re: Very poor update/insert ETL performance

Hi Dave,

cpimport does inserts, not updates. We have customers who use many different BI tools with InfiniDB. We have quick start guides for some of them on our web site for calpont.com.

http://www.calpont.com/resources/resource-library/category/9-infinidb-en...

Thanks,

Robert
Calpont