MySQL extended insert mode in Talend Open Studio
Par Pierrick, mercredi 28 novembre 2007 à 11:33 / categorie: Talend / tags: / #107 / rss
In feature 2378, I've implemented MySQL specific extended insert mode. Extended insert means that instead of inserting lines one by one, you insert many lines in the same insert query. Don't get confuse with a transaction mecanism, it's not. The advantage is speed.
To illustrate the performance improvement we'll have in Talend Open Studio 2.3.0M2 using extended inserts, I've created a benchmark : we read lines from a delimited file and we insert them in a table. 3 simple fields per line (numeric id, firstname, lastname). 1 million of lines to insert.

rows per insert job execution time improvement N times faster
1 102.7 N/A 1
10 28.6 72.1 % 3.6
100 15.6 84.8 % 6.6
500 14.1 86.3 % 7.3
1000 14.3 86.1 % 7.2
5000 13.9 86.5 % 7.4
10000 14.7 85.7 % 7.0
As you can read, performance improvement is huge. When inserting lines by block of 1000, the whole job is nearly 8 times faster. We also see that a too high value decreases performance. This is because the more we have high value, the more we use memory.
Warning: don't use a very high value for this new parameter because MySQL limits the size of the query in max_allowed_packet, which is 16M by default on my MySQL 5.0.45.
As a comparison, the bulk insert is still a lot faster. The same operation is done in 6.5 seconds.
Commentaires
Aucun commentaire pour le moment.
Ajouter un commentaire
Les commentaires pour ce billet sont fermés.