vendredi 21 décembre 2007
Par Pierrick,
vendredi 21 décembre 2007 à 00:08 / categorie: Talend
/ tags:
Talend Open Studio 2.3.0M2 is out. Let me list you what's new concerning Perl generation, compared to the current main release 2.2.3. As you will see, Perl code generation is still in progress :-) 13 new components, 8 new features in existing components. In this blog post, I only list news about Perl code generation, there are of course more new features, they are fully listed on the official ChangeLog page in releases 2.3.0M1 and 2.3.0M2.
Lire la suite
aucun commentaire
:: aucun trackback
jeudi 29 novembre 2007
Par Pierrick,
jeudi 29 novembre 2007 à 16:44 / categorie: Talend
/ tags:
3 years ago, I introduced in PhpWebGallery a very fast way to update several lines of the same table, at once. See PhpWebGallery Subversion revision 625 for details. I don't remember how this idea came to me, but I've implemented it as a component in Talend Open Studio. The purpose is to improve speed on mass updates.
The standard way to update several lines of a table, with different values for each line of course, is to perform a query for each line to update. In a web application it is a really bad thing not to know in advance the number of queries for each page. In any other situation, it's not good because it's very slow.
Lire la suite
aucun commentaire
:: aucun trackback
mercredi 28 novembre 2007
Par Pierrick,
mercredi 28 novembre 2007 à 11:33 / categorie: Talend
/ tags:
In feature 2378, I've implemented MySQL specific extended insert mode. Extended insert means that instead of inserting lines one by one, you insert many lines in the same insert query. Don't get confuse with a transaction mecanism, it's not. The advantage is speed.
To illustrate the performance improvement we'll have in Talend Open Studio 2.3.0M2 using extended inserts, I've created a benchmark : we read lines from a delimited file and we insert them in a table. 3 simple fields per line (numeric id, firstname, lastname). 1 million of lines to insert.
Lire la suite
aucun commentaire
:: aucun trackback
vendredi 23 novembre 2007
Par Pierrick,
vendredi 23 novembre 2007 à 17:51 / categorie: Talend
/ tags:
I've updated the first Talend Open Studio "use case" I wrote nearly one year ago with release 1.1.0RC1. This time I use new feature from Talend Open Studio 2.2.x : tUnite and tNormalize avoid the temporary file and the "include sub directories" option in tFileList makes the job smarter.

aucun commentaire
:: aucun trackback
lundi 22 octobre 2007
Par Pierrick,
lundi 22 octobre 2007 à 22:09 / categorie: GNU/linux
/ tags:

We're using Debian Etch (with GNU/Linux) as a server at Talend office. We need to reach a remote Microsoft SQL Server database. The first step is to perform a select query in the command line.
We need to install FreeTDS: FreeTDS is a set of libraries for Unix and Linux that allows your programs to natively talk to Microsoft SQL Server and Sybase databases.
. We have to define an "interface" for the Microsoft SQL Server in the FreeTDS "interfaces" file. At the end of the line,w use sqsh, a command line client for Sybase and Microsoft SQL Server.
Lire la suite
aucun commentaire
:: aucun trackback
vendredi 7 septembre 2007
Par Pierrick,
vendredi 7 septembre 2007 à 16:12 / categorie: GNU/linux
/ tags:

A long time ago, I've tried to use connect to a SSH server with my private key in a batch mode (with a cron task). I didn't find the way to do it. Now I have. It is as simple as to have no passphrase on your private key. Less secure (but still much more secure than FTP connection) but makes SSH possible in cron task.
Lire la suite
3 commentaires
:: aucun trackback
mardi 21 août 2007
Par Pierrick,
mardi 21 août 2007 à 18:06 / categorie: Talend
/ tags:

Richard and I have both worked 2 weeks on a main improvement proposed by Richard.
.----------------------------------------------------.
| job | TOS 2.1.1 | TOS 2.2.0M1 | improvement |
+------------+-----------+-------------+-------------+
| Scenario 2 | 20.8 s | 16.9 s | 18.8 % |
| Scenario 3 | 81.2 s | 30.4 s | 62.6 % |
'------------+-----------+-------------+-------------'
Lire la suite
un commentaire
:: aucun trackback
vendredi 6 juillet 2007
Par Pierrick,
vendredi 6 juillet 2007 à 12:08 / categorie: Développement
/ tags:

As another reminder for myself, here is a list of join examples with MySQL (to compare with Oracle behaviour in previous blog ticket)
Lire la suite
aucun commentaire
:: aucun trackback
Par Pierrick,
vendredi 6 juillet 2007 à 11:56 / categorie: Développement
/ tags:

As a reminder for myself, here is a list of join examples using Oracle.
Lire la suite
aucun commentaire
:: aucun trackback
jeudi 10 mai 2007
Par Pierrick,
jeudi 10 mai 2007 à 10:10 / categorie: Opensource
/ tags:
PEM is an opensource web application that let project users share their own project extensions. PEM stands for Project Extension Manager.

Lire la suite
3 commentaires
:: aucun trackback
mardi 17 avril 2007
Par Pierrick,
mardi 17 avril 2007 à 11:48 / categorie: GNU/linux
/ tags:

When your Subversion repository gets bigger and bigger, you need to find a solution to backup only what's new, and not the whole repository. Thanks to Subversion revisions, we can easily identify what's new since last backup. I've used this principle to write a Perl script making incremental backup.
Lire la suite
aucun commentaire
:: aucun trackback
mercredi 24 janvier 2007
Par Pierrick,
mercredi 24 janvier 2007 à 13:58 / categorie: Perl
/ tags:
I have an Oracle database with UTF-8 data inside. In a Perl script, I want to extract, transform and print these data in STDOUT, the standard output. The only difference between this ticket entry and Oracle to file in UTF-8, with Perl is the destination of data, so to avoid redundancy, take the time to read the previous ticket.
Lire la suite
aucun commentaire
:: aucun trackback
Par Pierrick,
mercredi 24 janvier 2007 à 13:45 / categorie: Perl
/ tags:
I have an Oracle database with UTF-8 data inside. In a Perl script, I want to extract, transform and load these data in a file.
Lire la suite
aucun commentaire
:: aucun trackback
lundi 8 janvier 2007
Par Pierrick,
lundi 8 janvier 2007 à 12:17 / categorie: Talend
/ tags:

Talend Open Studio release 1.1.0 is out. Exactly 3 months after release 1.0.0. This release contains many new features and of course many new Perl components. The list of new features is described on Freshmeat.
To give an example, TOS is now able to perform such a job:
- retrieve email files form a remote POP3 server
- extract informations from email headers (such as the "From" information)
- count the number of emails coming from the same author, with the new aggregate functions
- sort the result
- load result in bulk mode in a MySQL database
TOS can also read XML files with standard XPath queries, or even read/write LDIF files. Duplicates can be removed from a data flow.
To write components such as tAggregateRow or tSortRow, the 1.0 code generation model needed some improvements. Indeed, when you sort a list of lines, you need to first read all lines before outputing the first sorted line. This behaviour was not possible in TOS 1.0. We've implemented a system of virtual component. A virtual component hides a set of sub-components working altogether. This new technical feature of the Perl code generation model gives many possibilities to component writers.
For example, tSortRow is a virtual component hiding a tArray (filling a Perl array) and a tSortIn (sorting array values and outputing result). tSortIn starts its execution once tArray has finished to fill the Perl array. The first and second screenshots represent the same job.

Of course, there are many other new features in TOS 1.1, in this blog ticket I wanted to give information about the Perl part of TOS.
aucun commentaire
:: aucun trackback
mercredi 3 janvier 2007
Par Pierrick,
mercredi 3 janvier 2007 à 10:58 / categorie: Talend
/ tags:
I've written my first use case with Talend Open Studio : my purpose is to generate an email addresses whitelist based on the emails already accepted in my inbox. Using Talend Open Studio has saved me maybe 2 or 3 hours compared to a from scratch Perl script development. The generated Perl script is nearly as fast as if I had written the script specificaly for this task.

aucun commentaire
:: aucun trackback