I'm happy to announce that Talend Ecosystem has moved to Talend Exchange. It's a new name. "Ecosystem" was also used to describe the business partners and it was confusing for Talend customers. But the name is obviously not the only change. visit Talend Exchange
Thursday, February 19 2009
Talend Exchange, the new Ecosystem
By Pierrick Le Gall on Thursday, February 19 2009, 00:14 - Talend
Friday, October 17 2008
When memory matters, multiple keys hash
By Pierrick Le Gall on Friday, October 17 2008, 00:41 - Perl

One more time I need to load a huge number of data in memory with a Perl hash. See previous post about this in When memory matters. This time the hash key is made of several values. The value corresponding to each key is an array of scalar data. I have 6 string values for the key and 10 numeric float values for the corresponding value.
Wednesday, September 24 2008
PhpWebGallery turns Piwigo
By Pierrick Le Gall on Wednesday, September 24 2008, 00:07 - Piwigo

PhpWebGallery becomes Piwigo. The name of the project is changing. This is not a fork, just a rename.
What are the advantages of "Piwigo" over "PhpWebGallery", or "why did we decide to change?" :
- shorter : easier to remind
- unique : a search engine request on "Piwigo" will bring you to the Piwigo project pages
- no PHP in the name : I now find it odd to make the technology obvious in the application name
- keeps the PWG letters
Wednesday, July 16 2008
When memory matters
By Pierrick Le Gall on Wednesday, July 16 2008, 00:31 - Perl
I need to load a huge number of data in memory with a Perl hash. The value corresponding to each key is an array of scalar data. The key is most of the time created with a single field of my array, but it can be made of several fields. The number of fields in the array may vary a lot, but most of the time it will be around 5 scalar values.
Wednesday, June 18 2008
Install ecosystem components directly from Talend Open Studio
By Pierrick Le Gall on Wednesday, June 18 2008, 00:21 - Talend
Another new feature in Talend Open Studio 2.4.0 : the ability to install ecosytem components directly from your installed Talend Open Studio. No need to browse the web application ecosytem to find the component that fit your need.

Friday, June 13 2008
Parallel executions on iterate links
By Pierrick Le Gall on Friday, June 13 2008, 00:49 - Talend

New feature in Talend Open Studio 2.4, the ability to run iterations in parallel. After a tLoop or a tFileList you can set the "number of parallel executions" on the iterate link. If you're running on a quad core computer, it might be interesting to ask for 4 executions in parallel. 4 executions in parallel means that 4 "iterations" will be executed in parallel as long as some iterations are remaining. If your tFileList finds 1,000 files, you won't have 1,000 parallel executions, but 4, theoretically reducing the total execution time by 4.
Thursday, June 5 2008
TalendForge.org planet is on its orbit
By Pierrick Le Gall on Thursday, June 5 2008, 00:47 - Talend

I am proud to introduce you to the TalendForge planet. A planet is an online aggregation of several blog feeds related to a common topic. Currently registered planet members are Stéphane Mallet (Java developer at Talend), Sebastiao Correia (Java developer at Talend), Olivier Carbone (Training and Support manager at Talend) and me (Community Manager, Perl developer at Talend).
The purpose of this planet is to keep you informed about Talend Open Studio and related technologies/products, from the inside. This is not the Talend corporate blog, but a place where people who make Talend Open Studio give information to Talend Open Studio users.
As far as I'm concerned, only my posts with the "Talend" tag will be visible in this planet. TalendForge.org planet was placed into orbit thanks to Planet Planet. This Python script refreshes the planet every one hour.
The planet earth picture comes from wikipedia commons.
Thursday, May 22 2008
External command piped to Talend data flow
By Pierrick Le Gall on Thursday, May 22 2008, 00:14

With Talend Open Studio 2.4 and a Perl project, component tPipeRow appears in the palette. tPipeRow sends each input row to an external command, fetch the returned line and send it to the next component. tPipeRow does not launch the external script as many times as there are input rows, but only once at the data flow initialization. It makes the whole thing very performant. For each STDIN line, the external script must produce one STDOUT line, without buffering.
On a technical point of view, it's very interesting to see 2 scripts running in parallel and communicating through file descriptors. tPipeRow code is very short (but was not that short to write) because it simply uses IPC::Open2.
Saturday, April 12 2008
Automated test results on talendforge.org
By Pierrick Le Gall on Saturday, April 12 2008, 00:15 - Talend

After many months of work, Talend development team is proud to announce the public availability of our automated test results. You can browse them on talendforge.org. As said on the about view: It should allow our development team to: detect regressions, ensure backward compatibility, follow-up bug fixing
Monday, March 17 2008
multithreading for Perl jobs
By Pierrick Le Gall on Monday, March 17 2008, 00:50 - Talend

In Talend Open Studio, the multithreading option makes possible to execute 2 subjobs in parallel. It was implemented in Java code generation last summer (see feature 1335) for TOS 2.1 and I implemented it for Perl code generation last monday (see feature 3302). Current multithreading option was not implemented with threads in Perl, but with processes, I fork the parent process in children.
Wednesday, February 20 2008
Extract fields from a positionnal file with Perl
By Pierrick Le Gall on Wednesday, February 20 2008, 23:13 - Perl

Here is a sample of a positional:
Pierrick LE GALL 026169 Erwann LE GALL 002080 Larry WALL 053174
We have the firstname on 11 characters, lastname on 11 characters, age on 3 characters and size on 3 characters. We want to extract these fields into an array. I propose to use the unpack function.
Friday, December 21 2007
Talend Open Studio 2.3.0M2 is out
By Pierrick Le Gall on Friday, December 21 2007, 00:08 - Talend
Talend Open Studio 2.3.0M2 is out. Let me list you what's new concerning Perl generation, compared to the current main release 2.2.3. As you will see, Perl code generation is still in progress :-) 13 new components, 8 new features in existing components. In this blog post, I only list news about Perl code generation, there are of course more new features, they are fully listed on the official ChangeLog page in releases 2.3.0M1 and 2.3.0M2.
Thursday, November 29 2007
MySQL bulk update with Talend Open Studio
By Pierrick Le Gall on Thursday, November 29 2007, 16:44 - Talend
3 years ago, I introduced in PhpWebGallery a very fast way to update several lines of the same table, at once. See PhpWebGallery Subversion revision 625 for details. I don't remember how this idea came to me, but I've implemented it as a component in Talend Open Studio. The purpose is to improve speed on mass updates.
The standard way to update several lines of a table, with different values for each line of course, is to perform a query for each line to update. In a web application it is a really bad thing not to know in advance the number of queries for each page. In any other situation, it's not good because it's very slow.
Wednesday, November 28 2007
MySQL extended insert mode in Talend Open Studio
By Pierrick Le Gall on Wednesday, November 28 2007, 11:33 - Talend
In feature 2378, I've implemented MySQL specific extended insert mode. Extended insert means that instead of inserting lines one by one, you insert many lines in the same insert query. Don't get confuse with a transaction mecanism, it's not. The advantage is speed.
To illustrate the performance improvement we'll have in Talend Open Studio 2.3.0M2 using extended inserts, I've created a benchmark : we read lines from a delimited file and we insert them in a table. 3 simple fields per line (numeric id, firstname, lastname). 1 million of lines to insert.
Friday, November 23 2007
New whitelist generator with TOS 2.3.0M1
By Pierrick Le Gall on Friday, November 23 2007, 17:51 - Talend
I've updated the first Talend Open Studio "use case" I wrote nearly one year ago with release 1.1.0RC1. This time I use new feature from Talend Open Studio 2.2.x : tUnite and tNormalize avoid the temporary file and the "include sub directories" option in tFileList makes the job smarter.

Monday, October 22 2007
Debian Linux as a Microsoft SQL Server client
By Pierrick Le Gall on Monday, October 22 2007, 22:09 - GNU/linux

We're using Debian Etch (with GNU/Linux) as a server at Talend office. We need to reach a remote Microsoft SQL Server database. The first step is to perform a select query in the command line.
We need to install FreeTDS: FreeTDS is a set of libraries for Unix and Linux that allows your programs to natively talk to Microsoft SQL Server and Sybase databases.
. We have to define an "interface" for the Microsoft SQL Server in the FreeTDS "interfaces" file. At the end of the line,w use sqsh, a command line client for Sybase and Microsoft SQL Server.
Friday, September 7 2007
SSH, key authentication and batch mode
By Pierrick Le Gall on Friday, September 7 2007, 16:12 - GNU/linux

A long time ago, I've tried to use connect to a SSH server with my private key in a batch mode (with a cron task). I didn't find the way to do it. Now I have. It is as simple as to have no passphrase on your private key. Less secure (but still much more secure than FTP connection) but makes SSH possible in cron task.
Tuesday, August 21 2007
Talend 2.2.0M1 and Perl code performances
By Pierrick Le Gall on Tuesday, August 21 2007, 18:06 - Talend

Richard and I have both worked 2 weeks on a main improvement proposed by Richard.
.----------------------------------------------------. | job | TOS 2.1.1 | TOS 2.2.0M1 | improvement | +------------+-----------+-------------+-------------+ | Scenario 2 | 20.8 s | 16.9 s | 18.8 % | | Scenario 3 | 81.2 s | 30.4 s | 62.6 % | '------------+-----------+-------------+-------------'
- Scenario 3 details in Talendforge wiki
- Scenario 2 details in Talendforge wiki
Friday, July 6 2007
MySQL joins
By Pierrick Le Gall on Friday, July 6 2007, 12:08 - Développement

As another reminder for myself, here is a list of join examples with MySQL (to compare with Oracle behaviour in previous blog ticket)
Oracle joins
By Pierrick Le Gall on Friday, July 6 2007, 11:56 - Développement

As a reminder for myself, here is a list of join examples using Oracle.
« previous entries - page 1 of 2
