Each component works on a Perl array. This Perl array is the translation of a database row or a file line into what I call a Talend row: @row. In Talend 2.1.x and earlier, @row was copied from the input connection to the current component which transforms the @row and then copy it to the output connection. 2 copies for each component.

simple job example

In this job example, somewhere in the code, we had:

while (my @tMysqlInput_1 = $sth->fetchrow_array()) {
    # ...
    my @row3 = @tMysqlInput_1;
    my @tFilterRow_1 = @row3;
    # ...
    my @row4 = @tFilterRow_1;
    my @tLogRow_1 = @row4;
    # ...
}

The new way of doing things is:

while (my $tMysqlInput_1 = $sth->fetchrow_arrayref()) {
    # ...
    my $row3 = $tMysqlInput_1;
    my $tFilterRow_1 = $row3;
    # ...
    my $row4 = $tFilterRow_1;
    my $tLogRow_1 = $row4;
    # ...
}

Where $tMysqlInput_1 is an array reference. No data copy, only memory address copy.

In a job that has very few components (not many copies) and a small schema (@row will be very small), this is not a real problem. Of course, in a complex job with huge schema the improvement becomes very interesting.