Home     |     .Net Programming    |     cSharp Home    |     Sql Server Home    |     Javascript / Client Side Development     |     Ajax Programming

Ruby on Rails Development     |     Perl Programming     |     C Programming Language     |     C++ Programming     |     IT Jobs

Python Programming Language     |     Laptop Suggestions?    |     TCL Scripting     |     Fortran Programming     |     Scheme Programming Language


 
 
Cervo Technologies
The Right Source to Outsource

MS Dynamics CRM 3.0

Perl Programming Language

Sorting


Hello,

I've had a search through CPAN, and have not been able to find an
answer yet, but I would like to know if there is something like
File::Sort which will allow me to specify that there is one or more
header records at the start of the input which should be untouched by
the sort.  Does anyone know of such a module (or an easy way to do
this using File::Sort!)

Thx,
k

On 7 Jun, 09:44, k@bytebrothers.co.uk wrote:

> File::Sort which will allow me to specify that there is one or more
> header records at the start of the input which should be untouched by
> the sort.

OK, no responses, so I had time to find more research material, which
led me to this solution.  Any advice on ways to tighten this up a tad
without losing too much readability?

The data look like this (delimiters line up vertically):
==================================
  Licence | Created| Crtd By | Products |  Qty | To Loc |  Last  |
DZone
     01799|05/06/07|     OOS1|  NIV0327R|   960|  YH3621|        |
BACK
         1|07/06/07|    SPODE|  STT0014V|   156|   SFF15|        |
S
     10106|06/06/07|    DALEC|  VAN1383T|     0|   JLE12|        |
GDSIN1
      1015|29/05/07|  OOSOFFC|  CIF0012T|   192|  XP4417|        |
BACK
      1022|31/05/07|    WOODC|  DET0065Y|   141|  XE4313|        |
BACK
     10222|04/06/07|  COLEROB|  FLU0473P|  1640|   UAB12|  SMITHN|
None
     10319|07/06/07| HALLPHIL|  SCH3318Q|   240|   MDL22|        |
GDSIN1
     10350|07/06/07|   QUINNJ|  DOS0030K|  4072|   CRH52|        |
GDSIN1
==================================

So, to preserve the header and sort by the 'Products' column:

==================================
#!/usr/local/bin/perl -w
@lines = ();
@key   = ();
while (<>)
{
        $row++;
        if ($row == 1)
        {
                print;
                next;
        }
        chomp;
        push @lines,$_;
        push @key, (split(/\|/))[3];

}

@indices = sort {$key[$a] cmp $key[$b]} 0..$#lines;
foreach $index (@indices)
{
        print "$lines[$index]\n";
}

==================================

k@bytebrothers.co.uk wrote:
> I would like to know if there is something like File::Sort which will
> allow me to specify that there is one or more header records at the
> start of the input which should be untouched by the sort.

my ( @headers, @records );
while ( <DATA> ) {
     push @headers, $_;
     push @records, <DATA> if /^===/;

}

print @headers, sort @records;

__DATA__
First header
Another header
============================
Record B
Record C
Record A

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl

On Jun 7, 11:27 am, k@bytebrothers.co.uk wrote:

use strict;

> @lines = ();
> @key   = ();

no need to intialize an array to the empty list.  That's what it is
already.

> while (<>)
> {
>         $row++;

This variable already exists for you.  It's name is '$.'.  No need to
keep track the line count separately.

>         if ($row == 1)
>         {
>                 print;
>                 next;
>         }
>         chomp;
>         push @lines,$_;
>         push @key, (split(/\|/))[3];

> }

> @indices = sort {$key[$a] cmp $key[$b]} 0..$#lines;
> foreach $index (@indices)
> {
>         print "$lines[$index]\n";}

rather than messing with a bunch of indices, I would prefer a
Schwartzian transform.  The syntax has a bit of a learning curve, but
once you "get it", it becomes intuitive.

So my rewrite of your script comes down to:
#!/opt2/perl/bin/perl
use strict;
use warnings;

my @lines;
while (<DATA>) {
   print and next if $. == 1;
   push @lines, $_;

}

print  map { $_->[0] }
      sort { $a->[1] cmp $b->[1] }
       map { [ $_, (split /\|/)[3] ] }
      @lines;
__DATA__
Licence | Created| Crtd By | Products |  Qty | To Loc |  Last  | DZone
   01799|05/06/07|     OOS1|  NIV0327R|   960|  YH3621|        | BACK
       1|07/06/07|    SPODE|  STT0014V|   156|   SFF15|        | S
   10106|06/06/07|    DALEC|  VAN1383T|     0|   JLE12|        |
GDSIN1
    1015|29/05/07|  OOSOFFC|  CIF0012T|   192|  XP4417|        | BACK
    1022|31/05/07|    WOODC|  DET0065Y|   141|  XE4313|        | BACK
   10222|04/06/07|  COLEROB|  FLU0473P|  1640|   UAB12|  SMITHN| None
   10319|07/06/07| HALLPHIL|  SCH3318Q|   240|   MDL22|        |
GDSIN1
   10350|07/06/07|   QUINNJ|  DOS0030K|  4072|   CRH52|        |
GDSIN1

Paul Lalli

On 7 Jun, 16:46, Paul Lalli <mri@gmail.com> wrote:

> On Jun 7, 11:27 am, k@bytebrothers.co.uk wrote:
> > Any advice on ways to tighten this up a tad without losing too much readability?

> rather than messing with a bunch of indices, I would prefer a
> Schwartzian transform.  The syntax has a bit of a learning curve, but
> once you "get it", it becomes intuitive.

> print  map { $_->[0] }
>       sort { $a->[1] cmp $b->[1] }
>        map { [ $_, (split /\|/)[3] ] }
>       @lines;

Oh, that's sweet!  All I need to do now is sit down and work out
exactly how the feck that works!
On 8 Jun, 09:28, k@bytebrothers.co.uk wrote:

I've been working through this, and I think I'm getting there, slowly;
there's something going on here with anonymous list references, for a
start.  But how would I use this paradigm if there was a more
complicated key?  For example, in my original example, if I needed to
sort by the second column, which contains a date, I would have done
something like:

     @fields = split(/\|/);
     ($dy,$mn,$yr) = split(/\//,$field[1]);
     push @key, "$yr$mn$dy";
    etc...

How would this transform approach allow me to do something similar?

On Jun 8, 5:59 am, k@bytebrothers.co.uk wrote:

Well, obviously, it's going to be a little messier, but the concept is
the same;

print  map { $_->[0] }
      sort { $a->[1] cmp $b->[1] }
       map { [
               $_,
               do {
                  my ($d,$m,$y) = split '/', (split /\|/)[1];
                  "$y$m$d";
               }
             ]
           }
      @lines;

When trying to decipher a Schwartzian transform, read it backwards.
1) We start with the array of @lines.
2) The bottom map transform the array of lines into a list of array
references.  The first element of the array reference is the line
itself, and the second is the value we want to sort by eventually.  In
this case, that's the "year-month-day" value.
3) The sort now takes this list of array references, and sorts it by
the second element of each referenced array.  That is, it sorts the
array references on our sort key.
4) The top map takes this sorted list of array references and
transforms it to a new list containing the first element of each
referenced array - that is, the original line.
5) print is passed this list of lines.

It might be helpful if you break it out into it's individual steps.
In this case, I'll use a generic get_key() to represent obtaining the
sort key from your line.  That's the only part of a Schwartzian
transform that ever changes.  The syntax is always the same for the
rest of it.

my @lines_keys = map { [ $_, get_key($_) ] } @lines;
my @sorted_lines_keys = sort { $a->[1] cmp $b->[1] } @lines_keys;
my @sorted_lines = map { $_->[0] } @sorted_lines_keys;
print @sorted_lines;

Hope that helps,
Paul Lalli

On 8 Jun, 11:32, Paul Lalli <mri@gmail.com> wrote:

> When trying to decipher a Schwartzian transform, read it backwards.
> 1) We start with the array of @lines.
> 2) The bottom map transform the array of lines into a list of array
> references.  The first element of the array reference is the line
> itself, and the second is the value we want to sort by eventually.  In
> this case, that's the "year-month-day" value.
> 3) The sort now takes this list of array references, and sorts it by
> the second element of each referenced array.  That is, it sorts the
> array references on our sort key.
> 4) The top map takes this sorted list of array references and
> transforms it to a new list containing the first element of each
> referenced array - that is, the original line.
> 5) print is passed this list of lines.

I think I just had a religious experience.  That is new and wonderful,
and thank you for explaining it for me!
On Jun 8, 6:46 am, k@bytebrothers.co.uk wrote:

> On 8 Jun, 11:32, Paul Lalli <mri@gmail.com> wrote:
> > [description of Schwartzian Transform]
> I think I just had a religious experience.  That is new and
> wonderful, and thank you for explaining it for me!

You're welcome.  Glad to help.

I would be remiss, however, if I didn't point out that Uri has created
a module which generalizes the creation of a Schwartzian Transform
sort algorithm (amongst other things).  It is available on the CPAN,
named Sort::Maker.  Using that module, the process becomes:

use Sort::Maker
my $sorter = make_sorter('ST', string => \&get_key);
print $sorter->(@lines);

#get_key simply extracts the key from your data
#so in the second example, it would be:
sub get_key {
  my $date = (split /\|/, $_)[1];
  my ($d, $m, $y) = split '/', $date;
  "$y$m$d";

}

#in the original, it would be as simple as:
sub get_key {
  (split /\|/)[3];

}

Paul Lalli

Paul Lalli <mri@gmail.com> writes:
> When trying to decipher a Schwartzian transform, read it backwards.

That was the most difficult part to wrap my brain around. It's the reason
that, upon encountering an ST, I *still* have to stop and think about it
for a moment to parse it.

sherm--

--
Web Hosting by West Virginians, for West Virginians: http://wv-www.net
Cocoa programming in Perl: http://camelbones.sourceforge.net

>>>>> "k" == keith  <k@bytebrothers.co.uk> writes:

  k> I think I just had a religious experience.  That is new and wonderful,
  k> and thank you for explaining it for me!

if you want a module to do all that (and more) for you, check out
Sort::Maker.

uri

--
Uri Guttman  ------  u@stemsystems.com  -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs  ----------------------------  http://jobs.perl.org

Add to del.icio.us | Digg this | Stumble it | Powered by Megasolutions Inc