On Tue, May 31, 2011 at 2:15 AM, Agnello George
<agnello.dso...@gmail.com> wrote:
> i am given ever day a list of files  which i wget  within a parent folder .
>
> my  team gives me    a structure where theses files are located
*snip*
> using the above data given to me  i check whether the files exist  and if
> they do i download them
*snip*
> i need to create a tar of this in the same given format

<wall>

I like the idea of computers doing what people otherwise have to
manually do (often tediously). :) That's sort of what made me
want to be a programmer. :) While you hear other people complain
about how long certain data tasks require, you can often think of
a simple program to cut hours or days down into seconds or
minutes.

In any case, I wrote a program to do as much as I could make
sense of from the OP's requirements. I'm not entirely sure I
understood correctly, but either way I made some assumptions
about the format of the file he is given. :) I assumed the first
line would always contain the source server and the destination
server in that order preceded by the words "download" and
"upload" respectively. I assumed the next line would be blank and
every line after that would be a path/file part, optionally
followed by more file names (which should be joined with the
directory of the first file on the line). For example:

foo/bar baz

Would represent foo/bar and foo/baz.

Once I have that parsed I dump it to the user and prompt them to
continue (so that if the file is malformed or the parsing just
screws up it doesn't waste any time or do anything naughty). Once
the user says to continue, it uses LWP::Simple to fetch the
files, and Archive::Tar to put them into a tarball (preserving
the tree structure). Since I don't know what you do with the
tarball that's as far as I could get so the example program by
default dumps the Archive::Tar object, but if passed any
arguments, it writes the tarball to the first argument.

For example:

perl foo.pl blah.tar

Would write the tarball to blah.tar.

Currently the instructions are coming from DATA, but the function
that parses it accepts a file handle so you can instead open a
file handle to the file system and pass that (or modify it to
handle strings as file names or something).  I did it for fun so
now you get to have fun criticizing it. ;D It hasn't been tested
thoroughly so there are probably a few bugs and there are
definitely going to be some areas that need improvement before
you could call it robust. I used my own Web server as the source
for this example so that I could actually test the LWP::Simple
and Archive::Tar stuff.

@OP: if you need the tarball compressed then I noticed that
Archive::Tar::write accepts extra arguments to specify that, but
I didn't bother for this example. :)

Note that the first line of the instructions won't fit on a
single line in an E-mail so it will definitely wrap. :) It is up
to you to fix that before running the program (though the error
handling should catch it if you don't).


#!/usr/bin/env perl

use strict;
use warnings;

use Data::Dumper;
use LWP::Simple;
use Archive::Tar;

main(@ARGV);

sub fetch_files
{
    my ($src_uri, $file_names) = @_;

    $src_uri =~ s{[\\/]$}{};

    my %files;

    for my $file (@{$file_names})
    {
        $files{$file} = get("$src_uri/$file") or
                warn "Failed to get '$file'";
    }

    return \%files;
}

sub main
{
    my @args = @_;
    my $instructions = parse_instructions(\*DATA);
    prompt_to_continue($instructions);
    my $files = fetch_files(@$instructions{qw(src_uri files)});
    my $tar = make_tarball($files);

    # I don't know how you plan to upload it to the server so
    # I'll just write it out to the file system instead. :)
    $tar->write($ARGV[0]) if @args;
    print Dumper $tar unless @args;
}

sub make_tarball
{
    my ($files) = @_;

    my $tar = Archive::Tar->new;

    for my $file (keys %{$files})
    {
        $tar->add_data($file, $files->{$file});
    }

    return $tar;
}

sub parse_instructions
{
    my ($fh) = @_;

    my @lines = <$fh>;

    die "Malformed instructions file" unless @lines >= 2;

    my $uri_re = qr(http://[^ ]+);

    my ($src_uri, $dest_uri) =
            $lines[0] =~
            m{download.*?($uri_re).*upload.*?($uri_re)};

    die "Failed to parse server URIs"
            unless defined $src_uri && defined $dest_uri;

    my $start_line = 2;

    unless($lines[1] =~ /^\s*$/)
    {
        warn "The second line isn't empty" ;
        $start_line--;
    }

    @lines = @lines[$start_line..$#lines];

    my @files;

    for my $line (@lines)
    {
        my @dirfiles = split ' ', $line;

        next unless @dirfiles;

        @dirfiles = @dirfiles[1..$#dirfiles]
                if $dirfiles[0] =~ /^\s+$/;

        if(@dirfiles > 1)
        {
            my ($directory) = $dirfiles[0] =~ m{^(.*)/[^/]+$};

            warn "Failed to parse directory"
                    unless defined $directory;

            map { s{(.*)}{$directory/$1}; $_; }
                    @dirfiles[1..$#dirfiles];
        }

        push @files, @dirfiles;
    }

    return {
        src_uri => $src_uri,
        dest_uri => $dest_uri,
        files => \@files
    };
}

sub prompt_to_continue
{
    my ($instructions) = @_;
    print Dumper $instructions;
    print "Continue? (yes/no) ";
    exit(1) unless <STDIN> =~ /^y(?:es)?$/i;
}

__DATA__
download the files form wget http://castopulence.org  and upload on
http://prodserver/weebsite1

news.html
projects.xhtml
downloads.xml
js/acc.dev.js  acc.js
schitzwin/index.html  schitzwin.pl



-- 
Brandon McCaig <http://www.bamccaig.com/> <bamcc...@gmail.com>
V zrna gur orfg jvgu jung V fnl. Vg qbrfa'g nyjnlf fbhaq gung jnl.
Castopulence Software <http://www.castopulence.org/> <bamcc...@castopulence.org>

--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to