Camilo Gonzalez wrote:
> Bob Showalter wrote:
> 
> > [redirected to [EMAIL PROTECTED]
> > 
> > Camilo Gonzalez wrote:
> > 
> > 
> > > I need to read in from a temp file that is about 10 megs big in
> > > 1.5 meg increments and write results to a database. I can't slurp
> > > up the whole temp file because I'm only allowed 2 megs of memory.
> > > I was hoping to read in only 1.5 megs per pass of the file but
> > > the read seems to be slurping up the whole thing. Do I need
> > > sysread? 
> > > 
> > > Anyway, here's my attempt:
> > > 
> > > 
> > > open (TEMP, "dataTemp.txt") or die "Can't open dataTemp for
> > > reading:$!"; my $offset = 0; while (read TEMP, my $tempbuf,
> > > 1500000, $offset) 
> > > 
> > > 
> > 
> > Because you increment $offset by 1500000 at the end of the loop,
> > you're extending $tempbuf with each pass through the loop.
> > 
> Okay, let me make sure I understand read correctly. The arguments it
> takes is filehandle, scalar, length and offset. I always thought
> offset is where it will begin reading the filehandle in bytes

Nope. See perldoc -f read. It's the offset into your variable.

> , i.e.
> if I specify an offset of 1500000, it will begin reading TEMP at that
> point and continue for lengh or another 1500000. Am I wrong? If so,
> how do I read in a specified length?

The third argument is the number of bytes you want to read. Each read
continues reading from where the last one left off, unless you use seek() to
move the file pointer.

> Remember, I only have 2 megs of
> memory to work with. How can I flush out everything so I get a fresh
> 2 megs with every pass?

Don't use $offset. But remember what I said about reading a partial record
at the 1.5 Mb boundary.

> 
> > 
> > A hint: When you write big numbers in perl, you can use underscores
> > to make them more readable: 
> > 
> >   $size = 1_500_000;      # less likely to add or drop a zero!
> > 
> > If you don't use $offset, you'll fix that problem. However, you'll
> > introduce another, because the logical record that spans the 1.5 Mb
> > boundary won't be processed correctly. You'll have to add logic to
> > handle that. 
> > 
> > Is the input data not delimited by a known string that you could
> > set $/ to? Are input rows variously delimited by \r, \n, and \f?
> > 
> I'm working on a Macwith OS X and it tends to mangle and randomly(?)
> add line feeds.

Seems odd. I dunno.

> 
> > 
> > 
> > 
> > >       {
> > > my @temp = split/[\r\n\f]/, $tempbuf;
> > > 
> > > foreach my $temp(@temp)
> > >           {
> > >                   my ($label,
> > >                   $genre,
> > >                   $catalogue_num,
> > >                   $upc,
> > >                   $artist,
> > >                   $title,
> > >                   $description,
> > >                   $format,
> > >                   $shop_price,
> > >                   $dist_price,
> > >                   $dist_1,
> > >                   $dist_2,
> > >                   $vendor,
> > >                   $cost,
> > >                   $release_date,
> > >                   $special_buy,
> > >                   $items_buy_unit,
> > >                   $items_sell_unit,
> > >                   $quant_break,
> > >                   $asset_acct,
> > >                   $buy,
> > >                   $buy_unit_measure,
> > >                   $expense_cos_account,
> > >                   $inactive_item,
> > >                   $income_acct,
> > >                   $inventory,
> > >                   $item_picture,
> > >                   $minimum_level,
> > >                   $selling_base_price,
> > >                   $price_level_b,
> > >                   $price_level_c,
> > >                   $price_level_e,
> > >                   $price_level_f,
> > >                   $sell,
> > >                   $sell_unit_measure,
> > >                   $quantity_break) = split/\t/, $temp;
> > > 
> > > next if $genre eq "";
> > > 
> > > my $inserts = "INSERT INTO bigdata VALUES (
> > >                   '$label',
> > >                   '$genre',
> > >                   '$catalogue_num',
> > >                   '$upc',
> > >                   '$artist',
> > >                   '$title',
> > >                   '$description',
> > >                   '$format',
> > >                   '$shop_price',
> > >                   '$dist_price',
> > >                   '$dist_1',
> > >                   '$dist_2',
> > >                   '$vendor',
> > >                   '$cost',
> > >                   '$release_date',
> > >                   '$special_buy',
> > >                   '$items_buy_unit',
> > >                   '$items_sell_unit',
> > >                   '$quant_break',
> > >                   '$asset_acct',
> > >                   '$buy',
> > >                   '$buy_unit_measure',
> > >                   '$expense_cos_account',
> > >                   '$inactive_item',
> > >                   '$income_acct',
> > >                   '$inventory',
> > >                   '$item_picture',
> > >                   '$minimum_level',
> > >                   '$selling_base_price',
> > >                   '$price_level_b',
> > >                   '$price_level_c',
> > >                   '$price_level_e',
> > >                   '$price_level_f',
> > >                   '$sell',
> > >                   '$sell_unit_measure',
> > >                   '$quantity_break')";
> > > 
> > > my $sth = $dbh->do($inserts) or warn "Can't load <TEMP> into
> > > bigdata: $!:$@"; 
> > > 
> > > 
> > 
> > If the database driver you're using allows placeholders, you can
> > simplify this greatly. Also, using a prepared statement (if
> > supported) can speed this up as well. 
> > 
> > 
> Placeholders allowed but prepare staements are not.

You can do this and avoid all the intermediate variables:

   $dbh->do('insert into bigdata values (' . 
      join(',', map "'$_'", split /\t/, $temp) . ')');

> 
> > 
> > 
> > > 
> > > 
> > >        }
> > >        $offset += 1500000;
> > >        }
> > > close (TEMP);


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to