FYI, just in case anyone else happens on this. The clearing out
database and push method is working fine, and seems to be a good way
to get everything onto Heroku piecemeal.
However, and this is really weird, if you have multi column indexes on
those tables that you're pushing, something about the push process
seems to change their column order. I've had this happen a couple
times now before I realized what was going on. I'm fixing it just be
dropping the indexes and re-adding them.
There is a bit of a gotcha with adding indexes though, and that's that
Heroku only allows a database operation to run for 6 seconds by
default. This might not be long enough to build a large index, so in
your migration, you need to include the following line to extend the
limit:
ActiveRecord::Base.connection.select_all('set statement_timeout to
60000')
On Mar 19, 2:12 pm, Mike <[email protected]> wrote:
> Just an update on ways to get a large dataset up to Heroku.
>
> I saw that they're working on an update to Taps to make it work better
> with large datasets. In the meantime, however, here's a hack that
> essentially lets you upload only individual tables (or parts of
> tables) with Taps.
>
> Taps does not clear out its target database, on either a push or pull,
> but instead just tries to insert everything it receives. If there are
> duplicates with uniqueness constraints such as primary keys, this will
> cause it to error out, otherwise presumably it'll add duplicate
> entries.
>
> This can be used to our advantage in pushing a large database up to
> Heroku. If you do a push with nothing in your local database other
> than your new entries, Taps will happily send those up, seamlessly
> merging them into the Heroku database. I still haven't had the chance
> to upload my large datasets that were the original topic of this
> thread, but I have confirmed that the following procedure works on a
> small test set:
> First get a full set of current database. This is only necessary if
> the tables you are adding having primary keys, or other unique fields.
> By having them in the local database, you prevent newly added entries
> from conflicting.
> Then, add whatever you need to the tables, and remove all entries that
> are already on the application. Full tables can be removed, Taps seems
> to function fine without having a full table set being pushed.
> Finally, a db:push will now only send your new information. You can
> break up your new information into chunks and send them up in batches
> to avoid a single overly large push.
>
> While this is going on, it's important to prevent the live app from
> doing anything that might add entries into the table being pushed
> into, or you might get primary key conflicts.
>
> On Mar 15, 6:26 pm, Terence Lee <[email protected]> wrote:
>
> > What I've been told for tmp folder size is about 1gb but it's a soft
> > limit.
>
> > -Terence
>
> > On Mon, 2010-03-15 at 08:52 -0700, Mike wrote:
> > > Also in that link you linked to:
> > > Slug Size: 500MB - Hard
>
> > > Man, that probably includes the temp directory.
>
> > > Getting data onto Heroku is pretty friggin hard....
>
> > > On Mar 15, 9:27 am, Daniele <[email protected]> wrote:
> > > > On 15 Mar, 01:13, Mike <[email protected]> wrote:
>
> > > > > That's a really good idea on having a controller take the upload into
> > > > > temp. Is there a size limit on the temp directory?
>
> > > > I don't know, sorry.
>
> > > > > Or a time limit on
> > > > > how long a dyno can be locked to a single upload before being
> > > > > restarted?
>
> > > > "Request Length: 30 seconds - Hard" (http://legal.heroku.com/aup)
>
> > > > A bit low timeout for a middle-size upload...
>
>
--
You received this message because you are subscribed to the Google Groups
"Heroku" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/heroku?hl=en.