Cache table works with partitioned table.
I guess you’re experimenting with a default local metastore and the
metastore_db directory doesn’t exist at the first place. In this case,
all metastore tables/views don’t exist at first and will throw the error
message you saw when the |PARTITIONS| me
I recommend using the data generators provided with MLlib to generate synthetic
data for your scalability tests - provided they're well suited for your
algorithms. They let you control things like number of examples and
dimensionality of your dataset, as well as number of partitions.
As far as
Hi all,
I am trying to contribute some machine learning algorithms to MLlib.
I must evaluate their performance on a cluster, changing input data
size, the number of CPU cores and any their parameters.
I would like to build my develoipng Spark on EC2 automatically.
Is there already a building
Thanks for the update, Nate. I'm looking forward to seeing how these
projects turn out.
David, Packer looks very, very interesting. I'm gonna look into it more
next week.
Nick
On Thu, Oct 2, 2014 at 8:00 PM, Nate D'Amico wrote:
> Bit of progress on our end, bit of lagging as well. Our guy le
Bit of progress on our end, bit of lagging as well. Our guy leading effort got
little bogged down on client project to update hive/sql testbed to latest
spark/sparkSQL, also launching public service so we have been bit scattered
recently.
Will have some more updates probably after next week.
I think this is exactly what packer is for. See e.g.
http://www.packer.io/intro/getting-started/build-image.html
On a related note, the current AMI for hvm systems (e.g. m3.*, r3.*) has a
bad package for httpd, whcih causes ganglia not to start. For some reason I
can't get access to the raw AMI to
Is there perhaps a way to define an AMI programmatically? Like, a
collection of base AMI id + list of required stuff to be installed + list
of required configuration changes. I’m guessing that’s what people use
things like Puppet, Ansible, or maybe also AWS CloudFormation for, right?
If we could d
Hi,
In Spark 1.1 HiveContext, I ran a create partitioned table command followed by
a cache table command and got a java.sql.SQLSyntaxErrorException: Table/View
'PARTITIONS' does not exist. But cache table worked fine if the table is not a
partitioned table.
Can anybody confirm that cache of pa