Cool. filed a task for us to work on that.
https://bugzilla.mozilla.org/show_bug.cgi?id=672527
On 7/19/11 12:05 PM, Stack wrote:
Set region size very large (In trunk you can actually disable splitting).
St.Ack
On Tue, Jul 19, 2011 at 8:26 AM, Daniel Einspanjer
wrote:
We use a queue table
We use a queue table like this too and ran into the same problem. How
did you configure it such that it never splits?
-Daniel
On 7/16/11 4:24 PM, Stack wrote:
I learned friday that our fellas on the frontend are using an hbase
table to do simple queuing. They insert stuff to be processed by
Mozilla is using it for a test project that is building a datawarehouse
on top of our bugzilla installation. While it is still a bit young, it
is usable and very exciting, not only for the searching capabilities,
but also for the application friendly extensions to HBase such as linked
fields,
I set up a googledocs spreadsheet a while back and shared it on this
list that specifically broke down the costs associated with doing
SuperMicro 2 node 2U or 4 node 2U servers. The problem with the 2.5
inch drives is that you can't get a large drive that is enterprise class
(important for vib
o pin
on your chest if you are working toward getting HBase commit access.
(Did I mention getting paid for it?)
Thanks for your time,
Daniel Einspanjer
Metrics Architect
Mozilla Corporation
o non-HDFS data, you can literally wedge it in like
8gb. The biggest things that are not HDFS data are logs, and those
can go into the HDFS partition, they tend to be low volume but can add
up over time since the default is not to reap them.
On Thu, Sep 30, 2010 at 4:17 PM, Daniel Einspanjer
ge it in like
8gb. The biggest things that are not HDFS data are logs, and those
can go into the HDFS partition, they tend to be low volume but can add
up over time since the default is not to reap them.
On Thu, Sep 30, 2010 at 4:17 PM, Daniel Einspanjer
wrote:
Right now, most of our boxes h
Right now, most of our boxes have 3 disk in them. We take a small
partition on each of those and raid stripe them together to use as the
OS partition then allocate the rest of the disks as JBOD for HDFS storage.
We are building out a new cluster and I'm wondering if there are any
better ideas
Question regarding configuration and tuning...
Our current configuration/schema has fairly low hlog rollover sizes to
keep the possibility of data loss to a minimum. When we upgrade to .89
with append support, I imagine we'll be able to safely set this to a
much larger size. Are there any r
I've been trying to figure out what specs our next HBase cluster
should have. That mostly involves considering the balance between #
nodes, disks, memory, and CPU.
I put together this rough Google Docs spreadsheet with inaccurate but
somewhat relative prices for some SuperMicro enclosures th
Really good point about the firewall loophole, thanks for bringing it up.
The code that I wrote is very much the bridge daemon you suggested. So I
guess it just needs to remain living separately from Thrift.
-Daniel
On 9/10/10 12:05 PM, Time Less wrote:
If it were to be included in HBase in
able to offer an ASF grant we could include this in hbase.
-ryan
On Thu, Sep 9, 2010 at 5:36 PM, Daniel Einspanjer
wrote:
Cross posting my recent blog entry...
As documented in THRIFT-601, sending random data to Thrift can cause it to
leak memory.
At Mozilla, we use a web load balanc
Cross posting my recent blog entry...
As documented in THRIFT-601, sending random data to Thrift can cause it
to leak memory.
At Mozilla, we use a web load balancer to distribute traffic to our
Thrift machines, and the default liveness check it uses is a simple TCP
connect. We also had Nag
Xavier recently mentioned some code we use at Mozilla that should help here.
It is a unioning scanner that would let you define a list of scan ranges to run
for the job. You'd set a prefix for each Friday in the selected months and the
range of 143000 through 143060
Then you'd apply a filter on t
Matthew,
Maybe instead of changing the replication factor, you could spin up new nodes
with a different datacenter/rack configuration which would cause hadoop to
ensure the replicas are not solely on those temp nodes?
Matthew LeMieux wrote:
J-D,
Thank you for the very fast response.
org.apache.hadoop.hbase.master.RegionServerOperation: Updated row
crash_reports,21006172b7ec9f5-dcad-4c98-9dc5-969532100617,1276788891647
in region .META.,,1 with startcode=1276778868841, server=1
0.2.72.74:60020
On 6/17/10 11:42 AM, Daniel Einspanjer wrote:
Currently, in our production cluster, almost all of the
Currently, in our production cluster, almost all of the traffic for a
day ends up assigned to a single RS and that causes the load on that
machine to be too high.
With our last release, we salted our rowkeys so that rather than
starting with the date:
100617
they now start with the firs
I didn't realize that Lily was that far along, I thought you were
still in R&D for a few more months. This sounds very promising and
we'll take a look at what you have available.
-Daniel
On 6/8/10 3:38 AM, Steven Noels wrote:
On Tue, Jun 8, 2010 at 2:55 AM, Daniel Einspanjer
At the moment, we want to do nothing more than execute a callback
function *after* a get/incr/delete has sucessfully completed. If the
callback fails to execute for some reason, we'd want to log an error,
but wouldn't want it to have any impact on the HBase side of things.
This is an extreme
We are specifically looking for the ability to create callbacks on
put, increment, and delete for specific tables so we can implement the
indexing solution. This is actually advance preparation for Socorro 2.0
which won't be released until August or maybe September, so we have some
dev time.
rue'}, {NAME => 'processed_data',
VERSIONS => '3', COMPRESSION => 'LZO', TTL => '2147483647', BLOCKSIZE =>
'65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME =>
'raw_data', COMPRESSION => 'LZO', VERSIONS => '3', TTL => '2147483647',
BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}
Is there any other information I should provide that could lead to other
important config changes we should make on this upgrade?
Daniel Einspanjer
Mozilla Corporation
Mozilla is taking a hard look at using Elastic Search as an
indexing/searching mechanism for Socorro 2.0. We're evaluating the
possibility of using HBASE-2001 patch as a mechanism to be able to hook
in NRT indexing of the documents.
-Daniel
On 6/3/10 5:36 PM, Steven Noels wrote:
On Thu, Ju
rther about use cases and
implementation plans to see where we might be able to effectively collaborate.
Daniel Einspanjer
Metrics Architect
Mozilla Corporation
Just wanted to see if anyone knew of any potential problems with my
desired HBase schema for this project.
Please feel free to comment here or on the discussion page of the wiki.
https://wiki.mozilla.org/BouncerRealTimeMetricsProject
Daniel Einspanjer
Metrics Architect
Mozilla Corporation
24 matches
Mail list logo