>
> Can the sstableloader job run from outside a Cassandra node? or it has to
> be run from inside Cassandra node.
>
Yes, I'm a fan of running sstableloader on a server that is not one of the
nodes in the cluster. You can maximise the throughput by running multiple
instances
Hello Erick,
I have one more question.
Can the sstableloader job run from outside a Cassandra node? or it has to
be run from inside Cassandra node.
When I tried it from the cassandra node it worked but when I try to run it
from outside the cassandra cluster(a standalone machine which doesn
Thanks Erick, I will go through the posts and get back if I have any
questions.
On Mon, Nov 9, 2020 at 1:58 PM Erick Ramirez
wrote:
> A few months ago, I was asked a similar question so I wrote instructions
> for this. It depends on whether the clusters are identical or not. The
> posts define w
A few months ago, I was asked a similar question so I wrote instructions
for this. It depends on whether the clusters are identical or not. The
posts define what "identical" means.
If the source and target cluster are identical in configuration, follow the
procedure here -- https://community.datas
Hello,
I have few questions regarding restoring the data from snapshots using
sstableloader.
If i have a 6 node cassandra cluster with VNODEs(256) and I have taken
snapshot of all 6 nodes and if I have to restore to another cluster
1. Does the target cluster have to be of the same size?
2. If 1
Ok, thanks very much the answer!
On Fri, Feb 7, 2020 at 9:00 PM Erick Ramirez wrote:
> INFO [pool-1-thread-4] 2020-02-08 01:35:37,946 NoSpamLogger.java:91 -
>> Maximum memory usage reached (536870912), cannot allocate chunk of 1048576
>>
>
> The message gets logged when SSTables are being cache
>
> INFO [pool-1-thread-4] 2020-02-08 01:35:37,946 NoSpamLogger.java:91 -
> Maximum memory usage reached (536870912), cannot allocate chunk of 1048576
>
The message gets logged when SSTables are being cached and the cache fills
up faster than objects are evicted from it. Note that the message is
Hi folks,
When sstableloader hits a very large sstable cassandra may end up logging a
message like this:
INFO [pool-1-thread-4] 2020-02-08 01:35:37,946 NoSpamLogger.java:91 -
Maximum memory usage reached (536870912), cannot allocate chunk of 1048576
The loading process doesn't abort, an
ndra.apache.org"
Subject: Re: sstableloader: How much does it actually need?
Message from External Sender
Yes you will have all the data in two nodes provided there is no mutation drop
at node level or data is repaired
For example if you data A,B,C and D. with RF=3 and 4 nodes (node1, nod
nodes would *not* have all the data; but am more than willing to
> learn.
>
> On the other thing: that's an attractive option, but in our case, the
> target cluster will likely come into use before the source-cluster data is
> available to load. Seemed to me the safest approach was s
er data is
available to load. Seemed to me the safest approach was sstableloader.
Thanks
On Wed, Feb 5, 2020 at 6:56 PM Erick Ramirez wrote:
> Unfortunately, there isn't a guarantee that 2 nodes alone will have the
> full copy of data. I'd rather not say "it depends". 😁
>
> Another option is the DSE-bulk loader but it will require to convert to
> csv/json (good option if you don't like to play with sstableloader and deal
> to get all the sstables from all the nodes)
> https://docs.datastax.com/en/dsbulk/doc/index.html
>
Thanks, Sergio. T
DSE-bulk loader but it will require to convert to
> csv/json (good option if you don't like to play with sstableloader and deal
> to get all the sstables from all the nodes)
> https://docs.datastax.com/en/dsbulk/doc/index.html
>
> Cheers
>
> Sergio
>
> Il giorno me
Another option is the DSE-bulk loader but it will require to convert to
csv/json (good option if you don't like to play with sstableloader and deal
to get all the sstables from all the nodes)
https://docs.datastax.com/en/dsbulk/doc/index.html
Cheers
Sergio
Il giorno mer 5 feb 2020 alle o
tool
refresh. If the target cluster is already built and you can't assign the
same tokens then sstableloader is your only option. Cheers!
P.S. No need to apologise for asking questions. That's what we're all here
for. Just keep them coming. 👍
>
Scenario: Cassandra 3.11.x, 4 nodes, RF=3; moving to identically-sized
cluster via snapshots and sstableloader.
As far as I can tell, in the topology given above, any 2 nodes contain all
of the data. In terms of migrating this cluster, would there be any
downsides or risks with snapshotting and
rather than 4x
sstableloader in parallel).
Also, thanks to everyone for confirming no issue with num_tokens and
sstableloader; appreciate it.
On Mon, Jan 27, 2020 at 9:02 AM Durity, Sean R
wrote:
> I would suggest to be aware of potential data size expansion. If you load
> (for example) three copi
original data size (or,
origin RF * target RF), until compaction can run.
Sean Durity – Staff Systems Engineer, Cassandra
From: Erick Ramirez
Sent: Friday, January 24, 2020 11:03 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: sstableloader & num_tokens change
If I may just loop this
Hello
Concerning the original question, I agreed with @eric_ramirez,
sstableloader is transparent for token allocation number.
just for info @voytek, check this post out
https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html
You lay be interested to now if
On the subject of DSBulk, sstableloader is the tool of choice for this
scenario.
+1 to Sergio and I'm confirming that DSBulk is designed as a bulk loader
for CSV/JSON formats. Cheers!
> If I may just loop this back to the question at hand:
>
> I'm curious if there are any gotchas with using sstableloader to restore
> snapshots taken from 256-token nodes into a cluster with 32-token (or your
> preferred number of tokens) nodes (otherwise same # of nodes
If I may just loop this back to the question at hand:
I'm curious if there are any gotchas with using sstableloader to restore
snapshots taken from 256-token nodes into a cluster with 32-token (or your
preferred number of tokens) nodes (otherwise same # of nodes and same RF).
On Fri, J
ulk support migration cluster to cluster without CSV or JSON
> export?
>
> Thanks and Regards
>
> On Fri, Jan 24, 2020, 8:34 AM Nitan Kainth wrote:
>
>> Instead of sstableloader consider dsbulk by datastax.
>>
>> On Fri, Jan 24, 2020 at 10:20 AM Reid Pinchback &l
migration cluster to cluster without CSV or JSON export?
>
> Thanks and Regards
>
>> On Fri, Jan 24, 2020, 8:34 AM Nitan Kainth wrote:
>> Instead of sstableloader consider dsbulk by datastax.
>>
>>> On Fri, Jan 24, 2020 at 10:20 AM Reid Pinchback
&g
Why? Seems to me that the old Cassandra -> CSV/JSON and CSV/JSON -> new
Cassandra are unnecessary steps in my case.
On Fri, Jan 24, 2020 at 10:34 AM Nitan Kainth wrote:
> Instead of sstableloader consider dsbulk by datastax.
>
> On Fri, Jan 24, 2020 at 10:20 AM Reid Pinchb
of sstableloader consider dsbulk by datastax.
>
> On Fri, Jan 24, 2020 at 10:20 AM Reid Pinchback <
> rpinchb...@tripadvisor.com> wrote:
>
>> Jon Haddad has previously made the case for num_tokens=4. His Accelerate
>> 2019 talk is available at:
>>
>>
>
Instead of sstableloader consider dsbulk by datastax.
On Fri, Jan 24, 2020 at 10:20 AM Reid Pinchback
wrote:
> Jon Haddad has previously made the case for num_tokens=4. His Accelerate
> 2019 talk is available at:
>
>
>
> https://www.youtube.com/watch?v=swL7bCnolkU
>
&g
. The
caveats are explored at:
https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html
From: Voytek Jarnot
Reply-To: "user@cassandra.apache.org"
Date: Friday, January 24, 2020 at 10:39 AM
To: "user@cassandra.apache.org"
Subject: sstabl
will likely be using sstableloader to do so. I'm curious if there are any
gotchas with using sstableloader to restore snapshots taken from 256-token
nodes into a cluster with 32-token nodes (otherwise same # of nodes and
same RF).
Thanks in advance.
: Anthony Goetz
Subject: [EXTERNAL] Re: Sstableloader
Thank you Anthony and Jonathan. To add new ring it doesn't have to be same
version of Cassandra right. For ex dse 5.12 which is 3.11.0 has stables with mc
name and apache 3.11.3 also uses sstables name with mc . We should be still
able to add
SE DC
>
>
>
> Note: OpsCenter will stop working once you add OSS nodes.
>
>
>
> *From: *Jonathan Koppenhofer
> *Reply-To: *Cassandra User List
> *Date: *Wednesday, May 29, 2019 at 6:45 PM
> *To: *Cassandra User List
> *Subject: *[EXTERNAL] Re: Sstableloader
&
add OSS nodes.
>
>
>
> *From: *Jonathan Koppenhofer
> *Reply-To: *Cassandra User List
> *Date: *Wednesday, May 29, 2019 at 6:45 PM
> *To: *Cassandra User List
> *Subject: *[EXTERNAL] Re: Sstableloader
>
>
>
> Has anyone tried to do a DC switch as a means to migra
User List
Date: Wednesday, May 29, 2019 at 6:45 PM
To: Cassandra User List
Subject: [EXTERNAL] Re: Sstableloader
Has anyone tried to do a DC switch as a means to migrate from Datastax to OSS?
This would be the safest route as the ability to revert back to Datastax is
easy. However, I'm curiou
me, it should work
>
>
> Regards,
>
> Nitan
>
> Cell: 510 449 9629
>
> On May 28, 2019, at 4:21 PM, Rahul Reddy wrote:
>
> Hello,
>
> Does sstableloader works between datastax and Apache cassandra. I'm trying
> to migrate dse 5.0.7 to Apache 3.11.1 ?
>
>
If cassandra version is same, it should work
Regards,
Nitan
Cell: 510 449 9629
> On May 28, 2019, at 4:21 PM, Rahul Reddy wrote:
>
> Hello,
>
> Does sstableloader works between datastax and Apache cassandra. I'm trying to
> migrate dse 5.0.7 to Apache 3.11.1 ?
Hello,
I can't answer this question about the sstableloader (even though I think
it should be ok). My understanding, even though I'm not really up to date
with latest Datastax work, is that DSE uses a modified but compatible
version of Cassandra, for everything that is not
Hello,
Does sstableloader works between datastax and Apache cassandra. I'm trying
to migrate dse 5.0.7 to Apache 3.11.1 ?
--
Hi all,
I've been running into the following issue while trying to restore a C*
database via sstableloader:
Could not retrieve endpoint ranges:
org.apache.thrift.transport.TTransportException: Frame size (352518912)
larger than max length (15728640)!
java.lang.RuntimeException: Cou
Hello community,
I'm receiving some strange streaming errors while trying to restore certain
sstables snapshots with sstableloader to a new cluster.
While the cluster is up and running and nodes are communicating with
each other, I can see streams failing to the nodes with no obvious reaso
On Mon, Dec 3, 2018 at 4:24 PM Oliver Herrmann
wrote:
>
> You are right. The number of nodes in our cluster is equal to the
> replication factor. For that reason I think it should be sufficient to call
> sstableloader only from one node.
>
The next question is then: do
environment the user
>> that would do the restore does not have write access to the data folder.
>>
>
> OK, not entirely sure that's a reasonable setup, but do you imply that
> with sstableloader you don't need to process every snapshot taken -- that
> is, also visiting
It's a bug in the sstableloader introduced many years ago - before that, it
worked as described in documentation...
Oliver Herrmann at "Fri, 30 Nov 2018 17:05:43 +0100" wrote:
OH> Hi,
OH> I'm having some problems to restore a snapshot using sstableloader. I
re that's a reasonable setup, but do you imply that with
sstableloader you don't need to process every snapshot taken -- that is,
also visiting every node? That would only be true if your replication
factor equals to the number of nodes, IMO.
--
Alex
shot directory to the data directory and then running `nodetool refresh` is the supported way. Why use sstableloader for that?--Alex
-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org
cass_testapp) but then I get the message
>>> "Skipping file mc-11-big-Data.db: table snap1.snap1. doesn't exist".
>>>
>>
>> Hi,
>>
>> I imagine moving the files from snapshot directory to the data directory
>> and then running `nodetool refresh` is the supported way. Why use
>> sstableloader for that?
>>
>> --
>> Alex
>>
>>
er into the keyspace name (cass_testapp) but then I get the message
>> "Skipping file mc-11-big-Data.db: table snap1.snap1. doesn't exist".
>>
>
> Hi,
>
> I imagine moving the files from snapshot directory to the data directory
> and then running `nodetool refresh` is the supported way. Why use
> sstableloader for that?
>
> --
> Alex
>
>
the message
> "Skipping file mc-11-big-Data.db: table snap1.snap1. doesn't exist".
>
Hi,
I imagine moving the files from snapshot directory to the data directory
and then running `nodetool refresh` is the supported way. Why use
sstableloader for that?
--
Alex
Hi,
I'm having some problems to restore a snapshot using sstableloader. I'm
using cassandra 3.11.1 and followed the instructions for a creating and
restoring from this page:
https://docs.datastax.com/en/dse/6.0/dse-admin/datastax_enterprise/tools/toolsSStables/toolsBulkloader.html
8:00, Kalyan Chakravarthy a
écrit :
> I’m trying to migrate data between two clusters on different networks.
> Ports: 7001,7199,9046,9160 are open between them. But port:7000 is not
> open. When I run sstableloader command, got the following exception.
> Command:
>
> :/a/cassandra/
I’m trying to migrate data between two clusters on different networks. Ports:
7001,7199,9046,9160 are open between them. But port:7000 is not open. When I
run sstableloader command, got the following exception.
Command:
:/a/cassandra/bin# ./sstableloader -d
192.168.98.99/abc/cassandra/data
Hi,
I’m new to Cassandra, please help me with sstableloader. Thank you in advance.
I’m trying to migrate data between two clusters which are on different networks.
Migrating data from ‘c1’ to ‘c2’
Which one will be the source and which one will be destination??
And where should I run
ot the
> replica data
> - yes, if you want to use nodetool refresh as some sort of recovery
> solution, MAKE SURE YOU STORE THE TOKEN LIST with the
> sstables/snapshots/backups for the nodes.
>
> On Wed, Aug 29, 2018 at 8:57 AM Durity, Sean R <
> sean_r_dur...@homedepot.com>
/snapshots/backups for the nodes.
On Wed, Aug 29, 2018 at 8:57 AM Durity, Sean R
wrote:
> Sstableloader, though, could require a lot more disk space – until
> compaction can reduce. For example, if your RF=3, you will essentially be
> loading 3 copies of the data. Then it will get replicat
Sstableloader, though, could require a lot more disk space – until compaction
can reduce. For example, if your RF=3, you will essentially be loading 3 copies
of the data. Then it will get replicated 3 more times as it is being loaded.
Thus, you could need up to 9x disk space.
Sean Durity
From
Removing dev...
Nodetool refresh only picks up new SSTables that have been placed in the
tables directory. It doesn't account for actual ownership of the data like
SSTableloader does. Refresh will only work properly if the SSTables you are
copying in are completely covered by that nodes token
Hi Cassandra users, Cassandra dev,
When recovering using SSTables from a snapshot, I want to know what are the
key differences between using:
1. Nodetool refresh and,
2. SSTableloader
Does nodetool refresh have restrictions that need to be met?
Does nodetool refresh work even if there is a
What’s the cardinality of hash?
Do they have the same schema? If so you may be able to take a snapshot and
hardlink it in / refresh instead of sstableloader. Alternatively you could drop
the index from the destination keyspace and add it back in after the load
finishes.
How big are the
What does “hash” Data look like?
Rahul
On Jul 24, 2018, 11:30 AM -0400, Arpan Khandelwal , wrote:
> I need to clone data from one keyspace to another keyspace.
> We do it by taking snapshot of keyspace1 and restoring in keyspace2 using
> sstableloader.
>
> Suppose we have follo
I need to clone data from one keyspace to another keyspace.
We do it by taking snapshot of keyspace1 and restoring in keyspace2 using
sstableloader.
Suppose we have following table with index on hash column. Table has around
10M rows.
-
CREATE TABLE message (
id uuid
Never mind found it. its not a supported version.
> On Jun 19, 2018, at 2:41 PM, rajpal reddy wrote:
>
>
> Hello,
>
> I’m trying to use sstablloader from dse 4.8.4( 2.1.12) to apache 3.11.1, i’m
> getting below error. but works fine when i use stableloader dse 5.1.2(apache
> 3.11.0)
> Could
Hello,
I’m trying to use sstablloader from dse 4.8.4( 2.1.12) to apache 3.11.1, i’m
getting below error. but works fine when i use stableloader dse 5.1.2(apache
3.11.0)
Could not retrieve endpoint ranges:
java.io.IOException: Failed to open transport to: host-ip:9160.
Any work around to use
that it’s Cassandra.Cassandra from
> root to he Data folder and either run as root or sudo it.
>
> If it’s compacted it won’t be there so you won’t have the file. I’m not
> aware of this event being communicated to Sstableloader via SEDA. Besides,
> the sstable that you are loading SH
compacted it won’t be there so you won’t have the file. I’m not aware
of this event being communicated to Sstableloader via SEDA. Besides, the
sstable that you are loading SHOULD not be live. If you at streaming a life
sstable, it means you are using sstableloader not as it is designed to be used
hanks!
On Sun, Feb 18, 2018 at 3:58 PM, Rahul Singh
wrote:
> Check permissions maybe? Who owns the files vs. who is running
> sstableloader.
>
> --
> Rahul Singh
> rahul.si...@anant.us
>
> Anant Corporation
>
> On Feb 18, 2018, 4:26 AM -0500, shalom sagges ,
> wr
Check permissions maybe? Who owns the files vs. who is running sstableloader.
--
Rahul Singh
rahul.si...@anant.us
Anant Corporation
On Feb 18, 2018, 4:26 AM -0500, shalom sagges , wrote:
> Hi All,
>
> C* version 2.0.14.
>
> I was loading some data to another cluster using SST
Hi All,
C* version 2.0.14.
I was loading some data to another cluster using SSTableLoader. The
streaming failed with the following error:
Streaming error occurred
java.lang.RuntimeException: java.io.*FileNotFoundException*:
/data1/keyspace1/table1/keyspace1-table1-jb-65174-Data.db (No such
Hi all,
I've been running into the following issue while trying to restore a C*
database via sstableloader:
Could not retrieve endpoint ranges:
org.apache.thrift.transport.TTransportException: Frame size (352518912)
larger than max length (15728640)!
java.lang.RuntimeException: Coul
l see this
exception:
sstableloader -d cass1
/snapshot_data/keyspace1/cf1-2195c1a0bc1011e69b699bbcfdee6372
Established connection to initial hosts
Opening sstables and calculating sections to stream
Streaming relevant part of
/snapshot_data/keyspace1/cf1-2195c1a0bc1011e69b699bbcfdee6372/keyspace1-
I'm trying to use sstableloader to bulk load some data to my 4 DC cluster,
and I can't quite get it to work. Here is how I'm trying to run it:
sstableloader -d 127.0.0.1 -i {csv list of private ips of nodes in cluster}
myks/mttest
At first this seems to work, with a steady st
/simonefranzini
On Fri, Feb 10, 2017 at 4:28 PM, Simone Franzini
wrote:
> I am trying to ingest some data from a cluster to a different cluster via
> sstableloader. I am running DSE 4.8.7 / Cassandra 2.1.14.
> I have re-created the schemas and followed other instructions here
I am trying to ingest some data from a cluster to a different cluster via
sstableloader. I am running DSE 4.8.7 / Cassandra 2.1.14.
I have re-created the schemas and followed other instructions here:
https://docs.datastax.com/en/cassandra/2.1/cassandra/tools/toolsBulkloader_t.html
I am initially
Hello,
It's about 2500 sstables worth 25TB of data.
-t parameter doesn't change -t 1000 and -t 1
Most probably I face some limitation at target cluster.
I'm preparing to split sstables and run up to ten parallel sstableloader
sessions.
Regards,
Osman
On 11-10-2016 21:46, Raj
ZGATLIOGLU <
osman.yozgatlio...@krontech.com> wrote:
> Hello,
>
> Thank you Adam and Rajath.
>
> I'll split input sstables and run parallel jobs for each.
> I tested this approach and run 3 parallel sstableloader job without -t
> parameter.
> I raised stream_throughput
Hello,
Thank you Adam and Rajath.
I'll split input sstables and run parallel jobs for each.
I tested this approach and run 3 parallel sstableloader job without -t
parameter.
I raised stream_throughput_outbound_megabits_per_sec parameter from 200 to 600
Mbit/sec at all of target nodes.
But
Hi Osman,
You cannot restart the streaming only to the failed nodes specifically. You
can restart the sstableloader job itself. Compaction will eventually take
care of the redundant rows.
- Rajath
Rajath Subramanyam
On Sun, Oct 9, 2016 at 7:38 PM, Adam Hutson wrote
It'll start over from the beginning.
On Sunday, October 9, 2016, Osman YOZGATLIOGLU <
osman.yozgatlio...@krontech.com> wrote:
> Hello,
>
> I have running a sstableloader job.
> Unfortunately some of nodes restarted since beginnig streaming.
> I see streaming stop for th
Hello,
I have running a sstableloader job.
Unfortunately some of nodes restarted since beginnig streaming.
I see streaming stop for those nodes.
Can I restart those streaming somehow?
Or if I restart sstableloader job, will it start from beginning?
Regards,
Osman
This e-mail message, including
Thank you for your answer Kai.
On 17 Aug 2016, at 11:34 , Kai Wang mailto:dep...@gmail.com>>
wrote:
yes, you are correct.
On Tue, Aug 16, 2016 at 2:37 PM, Jean Tremblay
mailto:jean.tremb...@zen-innovations.com>>
wrote:
Hi,
I’m using Cassandra 3.7.
In the documentation for sst
yes, you are correct.
On Tue, Aug 16, 2016 at 2:37 PM, Jean Tremblay <
jean.tremb...@zen-innovations.com> wrote:
> Hi,
>
> I’m using Cassandra 3.7.
>
> In the documentation for sstableloader I read the following:
>
> << Note: To get the best throughput
Hi,
I’m using Cassandra 3.7.
In the documentation for sstableloader I read the following:
<< Note: To get the best throughput from SSTable loading, you can use multiple
instances of sstableloader to stream across multiple machines. No hard limit
exists on the number of SSTable
back to node X.
If you do not have information on where the sstable comes from or if you
added / removed nodes, then using the sstableloader is probably a good
idea. If you really don't like sstableloader (not sure why), you can paste
all the sstables to all the nodes then nodetool refresh + nod
Hi,
in the docs it still says that the sstableloader still uses gossip
(
https://docs.datastax.com/en/cassandra/2.1/cassandra/tools/toolsBulkloader_t.html
http://docs.datastax.com/en/cassandra/3.x/cassandra/tools/toolsBulkloader.html
)
but this blog (
http://www.datastax.com/dev/blog/using-the
ion with the destination node.
>
> If not you should check what is the configured storage_port in the
> destination node and set that in the cassandra.yaml of the source node so
> it's picked up by sstableloader.
>
Can you telnet 10.211.55.8 7000? This is the port used for streaming
communication with the destination node.
If not you should check what is the configured storage_port in the
destination node and set that in the cassandra.yaml of the source node so
it's picked up by sstableloader.
2016-
Hello,
I am trying to load the SSTables (from a Titan graph keyspace) of a
one-node-cluster (C* v2.2.6) into another node, but I cannot figure out how to
properly use the sstableloader. The target keyspace and table exist in the
target node. If they do not exist I get a proper error message
Hi everyone
I am currently working with Cassandra 3.5. I would like to know if it is
possible to restore backups without using sstableloader. I have been
referring to the following pages in the datastax documentation:
https://docs.datastax.com/en/cassandra/3.x/cassandra/operations
1/11/16, 5:21 AM, "Noorul Islam K M" wrote:
>
>>
>>I have a need to stream data to new cluster using sstableloader. I
>>spawned a machine with 32 cores assuming that sstableloader scaled with
>>respect to cores. But it doesn't look like so.
>>
>
Make sure streaming throughput isn’t throttled on the destination cluster.
Stream from more machines (divide sstables between a bunch of machines, run in
parallel).
On 1/11/16, 5:21 AM, "Noorul Islam K M" wrote:
>
>I have a need to stream data to new cluster using
I have a need to stream data to new cluster using sstableloader. I
spawned a machine with 32 cores assuming that sstableloader scaled with
respect to cores. But it doesn't look like so.
I am getting an average throughput of 18 MB/s which seems to be pretty
low (I might be wrong).
Is ther
You only need patch for sstableloader.
You don't have to upgrade your cassandra servers at all.
So,
1. fetch the latest cassandra-2.1 source
$ git clone https://git-wip-us.apache.org/repos/asf/cassandra.git
$ cd cassandra
$ git checkout origin/cassandra-2.1
2. build it
$
hi, Yuki
Thank you very much!
The issue's description almost fits to my case!
1. My Cassandra version is 2.1.11
2. my table has several colomn with collection type
3. Before failed this time, I can use sstableloader to load the data
into this table, but
I got
This is known issue.
https://issues.apache.org/jira/browse/CASSANDRA-10700
It is fixed in not-yet-released version 2.1.13.
So, you need to build from the latest cassandra-2.1 branch to try.
On Mon, Dec 28, 2015 at 5:28 PM, 土卜皿 wrote:
> hi, all
>I used the sstableloader many
hi, all
I used the sstableloader many times successfully, but I got the
following error:
[root@localhost pengcz]# /usr/local/cassandra/bin/sstableloader -u user -pw
password -v -d 172.21.0.131 ./currentdata/keyspace/table
Could not retrieve endpoint ranges:
java.lang.IllegalArgumentException
to migrate near 1 TB of data from a 6-node cluster to a 3-node
one. Neither copying sstables/nodetool refresh seems a great option as
well. Unless I am missing something.
Using sstableloader seems a more logical option. Still a bottleneck if you
need to do it for every node in your source cluster
Hello George,
You can use sstable2json to create the json of your keyspace and then load
this json to your keyspace in new cluster using json2sstable utility.
On Tue, Dec 1, 2015 at 3:06 AM, Robert Coli wrote:
> On Thu, Nov 19, 2015 at 7:01 AM, George Sigletos
> wrote:
>
>> We would like to mig
On Thu, Nov 19, 2015 at 7:01 AM, George Sigletos
wrote:
> We would like to migrate one keyspace from a 6-node cluster to a 3-node
> one.
>
http://www.pythian.com/blog/bulk-loading-options-for-cassandra/
=Rob
Hello,
We would like to migrate one keyspace from a 6-node cluster to a 3-node one.
Since an individual node does not contain all data, this means that we
should run the sstableloader 6 times, one for each node of our cluster.
To be precise, do "nodetool flush " then run sstablelo
Tks,Rob. We use spark-cassandra-connector to read data from table, then do
repartition action.
If some nodes with large file bring out running this tasktoo slow, maybe
serveral hours which is unacceptable.
But those nodes with small file running finished quickly.
So I think if sstableloader can
On Thu, Nov 12, 2015 at 6:44 AM, qihuang.zheng wrote:
> question is : why sstableloader can’t balance data file size?
>
Because it streams ranges from the source SStable to a distributed set of
ranges, especially if you are using vnodes.
It is a general property of Cassandra's str
onnecot to read table and repartition. Spark repartition job
below indicate:
If nodes has none data.db like first two nodes, InputSize is 0.0B,and nodes
with large files like the last one running too long!
My question is : why sstableloader can’t balance data file size?
Tks,qihuang.zheng
1 - 100 of 258 matches
Mail list logo