Sorry for my off topic question, it's the first time I heard of swift storage 
servers running an ARM processor. 

I think ARM architectures can be an interesting alternative to replace classic 
computing servers, but swift storage nodes did not catch my attention as a 
valid alternative, until now.

From my perspective, the savings in power consumption and real state in your 
datacenter are minimized if you load your servers with two dozen SATA disks (a 
typical swift node configuration) and a dual 10GbE connection, for example.

May be there are benefits I'm not aware of, but I would really love to hear 
about them :-)

Cheers
Diego

Enviado desde mi iPhone, perdona la brevedad

El 27/10/2012, a las 02:26, Nathan Trueblood <nat...@truebloodllc.com> escribió:

> Hey folks-
> 
> I'm trying to figure out what's going wrong with my Swift deployment on a 
> small cluster of "mini" servers.   I have a small test cluster (5 storage 
> nodes, 1 proxy) of mini-servers that are ARM-based.   The proxy is a regular, 
> Intel-based server with plenty of RAM.   The object/account/container servers 
> are relatively small, with 2GB of RAM per node.
> 
> Everything starts up fine, but now I'm trying to troubleshoot a strange 
> problem.   After I successfully upload a few test files, it seems like the 
> storage system stops responding and the proxy gives me a 503 error.
> 
> Here's the test sequence I run on my proxy:
> 
> lab@proxy01:~/bin$ ./swiftcl.sh stat
> swift -A http://proxy01:8080/auth/v1.0 -U system:root -K testpass stat
>    Account: AUTH_system
> Containers: 5
>    Objects: 4
>      Bytes: 47804968
> Accept-Ranges: bytes
> X-Timestamp: 1351294912.72119
> lab@proxy01:~/bin$ ./swiftcl.sh upload myfiles1 /home/lab/bigfile1 
> swift -A http://proxy01:8080/auth/v1.0 -U system:root -K testpass upload 
> myfiles1 /home/lab/bigfile1
> home/lab/bigfile1
> lab@proxy01:~/bin$ ./swiftcl.sh upload myfiles2 /home/lab/bigfile1 
> swift -A http://proxy01:8080/auth/v1.0 -U system:root -K testpass upload 
> myfiles2 /home/lab/bigfile1
> home/lab/bigfile1
> lab@proxy01:~/bin$ ./swiftcl.sh upload myfiles3 /home/lab/bigfile1 
> swift -A http://proxy01:8080/auth/v1.0 -U system:root -K testpass upload 
> myfiles3 /home/lab/bigfile1
> home/lab/bigfile1
> lab@proxy01:~/bin$ ./swiftcl.sh upload myfiles4 /home/lab/bigfile1 
> swift -A http://proxy01:8080/auth/v1.0 -U system:root -K testpass upload 
> myfiles4 /home/lab/bigfile1
> home/lab/bigfile1
> lab@proxy01:~/bin$ ./swiftcl.sh upload myfiles5 /home/lab/bigfile1 
> swift -A http://proxy01:8080/auth/v1.0 -U system:root -K testpass upload 
> myfiles5 /home/lab/bigfile1
> Object PUT failed: 
> http://172.16.1.111:8080/v1/AUTH_system/myfiles5/home/lab/bigfile1 503 
> Service Unavailable  [first 60 chars of response] 503 Service Unavailable
> 
> The server is currently unavailable
> lab@proxy01:~/bin$ ./swiftcl.sh stat
> swift -A http://proxy01:8080/auth/v1.0 -U system:root -K testpass stat
>    Account: AUTH_system
> Containers: 6
>    Objects: 5
>      Bytes: 59756210
> Accept-Ranges: bytes
> X-Timestamp: 1351294912.72119
> 
> Here's the corresponding log on the Proxy:
> 
> Oct 26 17:06:52 proxy01 proxy-server - 127.0.0.1 27/Oct/2012/00/06/52 GET 
> /auth/v1.0/ HTTP/1.0 200 - - - - - - - - 0.0010
> Oct 26 17:07:13 proxy01 proxy-server - 127.0.0.1 27/Oct/2012/00/07/13 GET 
> /auth/v1.0/ HTTP/1.0 200 - - - - - - - - 0.0017
> Oct 26 17:07:13 proxy01 proxy-server - 127.0.0.1 27/Oct/2012/00/07/13 GET 
> /auth/v1.0/ HTTP/1.0 200 - - - - - - - - 0.0016
> Oct 26 17:07:22 proxy01 proxy-server - 127.0.0.1 27/Oct/2012/00/07/22 GET 
> /auth/v1.0/ HTTP/1.0 200 - - - - - - - - 0.0010
> Oct 26 17:07:22 proxy01 proxy-server - 127.0.0.1 27/Oct/2012/00/07/22 GET 
> /auth/v1.0/ HTTP/1.0 200 - - - - - - - - 0.0016
> Oct 26 17:07:27 proxy01 proxy-server - 127.0.0.1 27/Oct/2012/00/07/27 GET 
> /auth/v1.0/ HTTP/1.0 200 - - - - - - - - 0.0010
> Oct 26 17:07:27 proxy01 proxy-server - 127.0.0.1 27/Oct/2012/00/07/27 GET 
> /auth/v1.0/ HTTP/1.0 200 - - - - - - - - 0.0016
> Oct 26 17:07:27 proxy01 proxy-server Handoff requested (1) (txn: 
> tx6946419daba54efe9c2878f8a2a78f88) (client_ip: 172.16.1.111)
> Oct 26 17:07:27 proxy01 proxy-server Handoff requested (2) (txn: 
> tx6946419daba54efe9c2878f8a2a78f88) (client_ip: 172.16.1.111)
> Oct 26 17:07:33 proxy01 proxy-server - 127.0.0.1 27/Oct/2012/00/07/33 GET 
> /auth/v1.0/ HTTP/1.0 200 - - - - - - - - 0.0010
> Oct 26 17:07:33 proxy01 proxy-server - 127.0.0.1 27/Oct/2012/00/07/33 GET 
> /auth/v1.0/ HTTP/1.0 200 - - - - - - - - 0.0016
> Oct 26 17:07:33 proxy01 proxy-server Handoff requested (1) (txn: 
> tx5f9659f74cb2491f9a63cbb84f680c5c) (client_ip: 172.16.1.111)
> Oct 26 17:07:33 proxy01 proxy-server Handoff requested (2) (txn: 
> tx5f9659f74cb2491f9a63cbb84f680c5c) (client_ip: 172.16.1.111)
> Oct 26 17:07:39 proxy01 proxy-server - 127.0.0.1 27/Oct/2012/00/07/39 GET 
> /auth/v1.0/ HTTP/1.0 200 - - - - - - - - 0.0009
> Oct 26 17:07:39 proxy01 proxy-server - 127.0.0.1 27/Oct/2012/00/07/39 GET 
> /auth/v1.0/ HTTP/1.0 200 - - - - - - - - 0.0009
> Oct 26 17:07:39 proxy01 proxy-server Handoff requested (1) (txn: 
> tx8dc917a4a8c84c40a4429b7bab0323c6) (client_ip: 172.16.1.111)
> Oct 26 17:07:39 proxy01 proxy-server Handoff requested (2) (txn: 
> tx8dc917a4a8c84c40a4429b7bab0323c6) (client_ip: 172.16.1.111)
> Oct 26 17:07:40 proxy01 proxy-server Object PUT returning 503, 1/2 required 
> connections (txn: tx8dc917a4a8c84c40a4429b7bab0323c6) (client_ip: 
> 172.16.1.111)
> Oct 26 17:07:41 proxy01 proxy-server Object PUT returning 503, 1/2 required 
> connections (txn: tx07a1f5dfaa23445a88eaa4a2ade68466) (client_ip: 
> 172.16.1.111)
> Oct 26 17:07:43 proxy01 proxy-server Object PUT returning 503, 1/2 required 
> connections (txn: tx938d08b706844db3886695b798bd9fad) (client_ip: 
> 172.16.1.111)
> Oct 26 17:07:47 proxy01 proxy-server Object PUT returning 503, 1/2 required 
> connections (txn: txa35e9f8a54924f139e13d6f3a5dc457f) (client_ip: 
> 172.16.1.111)
> Oct 26 17:07:55 proxy01 proxy-server Object PUT returning 503, 1/2 required 
> connections (txn: txc2f5c35b54a14b32a2d0178423bc7771) (client_ip: 
> 172.16.1.111)
> Oct 26 17:08:11 proxy01 proxy-server Object PUT returning 503, 1/2 required 
> connections (txn: txd8df89e76edd40f3b4c04091b3cbc857) (client_ip: 
> 172.16.1.111)
> Oct 26 17:08:17 proxy01 proxy-server - 127.0.0.1 27/Oct/2012/00/08/17 GET 
> /auth/v1.0/ HTTP/1.0 200 - - - - - - - - 0.0011
> 
> And here's the log from one of the Storage Nodes:
> 
> Oct 26 17:07:20 data05 container-server 192.168.1.111 - - 
> [27/Oct/2012:00:07:20 +0000] "PUT /sda6/200589/AUTH_system/myfiles1" 201 - 
> "tx063cd0643ef64d169b3bc5a86a808df4" "-" "-" 0.0510
> Oct 26 17:07:20 data05 account-server 192.168.1.204 - - [27/Oct/2012:00:07:20 
> +0000] "PUT /sda6/116021/AUTH_system/myfiles1" 201 - 
> "tx063cd0643ef64d169b3bc5a86a808df4" "-" "-" 0.0154 ""
> Oct 26 17:07:20 data05 object-server 192.168.1.111 - - [27/Oct/2012:00:07:20 
> +0000] "HEAD /sda6/36256/AUTH_system/myfiles1/home/lab/bigfile1" 404 - "-" 
> "tx6f1018e5da4d4974a85d544a0874a581" "-" 0.0015
> Oct 26 17:07:21 data05 object-server 192.168.1.111 - - [27/Oct/2012:00:07:21 
> +0000] "PUT /sda6/36256/AUTH_system/myfiles1/home/lab/bigfile1" 201 - "-" 
> "tx59f931d3281d41159eba0e0c88c18107" "-" 1.0255
> Oct 26 17:07:21 data05 container-server 192.168.1.202 - - 
> [27/Oct/2012:00:07:21 +0000] "PUT 
> /sda6/200589/AUTH_system/myfiles1/home/lab/bigfile1" 201 - 
> "tx59f931d3281d41159eba0e0c88c18107" "-" "-" 0.0218
> Oct 26 17:07:24 data05 account-replicator Beginning replication run
> Oct 26 17:07:25 data05 account-replicator Replication run OVER
> Oct 26 17:07:25 data05 account-replicator Attempted to replicate 1 dbs in 
> 0.21688 seconds (4.61073/s)
> Oct 26 17:07:25 data05 account-replicator Removed 0 dbs
> Oct 26 17:07:25 data05 account-replicator 2 successes, 0 failures
> Oct 26 17:07:25 data05 account-replicator no_change:0 ts_repl:0 diff:2 
> rsync:0 diff_capped:0 hashmatch:0 empty:0
> Oct 26 17:07:28 data05 account-server 192.168.1.204 - - [27/Oct/2012:00:07:28 
> +0000] "PUT /sda6/116021/AUTH_system/myfiles1" 201 - "-" "-" "-" 0.0021 ""
> Oct 26 17:07:29 data05 account-server 192.168.1.205 - - [27/Oct/2012:00:07:29 
> +0000] "PUT /sda6/116021/AUTH_system/myfiles2" 201 - 
> "txbace9f92a41a4645beb9159a802fc12d" "-" "-" 0.0178 ""
> Oct 26 17:07:30 data05 object-server 192.168.1.111 - - [27/Oct/2012:00:07:30 
> +0000] "PUT /sda6/63086/AUTH_system/myfiles2/home/lab/bigfile1" 201 - "-" 
> "txa4fac7ceab2948ae9010b4ce4cb56b97" "-" 0.7135
> Oct 26 17:07:33 data05 object-replicator Starting object replication pass.
> Oct 26 17:07:33 data05 object-replicator 4/4 (100.00%) partitions replicated 
> in 0.17s (23.71/sec, 0s remaining)
> Oct 26 17:07:33 data05 object-replicator 4 suffixes checked - 50.00% hashed, 
> 0.00% synced
> Oct 26 17:07:33 data05 object-replicator Partition times: max 0.1012s, min 
> 0.0164s, med 0.0273s
> Oct 26 17:07:33 data05 object-replicator Object replication complete. (0.00 
> minutes)
> Oct 26 17:07:34 data05 account-server 192.168.1.206 - - [27/Oct/2012:00:07:34 
> +0000] "PUT /sda6/116021/AUTH_system/myfiles3" 201 - 
> "txa547450c156d4df9ad7e42c04dcabd8f" "-" "-" 0.0168 ""
> Oct 26 17:07:34 data05 container-server 192.168.1.111 - - 
> [27/Oct/2012:00:07:34 +0000] "PUT /sda6/118594/AUTH_system/myfiles3" 202 - 
> "txa547450c156d4df9ad7e42c04dcabd8f" "-" "-" 0.0399
> Oct 26 17:07:35 data05 container-server 192.168.1.206 - - 
> [27/Oct/2012:00:07:35 +0000] "PUT 
> /sda6/118594/AUTH_system/myfiles3/home/lab/bigfile1" 201 - 
> "tx6946419daba54efe9c2878f8a2a78f88" "-" "-" 0.0022
> Oct 26 17:07:35 data05 object-server 192.168.1.111 - - [27/Oct/2012:00:07:35 
> +0000] "PUT /sda6/35436/AUTH_system/myfiles3/home/lab/bigfile1" 201 - "-" 
> "tx6946419daba54efe9c2878f8a2a78f88" "-" 0.6806
> Oct 26 17:07:36 data05 account-server 192.168.1.205 - - [27/Oct/2012:00:07:36 
> +0000] "PUT /sda6/116021/AUTH_system/myfiles2" 201 - "-" "-" "-" 0.0021 ""
> Oct 26 17:07:37 data05 object-auditor Begin object audit "forever" mode (ZBF)
> Oct 26 17:07:37 data05 object-auditor Object audit (ZBF) "forever" mode 
> completed: 0.04s. Total quarantined: 0, Total errors: 0, Total files/sec: 
> 89.46 , Total bytes/sec: 0.00, Auditing time: 0.04, Rate: 0.95
> Oct 26 17:07:40 data05 account-server 192.168.1.204 - - [27/Oct/2012:00:07:40 
> +0000] "PUT /sda6/116021/AUTH_system/myfiles4" 201 - 
> "txe095c94f8e5c4c4681a8d6a9c330bb7b" "-" "-" 0.0200 ""
> Oct 26 17:07:40 data05 container-server 192.168.1.111 - - 
> [27/Oct/2012:00:07:40 +0000] "PUT /sda6/217542/AUTH_system/myfiles4" 202 - 
> "txe095c94f8e5c4c4681a8d6a9c330bb7b" "-" "-" 0.0671
> Oct 26 17:07:40 data05 container-server 192.168.1.206 - - 
> [27/Oct/2012:00:07:40 +0000] "PUT 
> /sda6/217542/AUTH_system/myfiles4/home/lab/bigfile1" 201 - 
> "tx5f9659f74cb2491f9a63cbb84f680c5c" "-" "-" 0.0023
> Oct 26 17:07:40 data05 object-server 192.168.1.111 - - [27/Oct/2012:00:07:40 
> +0000] "PUT /sda6/49697/AUTH_system/myfiles4/home/lab/bigfile1" 201 - "-" 
> "tx5f9659f74cb2491f9a63cbb84f680c5c" "-" 0.6038
> Oct 26 17:07:44 data05 container-updater Begin container update sweep
> Oct 26 17:07:44 data05 container-replicator Beginning replication run
> Oct 26 17:07:44 data05 account-server 192.168.1.206 - - [27/Oct/2012:00:07:44 
> +0000] "PUT /sda6/116021/AUTH_system/myfiles1" 201 - "-" "-" "-" 0.0021 ""
> Oct 26 17:07:44 data05 account-server 192.168.1.206 - - [27/Oct/2012:00:07:44 
> +0000] "PUT /sda6/116021/AUTH_system/myfiles3" 201 - "-" "-" "-" 0.0021 ""
> Oct 26 17:07:44 data05 account-server 192.168.1.206 - - [27/Oct/2012:00:07:44 
> +0000] "PUT /sda6/116021/AUTH_system/myfiles4" 201 - "-" "-" "-" 0.0021 ""
> Oct 26 17:07:45 data05 account-server 192.168.1.203 - - [27/Oct/2012:00:07:45 
> +0000] "PUT /sda6/116021/AUTH_system/myfiles2" 201 - "-" "-" "-" 0.0023 ""
> Oct 26 17:07:45 data05 account-server 192.168.1.203 - - [27/Oct/2012:00:07:45 
> +0000] "PUT /sda6/116021/AUTH_system/myfiles1" 201 - "-" "-" "-" 0.0020 ""
> Oct 26 17:07:45 data05 container-replicator Replication run OVER
> Oct 26 17:07:45 data05 account-server 192.168.1.203 - - [27/Oct/2012:00:07:45 
> +0000] "PUT /sda6/116021/AUTH_system/myfiles3" 201 - "-" "-" "-" 0.0223 ""
> Oct 26 17:07:45 data05 container-replicator Attempted to replicate 5 dbs in 
> 0.83001 seconds (6.02399/s)
> Oct 26 17:07:45 data05 container-replicator Removed 0 dbs
> Oct 26 17:07:45 data05 container-replicator 10 successes, 0 failures
> Oct 26 17:07:45 data05 container-replicator no_change:8 ts_repl:0 diff:2 
> rsync:0 diff_capped:0 hashmatch:0 empty:0
> Oct 26 17:07:45 data05 container-updater Container update sweep completed: 
> 0.98s
> Oct 26 17:07:46 data05 account-server 192.168.1.205 - - [27/Oct/2012:00:07:46 
> +0000] "PUT /sda6/116021/AUTH_system/myfiles5" 201 - 
> "tx7ba0264da03c485ca3c3b4048f6a3d8c" "-" "-" 0.0098 ""
> Oct 26 17:07:46 data05 container-server 192.168.1.111 - - 
> [27/Oct/2012:00:07:46 +0000] "PUT /sda6/122255/AUTH_system/myfiles5" 202 - 
> "tx7ba0264da03c485ca3c3b4048f6a3d8c" "-" "-" 0.0465
> Oct 26 17:07:46 data05 object-server 192.168.1.111 - - [27/Oct/2012:00:07:46 
> +0000] "HEAD /sda6/150861/AUTH_system/myfiles5/home/lab/bigfile1" 404 - "-" 
> "txf08e8b91b2894a9da099f26b5300327b" "-" 0.0011
> Oct 26 17:07:46 data05 object-server 192.168.1.111 - - [27/Oct/2012:00:07:46 
> +0000] "PUT /sda6/150861/AUTH_system/myfiles5/home/lab/bigfile1" 507 - "-" 
> "tx8dc917a4a8c84c40a4429b7bab0323c6" "-" 0.0031
> 
> Right now, there is nothing special in the object-server.conf:
> 
> [DEFAULT]
> bind_ip = 192.168.1.206
> workers = 1
> 
> [pipeline:main]
> pipeline = object-server
> 
> [app:object-server]
> use = egg:swift#object
> 
> [object-replicator]
> 
> [object-updater]
> 
> [object-auditor]
> 
> [object-server]
> set log_level = DEBUG
> 
> My general question is: how do I troubleshoot this?   I've tried turning on 
> debug logging on the Object Server, but I don't get any additional clues.
> 
> The Object-servers do give a 507 error, which might indicate a disk problem, 
> but there is nothing wrong with the storage drive.   And also if there was a 
> fundamental drive problem then I wouldn't be able to upload objects in the 
> first place.
> 
> My best guess at the moment is that I should adjust the configuration of the 
> object servers to reflect their lower performance.   What would I adjust, the 
> cache size?   I'm not seeing any log messages that would indicate a time-out. 
>   
> 
> Thanks in advance for your advice.
> 
> -N
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to     : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp

Reply via email to