Traffic Type : public gone when using SG and advanced network

2013-10-15 Thread Bjoern Teipel

Since no user can respond to my questions, maybe a dev may know...

The traffic type "public" is gone when using SG and advanced network in 4.2
How I'm supposed to use the cloudstack loadbalancing (VR) in this 
constellation ?


Bjoern


4.2.1 package issues

2013-10-17 Thread Bjoern Teipel

Hi all,

did somebody encounter issues during RPM packaging ?
I got jar file read errors and I can't tell where they are coming from.
It seems maven tries to download those jars during the compile from an 
invalid url, because those just contain a webpage


Internet2 Shibboleth Project has moved

How can I fix the dependency urls ?

Bjoern

[INFO] 


[INFO] BUILD FAILURE
[INFO] 


[INFO] Total time: 44:39.903s
[INFO] Finished at: Thu Oct 17 10:12:50 PDT 2013
[INFO] Final Memory: 57M/393M
[INFO] 

[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:2.5.1:compile 
(default-compile) on project cloud-awsapi: Compilation failure: 
Compilation failure:
[ERROR] error: error reading 
/home/bteipel/.m2/repository/org/apache/axis2/mex/1.5.4/mex-1.5.4-impl.jar; 
error in opening zip file
[ERROR] error: error reading 
/home/bteipel/.m2/repository/org/apache/axis2/axis2-mtompolicy/1.5.4/axis2-mtompolicy-1.5.4.jar; 
error in opening zip file
[ERROR] error: error reading 
/home/bteipel/.m2/repository/org/apache/ws/commons/axiom/axiom-dom/1.2.10/axiom-dom-1.2.10.jar; 
error in opening zip file
[ERROR] error: error reading 
/home/bteipel/.m2/repository/org/opensaml/opensaml1/1.1/opensaml1-1.1.jar; 
error in opening zip file
[ERROR] error: error reading 
/home/bteipel/.m2/repository/commons-lang/commons-lang/2.3/commons-lang-2.3.jar; 
error in opening zip file

[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the 
-e switch.

[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, 
please read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException

[ERROR]
[ERROR] After correcting the problems, you can resume the build with the 
command

[ERROR]   mvn  -rf :cloud-awsapi
error: Bad exit status from /var/tmp/rpm-tmp.4NppiB (%build)




Urgent : Network stuck in implementing state

2014-01-26 Thread Bjoern Teipel
Hi guys,

I had trouble while I was creating a new VLAN and now I can't delete it
anymore, because it's stuck in implementing state.
I guess that happened after the VR did not came up and I restarted the
management server so I could delete the VR and all addresses.

If I would spin up a guest, the VR and the guest would come up but I sill
want to get rid of it since I have an error in the network service offering.
After I cleaned all up though the GUI I can only see an external IP for the
old VR is still associated with the VLAN but can't be deleted (I don't get
it offered)

In the logs I found this and the NPE matches the table content :



2014-01-26 13:53:02,857 DEBUG [cloud.network.NetworkManagerImpl]
(Job-Executor-2:job-1127 = [ f90b84ab-0d4a-45ce-9cd1-d73b63e24020 ])
Network id=212 is destroyed successfully, cleaning up corresponding
resources
 now.
2014-01-26 13:53:02,871 DEBUG [network.guru.DirectNetworkGuru]
(Job-Executor-2:job-1127 = [ f90b84ab-0d4a-45ce-9cd1-d73b63e24020 ])
Releasing ip 10.16.48.1 of placeholder nic Nic[189-null-null-10.16.48.1]
2014-01-26 13:53:02,872 DEBUG [db.Transaction.Transaction]
(Job-Executor-2:job-1127 = [ f90b84ab-0d4a-45ce-9cd1-d73b63e24020 ])
Rolling back the transaction: Time = 14 Name =
 -AsyncJobManagerImpl$1.run:494-Exec
utors$RunnableAdapter.call:471-FutureTask.run:262-ThreadPoolExecutor.runWorker:1145-ThreadPoolExecutor$Worker.run:615-Thread.run:744;
called by
-Transaction.rollback:897-Transaction.removeUpTo:840-Transaction.cl
ose:664-TransactionContextBuilder.interceptException:63-ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept:133-NetworkManagerImpl.destroyNetwork:3144-ComponentInstantiationPostProcessor$Intercep
torDispatcher.intercept:125-NetworkServiceImpl.deleteNetwork:1767-ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept:125-DeleteNetworkCmd.execute:70-ApiDispatcher.dispatch:158-AsyncJobManagerImp
l$1.run:531
2014-01-26 13:53:02,877 ERROR [cloud.async.AsyncJobManagerImpl]
(Job-Executor-2:job-1127 = [ f90b84ab-0d4a-45ce-9cd1-d73b63e24020 ])
Unexpected exception while executing
org.apache.cloudstack.api.command.user.ne
twork.DeleteNetworkCmd
java.lang.NullPointerException
at
com.cloud.network.guru.DirectNetworkGuru.trash(DirectNetworkGuru.java:311)
at
com.cloud.utils.component.ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125)
at
com.cloud.network.NetworkManagerImpl.destroyNetwork(NetworkManagerImpl.java:3144)
at
com.cloud.utils.component.ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125)
at
com.cloud.network.NetworkServiceImpl.deleteNetwork(NetworkServiceImpl.java:1767)
at
com.cloud.utils.component.ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125)
at
org.apache.cloudstack.api.command.user.network.DeleteNetworkCmd.execute(DeleteNetworkCmd.java:70)
at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:158)
at
com.cloud.async.AsyncJobManagerImpl$1.run(AsyncJobManagerImpl.java:531)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
2014-01-26 13:53:02,880 DEBUG [cloud.async.AsyncJobManagerImpl]
(Job-Executor-2:job-1127 = [ f90b84ab-0d4a-45ce-9cd1-d73b63e24020 ])
Complete async job-1127 = [ f90b84ab-0d4a-45ce-9cd1-d73b63e24020 ],
jobStatus:
 2, resultCode: 530, result: Error Code: 530 Error text: null
2014-01-26 13:53:02,888 DEBUG [cloud.async.SyncQueueManagerImpl]
(Job-Executor-2:job-1127 = [ f90b84ab-0d4a-45ce-9cd1-d73b63e24020 ]) Sync
queue (6) is currently empty
2014-01-26 13:53:02,889 WARN  [cloud.async.AsyncJobManagerImpl]
(Job-Executor-2:job-1127 = [ f90b84ab-0d4a-45ce-9cd1-d73b63e24020 ]) Unable
to unregister active job [ 1127 ] = [ f90b84ab-0d4a-45ce-9cd1-d73b63e24
020 ] from JMX monitoring


mysql> select * from nics  where id = 189\G
*** 1. row ***
id: 189
  uuid: 623d0af3-7ca5-4700-89a2-26252abdc054
   instance_id: NULL
   mac_address: NULL
   ip4_address: 10.16.48.1
   netmask: NULL
   gateway: NULL
   ip_type: NULL
 broadcast_uri: NULL
network_id: 212
  mode: NULL
 state: Reserved
  strategy: PlaceHolder
 reserver_name: NULL
reservation_id: NULL
 device_id: 0
   update_time: 2014-01-24 16:22:56
 isolation_uri: NULL
   ip6_address: NULL
   default_nic: 0
   vm_type: DomainRouter
   created: 2014-01-25 00:22:56
   removed: NULL
   ip6_gateway: NULL
  ip6_cidr: NULL
  secondary_ip: 0
   display_nic: 1
1

Re: Urgent : Network stuck in implementing state

2014-01-30 Thread Bjoern Teipel
Thanks, I got one step further and the NPE is not coming up but I guess I
need to change the overall network state :?

014-01-30 15:23:16,159 DEBUG [cloud.network.NetworkManagerImpl]
(Job-Executor-67:job-1218 = [ 7de934e0-2b42-476b-83da-cf03109331fa ])
Network is not implemented: Ntwk[212|Guest|16]
2014-01-30 15:23:16,160 DEBUG [cloud.network.NetworkManagerImpl]
(Job-Executor-67:job-1218 = [ 7de934e0-2b42-476b-83da-cf03109331fa ])
Network is not not in the correct state to be destroyed: Implementing
2014-01-30 15:23:16,164 DEBUG [cloud.async.AsyncJobManagerImpl]
(Job-Executor-67:job-1218 = [ 7de934e0-2b42-476b-83da-cf03109331fa ])
Complete async job-1218 = [ 7de934e0-2b42-476b-83da-cf03109331fa ],
jobStatus: 2, resultCode: 530, result: Error Code: 530 Error text: Failed
to delete network



On Mon, Jan 27, 2014 at 10:57 PM, Sanjeev Neelarapu <
sanjeev.neelar...@citrix.com> wrote:

> Hi Bjoern,
>
> Since the nic's instance_id is NULL you can set the removed field with
> some timestamp (e.g. now()). Also check the external IP address state in
> user_ip_address table. If this ip address state is in allocated set it to
> NULL and try delete network again.
>
> -Sanjeev
>
> -Original Message-
> From: Bjoern Teipel [mailto:bjoern.tei...@gmail.com]
> Sent: Monday, January 27, 2014 12:44 PM
> To: users; dev@cloudstack.apache.org
> Subject: Urgent : Network stuck in implementing state
>
> Hi guys,
>
> I had trouble while I was creating a new VLAN and now I can't delete it
> anymore, because it's stuck in implementing state.
> I guess that happened after the VR did not came up and I restarted the
> management server so I could delete the VR and all addresses.
>
> If I would spin up a guest, the VR and the guest would come up but I sill
> want to get rid of it since I have an error in the network service offering.
> After I cleaned all up though the GUI I can only see an external IP for
> the old VR is still associated with the VLAN but can't be deleted (I don't
> get it offered)
>
> In the logs I found this and the NPE matches the table content :
>
>
>
> 2014-01-26 13:53:02,857 DEBUG [cloud.network.NetworkManagerImpl]
> (Job-Executor-2:job-1127 = [ f90b84ab-0d4a-45ce-9cd1-d73b63e24020 ])
> Network id=212 is destroyed successfully, cleaning up corresponding
> resources  now.
> 2014-01-26 13:53:02,871 DEBUG [network.guru.DirectNetworkGuru]
> (Job-Executor-2:job-1127 = [ f90b84ab-0d4a-45ce-9cd1-d73b63e24020 ])
> Releasing ip 10.16.48.1 of placeholder nic Nic[189-null-null-10.16.48.1]
> 2014-01-26 13:53:02,872 DEBUG [db.Transaction.Transaction]
> (Job-Executor-2:job-1127 = [ f90b84ab-0d4a-45ce-9cd1-d73b63e24020 ])
> Rolling back the transaction: Time = 14 Name =
>  -AsyncJobManagerImpl$1.run:494-Exec
>
> utors$RunnableAdapter.call:471-FutureTask.run:262-ThreadPoolExecutor.runWorker:1145-ThreadPoolExecutor$Worker.run:615-Thread.run:744;
> called by
> -Transaction.rollback:897-Transaction.removeUpTo:840-Transaction.cl
>
> ose:664-TransactionContextBuilder.interceptException:63-ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept:133-NetworkManagerImpl.destroyNetwork:3144-ComponentInstantiationPostProcessor$Intercep
>
> torDispatcher.intercept:125-NetworkServiceImpl.deleteNetwork:1767-ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept:125-DeleteNetworkCmd.execute:70-ApiDispatcher.dispatch:158-AsyncJobManagerImp
> l$1.run:531
> 2014-01-26 13:53:02,877 ERROR [cloud.async.AsyncJobManagerImpl]
> (Job-Executor-2:job-1127 = [ f90b84ab-0d4a-45ce-9cd1-d73b63e24020 ])
> Unexpected exception while executing
> org.apache.cloudstack.api.command.user.ne
> twork.DeleteNetworkCmd
> java.lang.NullPointerException
> at
> com.cloud.network.guru.DirectNetworkGuru.trash(DirectNetworkGuru.java:311)
> at
>
> com.cloud.utils.component.ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125)
> at
>
> com.cloud.network.NetworkManagerImpl.destroyNetwork(NetworkManagerImpl.java:3144)
> at
>
> com.cloud.utils.component.ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125)
> at
>
> com.cloud.network.NetworkServiceImpl.deleteNetwork(NetworkServiceImpl.java:1767)
> at
>
> com.cloud.utils.component.ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125)
> at
>
> org.apache.cloudstack.api.command.user.network.DeleteNetworkCmd.execute(DeleteNetworkCmd.java:70)
> at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:158)
> at
> com.cloud.async.AsyncJobManagerImpl$1.run(AsyncJobManagerImpl.ja

CLOUDSTACK-5859 [HA] Shared storage failure results in reboot loop; VMs with Local storage brought offline

2014-03-26 Thread Bjoern Teipel
Hi folks,

in light of Cloudstack issue #5859 I would like to know what the
intention was for the kvmheartbeat.sh script, which ultimately can
reboot (fence) machines.
I had the unfortunate position to found 10 hypervisor in an unknown
state after the NFS volume became unresponsive while CMAN/CLVM and
fenced were running and everything went south. Because a reboot by
this script caused the hypervisor to fence and after over 50% of the
cluster nodes left, the cluster was in an state without quorum.
I personally can't see why anyone want a booting hypervisor,
especially if other storage pools like CLVM or local where serving
fine and would have increased the availability of those VMs.

Usually you fix your NFS,Storage or network problem and reboot the
affected VMs

Thanks,
Bjoern


4.2.1: CLVM to CLVM migrate storage not successfull

2014-05-28 Thread Bjoern Teipel
Hi devs,

I think I have found a bug affecting a primary storage migration if
the source and destination volume are on a CLVM storage.
So far I could follow the trail at :

- secondary storage is mounted on the source host with a new template id
- cp is issued to copy /dev/volgrpoup/uuid CLVM LV to NFS mount fom
the sec storage
- sec storage is mounted on the destination host
- a LV is created on the destination CLVM
- qemu-img convert is launched to migrate sec storage disk to CLVM lv
but it fails with

014-05-28 19:54:07,421 ERROR [kvm.storage.LibvirtStorageAdaptor]
(agentRequest-Handler-2:null) Failed to convert
/mnt/b3a9edfb-2ce7-3c96-ae77-78131bffaa4f/99bddb1b-13aa-4559-aac2-080d5c9576fc.qcow2
to /dev/vg_vmem-003a74_vdi5/3811e2e5-9ef8-4825-9b84-7f5c6fdf87fd the
error was: qemu-img: Could not open
'/mnt/b3a9edfb-2ce7-3c96-ae77-78131bffaa4f/99bddb1b-13aa-4559-aac2-080d5c9576fc.qcow2'qemu-img:
Could not open 
'/mnt/b3a9edfb-2ce7-3c96-ae77-78131bffaa4f/99bddb1b-13aa-4559-aac2-080d5c9576fc.qcow2'

At this point I have no clue what might be wrong. All storage is
available and has enough space available. Help would be good, I can
also create a bug if it turns out.

When I do the same convert from CLVM to NFS volume on the same hosts,
all operation are complete, with the exception of another bug that RAW
volumes get converted to qcow2 and windows VMs suddenly can't boot
anymore so I have to convert those back to RAW mode.

Thanks for any help,
Bjoern


Cloustack log :

2014-05-28 19:38:46,952 DEBUG [cloud.async.AsyncJobManagerImpl]
(catalina-exec-1:null) submit async job-3613 = [
68bcafc5-9db8-4de9-bcee-e174d6982ab8 ], details: AsyncJobVO {id:3613,
userId: 3, accountId: 3, sessionKey: null, instanceType: None,
instanceId: null, cmd:
org.apache.cloudstack.api.command.admin.vm.MigrateVMCmd,
cmdOriginator: null, cmdInfo:
{"response":"json","sessionkey":"zG/kiTfpBAgaaNNmCz7tjQRJ7qs\u003d","virtualmachineid":"7d2ae1c9-f306-4050-b53c-deec39d90f8a","cmdEventType":"VM.MIGRATE","ctxUserId":"3","storageid":"f451646b-9e04-42ab-9787-41e4dd5e90c6","httpmethod":"GET","_":"1401331288946","projectid":"a98b95ee-1eba-44f1-a714-00254ec068c5","ctxAccountId":"3","ctxStartEventId":"14750"},
cmdVersion: 0, callbackType: 0, callbackAddress: null, status: 0,
processStatus: 0, resultCode: 0, result: null, initMsid: 110493003717,
completeMsid: null, lastUpdated: null, lastPolled: null, created:
null}
2014-05-28 19:38:46,953 DEBUG [cloud.async.AsyncJobManagerImpl]
(Job-Executor-65:job-3613 = [ 68bcafc5-9db8-4de9-bcee-e174d6982ab8 ])
Executing org.apache.cloudstack.api.command.admin.vm.MigrateVMCmd for
job-3613 = [ 68bcafc5-9db8-4de9-bcee-e174d6982ab8 ]
2014-05-28 19:38:46,983 DEBUG [cloud.capacity.CapacityManagerImpl]
(Job-Executor-65:job-3613 = [ 68bcafc5-9db8-4de9-bcee-e174d6982ab8 ])
VM state transitted from :Stopped to Migrating with event:
StorageMigrationRequestedvm's original host id: 7 new host id: null
host id before state transition: null
2014-05-28 19:38:47,006 DEBUG
[storage.motion.AncientDataMotionStrategy] (Job-Executor-65:job-3613 =
[ 68bcafc5-9db8-4de9-bcee-e174d6982ab8 ]) copyAsync inspecting src
type VOLUME copyAsync inspecting dest type VOLUME
2014-05-28 19:38:47,009 DEBUG
[cache.allocator.StorageCacheRandomAllocator]
(Job-Executor-65:job-3613 = [ 68bcafc5-9db8-4de9-bcee-e174d6982ab8 ])
Can't find staging storage in zone: 1
2014-05-28 19:38:47,035 DEBUG [agent.transport.Request]
(Job-Executor-65:job-3613 = [ 68bcafc5-9db8-4de9-bcee-e174d6982ab8 ])
Seq 35-134884895: Sending  { Cmd , MgmtId: 110493003717, via: 35, Ver:
v1, Flags: 100111,
[{"org.apache.cloudstack.storage.command.CopyCommand":{"srcTO":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid":"59c7ca2e-dbd1-4d09-a9a2-8c4e28f773f8","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"da7702e3-9629-4a5d-908d-d0d31eb9ba18","id":11,"poolType":"CLVM","host":"localhost","path":"/vg_vmem-003a74_vdi2","port":0}},"name":"ROOT-106","size":42949672960,"path":"c31934cc-bc60-460f-9a36-d7c9b6eb466b","volumeId":163,"vmName":"i-7-106-VM","accountId":7,"format":"QCOW2","id":163,"hypervisorType":"KVM"}},"destTO":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid":"59c7ca2e-dbd1-4d09-a9a2-8c4e28f773f8","volumeType":"ROOT","dataStore":{"com.cloud.agent.api.to.NfsTO":{"_url":"nfs://xx/vol/cloud_sec_1","_role":"Image"}},"name":"ROOT-106","size":42949672960,"path":"volumes/7/163","volumeId":163,"vmName":"i-7-106-VM","accountId":7,"format":"QCOW2","id":163,"hypervisorType":"KVM"}},"executeInSequence":true,"wait":10800}}]
}
2014-05-28 19:54:06,689 DEBUG [agent.transport.Request]
(Job-Executor-65:job-3613 = [ 68bcafc5-9db8-4de9-bcee-e174d6982ab8 ])
Seq 35-134884895: Received:  { Ans: , MgmtId: 110493003717, via: 35,
Ver: v1, Flags: 110, { CopyCmdAnswer } }
2014-05-28 19:54:06,706 DEBUG [agent.transport.Request]
(Job-Executor-65:job-3613 = [ 68bcafc5-9db8-4de9-bcee-e174d6982ab8 ])
Seq 45-2103117064: Sending  { Cmd , MgmtId: 1104