HI Daan,
I am not sure exactly how to monkey-test cloudstack, what I did was to do
$ mvn test
which shows “BUILD SUCCESS”. I also did this:
$ mvn clean install -P systemvm,developer
which also succeeded. 

Is that what you mean? If not, please let me know what to do and I will further 
test it.
I will also work on assigning the proper reviewers now.

Thanks,
Ding

On Apr 3, 2014, at 12:04 PM, Daan Hoogland <daan.hoogl...@gmail.com> wrote:

> thanks Ding,
> 
> I saw your update. Did your run a cloud with this code; i.e. did you
> monkey-test it?
> 
> On Thu, Apr 3, 2014 at 5:26 PM, Ding Yuan <y...@ece.utoronto.ca> wrote:
>> Oops, sorry I didn't publish my diff. I just published it on review board.
>> Thanks for the comment Daan! Please let me know if I further need to improve 
>> it.
>> 
>> Ding
>> 
>> On Apr 3, 2014, at 9:52 AM, Daan Hoogland <daan.hoogl...@gmail.com> wrote:
>> 
>>> Ding,
>>> 
>>> I think we can dare to do so in master as it will not see release for
>>> a while. We'll just have to be aware of the locations and be on alert
>>> for any stacktraces that will pass this list. I would not like to do
>>> this on the 4.4 branch even when it is an improvement of code quality
>>> as such. It might do things or prevent things from happening that we
>>> need done.
>>> 
>>> I don't see a new version of the diff in the review request. Did you
>>> 'Update' -> 'Upload Diff'?
>>> 
>>> regards,
>>> Daan
>>> 
>>> On Thu, Apr 3, 2014 at 12:34 AM, Ding Yuan <y...@ece.utoronto.ca> wrote:
>>>> Uploaded a new patch to 19917. Changed the verbosity to debug, and 
>>>> addressed
>>>> Daan's comment on providing more distinctive text messages.
>>>> 
>>>> Sorry that I haven't split them into smaller patches.
>>>> 
>>>> Note in a few cases the original code was like:
>>>>        try {
>>>>           pstmt = txn.prepareAutoCloseStatement(sql);
>>>>           String gmtCutTime =
>>>> DateUtil.getDateDisplayString(TimeZone.getTimeZone("GMT"), cutTime);
>>>>           pstmt.setString(1, gmtCutTime);
>>>>           pstmt.setString(2, gmtCutTime);
>>>> 
>>>>           ResultSet rs = pstmt.executeQuery();
>>>>           while (rs.next()) {
>>>>               RunningHostCountInfo info = new RunningHostCountInfo();
>>>>               info.setDcId(rs.getLong(1));
>>>>               info.setHostType(rs.getString(2));
>>>>               info.setCount(rs.getInt(3));
>>>> 
>>>>               l.add(info);
>>>>                      }
>>>>                } catch (SQLException e) {
>>>>                } catch (Throwable e) {
>>>>                }
>>>> 
>>>> The try block only throws SQLException as checked exception, and this code
>>>> would also swallow any unchecked exceptions. I removed the catch 
>>>> (Throwable)
>>>> in these cases to avoid potentially swallowing any unexpected runtime
>>>> exceptions. Please let me know if this is not desirable so I can further
>>>> update.
>>>> 
>>>> Thanks,
>>>> Ding
>>>> 
>>>> On Apr 2, 2014, at 5:17 PM, Ding Yuan <y...@eecg.toronto.edu> wrote:
>>>> 
>>>> Thanks all for the quick comments!
>>>> If i understand the discussion correctly, I will just change all the added
>>>> log printing statements to debug verbosity. I will upload a new patch for
>>>> that shortly.
>>>> 
>>>> Now a bit back story: the reason we are doing this is that we recently did
>>>> an analysis on many bugs collected from JIRA to understand why today's 
>>>> cloud
>>>> system fails. And we found that almost all of the cluster-wide failures are
>>>> caused by buggy exception handling, which would often turn a component
>>>> failure into a cluster-wide one. One of the common bug pattern is ignoring
>>>> some exceptions -- allowing them to propagate and finally turn into 
>>>> disaster.
>>>> Therefore we built a simple static checking tool just to check some simple
>>>> rules for exception handling, such as if an exception is ignored.
>>>> Admittedly, it would be much harder to reason about the potential
>>>> consequences caused for ignoring a certain exception, that's why without
>>>> much more domain knowledge I can only recommend to 1) avoid over-catching 
>>>> an
>>>> exception, especially when the handling logic will swallow it, and 2) log
>>>> them, as what this patch does.
>>>> 
>>>> Nevertheless, it seems the four cases I mentioned in JIRA-6242 are
>>>> particularly suspicious. It might be worthwhile to double check their
>>>> correctness if you have time. I am reposting them below.
>>>> 
>>>> Thanks!
>>>> Ding
>>>> 
>>>> =========================
>>>> Case 1:
>>>> Line: 375, File: "com/cloud/network/ovs/OvsTunnelManagerImpl.java"
>>>> 
>>>> 326: try {
>>>> 327: String myIp = getGreEndpointIP(dest.getHost(), nw);
>>>> .. .. ..
>>>> 373: } catch (Exception e) {
>>>> 374: // I really thing we should do a better handling of these exceptions
>>>> 375: s_logger.warn("Ovs Tunnel network created tunnel failed", e);
>>>> 376: }
>>>> 
>>>> Comment seems to suggest for a better handling.
>>>> =========================
>>>> =========================
>>>> Case 2:
>>>> Line: 2295, File: "com/cloud/resource/ResourceManagerImpl.java"
>>>> 
>>>> 2284: try {
>>>> 2285: /*
>>>> 2286: * FIXME: this is a buggy logic, check with alex. Shouldn't
>>>> 2287: * return if propagation return non null
>>>> 2288: */
>>>> 2289: Boolean result = _clusterMgr.propagateResourceEvent(h.getId(),
>>>> ResourceState.Event.UpdatePassword);
>>>> 2290: if (result != null) {
>>>> 2291: return result;
>>>> 2292: }
>>>> 2293:
>>>> 2294: doUpdateHostPassword(h.getId());
>>>> 2295: } catch (AgentUnavailableException e) {
>>>> 2296: }
>>>> 
>>>> Seem from the comment the logic should be fixed.
>>>> A similar code snipeet is at:
>>>> Line: 2276, File: "com/cloud/resource/ResourceManagerImpl.java"
>>>> =========================
>>>> 
>>>> =========================
>>>> Case 3:
>>>> 
>>>> Line: 184, File:
>>>> "org/apache/cloudstack/api/command/user/autoscale/CreateAutoScaleVmGroupCmd.java"
>>>> 
>>>> 174: try
>>>> 175: {
>>>> 176: success = _autoScaleService.configureAutoScaleVmGroup(this);
>>>> 177: if (success) {
>>>> 178: vmGroup = _entityMgr.findById(AutoScaleVmGroup.class, getEntityId());
>>>> 179: AutoScaleVmGroupResponse responseObject =
>>>> _responseGenerator.createAutoScaleVmGroupResponse(vmGroup);
>>>> 180: setResponseObject(responseObject);
>>>> 181: responseObject.setResponseName(getCommandName());
>>>> 182: }
>>>> 183: } catch (Exception ex) {
>>>> 184: // TODO what will happen if Resource Layer fails in a step inbetween
>>>> 185: s_logger.warn("Failed to create autoscale vm group", ex);
>>>> 186: }
>>>> 
>>>> The comment, "TODO", seems to suggest for better handling.
>>>> =========================
>>>> =========================
>>>> Case 4:
>>>> 
>>>> Line: 222, File: "com/cloud/api/ApiDispatcher.java"
>>>> This snippet is moved to ParamProcessWorker.java in the trunk.
>>>> 
>>>> 203: try {
>>>> 204: setFieldValue(field, cmd, paramObj, parameterAnnotation);
>>>>    .. ..
>>>> 220: } catch (CloudRuntimeException cloudEx) {
>>>> 221: s_logger.error("CloudRuntimeException", cloudEx);
>>>> 222: // FIXME: Better error message? This only happens if the API command 
>>>> is
>>>> not executable, which typically
>>>> 223: //means
>>>> 224: // there was
>>>> 225: // and IllegalAccessException setting one of the parameters.
>>>> 226: throw new ServerApiException(ApiErrorCode.INTERNAL_ERROR, "Internal
>>>> error executing API command " + cmd.getCommandName().substring(0,
>>>> cmd.getCommandName().length() - 8));
>>>> 227: }
>>>> 
>>>> The "FIXME" comment seems to suggest for getter error message.
>>>> =========================
>>>> On Apr 2, 2014, at 4:37 PM, Daan Hoogland <daan.hoogl...@gmail.com> wrote:
>>>> 
>>>> Ding Yuan,
>>>> 
>>>> Any objections to that?
>>>> 
>>>> On Wed, Apr 2, 2014 at 10:32 PM, Alena Prokharchyk
>>>> <alena.prokharc...@citrix.com> wrote:
>>>> 
>>>> 
>>>> 
>>>> On 4/2/14, 1:27 PM, "Daan Hoogland" <daan.hoogl...@gmail.com> wrote:
>>>> 
>>>> I think we agree indeed. Doesn't mean we should start this discuss
>>>> thread or write a arch guideline on the wiki somewhere. Maybe Ding
>>>> Yuan wants to do a preliminary version of it?
>>>> 
>>>> 
>>>> Wiki guide would be useful indeed.
>>>> 
>>>> 
>>>> 
>>>> In the meantime I don't think that it hurts for the present patch to
>>>> do everything in debug and decide about higher levels needed later.
>>>> 
>>>> 
>>>> Agree.
>>>> 
>>>> 
>>>> 
>>>> regards,
>>>> 
>>>> On Wed, Apr 2, 2014 at 10:11 PM, Alena Prokharchyk
>>>> <alena.prokharc...@citrix.com> wrote:
>>>> 
>>>> Daan,
>>>> 
>>>> Correct me if I¹m wrong, but all of the logging added by Ding, fall
>>>> under
>>>> "to go with it or to indicate passing a certain code path². I¹ve just
>>>> noticed that some of them were added with DEBUG, and some with WARN
>>>> level,
>>>> and wanted to correct that.
>>>> 
>>>> So we should:
>>>> 
>>>> 1) For sure: never print them out in WARN as there is nothing admin
>>>> should
>>>> do in this case, because the code just handles them by ignoring.
>>>> 2) Figure out what would be the correct level to log them with: INFO or
>>>> DEBUG
>>>> 
>>>> From ³Logging best practices² articles, I can see that people use INFO
>>>> as
>>>> a ³storyline² of normal application behavior, and DEBUG for sort of
>>>> information that helps you to track down the failure cases scenarios.
>>>> To
>>>> me, stuff added by Ding, falls under second category. But I might be
>>>> wrong
>>>> as I don¹t recall on the spot any discussions happening on the debug
>>>> topic, from the mailing list.
>>>> 
>>>> -Alena.
>>>> 
>>>> 
>>>> On 4/2/14, 12:57 PM, "Daan Hoogland" <daan.hoogl...@gmail.com> wrote:
>>>> 
>>>> Alena,
>>>> 
>>>> What I read in your comment is a description of INFO vs WARN. Debug
>>>> would be only for outputting stacktraces to go with it or to indicate
>>>> passing a certain code path.
>>>> 
>>>> Agree?
>>>> 
>>>> On Wed, Apr 2, 2014 at 8:31 PM, Alena Prokharchyk
>>>> <alena.prokharc...@citrix.com> wrote:
>>>> 
>>>> 
>>>> -----------------------------------------------------------
>>>> This is an automatically generated e-mail. To reply, visit:
>>>> https://reviews.apache.org/r/19917/#review39324
>>>> -----------------------------------------------------------
>>>> 
>>>> 
>>>> Is there a reason why logs for some exceptions are being logged in
>>>> DEBUG mode, and some in WARN? From my point of view, if the code only
>>>> catches it and doesn't error out, it should be logged in DEBUG. Lots of
>>>> Admins are seeking for WARN statements in the log, and they might be
>>>> confused seeing WARN w/o further failure or retry.
>>>> 
>>>> - Alena Prokharchyk
>>>> 
>>>> 
>>>> On April 2, 2014, 1:55 p.m., Ding Yuan wrote:
>>>> 
>>>> 
>>>> -----------------------------------------------------------
>>>> This is an automatically generated e-mail. To reply, visit:
>>>> https://reviews.apache.org/r/19917/
>>>> -----------------------------------------------------------
>>>> 
>>>> (Updated April 2, 2014, 1:55 p.m.)
>>>> 
>>>> 
>>>> Review request for cloudstack.
>>>> 
>>>> 
>>>> Repository: cloudstack-git
>>>> 
>>>> 
>>>> Description
>>>> -------
>>>> 
>>>> This is the patch for JIRA-6242. See
>>>> https://issues.apache.org/jira/browse/CLOUDSTACK-6242 for more
>>>> details.
>>>> Thanks!
>>>> 
>>>> 
>>>> Diffs
>>>> -----
>>>> 
>>>> 
>>>> engine/orchestration/src/com/cloud/agent/manager/AgentManagerImpl.java
>>>> 0d41bc1
>>>> 
>>>> engine/orchestration/src/com/cloud/agent/manager/ClusteredAgentManager
>>>> Im
>>>> pl.java 01508a4
>>>> 
>>>> engine/orchestration/src/com/cloud/vm/VirtualMachineManagerImpl.java
>>>> 3e088db
>>>> 
>>>> engine/orchestration/src/org/apache/cloudstack/engine/datacenter/entit
>>>> y/
>>>> api/db/dao/EngineDataCenterDaoImpl.java 4b6818e
>>>> engine/schema/src/com/cloud/dc/dao/DataCenterDaoImpl.java ea5039f
>>>> engine/schema/src/com/cloud/host/dao/HostDaoImpl.java 426c90d
>>>> engine/schema/src/com/cloud/storage/dao/StoragePoolHostDaoImpl.java
>>>> e42eaf4
>>>> engine/schema/src/com/cloud/storage/dao/VMTemplateDaoImpl.java
>>>> 34fdca5
>>>> engine/schema/src/com/cloud/upgrade/dao/Upgrade2214to30.java
>>>> 58dd916
>>>> engine/schema/src/com/cloud/vm/dao/ConsoleProxyDaoImpl.java 5e9c2f0
>>>> engine/schema/src/com/cloud/vm/dao/SecondaryStorageVmDaoImpl.java
>>>> 1f382d6
>>>> 
>>>> engine/storage/src/org/apache/cloudstack/storage/datastore/DataObjectM
>>>> an
>>>> agerImpl.java 6ed1274
>>>> 
>>>> framework/ipc/src/org/apache/cloudstack/framework/serializer/OnwireCla
>>>> ss
>>>> Registry.java 83c8a42
>>>> 
>>>> plugins/hypervisors/xen/src/com/cloud/hypervisor/xen/discoverer/XcpSer
>>>> ve
>>>> rDiscoverer.java 0ad6dc4
>>>> 
>>>> plugins/hypervisors/xen/src/com/cloud/hypervisor/xen/resource/XenServe
>>>> rC
>>>> onnectionPool.java b779085
>>>> 
>>>> plugins/hypervisors/xen/src/com/cloud/hypervisor/xen/resource/XenServe
>>>> rS
>>>> torageProcessor.java e512046
>>>> 
>>>> plugins/storage/volume/solidfire/src/org/apache/cloudstack/storage/dat
>>>> as
>>>> tore/lifecycle/SolidFirePrimaryDataStoreLifeCycle.java af6a77a
>>>> server/src/com/cloud/resource/ResourceManagerImpl.java f9a59ba
>>>> server/src/com/cloud/server/ConfigurationServerImpl.java b8da4c8
>>>> 
>>>> services/console-proxy/server/src/com/cloud/consoleproxy/ConsoleProxyT
>>>> hu
>>>> mbnailHandler.java 06f21d3
>>>> utils/src/com/cloud/utils/net/NetUtils.java 6350986
>>>> 
>>>> Diff: https://reviews.apache.org/r/19917/diff/
>>>> 
>>>> 
>>>> Testing
>>>> -------
>>>> 
>>>> 
>>>> Thanks,
>>>> 
>>>> Ding Yuan
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Daan
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Daan
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Daan
>>>> 
>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Daan
>>> 
>> 
> 
> 
> 
> -- 
> Daan
> 

Reply via email to