[ https://issues.apache.org/jira/browse/HIVE-19429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16465976#comment-16465976 ]
Alan Gates commented on HIVE-19429: ----------------------------------- {quote}How much memory does your machine have? {quote} 256G {quote}I could not find a way to get the test results for the failed test. {quote} Yeah, I have not gotten to that part yet. It should be easy enough to change the ResulstsAnalyzer to grab that information as well. It may require revivifying the container to obtain the logs. Though it will be better if we can teach the container to print the logs of failed tests so that "docker logs" will automatically get them in the first pass. {quote}I think the existing batching logic is better than the one you have since we don't have to hardcode the directory names. The existing batching logic is much more customizable with regards to the batch sizes of individual CliDrivers. {quote} I don't like that I have the directory names etc. hard coded in the code. At the very least this should be in configuration. I have completely rewritten the MvnCommandFactory at least twice. Every time I tried to get more general though it got insanely complicated. Which leads me to the conclusion that rather than making this code much smarter, we should make the tests much simpler. We should not have to read two config files to figure out which qfiles to run with which tests. Ideally we could figure out a way to surface qfiles as individual tests rather than all buried in one test. I have some thoughts on how to achieve this, but it's longer term. Also, I haven't found the flexibility of different batch sizes worth the effort. One size fits all isn't perfect but seems to be good enough. {quote}I think it would be useful to run these containers in a cluster so that we can support multiple patches a time to speed up the testing. {quote} Definitely. I happen to have a beefy machine handy, but that isn't the general case. I designed it to support multiple container providers so it should be easy to write a ContainerClient that supports Yarn or Kubernetes instead of simple Docker. {quote}Also, not sure if there is a way to run command on an existing docker container so that we can re-use deployed containers. {quote} I am not a Docker expert, but I think this is an anti-pattern. Spinning up a new container is very fast and very low cost. To reuse a container you either have to build it as a standing service that can keep taking requests (which is much more complex that running a simple test command), or turn the container into an image and then start a new container on that image (so you are starting a new container anyway). Both of these are much more heavyweight than just starting a new container. Occasionally we may be forced to restart the container to get information out of it (like in the case of getting logs from failed tests). > Investigate alternative technologies like docker containers to increase > parallelism > ----------------------------------------------------------------------------------- > > Key: HIVE-19429 > URL: https://issues.apache.org/jira/browse/HIVE-19429 > Project: Hive > Issue Type: Sub-task > Reporter: Vihang Karajgaonkar > Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)