GitHub user CodingCat opened a pull request: https://github.com/apache/spark/pull/59
SPARK-1166: clean vpc_id if the group was just now created Reported in https://spark-project.atlassian.net/browse/SPARK-1166 In some very weird situation (when new created group master_group and slave_group have valid vpc_id), user will receive the following error when running the spark-ec2 script ``` Setting up security groups... ERROR:boto:400 Bad Request ERROR:boto:<?xml version="1.0" encoding="UTF-8"?> <Response><Errors><Error><Code>InvalidParameterValue</Code><Message>Invalid value 'null' for protocol. VPC security group rules must specify protocols explicitly.</Message></Error></Errors><RequestID>fc56f0ba-915a-45b6-8555-05d4dd0f14ee</RequestID></Response> Traceback (most recent call last): File "./spark_ec2.py", line 813, in <module> main() File "./spark_ec2.py", line 806, in main real_main() File "./spark_ec2.py", line 689, in real_main conn, opts, cluster_name) File "./spark_ec2.py", line 244, in launch_cluster slave_group.authorize(src_group=master_group) File "/Users/nanzhu/code/spark/ec2/third_party/boto-2.4.1.zip/boto-2.4.1/boto/ec2/securitygroup.py", line 184, in authorize File "/Users/nanzhu/code/spark/ec2/third_party/boto-2.4.1.zip/boto-2.4.1/boto/ec2/connection.py", line 2181, in authorize_security_group File "/Users/nanzhu/code/spark/ec2/third_party/boto-2.4.1.zip/boto-2.4.1/boto/connection.py", line 944, in get_status boto.exception.EC2ResponseError: EC2ResponseError: 400 Bad Request <?xml version="1.0" encoding="UTF-8"?> <Response><Errors><Error><Code>InvalidParameterValue</Code><Message>Invalid value 'null' for protocol. VPC security group rules must specify protocols explicitly.</Message></Error></Errors><RequestID>fc56f0ba-915a-45b6-8555-05d4dd0f14ee</RequestID></Response> ``` The related code in boto is as following: ``` group_name = None if not self.vpc_id: group_name = self.name group_id = None if self.vpc_id: group_id = self.id src_group_name = None src_group_owner_id = None src_group_group_id = None if src_group: cidr_ip = None src_group_owner_id = src_group.owner_id if not self.vpc_id: src_group_name = src_group.name else: if hasattr(src_group, 'group_id'): src_group_group_id = src_group.group_id else: src_group_group_id = src_group.id status = self.connection.authorize_security_group(group_name, src_group_name, src_group_owner_id, ip_protocol, from_port, to_port, cidr_ip, group_id, src_group_group_id) ``` So if we just create a new cluster, we should clean the vpc_id for the user You can merge this pull request into a Git repository by running: $ git pull https://github.com/CodingCat/spark SPARK-1166 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/59.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #59 ---- commit ecf485c3cd31a475951b427eafbd1eaa1dfb71d5 Author: CodingCat <zhunans...@gmail.com> Date: 2014-03-03T02:56:27Z clean vpc_id if the group was just now created ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---