Re: [Openstack-operators] nova-placement-api tuning

2018-03-29 Thread Belmiro Moreira
Hi, with Ocata upgrade we decided to run local placements (one service per cellV1) because we were nervous about possible scalability issues but specially the increase of the schedule time. Fortunately, this is now been address with the placement-req-filter work. We started slowly to aggregate our

Re: [Openstack-operators] Custom libvirt fragment for instance type?

2018-01-16 Thread Belmiro Moreira
Hi Jonathan, this was introduced in Pike. Belmiro On Tue, 16 Jan 2018 at 22:48, Jonathan Proulx wrote: > On Tue, Jan 16, 2018 at 03:49:25PM -0500, Jonathan Proulx wrote: > :On Tue, Jan 16, 2018 at 08:42:00PM +, Tim Bell wrote: > ::If you want to hide the VM signature, you can use the > img_

Re: [Openstack-operators] reminder: ops meetups team meeting on #openstack-operators

2017-10-24 Thread Belmiro Moreira
Hi, can we add this meeting into the official IRC meetings page? https://wiki.openstack.org/wiki/Meetings http://eavesdrop.openstack.org/ thanks, Belmiro On Tue, 24 Oct 2017 at 15:51, Chris Morgan wrote: > Next meeting in about 10 minutes from now > > Chris > > -- > Chris Morgan > __

Re: [Openstack-operators] Libvirt CPU map (host-model)

2017-10-10 Thread Belmiro Moreira
elps some, > > Thanks, > Paul Browne > > [1] https://pastebin.com/JshWi6i3 > [2] https://pastebin.com/5b8cAanP > [3] https://bugzilla.redhat.com/show_bug.cgi?id=1495171 > > On 9 October 2017 at 04:59, Belmiro Moreira gmail.com> wrote: > >> Hi, >> the CPU m

[Openstack-operators] Libvirt CPU map (host-model)

2017-10-08 Thread Belmiro Moreira
Hi, the CPU model that we expose to the guest VMs varies considering the compute node use case. We use "cpu_mode=host-passthrough" for the compute nodes that run batch processing VMs and "cpu_mode=host-model" for the compute nodes for service VMs. The reason to have "cpu_mode=host-model" is because

Re: [Openstack-operators] [nova] Should we allow passing new user_data during rebuild?

2017-10-04 Thread Belmiro Moreira
In our cloud rebuild is the only way for a user to keep the same IP. Unfortunately, we don't offer floating IPs, yet. Also, we use the user_data to bootstrap some actions in new instances (puppet, ...). Considering all the use-cases for rebuild it would be great if the user_data can be updated at r

Re: [Openstack-operators] [openstack-dev] [nova] Queens PTG recap - everything else

2017-09-19 Thread Belmiro Moreira
Hi Matt, thanks for these great summaries. I didn't find any mention to nested quotas. Was it discussed in the PTG? and what can we expect for Queens? thanks, Belmiro CERN On Mon, Sep 18, 2017 at 11:58 PM, Matt Riedemann wrote: > There was a whole lot of other stuff discussed at the PTG. The d

Re: [Openstack-operators] [nova][scheduler] Anyone relying on the host_subset_size config option?

2017-05-28 Thread Belmiro Moreira
This option is useful in large deployments. Our scheduler strategy is to "pack", however we are not interested in this strategy per individual compute node but per sets of them. One of the advantages is that when a user creates consecutive instances in the same AVZ it's unlikely that they will be

Re: [Openstack-operators] [openstack-dev] [keystone][nova][cinder][glance][neutron][horizon][policy] defining admin-ness

2017-05-26 Thread Belmiro Moreira
Hi, thanks for bringing this into discussion in the Operators list. Option 1 and 2 and not complementary but complety different. So, considering "Option 2" and the goal to target it for Queens I would prefer not going into a migration path in Pike and then again in Queens. Belmiro On Fri, May 2

Re: [Openstack-operators] [openstack-dev] [Nova] [Cells] Stupid question: Cells v2 & AZs

2017-05-23 Thread Belmiro Moreira
Hi David, AVZs are basically aggregates. In cells_v2 aggregates are defined in the cell_api, so it will be possible to have multiple AVZs per cell and AVZs that spread between different cells. Belmiro On Wed, May 24, 2017 at 5:14 AM, David Medberry wrote: > Hi Devs and Implementers, > > A quest

Re: [Openstack-operators] [openstack-dev] [nova] Boston Forum session recap - searchlight integration

2017-05-22 Thread Belmiro Moreira
Hi Matt, if by "incomplete results" you mean retrieve the instances UUIDs (in the cell_api) for the cells that failed to answer, I would prefer to have incomplete results than a failed operation. Belmiro On Mon, May 22, 2017 at 11:39 AM, Matthew Booth wrote: > On 19 May 2017 at 20:07, Mike Baye

Re: [Openstack-operators] Successful nova-network to Neutron Migration

2017-05-20 Thread Belmiro Moreira
Hi Joe, congrats. Can you also make available your scripts changes for IPv6? The more the better for any site that is still working in the migration, like us :) thanks, Belmiro On Sat, May 20, 2017 at 6:51 PM, Joe Topjian wrote: > Hi all, > > There probably aren't a lot of people in this situa

Re: [Openstack-operators] nova mitaka->newton online_data_migrations

2017-03-07 Thread Belmiro Moreira
Hi Saverio, when not using "max_count" the message "Running batches of 50 until complete" is always printed. If you are not getting any error and no more output the migrations should have finished. Unfortunately, there is not such message "All Done" when the online_data_migrations finish. You can

Re: [Openstack-operators] [nova] FYI: live_migration_progress_timeout will default to 0 and be deprecated in Ocata

2017-02-09 Thread Belmiro Moreira
+1 We use "block-migration" and we needed to disable this timeout. Belmiro CERN On Thu, Feb 9, 2017 at 5:29 PM, Matt Riedemann wrote: > This is just a heads up to anyone running with this since Liberty, there > is a patch [1] that will go into Ocata which deprecates the > live_migration_progres

Re: [Openstack-operators] [openstack-dev] [nova] How to expose the compute node local disks to instances

2016-12-09 Thread Belmiro Moreira
he LVM driver to > use that pool of space to present volumes to your compute instances. > > Thanks, > Sean > > On Thu, Dec 08, 2016 at 07:46:35PM +0100, Belmiro Moreira wrote: > > Hi, > > > > we have a set of disk servers (JBOD) that we would like to integrate in

[Openstack-operators] [openstack-dev] [nova] How to expose the compute node local disks to instances

2016-12-08 Thread Belmiro Moreira
Hi, we have a set of disk servers (JBOD) that we would like to integrate into our cloud to run applications like Hadoop and Spark. Using file disks for storage and a huge "/var/lib/nova" is not an option for these use cases so we would like to expose the local drives directly to the VMs as ephe

Re: [Openstack-operators] How to tune scheduling for "Insufficient compute resources" (race conditions ?)

2016-11-30 Thread Belmiro Moreira
How many nova-schedulers are you running? You can hit this issue when multiple nova-schedulers select the same compute node for different instances. Belmiro On Wed, Nov 30, 2016 at 3:56 PM, Massimo Sgaravatto < massimo.sgarava...@gmail.com> wrote: > Hi all > > I have a problem with scheduling in

Re: [Openstack-operators] How do you handle purge of database tables ?

2016-06-23 Thread Belmiro Moreira
Hi, we wrote this blog post a year ago but it still can be useful depending on the OpenStack version that you are running. http://openstack-in-production.blogspot.ch/2015/05/purging-nova-databases-in-cell.html Belmiro On Thu, Jun 23, 2016 at 3:32 PM, Nick Jones wrote: > > On Thu, 2016-06-23 at

Re: [Openstack-operators] Duplicates and confusion in nova policy.json files

2016-06-16 Thread Belmiro Moreira
Hi Sam, can you describe a little bit the issues that you experienced with the policy files, and how you fixed them? In my test env for the upgrade to Liberty and using the policy files from Kilo and I'm also seeing some problems. For example operations that are not allowed by policy (ex: resize)

Re: [Openstack-operators] [openstack-dev] [glance] Proposal for a mid-cycle virtual sync on operator issues

2016-06-01 Thread Belmiro Moreira
Eastern) is > quite difficult. If there's strong interest from San Jose, we may have > to settle for a rather awkward choice below: > > > > http://www.timeanddate.com/worldclock/meetingdetails.html?year=2016&month=6&day=9&hour=4&min=0&sec=0&p1=881&am

Re: [Openstack-operators] [openstack-dev] [glance] Proposal for a mid-cycle virtual sync on operator issues

2016-05-31 Thread Belmiro Moreira
Hi Nikhil, I'm interested in this discussion. Initially you were proposing Thursday June 9th, 2016 at 2000UTC. Are you suggesting to change also the date? Because in the new timeanddate suggestions is 6/7 of June. Belmiro On Tue, May 31, 2016 at 6:13 PM, Nikhil Komawar wrote: > Hey, > > > Than

Re: [Openstack-operators] Display all nova (or other service) configuration parameters

2016-05-08 Thread Belmiro Moreira
When a service starts with the log level configured to debug you can see what options/values is using. Belmiro On Sat, May 7, 2016 at 10:17 PM, Sergio Cuellar Valdes < scuell...@kionetworks.com> wrote: > Hi everybody, > > How can you display all the values that has nova or other service and that

Re: [Openstack-operators] [nova][neutron] What are your cells networking use cases?

2016-02-26 Thread Belmiro Moreira
Hi, thanks Carl for info about the DHCP plans. Our DHCP concern is because currently the DHCP agent needs to be assigned to a network and then it creates a port for each subnet. In our infrastructure we only consider a network with several hundred subnets. By default the DHCP agent runs in the net

[Openstack-operators] Operators Meetup Manchester - sessions schedule available?

2016-01-15 Thread Belmiro Moreira
Hi, for the Ops Meetup in Manchester we have the following etherpad: https://etherpad.openstack.org/p/MAN-ops-meetup but is there any schedule already available for the sessions? thanks, Belmiro ___ OpenStack-operators mailing list OpenStack-operators@li

Re: [Openstack-operators] [openstack-dev] [nova][cells] Should flavors in the API DB for cells v2 be soft-deletable?

2016-01-08 Thread Belmiro Moreira
IMHO I think it's a great way to fix the URI problem. +1 Belmiro On Fri, Jan 8, 2016 at 3:23 PM, Sylvain Bauza wrote: > > > Le 08/01/2016 15:10, Andrew Laski a écrit : > >> On 01/08/16 at 12:43pm, John Garbutt wrote: >> >>> On 7 January 2016 at 19:59, Matt Riedemann >>> wrote: >>> There i

Re: [Openstack-operators] Confirm resize in Kilo with Cells

2015-11-23 Thread Belmiro Moreira
", "flavorid", "vcpu_weight", "id"], "nova_object.name": "Flavor", "nova_object.data": {"root_gb": 20, "name": "m1.small", "ephemeral_gb": 0, "memory_mb": 2048, "vcpus": 1, "e

Re: [Openstack-operators] Confirm resize in Kilo with Cells

2015-11-23 Thread Belmiro Moreira
Hi Mathieu, thanks for the related bugs. But I'm observing this on 2015.1.1. On Sun, Nov 22, 2015 at 12:58 AM, Mathieu Gagné wrote: > On 2015-11-21 4:47 PM, Belmiro Moreira wrote: > > Hi, > > We are about to upgrade nova to kilo using cells and we noticed > > the resiz

[Openstack-operators] Confirm resize in Kilo with Cells

2015-11-21 Thread Belmiro Moreira
Hi, We are about to upgrade nova to kilo using cells and we noticed the resize/migrate functionality is not working properly. The instance is correctly resized/migrated but fails to “confirm resize” with the following trace: 2015-11-21 22:40:49.804 26786 ERROR nova.api.openstack.wsgi [req-67f6a22

[Openstack-operators] Confirm resize in Kilo with Cells

2015-11-21 Thread Belmiro Moreira
Hi, We are about to upgrade nova to kilo using cells and we noticed the resize/migrate functionality is not working properly. The instance is correctly resized/migrated but fails to “confirm resize” with the following trace: 2015-11-21 22:40:49.804 26786 ERROR nova.api.openstack.wsgi [req-67f6a22

Re: [Openstack-operators] [Scale][Performance] / compute_nodes ratio experience

2015-11-18 Thread Belmiro Moreira
Hi, we are still running nova Juno and I don't see this performance issue. (I can comment on Kilo next week). Per cell, we have a node that runs conductor + other control plane services. The number of conductor workers can change between 16 to 48. We try to not have more than 200 compute nodes per

Re: [Openstack-operators] Running mixed stuff Juno & Kilo , Was: cinder-api with rbd driver ignores ceph.conf

2015-11-18 Thread Belmiro Moreira
Hi Saverio, we always upgrade one component at a time. Cinder was one of the first components that we upgraded to kilo, meaning that other components (glance, nova, ...) were running Juno. We didn't have any problem with this setup. Belmiro CERN On Tue, Nov 17, 2015 at 6:01 PM, Saverio Proto wr

Re: [Openstack-operators] Informal Ops Meetup?

2015-10-29 Thread Belmiro Moreira
+1 Belmiro On Thursday, 29 October 2015, Kris G. Lindgren wrote: > We seem to have enough interest… so meeting time will be at 10am in the > Prince room (if we get an actual room I will send an update). > > Does anyone have any ideas about what they want to talk about? I am > pretty much op

Re: [Openstack-operators] [Large Deployments Team] Updated Agenda for Later Today

2015-06-22 Thread Belmiro Moreira
Hi, just added our use-cases/patches to the etherpad. Belmiro On Fri, Jun 19, 2015 at 11:09 PM, Kris G. Lindgren wrote: > Mike added our use case to the etherpad [1] today. I talked it over > with Carl Baldwin and he seemed ok with the format. If you guys want to > add your uses cases to th

[Openstack-operators] Random NUMA cell selection can leave NUMA cells unused

2015-06-05 Thread Belmiro Moreira
Hi, I would like to raise your attention for the bug https://bugs.launchpad.net/nova/+bug/1461777 since it can impact the efficiency of your cloud. It affects Juno and Kilo deployments. Belmiro ___ OpenStack-operators mailing list OpenStack-operators@lis

Re: [Openstack-operators] max_age and until_refresh for fixing Nova quotas

2015-03-21 Thread Belmiro Moreira
Hi, I just posted in our operations blog how CERN is dealing with quotas synchronization problem. http://openstack-in-production.blogspot.fr/2015/03/nova-quota-usage-synchronization.html Hope it helps, cheers, Belmiro On Sat, Mar 21, 2015 at 12:55 AM, Sam Morrison wrote: > I’d need to go thr

[Openstack-operators] [openstack-operators] [openstack-dev] [nova] Nova options as instance metadata

2015-03-04 Thread Belmiro Moreira
Hi, in nova there are several options that can be defined in the flavor (extra specs) and/or as image properties. This is great, however to deploy some of these options we will need offer the same image with different properties or let the users upload the same image with the right properties. It

Re: [Openstack-operators] How to handle updates of public images?

2015-02-05 Thread Belmiro Moreira
We don't delete public images from Glance because it breaks migrate/resize and block live migration. Not tested with upstream Kilo, though. As consequence, our public image list has been growing over time... In order to manage image releases we use "glance image properties" to tag them. Some rele

Re: [Openstack-operators] Deprecation of in tree EC2 API in Nova for Kilo release

2015-01-29 Thread Belmiro Moreira
I completely agree with Tim and Daniel. Also, deprecating nova EC2 API without having the community engaged with the new stackforge “EC2 standalone service” can lead to a no EC2 support at all. On Thu, Jan 29, 2015 at 4:46 AM, Saju M wrote: > I think, new EC2 API also uses EC2 API in the nova fo

Re: [Openstack-operators] Way to check compute <-> rabbitmq connectivity

2015-01-16 Thread Belmiro Moreira
Hi, we had similar issues. In our case, some times (not really a pattern here!) nova-compute didn't consume messages even if everything was apparently happy. We started monitoring the queues size and restarting nova-compute. We are still using "python-oslo-messaging-1.3.0.2", however the problem d

[Openstack-operators] New services disable reason

2015-01-14 Thread Belmiro Moreira
Hi, as operators I would like to have your comments/suggestions on: https://review.openstack.org/#/c/136645/1 With a large number of nodes several services are disabled because various reasons (in our case mainly hardware interventions). To help operations we use the "disable reason" as fast filt

Re: [Openstack-operators] How to remove a Zone from openstack

2015-01-09 Thread Belmiro Moreira
Hi Alex, you need to create "host aggregates" to define other availability zones. For more info see: http://docs.openstack.org/havana/config-reference/content/host-aggregates.html The default availability zone can be changed with the configuration option: default_availability_zone Belmiro On Fri

Re: [Openstack-operators] [nova] Specifying a list of tenants for scheduler filtering

2014-12-20 Thread Belmiro Moreira
Hi Mustafa, the filter doesn't expect a comma-delimited list. It supports a tenant per aggregate. Belmiro On Fri, Dec 19, 2014 at 11:56 PM, Mustafa Jamil wrote: > > The AggregateMultiTenancyIsolation nova scheduler filter is supposed to > check whether a tenant requesting nova to schedule a VM i

Re: [Openstack-operators] DB archive deleted rows

2014-10-02 Thread Belmiro Moreira
Ks) > > Simon. > > -- > Simon McCartney > "If not me, who? If not now, when?" > +447710836915 > > On 2 October 2014 at 15:18:32, Belmiro Moreira ( > moreira.belmiro.email.li...@gmail.com) wrote: > > Hi, > our nova DBs are growing rapidly and it

[Openstack-operators] DB archive deleted rows

2014-10-02 Thread Belmiro Moreira
Hi, our nova DBs are growing rapidly and it's time to start pruning them... I'm trying the "archive deleted rows" however is not working and I'm getting the following warning in the logs: "IntegrityError detected when archiving table" Searching about this problem I found the bug " https://bugs.lau

Re: [Openstack-operators] DB sync (Havana -> Icehouse). Unknown column 'instances.ephemeral_key_uuid'

2014-10-02 Thread Belmiro Moreira
Thanks Jonathan. I'm still trying to understand what is going on... On Tue, Sep 30, 2014 at 4:29 PM, Jonathan Proulx wrote: > On Tue, Sep 30, 2014 at 9:07 AM, Belmiro Moreira > wrote: > > Hi, > > I'm testing the nova DB sync (Havana -> Icehouse) in a copy o

[Openstack-operators] DB sync (Havana -> Icehouse). Unknown column 'instances.ephemeral_key_uuid'

2014-09-30 Thread Belmiro Moreira
Hi, I'm testing the nova DB sync (Havana -> Icehouse) in a copy of my production databases and I'm getting the following "CRITICAL" in nova-manage log for the 230-> 231 migration. Looking at this particular migration it creates a new column " instances.ephemeral_key_uuid" however it's complaining t

Re: [Openstack-operators] limit num instance-type per host

2014-09-24 Thread Belmiro Moreira
Hi, one possible solution to "mitigate" this problem is to stacking instances instead of spreading them. This means that the scheduler will select the compute "most used", maximizing in this way the space for large instances. For that you need to set "ram_weight_multiplier" option with a negative n

Re: [Openstack-operators] InstanceInfoCacheNotFound: Info cache for instance X could not be found.

2014-09-21 Thread Belmiro Moreira
Yes, it was merged in Icehouse however it wasn't in time for 2014.1.2 On Thu, Sep 18, 2014 at 8:47 PM, Alex Leonhardt wrote: > Thanks for for ding this! Not sure but I guess this got into icehouse? > > Alex > On 18 Sep 2014 17:19, "Belmiro Moreira" < > more

Re: [Openstack-operators] InstanceInfoCacheNotFound: Info cache for instance X could not be found.

2014-09-18 Thread Belmiro Moreira
ve been some API endpoint > setting that was missing or wrong. Possible to post your nova config so I > can compare to what we use right now ? > > Alex > > On 17 September 2014 15:23, Belmiro Moreira < > moreira.belmiro.email.li...@gmail.com> wrote: > >> Hi, >> I&

Re: [Openstack-operators] InstanceInfoCacheNotFound: Info cache for instance X could not be found.

2014-09-17 Thread Belmiro Moreira
Hi, I'm observing exactly the same problem. But in my case it is happening every time a VM is deleted. I'm using icehouse. Any idea? regards, Belmiro On Fri, Jul 18, 2014 at 10:22 AM, Alex Leonhardt wrote: > Hi All, > > I keep seeing this in the logs when deleting an instance ( and it takes