On Tuesday, August 21, 2018 at 12:05:09 AM UTC+2, zachar...@puppet.com 
wrote:
>
> Hi Steve, 
>
> The query you posted looks like the inactive_nodes CTE that's used as a 
> default filter which strips inactive or deactivated nodes from the 
> response. As far as I know it should show up on any query and I don't think 
> it's related to gc in this case. 
>
>

Hi Zachary,

Thanks for the response,

So after a bit of analysis we realise querying for a fact  by value_string 
was super slow . It's gin index
so while LIKEs were fast but =s were less good 125ms vs 6s


puppetdb> explain analyze select * from facts where value_string LIKE 
'nachodev6.cern.ch';
+------------------------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                
                                                               |
|------------------------------------------------------------------------------------------------------------------------------------------|
| Bitmap Heap Scan on facts  (cost=1018.68..15873.81 rows=3958 width=129) 
(actual time=124.456..124.469 rows=8 loops=1)                    |
|   Recheck Cond: (value_string ~~ 'nachodev6.cern.ch'::text)              
                                                                |
|   Heap Blocks: exact=5                                                    
                                                               |
|   ->  Bitmap Index Scan on facts_value_string_trgm  (cost=0.00..1017.69 
rows=3958 width=0) (actual time=124.435..124.435 rows=8 loops=1) |
|         Index Cond: (value_string ~~ 'nachodev6.cern.ch'::text)          
                                                                |
| Planning time: 0.170 ms                                                  
                                                                |
| Execution time: 124.627 ms                                                
                                                               |
+------------------------------------------------------------------------------------------------------------------------------------------+
EXPLAIN
Time: 0.145s
puppetdb> explain analyze select * from facts where value_string = 
'nachodev6.cern.ch';
+---------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                
                                    |
|---------------------------------------------------------------------------------------------------------------|
| Seq Scan on facts  (cost=0.00..935197.30 rows=3958 width=129) (actual 
time=5466.492..6886.163 rows=8 loops=1) |
|   Filter: (value_string = 'nachodev6.cern.ch'::text)                      
                                    |
|   Rows Removed by Filter: 27237747                                        
                                    |
| Planning time: 0.129 ms                                                  
                                     |
| Execution time: 6886.208 ms                                              
                                     |
+---------------------------------------------------------------------------------------------------------------+
EXPLAIN
Time: 6.904s (6 seconds)
puppetdb> 


We have this case a lot for queries like 

query_facts( 'hostgroup_0="dark" and hostgroup_1="web"', 
['ipaddress','ipaddress6']) 


and we could not figure a way to make them use LIKE.


So we added an index which made a hugely massive difference:


CREATE INDEX concurrently facts_value_string_idx ON facts(value_string);

and now we no longer doing full table scans for the above.

Attached is our CPU drop on the database as well as the latency drop on the 
facts endpoint of puppetdb 4.

Our puppetdb is now very happy rather than very sad.

We do now have a limit of 8191 on fact value but we can live with that.



Steve.
 

> On Monday, August 20, 2018 at 5:03:28 AM UTC-7, Steve Traylen wrote:
>>
>>
>>
>> Hi,
>>
>> We recently upgraded to puppetdb 4.4.
>>
>> There is query that takes a while to run, 3 or 4 minutes though we are 
>> unsure of why it is even running.
>>
>> The query below looks related to some kind of clean up or garbage 
>> collection however this particular puppetdb node has two relevant 
>> properties:
>>
>> * gc-interval is set to 0 and indeed there are no gc events in the logs.
>> * This particular node only receives /pdb/query requests and no /pdb/cmd 
>> requests. We have always and still do dedicate nodes to command and query 
>> traffic by redirection at haproxy level.
>>
>> What is the action that triggers the query below. 
>>
>> WITH inactive_nodes AS (SELECT certname FROM certnames WHERE (deactivated 
>> IS NOT NULL OR expired IS NOT NULL)) SELECT fs.certname AS certnam e, 
>> fp.name AS name, f.value AS value FROM factsets fs INNER JOIN facts f ON 
>> fs.id = f.factset_id INNER JOIN fact_paths fp ON f.fact_path_id = fp.id 
>> INNER JOIN value_types vt ON vt.id = f .value_type_id LEFT JOIN 
>> environments env ON fs.environment_id = env.id WHERE (fp.depth = 0 AND 
>> (((((fs.certname) in ( (SELECT fs.certname AS certname FROM factsets fs 
>> INNER JOIN facts f ON fs.id = f.factset_id INNER JOIN fact_paths fp ON 
>> f.fact_path_id = fp.id INNER JOIN value_types vt ON f.value_type_id = 
>> vt.id LEFT JOIN environments env ON fs.environment_id = env.id WHERE (
>> vt.id <> 5 AND ((fp.path = $1) AND (f.value_string = $2)))) ) ) AND 
>> ((fs.certname) in ( (SELECT fs.certname AS certname FROM factsets fs INNER 
>> JOIN facts f ON fs.id = f.factset_ id INNER JOIN fact_paths fp ON 
>> f.fact_path_id = fp.id INNER JOIN value_types vt ON f.value_type_id = 
>> vt.id LEFT JOIN environments env ON fs.environment_id = env.id WHERE (
>> vt.id <> 5 AND ((fp.path = $3) AND (f.value_string = $4)))) ) )) AND ((
>> fp.name = $5) OR (fp.name = $6))) AND NOT ((fs.certname) in ( (SELECT 
>> inactive_nodes.certname AS certname FROM inactive_nodes) ) )))
>>
>> Steve Traylen.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-users/c92ff0b9-7779-4d5e-88b6-93f10eb525b0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to