From: drum.lu...@gmail.com <mailto:drum.lu...@gmail.com>   Sent: Tuesday, May 
03, 2016 2:55 AM
I'm trying to get the query below a better performance.. but just don't know 
what else I can do...

Please, have a look and let me know if you can help somehow.. also.. if you 
need some extra data jet ask me please.

* Note that the gorfs.inode_segments table is 1.7TB size

I have the following Query:

explain analyze
SELECT split_part(full_path, '/', 4)::INT AS account_id,
       split_part(full_path, '/', 6)::INT AS note_id,
       split_part(full_path, '/', 9)::TEXT AS variation,
       st_size,
       segment_index,
       reverse(split_part(reverse(full_path), '/', 1)) as file_name,
       i.st_ino,
       full_path,
       (i.st_size / 1000000::FLOAT)::NUMERIC(5,2) || 'MB' AS size_mb
FROM gorfs.inodes i
JOIN gorfs.inode_segments s
  ON i.st_ino = s.st_ino_target
WHERE
      i.checksum_md5 IS NOT NULL
  AND s.full_path ~ '^/userfiles/account/[0-9]+/[a-z]+/[0-9]+'
  AND i.st_size > 0;
  split_part(s.full_path, '/', 4)::INT IN (
 
SELECT account.id
        FROM public.ja_clients AS account
        WHERE
        NOT (
                ((account.last_sub_pay > EXTRACT('epoch' FROM 
(transaction_timestamp() - CAST('4 Months' AS INTERVAL)))) AND 
(account.price_model > 0)) OR
                (account.regdate > EXTRACT('epoch' FROM 
(transaction_timestamp() - CAST('3 Month' AS INTERVAL)))) OR
                (((account.price_model = 0) AND (account.jobcredits > 0)) AND 
(account.last_login > EXTRACT('epoch' FROM (transaction_timestamp() - CAST('4 
Month' AS INTERVAL)))))
        ) LIMIT 100
);

 

There is one obvious solution: restructure your data, since it is not in a 
“standard” form but you’re trying to query it as if it were…you are turning 
your long full_path string into columns…if performance is a concern, that 
overhead has to be eliminated.  

Your two choices would be to either restructure this table directly (requiring 
a change in app code that was filling it), or use it to fill a proper table 
that already has everything decomposed from the long full_path string via 
post-processing after the insert.  A third consideration would be to archive 
off older/unneeded rows to a history table to reduce row counts.  This is about 
proper structure.

Mike Sofen

Reply via email to