Hi, I'm reading the code for vacuum/analyze and it looks like currently we call vacuum_rel/analyze_rel for each relation specified. Which means that if a relation is specified more than once, then we simply vacuum/analyze it that many times. Do we gain any advantage by vacuuming/analyzing a relation back-to-back within a single command? I strongly feel no. I'm thinking we could do a simple optimization here, by transforming following VACUUM/ANALYZE commands to: 1) VACUUM t1, t2, t1, t2, t1; transform to --> VACUUM t1, t2; 2) VACUUM ANALYZE t1(a1), t2(a2), t1(b1), t2(b2), t1(c1); transform to --> VACUUM ANALYZE t1(a1, b1, c1), t2(a2, b2) 3) ANALYZE t1, t2, t1, t2, t1; transform to --> ANALYZE t1, t2; 4) ANALYZE t1(a1), t2(a2), t1(b1), t2(b2), t1(c1); transform to --> ANALYZE t1(a1, b1, c1), t2(a2, b2)
Above use cases may look pretty much unsound and we could think of disallowing with an error for the use cases (1) and 3(), but the use cases (2) and (4) are quite possible in customer scenarios(??). Please feel free to add any other use cases you may think of. The main advantage of the above said optimization is that the commands can become a bit faster because we will avoid extra processing. I would like to hear opinions on this. I'm not sure if this optimization was already given a thought and not done because of some specific reasons. If so, it will be great if someone can point me to those discussions. Or it could be that I'm badly missing in my understanding of current vacuum/analyze code, feel free to correct me. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com