KaiGai

On Tue, Nov 19, 2013 at 9:41 AM, Kohei KaiGai <kai...@kaigai.gr.jp> wrote:

> Thanks for your review.
>
> 2013/11/19 Jim Mlodgenski <jimm...@gmail.com>:
> > My initial review on this feature:
> > - The patches apply and build, but it produces a warning:
> > ctidscan.c: In function ‘CTidInitCustomScanPlan’:
> > ctidscan.c:362:9: warning: unused variable ‘scan_relid’
> [-Wunused-variable]
> >
> This variable was only used in Assert() macro, so it causes a warning if
> you
> don't put --enable-cassert on the configure script.
> Anyway, I adjusted the code to check relid of RelOptInfo directly.


>
The warning is now gone.


> > I'd recommend that you split the part1 patch containing the ctidscan
> contrib
> > into its own patch. It is more than half of the patch and its certainly
> > stands on its own. IMO, I think ctidscan fits a very specific use case
> and
> > would be better off being an extension instead of in contrib.
> >
> OK, I split them off. The part-1 is custom-scan API itself, the part-2 is
> ctidscan portion, and the part-3 is remote join on postgres_fdw.
>

Attached is a patch for the documentation. I think the documentation still
needs a little more work, but it is pretty close. I can add some more
detail to it once finish adapting the hadoop_fdw to using the custom scan
api and have a better understanding of all of the calls.


> Thanks,
> --
> KaiGai Kohei <kai...@kaigai.gr.jp>
>
*** a/doc/src/sgml/custom-scan.sgml	2013-11-18 17:50:02.652039003 -0500
--- b/doc/src/sgml/custom-scan.sgml	2013-11-22 09:09:13.624254649 -0500
***************
*** 8,47 ****
    <secondary>handler for</secondary>
   </indexterm>
   <para>
!   Custom-scan API enables extension to provide alternative ways to scan or
!   join relations, being fully integrated with cost based optimizer,
!   in addition to the built-in implementation.
!   It consists of a set of callbacks, with a unique name, to be invoked during
!   query planning and execution. Custom-scan provider should implement these
!   callback functions according to the expectation of API.
   </para>
   <para>
!   Overall, here is four major jobs that custom-scan provider should implement.
!   The first one is registration of custom-scan provider itself. Usually, it
!   shall be done once at <literal>_PG_init()</literal> entrypoint on module
!   loading.
!   The other three jobs shall be done for each query planning and execution.
!   The second one is submission of candidate paths to scan or join relations,
!   with an adequate cost, for the core planner.
!   Then, planner shall chooses a cheapest path from all the candidates.
!   If custom path survived, the planner kicks the third job; construction of
!   <literal>CustomScan</literal> plan node, being located within query plan
!   tree instead of the built-in plan node.
!   The last one is execution of its implementation in answer to invocations
!   by the core executor.
   </para>
   <para>
!   Some of contrib module utilize the custom-scan API. It may be able to
!   provide a good example for new development.
    <variablelist>
     <varlistentry>
      <term><xref linkend="ctidscan"></term>
      <listitem>
       <para>
!       Its logic enables to skip earlier pages or terminate scan prior to
!       end of the relation, if inequality operator on <literal>ctid</literal>
!       system column can narrow down the scope to be scanned, instead of
!       the sequential scan that reads a relation from the head to the end.
       </para>
      </listitem>
     </varlistentry>
--- 8,46 ----
    <secondary>handler for</secondary>
   </indexterm>
   <para>
!   The custom-scan API enables an extension to provide alternative ways to scan
!   or join relations leveraging the cost based optimizer. The API consists of a
!   set of callbacks, with a unique names, to be invoked during query planning 
!   and execution. A custom-scan provider should implement these callback 
!   functions according to the expectation of the API.
   </para>
   <para>
!   Overall, there are four major tasks that a custom-scan provider should 
!   implement. The first task is the registration of custom-scan provider itself.
!   Usually, this needs to be done once at the <literal>_PG_init()</literal> 
!   entrypoint when the module is loading. The remaing three tasks are all done
!   when a query is planning and executing. The second task is the submission of
!   candidate paths to either scan or join relations with an adequate cost for
!   the core planner. Then, the planner will choose the cheapest path from all of
!   the candidates. If the custom path survived, the planner starts the third 
!   task; construction of a <literal>CustomScan</literal> plan node, located
!   within the query plan tree instead of the built-in plan node. The last task
!   is the execution of its implementation in answer to invocations by the core
!   executor.
   </para>
   <para>
!   Some of contrib modules utilize the custom-scan API. They may provide a good
!   example for new development.
    <variablelist>
     <varlistentry>
      <term><xref linkend="ctidscan"></term>
      <listitem>
       <para>
!       This custom scan in this module enables a scan to skip earlier pages or
!       terminate prior to end of the relation, if the inequality operator on the
!       <literal>ctid</literal> system column can narrow down the scope to be
!       scanned, instead of a sequential scan which reads a relation from the
!       head to the end.
       </para>
      </listitem>
     </varlistentry>
***************
*** 49,70 ****
      <term><xref linkend="postgres-fdw"></term>
      <listitem>
       <para>
!       Its logic replaces a local join of foreign tables managed by
!       <literal>postgres_fdw</literal> with a custom scan that fetches
!       remotely joined relations.
!       It shows the way to implement a custom scan node that performs
!       instead join nodes.
       </para>
      </listitem>
     </varlistentry>
    </variablelist>
   </para>
   <para>
!   Right now, only scan and join are supported to have fully integrated cost
!   based query optimization performing on custom scan API.
!   You might be able to implement other stuff, like sort or aggregation, with
!   manipulation of the planned tree, however, extension has to be responsible
!   to handle this replacement correctly. Here is no support by the core.
   </para>
  
   <sect1 id="custom-scan-spec">
--- 48,68 ----
      <term><xref linkend="postgres-fdw"></term>
      <listitem>
       <para>
!       This custom scan in this module replaces a local join of foreign tables
!       managed by <literal>postgres_fdw</literal> with a scan that fetches
!       remotely joined relations. It demostrates the way to implement a custom
!       scan node that performs join nodes.
       </para>
      </listitem>
     </varlistentry>
    </variablelist>
   </para>
   <para>
!   Currently, only scan and join are fully supported with integrated cost
!   based query optimization using the custom scan API. You might be able to
!   implement other stuff, like sort or aggregation, with manipulation of the
!   planned tree, however, the extension has to be responsible to handle this
!   replacement correctly. There is no support in the core.
   </para>
  
   <sect1 id="custom-scan-spec">
***************
*** 72,80 ****
    <sect2 id="custom-scan-register">
     <title>Registration of custom scan provider</title>
     <para>
!     The first job for custom scan provider is registration of a set of
!     callbacks with a unique name. Usually, it shall be done once on
!     <literal>_PG_init()</literal> entrypoint of module loading.
  <programlisting>
  void
  register_custom_provider(const CustomProvider *provider);
--- 70,78 ----
    <sect2 id="custom-scan-register">
     <title>Registration of custom scan provider</title>
     <para>
!     The first task for a custom scan provider is the registration of a set of
!     callbacks with a unique names. Usually, this is done once upon module
!     loading in the <literal>_PG_init()</literal> entrypoint.
  <programlisting>
  void
  register_custom_provider(const CustomProvider *provider);
***************
*** 90,105 ****
    <sect2 id="custom-scan-path">
     <title>Submission of custom paths</title>
     <para>
!     The query planner finds out the best way to scan or join relations from
!     the various potential paths; combination of a scan algorithm and target
!     relations.
!     Prior to this selection, we list up all the potential paths towards
!     a target relation (if base relation) or a pair of relations (if join).
!     The <literal>add_scan_path_hook</> and <literal>add_join_path_hook</>
!     allows extensions to add alternative scan paths in addition to built-in
!     ones.
      If custom-scan provider can submit a potential scan path towards the
!     supplied relation, it shall construct <literal>CustomPath</> object
      with appropriate parameters.
  <programlisting>
  typedef struct CustomPath
--- 88,102 ----
    <sect2 id="custom-scan-path">
     <title>Submission of custom paths</title>
     <para>
!     The query planner finds the best way to scan or join relations from various
!     potential paths using a combination of scan algorithms and target 
!     relations. Prior to this selection, we list all of the potential paths
!     towards a target relation (if it is a base relation) or a pair of relations
!     (if it is a join). The <literal>add_scan_path_hook</> and
!     <literal>add_join_path_hook</> allow extensions to add alternative scan
!     paths in addition to built-in paths.
      If custom-scan provider can submit a potential scan path towards the
!     supplied relation, it shall construct a <literal>CustomPath</> object
      with appropriate parameters.
  <programlisting>
  typedef struct CustomPath
***************
*** 110,118 ****
      List       *custom_private;     /* can be used for private data */
  } CustomPath;
  </programlisting>
!     Its <literal>path</> is common field for all the path nodes to store
!     cost estimation. In addition, <literal>custom_name</> is the name of
!     registered custom scan provider, <literal>custom_flags</> is a set of
      flags below, and <literal>custom_private</> can be used to store private
      data of the custom scan provider.
     </para>
--- 107,115 ----
      List       *custom_private;     /* can be used for private data */
  } CustomPath;
  </programlisting>
!     Its <literal>path</> is a common field for all the path nodes to store
!     a cost estimation. In addition, <literal>custom_name</> is the name of
!     the registered custom scan provider, <literal>custom_flags</> is a set of
      flags below, and <literal>custom_private</> can be used to store private
      data of the custom scan provider.
     </para>
***************
*** 125,132 ****
          It informs the query planner this custom scan node supports
          <literal>ExecMarkPosCustomScan</> and
          <literal>ExecRestorePosCustomScan</> methods.
!         Also, custom scan provider has to be responsible to mark and restore
!         a particular position.
         </para>
        </listitem>
       </varlistentry>
--- 122,129 ----
          It informs the query planner this custom scan node supports
          <literal>ExecMarkPosCustomScan</> and
          <literal>ExecRestorePosCustomScan</> methods.
!         Also, the custom scan provider has to be responsible to mark and
!         restore a particular position.
         </para>
        </listitem>
       </varlistentry>
***************
*** 135,141 ****
        <listitem>
         <para>
          It informs the query planner this custom scan node supports
!         backward scan.
          Also, custom scan provider has to be responsible to scan with
          backward direction.
         </para>
--- 132,138 ----
        <listitem>
         <para>
          It informs the query planner this custom scan node supports
!         backward scans.
          Also, custom scan provider has to be responsible to scan with
          backward direction.
         </para>
***************
*** 148,157 ****
    <sect2 id="custom-scan-plan">
     <title>Construction of custom plan node</title>
     <para>
!     Once <literal>CustomPath</literal> got choosen by query planner,
!     it calls back its associated custom scan provider to complete setting
!     up <literal>CustomScan</literal> plan node according to the path
!     information.
  <programlisting>
  void
  InitCustomScanPlan(PlannerInfo *root,
--- 145,154 ----
    <sect2 id="custom-scan-plan">
     <title>Construction of custom plan node</title>
     <para>
!     Once <literal>CustomPath</literal> was choosen by the query planner,
!     it calls back to its associated to the custom scan provider to complete 
!     setting up the <literal>CustomScan</literal> plan node according to the
!     path information.
  <programlisting>
  void
  InitCustomScanPlan(PlannerInfo *root,
***************
*** 160,180 ****
                     List *tlist,
                     List *scan_clauses);
  </programlisting>
!     Query planner does basic initialization on the <literal>cscan_plan</>
!     being allocated, then custom scan provider can apply final initialization.
!     <literal>cscan_path</> is the path node that was constructed on the
!     previous stage then got choosen.
      <literal>tlist</> is a list of <literal>TargetEntry</> to be assigned
      on the <literal>Plan</> portion in the <literal>cscan_plan</>.
      Also, <literal>scan_clauses</> is a list of <literal>RestrictInfo</> to
!     be checked during relation scan. Its expression portion shall be also
      assigned on the <literal>Plan</> portion, but can be eliminated from
      this list if custom scan provider can handle these checks by itself.
     </para>
     <para>
      It often needs to adjust <literal>varno</> of <literal>Var</> node that
!     references a particular scan node, after conscruction of plan node.
!     For example, Var node in the target list of join node originally
      references a particular relation underlying a join, however, it has to
      be adjusted to either inner or outer reference.
  <programlisting>
--- 157,177 ----
                     List *tlist,
                     List *scan_clauses);
  </programlisting>
!     The query planner does basic initialization on the <literal>cscan_plan</>
!     being allocated, then the custom scan provider can apply final 
!     initialization. <literal>cscan_path</> is the path node that was 
!     constructed on the previous stage then was choosen.
      <literal>tlist</> is a list of <literal>TargetEntry</> to be assigned
      on the <literal>Plan</> portion in the <literal>cscan_plan</>.
      Also, <literal>scan_clauses</> is a list of <literal>RestrictInfo</> to
!     be checked during a relation scan. Its expression portion will also be
      assigned on the <literal>Plan</> portion, but can be eliminated from
      this list if custom scan provider can handle these checks by itself.
     </para>
     <para>
      It often needs to adjust <literal>varno</> of <literal>Var</> node that
!     references a particular scan node, after construction of the plan node.
!     For example, Var node in the target list of the join node originally
      references a particular relation underlying a join, however, it has to
      be adjusted to either inner or outer reference.
  <programlisting>
***************
*** 183,191 ****
                       CustomScan *cscan_plan,
                       int rtoffset);
  </programlisting>
!     This callback is optional if custom scan node is a vanilla relation
!     scan because here is nothing special to do. Elsewhere, it needs to
!     be handled by custom scan provider in case when a custom scan replaced
      a join with two or more relations for example.
     </para>
    </sect2>
--- 180,188 ----
                       CustomScan *cscan_plan,
                       int rtoffset);
  </programlisting>
!     This callback is optional if the custom scan node is a vanilla relation
!     scan because there is nothing special to do. Elsewhere, it needs to
!     be handled by the custom scan provider in case when a custom scan replaced
      a join with two or more relations for example.
     </para>
    </sect2>
***************
*** 193,200 ****
    <sect2 id="custom-scan-exec">
     <title>Execution of custom scan node</title>
     <para>
!     Query execuror also launches associated callbacks to begin, execute and
!     end custom scan according to the executor's manner.
     </para>
     <para>
  <programlisting>
--- 190,197 ----
    <sect2 id="custom-scan-exec">
     <title>Execution of custom scan node</title>
     <para>
!     The query executor also launches the associated callbacks to begin, execute
!     and end the custom scan according to the executor's manner.
     </para>
     <para>
  <programlisting>
***************
*** 202,217 ****
  BeginCustomScan(CustomScanState *csstate, int eflags);
  </programlisting>
      It begins execution of the custom scan on starting up executor.
!     It allows custom scan provider to do any initialization job around this
!     plan, however, it is not a good idea to launch actual scanning jobs.
      (It shall be done on the first invocation of <literal>ExecCustomScan</>
      instead.)
      The <literal>custom_state</> field of <literal>CustomScanState</> is
!     intended to save the private state being managed by custom scan provider.
!     Also, <literal>eflags</> has flag bits of the executor's operating mode
!     for this plan node.
!     Note that custom scan provider should not perform anything visible
!     externally if <literal>EXEC_FLAG_EXPLAIN_ONLY</> would be given,
     </para>
  
     <para>
--- 199,214 ----
  BeginCustomScan(CustomScanState *csstate, int eflags);
  </programlisting>
      It begins execution of the custom scan on starting up executor.
!     It allows the custom scan provider to do any initialization job around this
!     plan, however, it is not a good idea to launch the actual scanning jobs.
      (It shall be done on the first invocation of <literal>ExecCustomScan</>
      instead.)
      The <literal>custom_state</> field of <literal>CustomScanState</> is
!     intended to save the private state being managed by the custom scan
!     provider. Also, <literal>eflags</> has flag bits of the executor's
!     operating mode for this plan node. Note that the custom scan provider
!     should not perform anything visible externally if 
!     <literal>EXEC_FLAG_EXPLAIN_ONLY</> would be given,
     </para>
  
     <para>
***************
*** 219,229 ****
  TupleTableSlot *
  ExecCustomScan(CustomScanState *csstate);
  </programlisting>
!     It fetches one tuple from the underlying relation or relations if join
      according to the custom logic. Unlike <literal>IterateForeignScan</>
!     method in foreign table, it is also responsible to check whether next
      tuple matches the qualifier of this scan, or not.
!     A usual way to implement this method is the callback performs just an
      entrypoint of <literal>ExecQual</> with its own access method.
     </para>
  
--- 216,226 ----
  TupleTableSlot *
  ExecCustomScan(CustomScanState *csstate);
  </programlisting>
!     It fetches one tuple from the underlying relation or relations, if joining,
      according to the custom logic. Unlike <literal>IterateForeignScan</>
!     method in foreign table, it is also responsible to check whether the next
      tuple matches the qualifier of this scan, or not.
!     The usual way to implement this method is the callback performs just an
      entrypoint of <literal>ExecQual</> with its own access method.
     </para>
  
***************
*** 232,240 ****
  Node *
  MultiExecCustomScan(CustomScanState *csstate);
  </programlisting>
!     It fetches multiple tuples from the underlying relation or relations if
!     join according to the custom logic. Pay attention the data format (and
!     the way to return also) depends on the type of upper node.
     </para>
  
     <para>
--- 229,237 ----
  Node *
  MultiExecCustomScan(CustomScanState *csstate);
  </programlisting>
!     It fetches multiple tuples from the underlying relation or relations, if
!     joining, according to the custom logic. Pay attention the data format (and
!     the way to return also) since it depends on the type of upper node.
     </para>
  
     <para>
***************
*** 242,248 ****
  void
  EndCustomScan(CustomScanState *csstate);
  </programlisting>
!     It ends the scan and release resources privately allocated.
      It is usually not important to release memory in per-execution memory
      context. So, all this callback should be responsible is its own
      resources regardless from the framework.
--- 239,245 ----
  void
  EndCustomScan(CustomScanState *csstate);
  </programlisting>
!     It ends the scan and releases resources privately allocated.
      It is usually not important to release memory in per-execution memory
      context. So, all this callback should be responsible is its own
      resources regardless from the framework.
***************
*** 257,263 ****
  ReScanCustomScan(CustomScanState *csstate);
  </programlisting>
      It restarts the current scan from the beginning.
!     Note that parameters of the scan depends on might change values,
      so rewinded scan does not need to return exactly identical tuples.
     </para>
     <para>
--- 254,260 ----
  ReScanCustomScan(CustomScanState *csstate);
  </programlisting>
      It restarts the current scan from the beginning.
!     Note that parameters of the scan depends on may change values,
      so rewinded scan does not need to return exactly identical tuples.
     </para>
     <para>
***************
*** 276,282 ****
  RestorePosCustom(CustomScanState *csstate);
  </programlisting>
      It rewinds the current position of the custom scan to the position
!     where <literal>MarkPosCustomScan</> saved before.
      Note that it is optional to implement, only when
      <literal>CUSTOM__SUPPORT_MARK_RESTORE</> is set.
     </para>
--- 273,279 ----
  RestorePosCustom(CustomScanState *csstate);
  </programlisting>
      It rewinds the current position of the custom scan to the position
!     where <literal>MarkPosCustomScan</> was saved before.
      Note that it is optional to implement, only when
      <literal>CUSTOM__SUPPORT_MARK_RESTORE</> is set.
     </para>
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to