We had a yml file that mapped physical datasources to the loader that the generic one serves as a facade to. Now we're moving to an HCatalog based solution that handles that as well as the logical to physical resolution. Basically the mappings are stored in a DB.
On Tue, Dec 11, 2012 at 8:20 AM, Prashant Kommireddi <[email protected]>wrote: > Thanks Bill. Any ideas on how to hide the location of HDFS files from the > end user? > > On Tue, Dec 11, 2012 at 9:42 PM, Bill Graham <[email protected]> wrote: > >> I think the latter would be better. Since the LoadFunc would be decoupled >> from the data exporter you could schedule the exporting independent of the >> loading. We do something similar, without the $query part. >> >> >> On Tue, Dec 11, 2012 at 1:10 AM, Prashant Kommireddi <[email protected] >> >wrote: >> >> > I was working on a LoadFunc and needed some ideas/second opinion on the >> > best way to do this: >> > >> > >> > 1. We use an API to download data from database as flat-files. >> > - A query is given with table name and fields required to extract >> > data >> > 2. Once 1. is done upload data to HDFS >> > 3. Upload the schema file to HDFS >> > 4. LoadFunc to read the schema file and parse data >> > >> > A strict requirement is to hide the details of the location of these >> HDFS >> > files from the user issuing the pig query. For a user it could look as >> > simple as: >> > >> > A = load 'scheme://SampleTable' using CustomLoader('$query'); >> > >> > User here only issues the load statement on table with a query and API >> > calls for importing from database could happen in the background. >> > >> > What would be the best way to do this? Is it better to do the above as >> part >> > of LoadFunc, or would it rather be beneficial to do it separate and >> somehow >> > communicate the location from API import to LoadFunc? >> > >> > Thanks, >> > >> > Prashant >> > >> >> >> >> -- >> *Note that I'm no longer using my Yahoo! email address. Please email me at >> [email protected] going forward.* >> > > -- *Note that I'm no longer using my Yahoo! email address. Please email me at [email protected] going forward.*
