Re: Using COPY to import large xml file

2018-06-26 Thread Anto Aravinth
Thanks a lot everyone. After playing around with small dataset, I could able to make datasets that are easy to go with COPY. Creating datasets of around 50GB took say 2hrs (I can definitely improve on this). 54M records, COPY took around 35 minutes! Awesome.. :) :) Mean time, I understood few thi

Re: Using COPY to import large xml file

2018-06-25 Thread Tim Cross
Anto Aravinth writes: > Thanks a lot. But I do got lot of challenges! Looks like SO data contains > lot of tabs within itself.. So tabs delimiter didn't work for me. I thought > I can give a special demiliter but looks like Postrgesql copy allow only > one character as delimiter :( > > Sad, I g

Re: Using COPY to import large xml file

2018-06-25 Thread Adrian Klaver
On 06/25/2018 07:25 AM, Anto Aravinth wrote: Thanks a lot. But I do got lot of challenges! Looks like SO data contains lot of tabs within itself.. So tabs delimiter didn't work for me. I thought I can give a special demiliter but looks like Postrgesql copy allow only one character as delimiter

Re: Using COPY to import large xml file

2018-06-25 Thread Nicolas Paris
2018-06-25 17:30 GMT+02:00 Anto Aravinth : > > > On Mon, Jun 25, 2018 at 8:54 PM, Anto Aravinth < > anto.aravinth@gmail.com> wrote: > >> >> >> On Mon, Jun 25, 2018 at 8:20 PM, Nicolas Paris >> wrote: >> >>> >>> 2018-06-25 16:25 GMT+02:00 Anto Aravinth : >>> Thanks a lot. But I do got lot

Re: Using COPY to import large xml file

2018-06-25 Thread Anto Aravinth
On Mon, Jun 25, 2018 at 8:54 PM, Anto Aravinth wrote: > > > On Mon, Jun 25, 2018 at 8:20 PM, Nicolas Paris > wrote: > >> >> 2018-06-25 16:25 GMT+02:00 Anto Aravinth : >> >>> Thanks a lot. But I do got lot of challenges! Looks like SO data >>> contains lot of tabs within itself.. So tabs delimite

Re: Using COPY to import large xml file

2018-06-25 Thread Anto Aravinth
On Mon, Jun 25, 2018 at 8:20 PM, Nicolas Paris wrote: > > 2018-06-25 16:25 GMT+02:00 Anto Aravinth : > >> Thanks a lot. But I do got lot of challenges! Looks like SO data contains >> lot of tabs within itself.. So tabs delimiter didn't work for me. I thought >> I can give a special demiliter but

Re: Using COPY to import large xml file

2018-06-25 Thread Nicolas Paris
2018-06-25 16:25 GMT+02:00 Anto Aravinth : > Thanks a lot. But I do got lot of challenges! Looks like SO data contains > lot of tabs within itself.. So tabs delimiter didn't work for me. I thought > I can give a special demiliter but looks like Postrgesql copy allow only > one character as delimit

Re: Using COPY to import large xml file

2018-06-25 Thread Anto Aravinth
Thanks a lot. But I do got lot of challenges! Looks like SO data contains lot of tabs within itself.. So tabs delimiter didn't work for me. I thought I can give a special demiliter but looks like Postrgesql copy allow only one character as delimiter :( Sad, I guess only way is to insert or do a th

Re: Using COPY to import large xml file

2018-06-24 Thread Christoph Moench-Tegeder
## Anto Aravinth (anto.aravinth@gmail.com): > Sure, let me try that.. I have a question here, COPY usually works when you > move data from files to your postgres instance, right? Now in node.js, > processing the whole file, can I use COPY > programmatically like COPY Stackoverflow ? > Because

Re: Using COPY to import large xml file

2018-06-24 Thread Tim Cross
On Mon, 25 Jun 2018 at 11:38, Anto Aravinth wrote: > > > On Mon, Jun 25, 2018 at 3:44 AM, Tim Cross wrote: > >> >> Anto Aravinth writes: >> >> > Thanks for the response. I'm not sure, how long does this tool takes for >> > the 70GB data. >> > >> > I used node to stream the xml files into insert

Re: Using COPY to import large xml file

2018-06-24 Thread Anto Aravinth
On Mon, Jun 25, 2018 at 3:44 AM, Tim Cross wrote: > > Anto Aravinth writes: > > > Thanks for the response. I'm not sure, how long does this tool takes for > > the 70GB data. > > > > I used node to stream the xml files into inserts.. which was very slow.. > > Actually the xml contains 40 million

Re: Using COPY to import large xml file

2018-06-24 Thread Tim Cross
Anto Aravinth writes: > Thanks for the response. I'm not sure, how long does this tool takes for > the 70GB data. > > I used node to stream the xml files into inserts.. which was very slow.. > Actually the xml contains 40 million records, out of which 10Million took > around 2 hrs using nodejs.

Re: Using COPY to import large xml file

2018-06-24 Thread Christoph Moench-Tegeder
## Adrien Nayrat (adrien.nay...@anayrat.info): > I used this tool : > https://github.com/Networks-Learning/stackexchange-dump-to-postgres That will be awfully slow: this tool commits each INSERT on it's own, see loop in https://github.com/Networks-Learning/stackexchange-dump-to-postgres/blob/mast

Re: Using COPY to import large xml file

2018-06-24 Thread Adrian Klaver
On 06/24/2018 08:25 AM, Anto Aravinth wrote: Hello Everyone, I have downloaded the Stackoverflow posts xml (contains all SO questions till date).. the file is around 70GB.. I wanna import the data in those xml to my table.. is there a way to do so in postgres? It is going to require some wor

Re: Using COPY to import large xml file

2018-06-24 Thread Anto Aravinth
Thanks for the response. I'm not sure, how long does this tool takes for the 70GB data. I used node to stream the xml files into inserts.. which was very slow.. Actually the xml contains 40 million records, out of which 10Million took around 2 hrs using nodejs. Hence, I thought will use COPY comma

Re: Using COPY to import large xml file

2018-06-24 Thread Adrien Nayrat
On 06/24/2018 06:07 PM, Anto Aravinth wrote: > Thanks for the response. I'm not sure, how long does this tool takes for the > 70GB data. In my memory, it took several hours. I can't remember if it is xml conversion or insert which are longer. > > I used node to stream the xml files into inserts

Re: Using COPY to import large xml file

2018-06-24 Thread Adrien Nayrat
On 06/24/2018 05:25 PM, Anto Aravinth wrote: > Hello Everyone, > > I have downloaded the Stackoverflow posts xml (contains all SO questions till > date).. the file is around 70GB.. I wanna import the data in those xml to my > table.. is there a way to do so in postgres? > > > Thanks,  > Anto. H

Using COPY to import large xml file

2018-06-24 Thread Anto Aravinth
Hello Everyone, I have downloaded the Stackoverflow posts xml (contains all SO questions till date).. the file is around 70GB.. I wanna import the data in those xml to my table.. is there a way to do so in postgres? Thanks, Anto.