On Mon, Jun 25, 2018 at 8:54 PM, Anto Aravinth <anto.aravinth....@gmail.com> wrote:
> > > On Mon, Jun 25, 2018 at 8:20 PM, Nicolas Paris <nipari...@gmail.com> > wrote: > >> >> 2018-06-25 16:25 GMT+02:00 Anto Aravinth <anto.aravinth....@gmail.com>: >> >>> Thanks a lot. But I do got lot of challenges! Looks like SO data >>> contains lot of tabs within itself.. So tabs delimiter didn't work for me. >>> I thought I can give a special demiliter but looks like Postrgesql copy >>> allow only one character as delimiter :( >>> >>> Sad, I guess only way is to insert or do a through serialization of my >>> data into something that COPY can understand. >>> >> >> easiest way would be: >> xml -> csv -> \copy >> >> by csv, I mean regular quoted csv (Simply wrap csv field with double >> quote, and escape >> enventually contained quotes with an other double quote.). >> > > I tried but no luck. Here is the sample csv, I wrote from my xml convertor: > > 1 "Are questions about animations or comics inspired by Japanese > culture or styles considered on-topic?" "pExamples include a href="" > http://www.imdb.com/title/tt0417299/"" rel=""nofollow""Avatar/a, a href="" > http://www.imdb.com/title/tt1695360/"" rel=""nofollow""Korra/a and, to > some extent, a href=""http://www.imdb.com/title/tt0278238/"" > rel=""nofollow""Samurai Jack/a. They're all widely popular American > cartoons, sometimes even referred to as ema href=""https://en.wikipedia. > org/wiki/Anime-influenced_animation"" rel=""nofollow""Amerime/a/em./p > > > pAre questions about these series on-topic?/p > > " "pExamples include a href=""http://www.imdb.com/title/tt0417299/"" > rel=""nofollow""Avatar/a, a href=""http://www.imdb.com/title/tt1695360/"" > rel=""nofollow""Korra/a and, to some extent, a href=""http://www.imdb.com/ > title/tt0278238/"" rel=""nofollow""Samurai Jack/a. They're all widely > popular American cartoons, sometimes even referred to as ema href="" > https://en.wikipedia.org/wiki/Anime-influenced_animation"" > rel=""nofollow""Amerime/a/em./p > > > pAre questions about these series on-topic?/p > > " "null" > > the schema of my table is: > > CREATE TABLE so2 ( > id INTEGER NOT NULL PRIMARY KEY, > title varchar(1000) NULL, > posts text, > body TSVECTOR, > parent_id INTEGER NULL, > FOREIGN KEY (parent_id) REFERENCES so1(id) > ); > > and when I run: > > COPY so2 from '/Users/user/programs/js/node-mbox/file.csv'; > > > I get: > > > *ERROR: missing data for column "body"* *CONTEXT: COPY so2, line 1: "1 "Are questions about animations or comics inspired by Japanese culture or styles considered on-top..."* > CONTEXT: COPY so2, line 1: "1 "Are questions about animations or comics > inspired by Japanese culture or styles considered on-top..." > > Not sure what I'm missing. Not sure the above csv is breaking because I > have newlines within my content. But the error message is very hard to > debug. > > > >> >> Postgresql copy csv parser is one of the most robust I ever tested >> before. >> > >