On Wed, 22 May 2019, Adrian Klaver wrote:
A sample of the data you are cleaning up.
Adrian, et al.:
I have it working properly now. Both org_id and person_id numbers are
prepended to each row in the appropriate table and they are unique because
each series begins one greater than the max(*_id
>
> A sample of the data you are cleaning up.
>
> I think what people are trying to wrap there head around is how 800
> lines in the file is being split into two subsets: the organization data
> and the people data. In particular how that is being done to preserve
> the relationship between organiz
On 5/22/19 10:53 AM, Rich Shepard wrote:
On Wed, 22 May 2019, Francisco Olarte wrote:
Also, when I speak of "unique identifier" I'm not speaking of the one if
your FINAL tables, I assume you would have at least the *_id field as
PKEY, so nothing else needed, but the one in your SOURCE data set
On Wed, 22 May 2019, Jeremy Finzel wrote:
Are you saying your database already has an organizations table, and this
data file is appending to it with all-brand-new organizations and people?
Jeremy,
The database has both organizations and people tables (among others) which
are already populate
On Wed, May 22, 2019 at 12:53 PM Rich Shepard
wrote:
> On Wed, 22 May 2019, Francisco Olarte wrote:
>
> > Also, when I speak of "unique identifier" I'm not speaking of the one if
> > your FINAL tables, I assume you would have at least the *_id field as
> > PKEY, so nothing else needed, but the on
On Wed, 22 May 2019, Francisco Olarte wrote:
Also, when I speak of "unique identifier" I'm not speaking of the one if
your FINAL tables, I assume you would have at least the *_id field as
PKEY, so nothing else needed, but the one in your SOURCE data set (it can
be anything, like the row number i
On Wed, May 22, 2019 at 11:07 AM Rich Shepard
wrote:
> On Wed, 22 May 2019, Francisco Olarte wrote:
>
> > You are not reading what we write to you. Note YOU AND ONLY YOU are the
> > one speaking of PK. We are speaking of "unique identifier" ( that would
> > be, IIRC, "candidate keys", you can pee
Rich:
On Wed, May 22, 2019 at 6:07 PM Rich Shepard wrote:
> On Wed, 22 May 2019, Francisco Olarte wrote:
> > You are not reading what we write to you. Note YOU AND ONLY YOU are the
> > one speaking of PK. We are speaking of "unique identifier" ( that would
> > be, IIRC, "candidate keys", you can
On Wed, 22 May 2019, Francisco Olarte wrote:
You are not reading what we write to you. Note YOU AND ONLY YOU are the
one speaking of PK. We are speaking of "unique identifier" ( that would
be, IIRC, "candidate keys", you can peek any as your PK, or even introduce
a new synthetic one with a seque
Rich:
On Wed, May 22, 2019 at 4:38 PM Rich Shepard wrote:
> Not necessary; see below. Also, these data come from a regulator and
> provided as an Excel spreadsheet. If they were extracted from a database
> then that was very poorly designed because there's no consistency in how
> fields/columns
Rich:
On Wed, May 22, 2019 at 2:40 PM Rich Shepard wrote:
> > I'm curious, what org_id do you put (manually) to the people? you must
> > have some way to match it ( like, say, "I have an organization line,
> > followed by lines for people in this organization"
> The first data entered was in sma
On Wed, 22 May 2019, Adrian Klaver wrote:
So does the people data have an organization attribute?
Adrian,
Yes. It's the FK to the organization table.
If so why not just assign the org_id while cleaning up the data?
That's what I thought to do based on your suggestion yesterday. It would
m
On 5/22/19 7:38 AM, Rich Shepard wrote:
On Wed, 22 May 2019, Jeremy Finzel wrote:
There's absolutely no need to use anything beyond SQL here, though you
could if you want to.
Jeremy,
This is a new experience for me so I didn't think of a SQL solution.
I really wonder how much we are just ta
>
> Each row in the source file (exported from the spreadsheet as .csv and
> renamed to .txt for processing in emacs and awk) is a mixture of attributes
Absolutely no need to rename the .csv for those tools.
> that belong in either or both of the organization and people tables in my
> database. An
On Wed, 22 May 2019, Jeremy Finzel wrote:
There's absolutely no need to use anything beyond SQL here, though you
could if you want to.
Jeremy,
This is a new experience for me so I didn't think of a SQL solution.
I really wonder how much we are just talking past each other simply because
we
On Wed, May 22, 2019 at 7:40 AM Rich Shepard
wrote:
> On Wed, 22 May 2019, Francisco Olarte wrote:
>
> > I'm curious, what org_id do you put (manually) to the people? you must
> > have some way to match it ( like, say, "I have an organization line,
> > followed by lines for people in this organiz
On Wed, 22 May 2019, Francisco Olarte wrote:
I'm curious, what org_id do you put (manually) to the people? you must
have some way to match it ( like, say, "I have an organization line,
followed by lines for people in this organization"
Francisco,
The first data entered was in small chunks so
Jeremy:
On Tue, May 21, 2019 at 11:58 PM Jeremy Finzel wrote:
> Then take Francisco's suggestion, only use an md5 of the organization fields
> to create yourself a unique identifier. Then you can use ctid (unique
> internal identifier for each row) to join back. You use SQL like this:
Sadly
On Tue, May 21, 2019 at 7:24 PM Rich Shepard wrote:
> > From how you say it, I assume you have some data in your original
> > dumps which can relate boths, lets assume it's org_name, but may be an
> > org-code. If you do not have it it means you cannot match people to
> > orgs in your data, all is
On Tue, May 21, 2019 at 12:24 PM Rich Shepard
wrote:
> On Tue, 21 May 2019, Francisco Olarte wrote:
>
> > From how you say it, I assume you have some data in your original
> > dumps which can relate boths, lets assume it's org_name, but may be an
> > org-code. If you do not have it it means you c
On Tue, 21 May 2019, Will Hartung wrote:
You can either use a script for the entire process, or, convert the people
table to INSERT statements that have a SELECT for the foreign key as part
of the insert.
INSERT INTO PEOPLE (id, name, org_id) VALUES (1, ‘Bob’, (SELECT org_id
FROM org WHERE org_
> On May 21, 2019, at 9:56 AM, Rich Shepard wrote:
>
> I could insert all new rows into the organizations table, then produce a
> list of the org_id and org_name to manually insert the foreign key in the
> related people table. Time and effort intense.
You can either use a script for the entire
On 5/21/19 12:27 PM, Rich Shepard wrote:
On Tue, 21 May 2019, Michael Lewis wrote:
For each row-
Insert into organizations table if the record does not exist, returning ID.
Insert into people using that ID.
Michael,
The org_id will not exist until I run the insert script.
Else, load all th
On Tue, 21 May 2019, Adrian Klaver wrote:
The other way is to create the org_id for each organization ahead of time
and put it into the data file. Either way you have create the org_id for
the FK relationship, it is just a matter of where and when.
Adrian,
I had thought of that but overlooked
On Tue, 21 May 2019, Michael Lewis wrote:
For each row-
Insert into organizations table if the record does not exist, returning ID.
Insert into people using that ID.
Michael,
The org_id will not exist until I run the insert script.
Else, load all the data with empty ID column on person tabl
On 5/21/19 10:22 AM, Rich Shepard wrote:
On Tue, 21 May 2019, Adrian Klaver wrote:
Well you are not going to know the org_id until the organization table
is loaded, which means something like:
1) First run through file load the organizations table.
2) Build a mapping of org_id to organizatio
On Tue, 21 May 2019, Francisco Olarte wrote:
From how you say it, I assume you have some data in your original
dumps which can relate boths, lets assume it's org_name, but may be an
org-code. If you do not have it it means you cannot match people to
orgs in your data, all is lost.
Francisco,
For each row-
Insert into organizations table if the record does not exist, returning ID.
Insert into people using that ID.
Else, load all the data with empty ID column on person table,then just
update the person table afterward and drop the org name column.
Perhaps I am missing something.
On Tue, 21 May 2019, Adrian Klaver wrote:
Well you are not going to know the org_id until the organization table is
loaded, which means something like:
1) First run through file load the organizations table.
2) Build a mapping of org_id to organization.
3) Run through data file again and loa
Rich:
On Tue, May 21, 2019 at 6:56 PM Rich Shepard wrote:
> I'm cleaning and formatting a 800-line data file to be inserted into a
> database. Some of the input file fields will be inserted into an
> 'organizations' table letting postgres assign sequential org_id numbers.
> Other fields will be i
On 5/21/19 9:56 AM, Rich Shepard wrote:
I'm cleaning and formatting a 800-line data file to be inserted into a
database. Some of the input file fields will be inserted into an
'organizations' table letting postgres assign sequential org_id numbers.
Other fields will be inserted into a separate 'p
I'm cleaning and formatting a 800-line data file to be inserted into a
database. Some of the input file fields will be inserted into an
'organizations' table letting postgres assign sequential org_id numbers.
Other fields will be inserted into a separate 'people' table associated with
each organiz
32 matches
Mail list logo