The aim of this exercise is to combine the sample database, click tracking 
information from a test website and application, and information from user's 
social networks.

The sample database contains the following fields and is made up of 500 records.

        first_name,     last_name, company_name, address, city, county,     
state, zip, phone1, phone2,           email,     web

Here are the instructions:

1) Download the US500 database from http://www.briandunning.com/sample-data/

2) Use the exchange portion of the telephone numbers (the middle three digits) 
as the proxy for "user clicked on and expressed interest in this topic". 
Identify groups of users that share topic interests (exchange numbers match).

3) Provide an API that takes an e-mail address an input, and returns the e-mail 
addresses of other users that share that interest.

4) Extend that API to return users within a certain "distance" N of that 
interest. For example, if the original user has an interest in group 236, and N 
is 2, return all users with interests in 234 through 238.

5) Identify and rank the states with the largest groups, and (separately) the 
largest number of groups.

6) Provide one or more demonstrations that the API works.  These can be via a 
testing framework, and/or a quick and dirty web or command line client, or 
simply by driving it from a browser and  showing a raw result.


I was able to import the data this way, however I know there's a better method 
using the CSV module. The code below just reads lines, I'd like to be able to 
split each individual field into columns and assign primary and foreign keys in 
order to solve the challenge. What's the best method to accomplish this task?

import os, csv, json, re

        class fetch500():                                             # class 
instantiation
            def __init__(self):                                   # initializes 
data import object
                US_500file = open('us-500.csv')
                us_500list = US_500file.readlines()
                for column in us_500list:
                    print column,                                   # prints 
out phone1 array

        data_import = fetch500()
        print fetch500()
-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to