On 01/12/2010 02:03, javivd wrote:
On Nov 30, 11:43 pm, Tim Harig<user...@ilthio.net> wrote:
On 2010-11-30, javivd<javiervan...@gmail.com> wrote:
I have a case now in wich another file has been provided (besides the
database) that tells me in wich column of the file is every variable,
because there isn't any blank or tab character that separates the
variables, they are stick together. This second file specify the
variable name and his position:
VARIABLE NAME POSITION (COLUMN) IN FILE
var_name_1 123-123
var_name_2 124-125
var_name_3 126-126
..
..
var_name_N 512-513 (last positions)
I am unclear on the format of these positions. They do not look like
what I would expect from absolute references in the data. For instance,
123-123 may only contain one byte??? which could change for different
encodings and how you mark line endings. Frankly, the use of the
world columns in the header suggests that the data *is* separated by
line endings rather then absolute position and the position refers to
the line number. In which case, you can use splitlines() to break up
the data and then address the proper line by index. Nevertheless,
you can use file.seek() to move to an absolute offset in the file,
if that really is what you are looking for.
I work in a survey research firm. the data im talking about has a lot
of 0-1 variables, meaning yes or no of a lot of questions. so only one
position of a character is needed (not byte), explaining the 123-123
kind of positions of a lot of variables.
and no, MRAB, it's not the similar problem (at least what i understood
of it). I have to associate the position this file give me with the
variable name this file give me for those positions.
thank you both and sorry for my english!
You just have to parse the second file to build a list (or dict)
containing the name, start position and end position of each variable:
variables = [("var_name_1", 123, 123), ...]
and then work through that list, extracting the data between those
positions in the first file and putting the values in another list (or
dict).
You also need to check whether the positions are 1-based or 0-based
(Python uses 0-based).
--
http://mail.python.org/mailman/listinfo/python-list