Steven, As I head out the door, I will sketch it.
Given a data.frame populated with N rows and columns you want to break it into training and test data sets. In a data.frame, you can refer to a row by using an index like 5 or 2019. You can ask for the number of rows currently in existence. You can also create an array/vector of length N consisting of instructions that can tell which random rows of the N you want and which you don't. For the purposes of this task, you choose random numbers in the range of N and either keep the numbers as indices or as a way to mark True/False in the vector. You then ask for a new data.frame made by indexing the existing one using the vector. You can then negate the vector and ask for a second new data.frame indexing it. Something close to that. Or, you can simply add the vector as a new column in the data.frame in some form. It would then mark which rows are to be used for which purpose. Later, when using the data, you include a CONDITION that row X is true, or whatever. -----Original Message----- From: Tutor <tutor-bounces+avigross=verizon....@python.org> On Behalf Of Steven D'Aprano Sent: Friday, December 28, 2018 11:12 PM To: tutor@python.org Subject: Re: [Tutor] decomposing a problem On Fri, Dec 28, 2018 at 10:39:53PM -0500, Avi Gross wrote: > I will answer this question then head off on vacation. You wrote about 140 or more lines, but didn't come close to answering the question: how to randomly split data from a dictionary into training data and reserved data. -- Steve _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor