Marco,
Thanks. The reason for learning selenium is for the automation. As I want to test web sites for keyboard and mouse interaction and record the results. That at least is the long term goal. In the short term, I will have a look at your suggestion. From: Marco Mistroni <mmistr...@gmail.com> Sent: Sunday, 27 January 2019 9:46 PM To: mhysnm1...@gmail.com Cc: tutor@python.org Subject: Re: [Tutor] Web scraping using selenium and navigating nested dictionaries / lists. Hi my 2 cents. Have a look at scrapy for scraping.selenium is v good tool to learn but is mainly to automate uat of guis Scrapy will scrape for you and u can automate it via cron. It's same stuff I am doing ATM Hth On Sun, Jan 27, 2019, 8:34 AM <mhysnm1...@gmail.com <mailto:mhysnm1...@gmail.com> wrote: All, Goal of new project. I want to scrape all my books from Audible.com that I have purchased. Eventually I want to export this as a CSV file or maybe Json. I have not got that far yet. The reasoning behind this is to learn selenium for my work and get the list of books I have purchased. Killing two birds with one stone here. The work focus is to see if selenium can automate some of the testing I have to do and collect useful information from the web page for my reports. This part of the goal is in the future. As I need to build my python skills up. Thus far, I have been successful in logging into Audible and showing the library of books. I am able to store the table of books and want to use BeautifulSoup to extract the relevant information. Information I will want from the table is: * Author * Title * Date purchased * Length * Is the book in a series (there is a link for this) * Link to the page storing the publish details. * Download link Hopefully this has given you enough information on what I am trying to achieve at this stage. AS I learn more about what I am doing, I am adding possible extra's tasks. Such as verifying if I have the book already download via itunes. Learning goals: Using the BeautifulSoup structure that I have extracted from the page source for the table. I want to navigate the tree structure. BeautifulSoup provides children, siblings and parents methods. This is where I get stuck with programming logic. BeautifulSoup does provide find_all method plus selectors which I do not want to use for this exercise. As I want to learn how to walk a tree starting at the root and visiting each node of the tree. Then I can look at the attributes for the tag as I go. I believe I have to set up a recursive loop or function call. Not sure on how to do this. Pseudo code: Build table structure Start at the root node. Check to see if there is any children. Pass first child to function. Print attributes for tag at this level In function, check for any sibling nodes. If exist, call function again If no siblings, then start at first sibling and get its child. This is where I get struck. Each sibling can have children and they can have siblings. So how do I ensure I visit each node in the tree? Any tips or tricks for this would be grateful. As I could use this in other situations. Sean _______________________________________________ Tutor maillist - Tutor@python.org <mailto:Tutor@python.org> To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor