"Crawling"....is that the same as "Parsing"?....I have limited knowledge of databases in general....most of my exposure comes from SQL.
On Fri, Oct 26, 2018, 5:18 AM Frau Silvia Sánchez <lailah...@gmail.com> wrote: > > Hi Bruce, > > Sounds like an interesting problem to solve. And it sounds like something > that might be useful for me too at some moment. Unfortunately, my knowledge > of databases is about null. But I'm always ready to learn and help, so I'd > love to know more if that's okay for you. > As for knowing someone that is good at it, I'm sorry but I don't. > > Kind regards, > Silvia > > > > On Thu, 25 Oct 2018 at 22:00, bruce <badoug...@gmail.com> wrote: > >> Hi. >> >> Got an issue. this is waaaay off topic. And I apologize. If more than >> a few object, would the moderator please kill the thread. i wouldn't >> have posted, but the list has been kind of "slow" lately, and.. well.. >> I have no tech/cool people to turn to! >> >> I'm working on a crawling project. The overall project is geared >> towards crawling a number of college sites (~400) to get class data, >> as well as the required book data. The process targets the colleges, >> does the fetch/parse, and stores the data into a mysql db. A similar >> process occurs for the book data. >> >> My issue. Please don't laugh. Make sure you're not drinking your >> bourbon.. the crawl for the bookdata.. takes ~2-3 days.. running a >> bunch of processes on a number of cheap digitalocean servers. The >> process generates ~720K "sections" across the colleges (for the book >> section/ISBN data). >> >> I know there are people/resources who are "good" with this. I just >> don't "know" any of them that I can talk with! >> >> If you guys know of anyone ,or have any thoughts/ideas/etc.. I'd >> appreciate the opportunity to discuss/chat/talk/etc.. >> >> And yeah, this process is really crude, but it more or less works.. If >> I had clones of me, I'd implement queues, and test out other things to >> speed up the overall processing time. >> >> thanks for reading.. >> >> and again.. to the moderator.. if a few people object to this, feel >> free to kill the thread! >> >> thanks guys! >> _______________________________________________ >> users mailing list -- users@lists.fedoraproject.org >> To unsubscribe send an email to users-le...@lists.fedoraproject.org >> Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html >> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines >> List Archives: >> https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org >> > _______________________________________________ > users mailing list -- users@lists.fedoraproject.org > To unsubscribe send an email to users-le...@lists.fedoraproject.org > Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines > List Archives: > https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org >
_______________________________________________ users mailing list -- users@lists.fedoraproject.org To unsubscribe send an email to users-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org