"Crawling"....is that the same as "Parsing"?....I have limited knowledge of
databases in general....most of my exposure comes from SQL.

On Fri, Oct 26, 2018, 5:18 AM Frau Silvia Sánchez <lailah...@gmail.com>
wrote:

>
> Hi Bruce,
>
> Sounds like an interesting problem to solve. And it sounds like something
> that might be useful for me too at some moment. Unfortunately, my knowledge
> of databases is about null.  But I'm always ready to learn and help, so I'd
> love to know more if that's okay for you.
> As for knowing someone that is good at it, I'm sorry but I don't.
>
> Kind regards,
> Silvia
>
>
>
> On Thu, 25 Oct 2018 at 22:00, bruce <badoug...@gmail.com> wrote:
>
>> Hi.
>>
>> Got an issue. this is waaaay off topic. And I apologize. If more than
>> a few object, would the moderator please kill the thread. i wouldn't
>> have posted, but the list has been kind of "slow" lately, and.. well..
>> I have no tech/cool people to turn to!
>>
>> I'm working on a crawling project. The overall project is geared
>> towards crawling a number of college sites (~400) to get class data,
>> as well as the required book data. The process targets the colleges,
>> does the fetch/parse, and stores the data into a mysql db. A similar
>> process occurs for the book data.
>>
>> My issue. Please don't laugh. Make sure you're not drinking your
>> bourbon.. the crawl for the bookdata.. takes ~2-3 days.. running a
>> bunch of processes on a number of cheap digitalocean servers. The
>> process generates ~720K "sections" across the colleges (for the book
>> section/ISBN data).
>>
>> I know there are people/resources who are "good" with this. I just
>> don't "know" any of them that I can talk with!
>>
>> If you guys know of anyone ,or have any thoughts/ideas/etc.. I'd
>> appreciate the opportunity to discuss/chat/talk/etc..
>>
>> And yeah, this process is really crude, but it more or less works.. If
>> I had clones of me, I'd implement queues, and test out other things to
>> speed up the overall processing time.
>>
>> thanks for reading..
>>
>> and again.. to the moderator.. if a few people object to this, feel
>> free to kill the thread!
>>
>> thanks guys!
>> _______________________________________________
>> users mailing list -- users@lists.fedoraproject.org
>> To unsubscribe send an email to users-le...@lists.fedoraproject.org
>> Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
>> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
>> List Archives:
>> https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
>>
> _______________________________________________
> users mailing list -- users@lists.fedoraproject.org
> To unsubscribe send an email to users-le...@lists.fedoraproject.org
> Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives:
> https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
>
_______________________________________________
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org

Reply via email to