Hi Jeremy,

I hope the following will enplane it better, but first of all I don't know 
yet if this violate any licences or not and I'm still looking if any.

Some Facts:-
1- TiddlyWiki has multiple "Data Sources" (Communities) around the internet 
(this GG, TW5 Dev GG, TW5 Doc GG, GitHub Discussions, Stack Overflow 
Questions, TW5 on Reddit, ...... etc)
2- TiddlyWiki is the BEST personal Information Management System we 
know/used (this is a FACT for me, but to prove it to others, at lease TW 
should manage its own information).

Now, in a few words; my goal is to EXPORT all TiddlyWiki data/info from 
those data sourced (by scraping/crawling) and IMPORT them into a unified 
TiddlyWiki System (converting data units into tiddlers) so we can start 
use/build over it.
In other words, let TiddlyWiki own its data that scattered around the 
Internet and build a "TW Information Portal" that show the EXTREME power of 
TW5 over its own data.
NOTE that this will NOT replace any of the original Data Sources, just to 
complements them and construct a "Portal" that we can build over it

So, at the end of the 1st phase , we'll have a single TiddlyWiki system 
(Node.js) contains something like the following (ALL the below data will be 
extracted from the source, NO Human Intervention at this phase):

Imaginary TW5 Google Group Tiddler (we'll have > 120K of them)
===================================
title: "GG TW5 ID pDlJDdWZNHQ"
tags: [[2021]] [[Conversation]] [[Message]] [[TiddlyWiki Google Group]]
custm-field-gg-title: "The History Show of the GG Community"
custm-field-gg-url:  "https://groups.google.com/g/tiddlywiki/c/pDlJDdWZNHQ";
custm-field-gg-author: "Taacees"
custm-field-gg-date: "01 Aug 2021"
text: "Message Body"
..... Any other info in separated "Custom Fields"
AND a separated tiddler for each reply with links to main Question

Imaginary GitHub Disscussion Tiddler
=====================================
title: "GitHub Disscussions ID 5924"
tags: [[2021]] [[Disscussion]] [[TiddlyWiki GitHub Disscussions]]
custm-field-gethub-title: "Bitmap editor - should we use pointer events?"
custm-field-gethub-url:  
"https://github.com/Jermolene/TiddlyWiki5/discussions/5924";
custm-field-gethub-author:  "BurningTreeC"
custm-field-gethub-date: "01 Aug 2021"
text: "Message Body"
..... Any other info in separated "Custom Fields"
AND a separated tiddler for each reply with links to main Question

Imaginary Reddit Quesion/Comment Tiddler
=========================================
title: "Reddit Question ID onx6qn"
tages: [[2021]] [[Reddit]] [[Question]]
custm-field-reddit-title: "Newbie Question: Editing a field in a template"
custm-field-reddit-url:  
"https://www.reddit.com/r/TiddlyWiki5/comments/onx6qn/newbie_question_editing_a_field_in_a_template/";
custm-field-reddit-author: "u/OneDiscombobulated83"
custm-field-reddit-date: "01 Aug 2021"
text: "Question Body"
..... Any other info in separated "Custom Fields"
AND a separated tiddler for each reply with links to main Question

Imaginary Stack Overflow Quesion/Answer Tiddler
===============================================
title: "StackOverflow Answer ID 34693482"
tages: [[2016]] [[Stack Overflow]] [[Reply]] [[Answer]]
custom-field-stackoverflow-title: tiddlywiki: "can't save changes in 
QWebView"
custom-field-stackoverflow-date: "9 Jan 2016"
custom-field-stackoverflow-url:  
"https://stackoverflow.com/questions/34693482/tiddlywiki-cant-save-changes-in-qwebview";
text: "Question Body"
..... All other info in separated "Custom Fields"
AND separated tiddler for each answers with links to main Question

I hope this make the idea more clear, but It'll be more visible after 
showing some example "tid" files.

And, regarding the Tools, I'm developing those scrapers/crawlers using the 
following golang libraries:

   - gocolly/colly <https://github.com/gocolly/colly>
   - chromedp/chromedp <https://github.com/chromedp/chromedp>
   
Regards

On Sunday, August 1, 2021 at 2:41:26 PM UTC [email protected] wrote:

> Hi Taacees
>
> Thank you! The project looks intriguing, could you explain a little more? 
> From the GitHub repo I understand that you’re trying to mine the Google 
> Group archive for useful information, but I’d like to understand more about 
> your goals and the techniques you are using,
>
> Best wishes
>
> Jeremy.
>
>
> On 1 Aug 2021, at 04:27, Taacees ME <[email protected]> wrote:
>
> Hi Everyone,
>
> Following the Jeremy's advice 
> <https://twitter.com/Jermolene/status/1420658644211978240>, Kindly accept 
> the invitation to celebrate that our GG community has reached the count of 
> 25K conversations (almost 😉).
> Get your popcorn, soda, favorite audio tracks and enjoy the 16 Years age 
> of our Community (try to forget any cons of the "Google Group" 😉)
>
> ‎The History Show of TiddlyWiki GG Community 
> <https://www.youtube.com/playlist?list=PLh3C3gr79VC1UxmXoGtaTyU1iPaw_WJg_>
>
> Regards
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "TiddlyWiki" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/tiddlywiki/11512e17-2eab-48b5-b7a5-6ce46681e6bcn%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/tiddlywiki/11512e17-2eab-48b5-b7a5-6ce46681e6bcn%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tiddlywiki/cbc51e1d-8506-444b-9d71-a7448cd17709n%40googlegroups.com.

Reply via email to