Hi all,

So in this week I've been working on the schema. This is what I came up
with:

Conversations:
* id
* account
* type (1:1, group, channel)
* entity_identifier (id of contact, group or channel)

Messages:
* id
* conversation_id
* timestamp
* sender_id (contact id, if available)
* unknown_sender_name (sender name, if id not available, e.g. in irc)
* type (text, event, file, voice clip)
* content (text content or media location)
* importance
* state

The idea is that from a contact, group or channel in the contact list,
the plugin is able to get a persistent "conversation", and from that,
get all the messages. "entity_identifier" sets the bridge between a
contact (or room, or channel) and a "conversation".

(It's weird that today it doesn't feel natural anymore to simply focus
on contacts instead of conversations, at least UX-wise. AFAIK, all
modern chat services now display conversations in the main window. Am I
alone here?)

The messages table is supposed to be agnostic to the protocol and to the
amount of people in the chat. Direct one-to-one and multi-users chats
have the same structure.

I'm not yet sure how to proper handle multimedia messages, but I think I
should be focusing in text messages for now anyway. Text messages are
stored in HTML in the messages table, but in plain text in the full-text
search table.

Any obvious unsoundness in this schema? I'm sure it will have to improve
before the project is complete, but I think this is a good starting point.

This next week I'll be working on the storage façade, and probably send
a few more patches regarding Qt5 and KF5 migration!

Paulo

On 06-06-2017 11:34, Pali Rohár wrote:
> Hi!
> 
> On Tuesday 06 June 2017 11:05:26 Paulo Lieuthier wrote:
>> Hi, Pali!
>>
>> Really sorry for the delay! I will do my best to send my reports on the 
>> saturdays.
> 
> Ok.
> 
>> I'm keeping my work on Github [1]. I'm trying to keep my commits very 
>> organized,
>> so you can easily track my progress, and maybe cherry-pick bugfixes if they 
>> are
>> worth it.
>>
>> That's what I did this week:
>>
>> * I got the XML-based plugin to build and work on top of the kf5 branch. It 
>> took
>> some time for me to grok it, but now I'm more familiar with the codebase. 
>> There
>> is a bug when creating the history XML files, which I fixed in [2] (should I
>> send a patch to review board?).
> 
> Yes, please send patches which are ready to review board.
> 
>> * I have done some research about SQLite full-text search capabilities. I 
>> think
>> this it is a great option. Metadata is stored separated from the text indexes
>> and the indexes are automatically updated when some record changes [3]. Also,
>> there is built-in support for relevancy-based ordering and stemming [5].
> 
> Note that message is in HTML format, but searching needs to be done on
> text part, not on HTML tags. So if there is "te<b>xt</b>". Then it must
> be able to search for "text".
> 
>> We haven't dicussed much about which should be kept or removed from the 
>> current
>> plugins, but I don't think there is much to bikeshed here: I must aim for a
>> simple and fast storage with a complete schema that works for all protocols.
>>
>> Speaking of schema, that's my main goal for this week. I believe it will be
>> similar to what will sent my previously, but not quite. I'm thinking of
>> something more general, that is not tied up to text or simple groups 
>> (multimedia
>> messages and channels must be first-class).
>>
>> I'm looking forward to get feedback.
>>
>> Paulo
>>
>> [1] https://github.com/paulolieuthier/kopete/commits/history-plugin
>> [2]
>> https://github.com/paulolieuthier/kopete/commit/a901f0455f228a467dd9e5f8dd990535b9da5873
>> [3] https://sqlite.org/fts5.html#external_content_tables
>> [4] https://sqlite.org/fts5.html#_summary_of_technical_differences_
>> [5] https://sqlite.org/fts5.html#tokenizers
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to