Motivation

   The motivation of ZEPPELIN-2619 is to change the notes storage
structure. Previously we store it using {noteId}/note.json, we’d like to
change it into {note_name}_{note_id}.zpln. There are several reasons for
this change.


   1.

   {noteId}/note.json is not scalable. We put all notes in one root folder
   in flat structure. And when zeppelin server starts, we need to read all
   note.json to get the note file name and build the note folder structure
   (Because we need to get the note name which is stored in note.json to build
   the notebook menu). This would be a nightmare when you have large amounts
   of notes.
   2.

   {noteId}/note.json is not maintainable. It is difficult for a
   developer/administrator to find note file based on note name.
   3.

   {noteId}/note.json has no folder structure. Currently zeppelin have to
   build the folder structure internally in memory according note name which
   is a big overhead.


New Approach

   As I mentioned above, I propose to change the note storage structure to
{note_name}_{note_id}.zpln.  note_name could contains folders, e.g.
folder_1/mynote_abcd.zpln

This kind of note storage structure could bring several benefits.

   1.

   We don’t need to load all notes when zeppelin starts. We just need to
   list each folder to get the note name and note_id.
   2.

   It is much maintainable so that it is easy to find the note file based
   on note name.
   3.

   It has the folder structure already. That can be mapped to the note
   folder structure.


Side Effect

This approach only works for file system storage, so that means we have to
drop support for MongoNotebookRepo. I think it is ok because I didn’t see
any users talk about this in community, so I assume no one is using it.


This is overall design, welcome any comments and feedback. Thanks.


Here's the google docs, you can also comment it here.

https://docs.google.com/document/d/126egAQmhQOL4ynxJ3AQJQRBBLdW8TATYcGkDL1DNZoE/edit?usp=sharing

Reply via email to