Jiabao Sun created ZEPPELIN-5718:
------------------------------------

             Summary: Notebook lost due to non-atomic file writes.
                 Key: ZEPPELIN-5718
                 URL: https://issues.apache.org/jira/browse/ZEPPELIN-5718
             Project: Zeppelin
          Issue Type: Improvement
          Components: zeppelin-zengine
    Affects Versions: 0.10.1, 0.9.0
            Reporter: Jiabao Sun


Zeeplin nootbook file may become a xxx.zpln.tmp file and be lost when zeppelin 
restarts.

 
org.apache.zeppelin.notebook.FileSystemStorage#writeFile

The code shows that we need to change the .zpln file, we delete it first and 
then rename the 
.zpln.tmp file as .zpln file. This operation is non-atomic and may be cases 
where files have been deleted but not renamed.
{code:java}
  public void writeFile(final String content, final Path file, boolean 
writeTempFileFirst, Set<PosixFilePermission> permissions)
      throws IOException {
    FsPermission fsPermission;
    if (permissions == null || permissions.isEmpty()) {
      fsPermission = FsPermission.getFileDefault();
    } else {
      // FsPermission expects a 10-character string because of the leading
      // directory indicator, i.e. "drwx------". The JDK toString method returns
      // a 9-character string, so prepend a leading character.
      fsPermission = FsPermission.valueOf("-" + 
PosixFilePermissions.toString(permissions));
    }
    callHdfsOperation(new HdfsOperation<Void>() {
      @Override
      public Void call() throws IOException {
        InputStream in = new ByteArrayInputStream(content.getBytes(
            zConf.getString(ZeppelinConfiguration.ConfVars.ZEPPELIN_ENCODING)));
        Path tmpFile = new Path(file.toString() + ".tmp");
        IOUtils.copyBytes(in, fs.create(tmpFile), hadoopConf);
        fs.setPermission(tmpFile, fsPermission);
        fs.delete(file, true);
        fs.rename(tmpFile, file);
        return null;
      }
    });
  }
{code}
 

BTW VFSNotebookRepo has the same problem. For the processing of local files, 
are we considering not to use VFS2.

org.apache.commons.vfs2.provider.AbstractFileObject#moveTo
{code:java}
public void moveTo(final FileObject destFile) throws FileSystemException {
    if (canRenameTo(destFile)) {
        if (!getParent().isWriteable()) {
            throw new 
FileSystemException("vfs.provider/rename-parent-read-only.error", getName(),
                    getParent().getName());
        }
    } else {
        if (!isWriteable()) {
            throw new 
FileSystemException("vfs.provider/rename-read-only.error", getName());
        }
    }

    if (destFile.exists() && !isSameFile(destFile)) {
        destFile.deleteAll();
        // throw new 
FileSystemException("vfs.provider/rename-dest-exists.error", 
destFile.getName());
    }

    if (canRenameTo(destFile)) {
        // issue rename on same filesystem
        try {
            attach();
            // remember type to avoid attach
            final FileType srcType = getType();

            doRename(destFile);

            
FileObjectUtils.getAbstractFileObject(destFile).handleCreate(srcType);
            destFile.close(); // now the destFile is no longer imaginary. force 
reattach.

            handleDelete(); // fire delete-events. This file-object (src) is 
like deleted.
        } catch (final RuntimeException re) {
            throw re;
        } catch (final Exception exc) {
            throw new FileSystemException("vfs.provider/rename.error", exc, 
getName(), destFile.getName());
        }
    } else {
        // different fs - do the copy/delete stuff

        destFile.copyFrom(this, Selectors.SELECT_SELF);

        if ((destFile.getType().hasContent()
                && 
destFile.getFileSystem().hasCapability(Capability.SET_LAST_MODIFIED_FILE)
                || destFile.getType().hasChildren()
                        && 
destFile.getFileSystem().hasCapability(Capability.SET_LAST_MODIFIED_FOLDER))
                && fileSystem.hasCapability(Capability.GET_LAST_MODIFIED)) {
            
destFile.getContent().setLastModifiedTime(this.getContent().getLastModifiedTime());
        }

        deleteSelf();
    }

}
{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to