[GitHub] spark pull request: [spark-3586][streaming]Support nested director...

wangxiaojing Thu, 16 Oct 2014 20:48:07 -0700

Github user wangxiaojing commented on the pull request:

    https://github.com/apache/spark/pull/2765#issuecomment-59462998
  
    Hi @jerryshao,It's changing the code to use this parameter to control the 
searching depth,but if the depth is greater than 1,the ignore time  is not 
reasonable,because if the secondary subdirectories has a new file,the 
modification time of the first subdirectories is not change.like:
    The streaming monitor the directory /tmp/
    The directory structure is :
     2014-10-16 19:17 /tmp/spark1
     2014-10-16 19:17 /tmp/spark1/spark2
    
    A files created in /tmp/spark1/spark2 
    
     2014-10-16 19:17 /tmp/spark1
     2014-10-16 19:18 /tmp/spark1/spark2
     2014-10-16 19:18 /tmp/spark1/spark2/file
    
    If you use the ignore time to do filtering,the first subdirectories is 
always ignore,Can you give me some advice?




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [spark-3586][streaming]Support nested director...

Reply via email to