Re: libhdfs process fork problem

2012-03-23 Thread Brian Bockelman
Hi Tareq, This is because libhdfs will keep a bit of state data (especially if the master was connected to HDFS). Three suggestions: 1) [Likely to work] Fork the children first, then do any HDFS actions in the master. Alternately, don't have the master do any HDFS actions; have it fork a chil

Re: libhdfs process fork problem

2012-03-27 Thread Brian Bockelman
pplication and it worked.. for some reason > it doesn't work in the other application that I'm working on. > Since I'm not connecting from master then I cannot really apply any of your > advices.. do you have other ideas? > Thanks > > On Fri, Mar 23, 2012 at 7:25

Tracking Replication errors

2009-09-09 Thread Brian Bockelman
Hey everyone, We're going through a review of our usage of HDFS (it's a good thing! - we're trying to get "official"). One reviewer asked a good question that I don't know the answer too - could you help? To quote, "What steps do you take to ensure the block rebalancing produces non- cor

Re: Tracking Replication errors

2009-09-09 Thread Brian Bockelman
he block. Is the CRC kept in the NN? Any specific reason why not, beyond decreasing the memory footprint? Brian Also, the there is a thread in the datanode that periodically verifies crc of existing blocks. dhruba On Wed, Sep 9, 2009 at 7:27 PM, Brian Bockelman wrote: Hey everyone,

Re: Description of HDFS metrics

2009-11-30 Thread Brian Bockelman
Hey Suresh, If these are HDFS metrics, it sure would be interesting to post this to an outside-readable wiki. Thanks! Brian On Nov 30, 2009, at 6:56 PM, Suresh Srinivas wrote: > I have updated the metrics twiki that I got from Konstantin > http://twiki.corp.yahoo.com/view/Grid/ClusterMetrics

Re: HDFS Blockreport question

2010-04-06 Thread Brian Bockelman
Hey Jay, I think, if you're experienced in implementing transfer protocols, it is not difficult to implement the HDFS wire protocol. As you point out, they are subject to change between releases (especially between 0.20, 0.21, and 0.22) and basically documented in fragments in the java source

Re: HDFS-1150: Comment and review

2010-05-14 Thread Brian Bockelman
Hey Jakob, If I understand correctly, then two plausible ways of attacking a client are then: 1) Screw up the client's DNS resolution somehow (reference the rather serious DNS hijack that have gone on) and trick it to talk to an attacker's node. 2) Perform a MITM attack. Since we are basing this

Re: how can we know the statistics of namenode and datanode using API's

2010-06-11 Thread Brian Bockelman
Hey Vidur, Do you need to access it directly in Java, or can you just parse them out of (for example) Ganglia? There are JMX statistics, but I am not familiar enough with JMX at the code level to give you decent advise. Brian On Jun 11, 2010, at 12:39 AM, Vidur Goyal wrote: > Hi All, > > I

Re: DataBlockScanner scan period

2010-10-13 Thread Brian Bockelman
Hi Thanh, The scan period is the period that hadoop *attempts* to complete an entire node scan. That is, if it's set to 3 weeks, HDFS will try to scan each block once every 3 weeks. Obviously, depending on the bandwidth you have made available to the scanning thread, you can specify impossibl

Re: DataBlockScanner scan period

2010-10-13 Thread Brian Bockelman
tempt* to complete and *entire* node scan, > you mean for example, if a node has 100 block files, it will > try to verify all 100 block every 3 weeks? > That is in average, a block is scanned every (3 weeks / 100 time interval)? > > Thanks > Thanh > > > On Wed, Oct 13, 20

Re: DataBlockScanner scan period

2010-10-13 Thread Brian Bockelman
. At some point, I'd really like to figure out what percentage of our blocks actually get scanned at our site, I suspect some go very long without a scan. Brian > Thanh > > On Wed, Oct 13, 2010 at 7:18 PM, Brian Bockelman wrote: > >> Hi Thanh, >> >> That is c

Re: DataBlockScanner scan period

2010-11-23 Thread Brian Bockelman
On Nov 23, 2010, at 7:41 PM, Thanh Do wrote: > sorry for digging up this old thread. > > Brian, is this the reason you want to add a "data-level" scan > to HDFS, as in HDFS-221. > > It seems to me that a very rarely read block could > be silently corrupted, because the DataBlockScanner > never

Fwd: problems with fuse-dfs

2011-03-01 Thread Brian Bockelman
Sorry, resending on hdfs-dev; apparently I'm not on -user. Begin forwarded message: > From: Brian Bockelman > Date: March 1, 2011 6:24:28 PM CST > Cc: hdfs-user > Subject: Re: problems with fuse-dfs > > > Side note: Do not cross post to multiple lists. It annoys

Re: problems with fuse-dfs

2011-03-02 Thread Brian Bockelman
me/hadoop/hadoop/hadoop-0.20.2/src/c++/libhdfs/ > /usr/local/hadoop/hadoop-0.20.2/src/c++/libhdfs/ > > Now, I cannot understand why the changes to the libhdfs code are not > reflected. > Are those libraries on the linker's path? > Thanks again for your help. > > Regards

Re: libhdfs not getting compiled

2011-03-17 Thread Brian Bockelman
Hi Aastha, Try using "ldd" against the fuse_dfs executable, and see where you are pulling libhdfs.so from. It may be it is linking from the "wrong one". Brian On Mar 17, 2011, at 3:24 PM, Aastha Mehta wrote: > Hello, > > I am working on a project involving hdfs and fuse-dfs API on top of it.

Re: libhdfs not getting compiled

2011-03-18 Thread Brian Bockelman
so that libhdfs.so.0 > and libjvm.so files above had links to the correct object files. > > Could you please tell what could the problem be now? > > Thanks, > > Aastha. > > On 18 March 2011 03:08, Brian Bockelman wrote: > >> Hi Aastha, >> >> Try usi

Re: Merging Namenode Federation feature (HDFS-1052) to trunk

2011-03-21 Thread Brian Bockelman
On Mar 21, 2011, at 6:08 PM, Sanjay Radia wrote: > > On Mar 14, 2011, at 10:57 AM, Sanjay Radia wrote: > >> >> On Mar 12, 2011, at 8:43 AM, Allen Wittenauer wrote: >> >>> >>> To me, this series of changes looks like it is going to make >>> running a grid much much harder for very little

Re: questions regarding fuse_dfs_read

2011-09-07 Thread Brian Bockelman
Hi Aastha, A read-ahead buffer is a common technique to trade higher bandwidth for lower latency for a number of common read patterns. Your OS does something similar (a much more advanced technique though). By reading ahead, HDFS is betting that your reads have a pattern to it. I think the 1

Re: corrupted edits log after power failure

2011-09-22 Thread Brian Bockelman
Hi Gabi, I'd be a bit scared of that backup strategy; what happens if the TCP connection gets cut suddenly during curl? What happens if there's a TCP corruption? Such things have happened before. Personally, we have the SNN merge the edits every 15 minutes. If it hasn't happened in 30 minut

[jira] Created: (HDFS-838) libhdfs causes a segfault due to race condition

2009-12-16 Thread Brian Bockelman (JIRA)
Affects Versions: 0.20.1, 0.20.2, 0.21.0, 0.22.0 Reporter: Brian Bockelman The first libhdfs operation that is performed is not thread-safe; this is because the creation of a JVM is not protected by a mutex. We have been able to trigger this by doing the following: 1) Start a few

[jira] Created: (HDFS-856) Hardcoded replication level for new files in fuse-dfs

2009-12-28 Thread Brian Bockelman (JIRA)
/fuse-dfs Reporter: Brian Bockelman Priority: Minor In fuse-dfs, the number of replicas is always hardcoded to 3 in the arguments to hdfsOpenFile. We should use the setting in the hadoop configuration instead. -- This message is automatically generated by JIRA. - You

[jira] Created: (HDFS-857) Incorrect type for fuse-dfs capacity can cause "df" to return negative values on 32-bit machines

2009-12-28 Thread Brian Bockelman (JIRA)
e/HDFS-857 Project: Hadoop HDFS Issue Type: Bug Components: contrib/fuse-dfs Reporter: Brian Bockelman Priority: Minor Attachments: HDFS-857.patch On sufficiently large HDFS installs, the casting of hdfsGetCapacity to a long may cau

[jira] Created: (HDFS-858) Incorrect return codes for fuse-dfs

2009-12-28 Thread Brian Bockelman (JIRA)
Incorrect return codes for fuse-dfs --- Key: HDFS-858 URL: https://issues.apache.org/jira/browse/HDFS-858 Project: Hadoop HDFS Issue Type: Bug Reporter: Brian Bockelman Priority: Minor

[jira] Created: (HDFS-859) fuse-dfs utime behavior causes issues with tar

2009-12-28 Thread Brian Bockelman (JIRA)
Reporter: Brian Bockelman Priority: Minor When trying to untar files onto fuse-dfs, tar will try to set the utime on all the files and directories. However, setting the utime on a directory in libhdfs causes an error. We should silently ignore the failure of setting a utime

[jira] Created: (HDFS-860) fuse-dfs truncate behavior causes issues with scp

2009-12-28 Thread Brian Bockelman (JIRA)
fuse-dfs truncate behavior causes issues with scp - Key: HDFS-860 URL: https://issues.apache.org/jira/browse/HDFS-860 Project: Hadoop HDFS Issue Type: Bug Reporter: Brian Bockelman

[jira] Created: (HDFS-861) fuse-dfs does not support O_RDWR

2009-12-28 Thread Brian Bockelman (JIRA)
fuse-dfs does not support O_RDWR Key: HDFS-861 URL: https://issues.apache.org/jira/browse/HDFS-861 Project: Hadoop HDFS Issue Type: Bug Components: contrib/fuse-dfs Reporter: Brian

[jira] Resolved: (HDFS-429) fuse-dfs implement posix truncate functionality

2011-02-22 Thread Brian Bockelman (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Bockelman resolved HDFS-429. -- Resolution: Fixed Looking at the current trunk, this is already implemented. Couldn't f

[jira] Resolved: (HDFS-430) create posix-like (as far as we can) layer for Linux on top of libhdfs

2011-02-22 Thread Brian Bockelman (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Bockelman resolved HDFS-430. -- Resolution: Later This Wish (not really a bug) appears to be abandoned. Closing - can reopen