Matthew Dillon wrote:
:Went from 10->15, and it took quite a bit longer into the backup before :the problem cropped back up.

Jumping right into it, there is another post after this one, but I'm going to try to reply inline:

    Try 30 or longer.  See if you can make the problem go away entirely.
    then fall back to 5 and see if the problem resumes at its earlier
    pace.

I'm sure 30 will either push the issue longer, or into non-existence, but are there any developers here who can say what this timer does? ie. How does changing this timer affect the performance of the disk subsystem (aside from allowing it to work, of course).

After I'm done responding this message, I'll be testing the sysctl to 30.

>   It could be temperature related.  The drives are being exercised
    a lot, they could very well be overheating.  To find out add more
    airflow (a big house fan would do the trick).


Temperature is a good thought, but currently, my physical situation has this:

- 2U chassis
- multiple fans in the case
- in my lab (which is essentially beside my desk)
- the case has no lid
- it is 64 degrees with A/C and circulating fans in this area
- hard drives are separated relatively well inside the case

    It could be that errors are accumulating on the drives, but it seems
    unlikely that four drives would exhibit the same problem.

Thats what I'm thinking. All four drives are exhibiting the same errors... or, for all intents and purposes, the machine is coughing the same errors for all the drives.

    Also make sure the power supply can handle four drives.  Most power
    supplies that come with consumer boxes can't under full load if you
    also have a mid or high-end graphics card installed.  Power supplies
    that come with OEM slap-together enclosures are not usually much better.

I currently have a 550W PSU in the 2U chassis, which again, is sitting open. I have more hardware, running in worse conditions with less wattage PSUs that don't exhibit this behavior. I need to determine whether this problem is SATA, ZFS, the motherboard or code.

    Specifically, look at the +5V and +12V amperage maximums on the power
    supply, then check the disk labels to see what they draw, then
    multiply by 2.  e.g. if your power supply can do [EMAIL PROTECTED] and you 
have
    four drives each taking [EMAIL PROTECTED] (and typically ~half that at 5V), 
thats
    4x2x2 = [EMAIL PROTECTED] and you would probably be ok.

I'm well within specs. Even after V/A tests with the meter. The power supply is providing ample wattage to each device accordingly.

    To test, remove two of the four drives, reformat the ZFS to use just 2,
    and see if the problem reoccurs with just two drives.

... I knew that was going to come up... my response is "I worked so hard to get this system with ZFS all configured *exactly* how I wanted it".

To test, I'm going to flip to 30 as per Matthews recommendation, and see how far that takes me. At this time, I'm only testing by backing up one machine on the network. If it fails, I'll clock the time, and then 'reformat' with two drives.

Is there a technical reason this may work better with only two drives?

Is there anyone interested to the point where remote login would be helpful?

Steve
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to