from:"Zhihui Zhang"

Anything special with kmem_map and mb_map?

1999-07-19 Thread Zhihui Zhang


I have been wondering this for some time.   There are many kernel
submaps: exec_map, clean_map,
etc.  But if you look the code in vm_map_find(), we have to call splvm()
for kmem_map and its
submap mb_map, but not for other kernel submaps.  So is there anything
special with these two
kernel submaps?

Thanks for any help.

-Zhihui




To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

understanding code related to forced COW for debugger

1999-07-20 Thread Zhihui Zhang



I have tried to understand the following code in vm_map_lookup() without
much success:

if (fault_type & VM_PROT_OVERRIDE_WRITE)
prot = entry->max_protection;
else
prot = entry->protection;
 

if (entry->wired_count && (fault_type & VM_PROT_WRITE) &&
(entry->eflags & MAP_ENTRY_COW) &&
(fault_typea & VM_PROT_OVERRIDE_WRITE) == 0) {
RETURN(KERN_PROTECTION_FAILURE);
}

At first, it seems to me that if you want to write a COW page, you must
have OVERRIDE_WRITE set.
But later I find that when wired_count is non zero, we are actually
simulating a page fault, not a real one.
Anyway, I do not know how the above code (1) prevents a debugger from
writing a binary code, (2) forces
a COW when a debugger write other data.

I also have some questions on wiring a page:

(1)  According to the man pages of mlock(2), a wired page can still
cause protection-violation faults.
But in the same vm_map_lookup(), we have the following code:

if (*wired)
prot = fault_type = entry->protection;

and the comment says "get it for all possible accesses".  As I undersand
it, we wire a page by simulating
a page fault (no matter whether it is kernel or user who is wiring a
page).

(2)  Can the kernel wire a page of a user process without that user's
request (by calling mlock)?

Any help is appreciated.



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Re: understanding code related to forced COW for debugger

1999-07-21 Thread Zhihui Zhang

On Wed, 21 Jul 1999, Matthew Dillon wrote:

>
> The VM_PROT_OVERRIDE_WRITE flag is only used for user-wired pages,
so
> it does not effect 'normal' page handling.   Look carefully at the

> vm_fault() code (vm/vm_fault.c line 212), that lookup only occurs
> with VM_PROT_OVERRIDE_WRITE set if the normal lookup fails and the

> user has wired the page.
>
> So if a normal lookup fails and this is a user-wired page, we try
> the lookup again with VM_PROT_OVERRIDE_WRITE, presumably to handle

> a faked copy-on-write fault for the debugger.  This results in the

> following:
>
> First, we temporarily increase the protections to make the page
*appear*
> writeable.  Note: only 'appear' writeable, not actually be
writeable.
>
> if (fault_type & VM_PROT_OVERRIDE_WRITE)
> prot = entry->max_protection;
> else
> prot = entry->protection;

To allow a debugger to write TEXT area of a program, the max_protection
field must be set to include VM_PROT_WRITE by the loader.  Am I right?

> *wired = (entry->wired_count != 0);
> if (*wired)
> prot = fault_type = entry->protection;
>
> I'm pretty sure this piece is simply reverting the mess that the
> copy-on-write stuff does for the debugger.  entry->protection is
what
> we normally want to use.

Since mlock(2) is used by user, these make sense to me.  Both
vm_fault_wire()
and vm_fault_user_wire() have non-zero wired_count of the related map
entry
before calling vm_fault().  This is done by their caller
vm_map_pageable() and
vm_map_user_pageable(). Since you are talking about user wiring case, so
for the
kernel wiring case, the above code should prevent any further fault on
the page after
this simulated one.  Therefore, a kernel-wired page will never cause
protection-violation
faults, while a user-wired page can, as said on the man pages of
mlock(2).  Since mlock(2)
is used by user, these make sense to me.

Thanks for your response.

-Zhihui

To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Questions on new-bus source code

1999-08-06 Thread Zhihui Zhang


In FreeBSD new-bus architecture, all devices are linked into a device
tree. The root of the tree is root_bus, it has a child called nexus0 added
during the device configuration phase.  I have two questions about this
new-bus code:

(1) What is the usage of this "nexus0" device?  Its parent (root_bus) does
not declare the probe method, so probing nexus0 can only return ENXIO for
us (from error_method()). 

(2) I guess that the probe process of all devices on the tree is triggered
by root_bus_configure() in subr_bus.c.  It is done from top to bottom,
i.e. the probe process should be propagated down the device tree from
root_bus. Am I right? How does this tree structure achieve the dynamic
feature of device configuring (adding/removing devices on the fly)? 

Having a pig picture often helps to understand the details more readily.

Any help is appreciated.

------
Zhihui Zhang.  Please visit http://www.freebsd.org
--



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Configuration mechanism of PCI bus

1999-08-09 Thread Zhihui Zhang


Even with "PCI System Architecture, 4th edition" at hand, I still have
some problems understanding the code in isa/pcibus.c.  Please point out
any misunderstanding I may have in the following:

(1) At first, you can not modify the address port at 0xcf8 without a FULL
32-bit write.  The routine pci_cfgopen() seems to use this fact.

(2) The constant CONF1_ENABLE_MSK includes 4 higher bus number bits, only
4 bits can be used as bus number, so we can have at most 16 PCI buses. 

(3) The variable "mode1res" seems to refer to any residual left by BIOS in
the address port.  If it is non-zero, we will try to find a device using
configuration mechanism 1. 

(3) The magic constant 0xf870ff excludes many devices.  How it is chosen? 
I guess those excluded devices are not important or supported by FreeBSD. 
It seems to me that if pci_cfgcheck() finds at least one device, then the
configuration mechanism is regarded as correctly detected.

Any help is appreciated.

----------
Zhihui Zhang.  Please visit http://www.freebsd.org
--



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Create a dump image of kernel

1999-08-13 Thread Zhihui Zhang


Can anyone tell me how to modify the config file to build a kernel that
creates dump image whenever it panics. Currently I have to use dumpon
command after system bootup.  But this command does not work when the
panic happens during the bootup time, i.e., when you have no chance to
issue the dumpon command. Thanks.

--
Zhihui Zhang.  Please visit http://www.freebsd.org
--



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Re: Create a dump image of kernel

1999-08-13 Thread Zhihui Zhang

On Fri, 13 Aug 1999, Andrzej Bialecki wrote:

> On Fri, 13 Aug 1999, Zhihui Zhang wrote:
> 
> > 
> > Can anyone tell me how to modify the config file to build a kernel that
> > creates dump image whenever it panics. Currently I have to use dumpon
> > command after system bootup.  But this command does not work when the
> > panic happens during the bootup time, i.e., when you have no chance to
> > issue the dumpon command. Thanks.
> 
> This is a common problem recently, it seems.. See my recent postings to
> this group (or was it -current?).
> 
> Andrzej Bialecki

It is in -current list.  Subject is: is dumpon/savecore broken?.  I read
your postings there. It seems we can use remote GDB to debug a kernel that
panics even before it probes the devices.  I hope it is easy to learn how
to use it from the handbook.  Thanks. 

-Zhihui

To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Need help with kernel trace

1999-08-14 Thread Zhihui Zhang


I think it helps to understand a routine in the kernel code if I know how
the routine is called and what parameters are being passed to it. To get
such information, I decide to simulate a panic whenever that routine is
called.  For example, I want know how link() in vfs_syscalls.c is called
and what parameters are being passed to it.  I add a sysctl variable named
"debug.link_panic" and at the very beginning of link(), I add the
following statement:

  if (link_panic) panic("link() is called");

The system panics whenever I set debug.link_panic to 1 and issue a ln
command at the prompt as expected.

Now the problem is how to use the coredump to get the information I am
interested. The following script records the process I tried: 

now5# cd /usr/crash
now5# gdb -k -s /usr/crash/kernel.gdb kernel.4 vmcore.4
GNU gdb 4.18
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you
are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for
details.
This GDB was configured as "i386-unknown-freebsd"...
(no debugging symbols found)...
IdlePTD 3588096

kernel symbol `gd_curpcb' not found.
(kgdb) where
No stack.

I expected that I could have a stack to trace down how link() is called
step by step. But it seems to me that I can not do so. 

The kernel is configured with "config -g" and "make installed" after doing
"strip -g kernel".  The file kernel.gdb is copied from the directory
/usr/src/sys/compile/DDB to /usr/crash before being stripped.  The
/var/crash is too small, therefore I modified the file /etc/rc so that
savecore will save core dumps under /usr/crash.  The system is running
FreeBSD 3.2 - Release.

Any help is appreciated.

--
Zhihui Zhang.  Please visit http://www.freebsd.org
--



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

kernel symbol `gd_curpcb' not found

1999-08-16 Thread Zhihui Zhang


I have tried to debug a kernel by simulating a panic without success. I
have read the handbook and searched the mailinglist.  I even tried not to
strip the debug kernel at all. Still I get the above message and I do not
know how to go on.  The following are the commands that I used: 

now5# gdb -k
GNU gdb 4.18
Copyright 1998 Free Software Foundation, Inc.
..
This GDB was configured as "i386-unknown-freebsd".
(kgdb) symbol-file /kernel
Reading symbols from /kernel...done.
(kgdb) exec-file kernel.6
(kgdb) core-file vmcore.6
IdlePTD 3600384
kernel symbol `gd_curpcb' not found.
(kgdb) where
No stack.

Thanks for any help.

------
Zhihui Zhang.  Please visit http://www.freebsd.org
--



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Kernel debugging questions

1999-08-19 Thread Zhihui Zhang


I am using FreeBSD 4.0 and have two questions on kernel debugging:

(1) Can I specify /usr/src/sys/compile/MYKERN/kernel.debug as the kernel
to boot from manually without copying that file under /?  It seems I can
not do so.  I guess the reason is that the /usr is not mounted at that
time.

(2) After bootup, I try the following to debug the live system (after
reading some pages of the book "Panic! Unix system crash dump analysis"):

now4# gdb -k /kernel.debug /dev/mem
(kgdb) run
Starting program: /kernel.debug 

Program terminated with signal SIGABRT, Aborted.
The program no longer exists.
You can't do that without a process to debug.

Is there something wrong?  I did the same thing with the postmortem
coredump files and got similar messages.  Maybe I am using gdb in a wrong
way. 

Any help is appreciated.

------
Zhihui Zhang.  Please visit http://www.freebsd.org
--



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Re: Kernel debugging questions

1999-08-19 Thread Zhihui Zhang

On Fri, 20 Aug 1999, Greg Lehey wrote:

> You can't control the execution of the kernel, you can just look at
> the way things are.  With the core dump, you at least have the
> advantage that things won't change while you look at them; you can't
> even do that with /dev/mem.  The other alternative is remote serial
> debugging, where you *can* influence the execution of the kernel, for
> example by setting breakpoints.  But remember that the kernel is
> already running when you attach to it, so you don't say 'run', you say
> 'c[ontinue]'.

Thanks for your response.  I can not think of those points myself. 
However, on page 7 of the book "Panic! Unix system crash dump analysis",
it says that a debugger named kadb in SunOS can load the real kernel
during boot and treat the latter like a great, big, user program, stepping
through its execution, examining and modifying values on the fly. 

It seems to me that FreeBSD does not have such a debugger. Maybe ddb can
do so, but it works with assembly. 

-Zhihui

To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Serial cable

1999-08-20 Thread Zhihui Zhang


Hi, Rich:

Can you find a serial cable for me?  I need to connect two PCs together
via RS232 ports.  


Thanks.

--
Zhihui Zhang.  Please visit http://www.freebsd.org
--



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Questions for vnconfig

1999-08-21 Thread Zhihui Zhang


I have successfully used vnconfig to add swap file and mount disk image
files. However, I am still not sure about the following two things:

(1) What does the count in "pseudo-device vn count" stand for?  My guess
is that if it is 2, then we can use /dev/vn0x and /dev/vn1x. If it is 1,
then we can only use /dev/vn0x. The x stands for one of those eight
partitions [a-h] in one slice. 

(2) For /dev/vn0[a-h], which one from a-h should I use for which purpose? 

Any help is appreciated.

------
Zhihui Zhang.  Please visit http://www.freebsd.org
--



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

What does unp stand for?

1999-08-22 Thread Zhihui Zhang


In file uipc_usrreq.c, there are many routines beginning with unp_. For
example, unp_connect(), unp_bind(), etc. What does unp stand for?  

Thanks. 

--
Zhihui Zhang.  Please visit http://www.freebsd.org
--



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

FreeBSD FIFO implementation

1999-08-23 Thread Zhihui Zhang


While looking at the FIFO implementation, I understand that a FIFO is
implemented as a socket.  But I am not sure where the data in a FIFO
is stored (mbuf or filesystem buf structure?) and how it manages the
red/write pointers.  Can anyone give me a general picture of this?

Any help is appreciated.

--
Zhihui Zhang.  Please visit http://www.freebsd.org
--



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Help with remote debugging (gdb -k)

1999-08-30 Thread Zhihui Zhang


After reading the handbook and some postings in the mailing list archive. 
I still can not make remote debugging work.  I basically did the following
on FreeBSD-current 4.0 (A is debugging machine, B is the target): 

(1) Build a debug kernel (options DDB and BREAK_TO_DEBUGGER) on box A.
The sio flag I used is 0x90 (I also tried 0x80).  Ftp the file /kernel to
box B and renamed as /kernel.A

(2) Boot the kernel /kernel.A on box B with -d option:

>>FreeBSD/i386 boot
Default: 0:wd(0,a)/boot/loader
boot: /kernel.A -d
Debugger("Boot flags requested debugger")
Stopped at 0xc0252c27: movl $0, 0xc031ed98
db> gdb
Next trap will enter GDB remote protocol mode
db> s

(3) On machine A, go to the compile directory:

#gdb -g kernel.debug

(kgdb) target remote /dev/cuaa0

 Remote debugging using /dev/cuaa0
 Ignoring packet error, continuing...
 Ignoring packet error, continuing...
 Couldn't establish connection to remote target
 Malformed response to offset query, timeout

The serial cable is null-modem and has been tested with kermit. It is
connected to /dev/ttyd0 (com 1) of machine B and com 2 of machine A.

I did not do "strip -x" because I assume this is done by FreeBSD 4.0
automatically and the file debug.kernel is the one with symbols.

Any help is appreciated.

------
Zhihui Zhang.  Please visit http://www.freebsd.org
--




To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Re: Help with remote debugging (gdb -k)

1999-08-30 Thread Zhihui Zhang


On Mon, 30 Aug 1999, Zhihui Zhang wrote:

> 
> After reading the handbook and some postings in the mailing list archive. 
> I still can not make remote debugging work.  I basically did the following
> on FreeBSD-current 4.0 (A is debugging machine, B is the target): 
> 
> (1) Build a debug kernel (options DDB and BREAK_TO_DEBUGGER) on box A.
> The sio flag I used is 0x90 (I also tried 0x80).  Ftp the file /kernel to
> box B and renamed as /kernel.A
> 
> (2) Boot the kernel /kernel.A on box B with -d option:
> 
> >>FreeBSD/i386 boot
> Default: 0:wd(0,a)/boot/loader
> boot: /kernel.A -d
> Debugger("Boot flags requested debugger")
> Stopped at 0xc0252c27: movl $0, 0xc031ed98
> db> gdb
> Next trap will enter GDB remote protocol mode
> db> s
> 
> (3) On machine A, go to the compile directory:
> 
> #gdb -g kernel.debug
> 
> (kgdb) target remote /dev/cuaa0
> 
>  Remote debugging using /dev/cuaa0
>  Ignoring packet error, continuing...
>  Ignoring packet error, continuing...
>  Couldn't establish connection to remote target
>  Malformed response to offset query, timeout
> 
> The serial cable is null-modem and has been tested with kermit. It is
> connected to /dev/ttyd0 (com 1) of machine B and com 2 of machine A.
> 
> I did not do "strip -x" because I assume this is done by FreeBSD 4.0
> automatically and the file debug.kernel is the one with symbols.
> 
> Any help is appreciated.
> 

I have just found the reason.  I should specify the local serial port of
the debugging machine.  So I should use: 

(kgdb) target remote /dev/cuaa1  <-- do not use /dev/cuaa0

Now everything works fine.

-Zhihui



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Re: Help with remote debugging (gdb -k)

1999-08-30 Thread Zhihui Zhang

> 
> On Mon, 30 Aug 1999, Zhihui Zhang wrote:
> 
> > (3) On machine A, go to the compile directory:
> > 
> > #gdb -g kernel.debug
> 
> -g?
> 
This is a typo.  It should be "gdb -k kernel.debug".  I have just posted
another message pointing out my mistakes.  Thanks for your response.

-Zhihui



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Re: Problems with FIFO open in non-blocking mode?

1999-09-06 Thread Zhihui Zhang


On Mon, 6 Sep 1999, Alex Povolotsky wrote:

> Hello!
> 
> The following program
> 
> #include 
> #include 
> 
> main() {
>   int control;
>   if ((control = open("STATUS",O_WRONLY|O_NONBLOCK))<0) {
>   perror("Could not open STATUS ");
>   exit(1);
>   }
>   printf("STATUS ready\n");
>   close(control);
>   return(0);
> }
> 
> fails to run (STATUS is pre-created FIFO file) with error "Device not
> configured", which seems kinda odd for me.
> 
> However, when FIFO is opened with O_RDWR and O_NONBLOCK, every attempt 
> to select(2) its handler for writing doesn't wait until someone opens
> FIFO for reading, but instead FIFO is ready to write at every select.
> 
> Is it a bug or a feature?
> 

I answered a similar question some time ago.  You can search the mailing
list archive for this.  Basically, you need to read the "Advanced Unix
Programming Environment" by Stevens.  I can not remember every details
right now.  The "device not configured" error is expected. 

-Zhihui



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

The usage of MNT_RELOAD

1999-09-08 Thread Zhihui Zhang


The flag MNT_RELOAD is not documented in mount manpages.  From the source
code, I find that it is always used along with MNT_UPDATE which can be
speficied by user (-u option).  Can anyone explain the usage of MNT_RELOAD
for me?  It seems not to be used normally.

Any help is appreciated.

--
Zhihui Zhang.  Please visit http://www.freebsd.org
--



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Re: The usage of MNT_RELOAD

1999-09-08 Thread Zhihui Zhang

On Wed, 8 Sep 1999, Luoqi Chen wrote:

> > The flag MNT_RELOAD is not documented in mount manpages.  From the source
> > code, I find that it is always used along with MNT_UPDATE which can be
> > speficied by user (-u option).  Can anyone explain the usage of MNT_RELOAD
> > for me?  It seems not to be used normally.
> > 
> It is created almost exclusively for fsck (and similar programs) to update
> the in core image of the superblock (of / in single user mode) after the
> on disk version has been modified.
> 

Does fsck have to run on a MOUNTED filesystem?  If so, your answer makes
sense to me: if fsck modifies the on-disk copy of the superblock, it does
not have to unmount and then remount the filesystem, it only need to
reload the superlock for disk. 

-Zhihui 

To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Using gdb with fork()

1999-09-08 Thread Zhihui Zhang


I am using gdb 4.18 on FreeBSD-current.  The program being debugged
consists of two small files: test1.c and test2.c.  The main() in test1.c
has a call to fork() and for the child process case, it will call a
routine, say test(), in test2.c. 

I use "set follow-fork-mode child", "break fork", "step" command trying to
access the source in test2.c without success.  The program is compiled
with "cc -g test1.c test2.c" and I run gdb with "gdb a.out".

If there is no fork(), a call from test1.c to a routine in test2.c will
bring up the source of test2.c if I step that routine.  Why it does not
work with fork()?  Am I missing something?

Thanks for any help.
 
------
Zhihui Zhang.  Please visit http://www.freebsd.org
--



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

How to follow child process in gdb

1999-09-08 Thread Zhihui Zhang

On Wed, 8 Sep 1999, Kip Macy wrote:

> You need to detach from your current process and attach to the spawned
> process. It might make it easier to attach in a timely fashion if you put
> a 3 second sleep in right after the fork. This would all be easiest using
> something like DDD where DDD will tell you what other processes are
> running with the same name, and allow you to attach to them through the
> GUI.

In dbx on a Sun workstation, all I need to do to follow a child process
after fork() is to use the following command in advance:

(dbx)dbxenv follow_fork_mode child

Your response suggests that I can not achieve the same result simply by
using (I am using gdb 4.18):

(gdb)set follow-fork-mode child

I have to use attach and dettach to do so.  Does that mean I have to
display the pid of the new process in order to follow it.  And I have to
modify the child process so that it can wait until I can attach to it.
That will not be as easy.

-Zhihui


> 
>   
> 
> On Wed, 8 Sep 1999, Zhihui Zhang wrote:
> 
> > 
> > I am using gdb 4.18 on FreeBSD-current.  The program being debugged
> > consists of two small files: test1.c and test2.c.  The main() in test1.c
> > has a call to fork() and for the child process case, it will call a
> > routine, say test(), in test2.c. 
> > 
> > I use "set follow-fork-mode child", "break fork", "step" command trying to
> > access the source in test2.c without success.  The program is compiled
> > with "cc -g test1.c test2.c" and I run gdb with "gdb a.out".
> > 
> > If there is no fork(), a call from test1.c to a routine in test2.c will
> > bring up the source of test2.c if I step that routine.  Why it does not
> > work with fork()?  Am I missing something?
> > 



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Let a daemon process print a message

1999-09-13 Thread Zhihui Zhang


Can anyone tell me how to let a daemon process print a message to the
console?  Adding printf() does not work (I wonder if a daemon process
has been cut of relationship with stdout).  Thanks for any help.

--
Zhihui Zhang.  Please visit http://www.freebsd.org
--



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Re: Let a daemon process print a message

1999-09-13 Thread Zhihui Zhang

On Mon, 13 Sep 1999, Brian Mitchell (ISSATL) wrote:

> syslog() with the proper facility is probably the best way to do this.
> Another possibility is opening /dev/console, but I think that will aquire
> a controlling terminal.
> 
> On Mon, 13 Sep 1999, Zhihui Zhang wrote:
> 
> > 
> > Can anyone tell me how to let a daemon process print a message to the
> > console?  Adding printf() does not work (I wonder if a daemon process
> > has been cut of relationship with stdout).  Thanks for any help.
> > 

I have tested syslog().  I find out:  (1) The log messages will go into
/var/log/messages and appear on the console only after I login in (as
root).  (2) The LOG_INFO priority does not cause the messages to appear on
the console or to be written into file /var/log/messages. 

Can anyone explain the reason for me?  Thanks a lot.

-Zhihui

To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

NFS authentication

1999-09-13 Thread Zhihui Zhang


I am wondering where the NFS authentication is done in FreeBSD. Is it done
by the NFS daemon mountd (or other daemon) or within the kernel?  Can
anyone give me a pointer?  Thanks a lot. 

--
Zhihui Zhang.  Please visit http://www.freebsd.org
--



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Multiple routes to the same destination

1999-09-17 Thread Zhihui Zhang


As said by the 4.4 BSD book (page 423), 4.4 BSD does not support multiple
routes to the same destination (identical key and mask). Does the radix
tree code in FreeBSD - 4.0 has the same limitation?  I am wondering if
there is already a solution for this? 

Any help is appreciated.

--
Zhihui Zhang.  Please visit http://www.freebsd.org
--



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Metablock caching & negative block #

1999-05-12 Thread Zhihui Zhang


It seems to me that metablocks such as filesystem superblock and cylinder
group control blocks are associated with the device vnode.  The indirect
blocks are associated with the file using them to find data blocks.  These
indirect blocks are identified by negative block numbers.  This makes the
max file size limited by 2^31 * 2^9, because we need one bit in the block
number to cope with negative block numbers.  The first time I understand
this I think it is cool because it allows buffering both kinds of data in
the same way and we can differentiate them at the same time.

Now my question is why we must associated these (double, triple) indirect
blocks with the file using them?  If these indirect blocks can be handled
like other metablocks (superblocks, cylinder group control blocks), we can
save one bit and make the max file size to be 2^32*2^9. 

By the way, all other metablocks seem to be delay-written. In other words,
they are not written synchronously.  What happens if the system crashes
before their updates go to disk.  I read in the mailinglist that FreeBSD
metadata I/O are conservative.  Can anyone describe this a little bit for
me. 

Any help is appreciated.

--
Zhihui Zhang.  Please visit http://www.freebsd.org
--



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

VOP_LEASE(...) or (void)VOP_LEASE(...)?

1999-05-13 Thread Zhihui Zhang


VOP_LEASE(...) always returns 0 so there is no actual need to check its
return value. But still it has a return value.  So should we use
(void)VOP_LEASE(...) instead of just VOP_LEASE(...)? 

BTW, I guess that the practice of modifying
default_vnodeop_p[VOFFSET(vop_lease)] in nfs_init() is a hack. Why do not
we use

   { &vop_lease_desc,  (vop_t *) nqnfs_vop_lease_check }, 

instead of 

   { &vop_lease_desc,  (vop_t *) vop_null },

in nfsv2_vnodeop_entries[] in file nfs_vnops.c?

Thanks for any help.

------
Zhihui Zhang.  Please visit http://www.freebsd.org
--



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

cylinder group and special device

1999-05-17 Thread Zhihui Zhang


Can anyone answer the following two questions for me:

(1) Does a cylinder group in FFS have to begin at a cylinder boundary?

(2) If we read a block via a special device name (/dev/xxx), will the
block be buffered as normal file data and used when we need the block
again?

Thanks for any help.

--
Zhihui Zhang.  Please visit http://www.freebsd.org
--



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

What does VOP_WHITEOUT() do?

1999-05-20 Thread Zhihui Zhang


Can anyone tell me what does VOP_WHITEOUT() do?  I can not find it in the
hypertext manual pages. 

Thanks.

--
Zhihui Zhang.  Please visit http://www.freebsd.org
--



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

open a file for read and write

1999-05-21 Thread Zhihui Zhang


If I want to read and write a file, I can do it in two ways:

(1) Open the file as read and write, using one file descriptor.
(2) Open the file as read only and open it again as write only, using a
total of two file descriptors.

Method (2) is more clear in logic and uses a little more resource (file
descriptors).  Other than these, are there any performance reasons for
doing so?  Method (2) is used in source code file mkfs.c when we open a
special device file to create a file system.

Thanks for any help.

-Zhihui

--
Zhihui Zhang.  Please visit http://www.freebsd.org
--



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

mmap of a network buffer

1999-05-21 Thread Zhihui Zhang


I really do not know how to describe the problem. But a friend here asks
me how to mmap a network buffer so that there is no need to copy the data
from user space to kernel space. We are not sure whether FreeBSD can
create a device file (mknod) for a network card, and if so, we can use the
mmap() call to do so because mmap() requires a file descriptor.  We assume
that the file descriptor can be acquired by opening the network device.
If this is infeasible, is there another way to accomplish the same goal?

Thanks for any enlightment.

-Zhihui

--
Zhihui Zhang.  Please visit http://www.freebsd.org
--



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

A bug in namei cache?

1999-05-25 Thread Zhihui Zhang


Suppose you want to mv a directory file (with subdirectories) to another
name (it is like grafting a subtree to another point), the namecache
associated with the source directory file will be purged by calling
cache_purge() (done in ufs_rename()?).  However, the routine cache_purge() 
does not purge cache entries recursively down the subtree.  Will this
result in a lot of stale entries in the namecache? FreeBSD 3.1 no longer
allows stale entries in the namei cache (FreeBSD 2.2.8 does). 

Thanks for any help.

--
Zhihui Zhang.  Please visit http://www.freebsd.org
--



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Re: File system gets too fragmented ???

1999-05-27 Thread Zhihui Zhang


> 
> It might help somewhat if a file that grows by a fragment can allocate
> the free fragment immediately preceeding it instead of being relocated
> to a fresh block.  I don't know if FFS does this or not.
> 

Really? FFS allocates free fragments with bitmap, so it should be able to
find free fragments anywhere. 

-Zhihui



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Re: A bug in namei cache? (stale entries)

1999-05-27 Thread Zhihui Zhang

On 27 May 1999, Ville-Pertti Keinonen wrote:

> zzh...@cs.binghamton.edu (Zhihui Zhang) writes:
> 
> > Suppose you want to mv a directory file (with subdirectories) to another
> > name (it is like grafting a subtree to another point), the namecache
> > associated with the source directory file will be purged by calling
> > cache_purge() (done in ufs_rename()?).  However, the routine cache_purge() 
> > does not purge cache entries recursively down the subtree.  Will this
> > result in a lot of stale entries in the namecache? FreeBSD 3.1 no longer
> 
> The name cache only caches component names, not paths, so the entries
> are still valid.
> 

Thanks for your reply. I understand now that the namecache only acts on
individual component names, not on the entire pathname.  The following is
based on my understanding: 

Suppose, you have a directory hierarchy a -> b -> c.  In each of a, b, and
c, we have the following files:

a: ., .., a1, a2, a3, b   (a1, a2, a3 are not directory files)
b: ., .., b1, b2, b3, c   (b1, b2, b3 are not directory files)

If I do a "mv a a_new", then cache entries for a, a1, a2, a3, b will be
purged from the cache. Although b is purged from the namecache, we can
still find it by other means (e.g. ufs_ihashget() called by ffs_vget()). 
So the entries for b1, b2, b3, c are still useful.  So the namei cache
will not contain any stale entries. 

Am I right? 

-Zhihui

To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Algorithm used to delete part of a file

1999-05-28 Thread Zhihui Zhang


I am wondering what will happen to the underlying data blocks and indirect
blocks of a file if I delete a part of the file - how these blocks are
re-organized. I have no idea which source code should I look into to
understand this.  Maybe I should read the source code for vi or another
editor.  I hope someone can suggest me a better way to understand this or 
describe briefly the algorithm. 

Any help is appreciated.

-Zhihui

--
Zhihui Zhang.  Please visit http://www.freebsd.org
--



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Re: Algorithm used to delete part of a file

1999-05-28 Thread Zhihui Zhang

On Fri, 28 May 1999, Christopher R. Bowman wrote:

> It is difficult to understand if you are talking at the file system layer
> (because you mention data and indirect blocks) or the application layer (you
> mention looking at the vi code).  At the file system layer you don't delete
> blocks in the middle of a file.  You can append to the file thus allocating 
> new
> data blocks (and perhaps indirect blocks if they are needed) that will be 
> added
> to the end of the file.  Or you can truncate a file, thus freeing the data
> blocks (and perhaps indirect blocks) at the end of the.  When you truncate a
> file the data blocks are returned to a list of free blocks, and when a block 
> is
> later reused for another purpose it is either written to in it's entirety or
> zero filled, and then partially filed with your data (if you don't write the
> entire block).  In either case blocks are never added or removed except at the
> end of the file, thus blocks never have to be "re-organized."  They are simply
> allocated or freed.  If this is the level of your interest then looking at vi
> source code won't help you.

Thanks for your valuable information. This explains why I have not found
any routines in the files under /ufs/ffs and /ufs/ufs that re-organize the
on-disk image of a file in that way. If a middle part of a file is
deleted, then all the remaining part of the file must be read by an editor
(such as vi) and written out to another place before the file length is
truncated. This algorithm seems to be not very efficient. But disk is not
like memory, where we can simply modify pointers to point to new locations
easily, I guess there may be no better way to do this.  If you have any
ideas about why this is not done by the filesystem itself, please let me
know. 

Thanks for your help.

Zhihui

To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Re: Algorithm used to delete part of a file

1999-05-29 Thread Zhihui Zhang

On Sat, 29 May 1999, Duncan Barclay wrote:

> Primarily the file system is a "block" orientated storage media where a 
> "block"
> is the fragment size or a file system block. Addressing in the filesystem is
> done on a block by block basis. As each block is a number of bytes we cannot
> use byte addressing to simply move pointers around.
> 
> If you find the papers written by Rob Pike on the editor "Sam" undr Plan-9 he
> goes into a lot of detail about algorithms for removing/adding bytes into a
> storage area with block addressing.
> 
Thanks.  I have found a paper named "the text editor sam" by Rob Pike in
1987 at http://plan9.bell-labs.com/cm/cs/papers.html.

-Zhihui



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

question about vnode and inode locking

1999-05-30 Thread Zhihui Zhang


It seems to me that we can lock at the vnode layer AND at the inode layer. 
Since an inode is always associated with a vnode, and is accessed via its
vnode, I do not see the reason why we should lock the inode after having
locked the vnode.  Can anyone help me with this? 

Thanks a lot.

-Zhihui




To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Accessing special device files

1999-05-31 Thread Zhihui Zhang


I write a small program to read/write each FreeBSD partition via special
device file names, e.g. /dev/wd0s2e, /dev/rwd0s2e, etc. I have two
questions about doing this:

(1) If I try to read() on these files, the buffer size must be given in
multiples of 512 (sector size).  Otherwise, I will get an EINVAL error. 
Why is this the case?  Does the same thing happen to the write() system
call?

(2) I use lseek() on these device files, it returns the correct offset for
me.  But actually it does not work. I read in a recent posting saying that
you can't expect lseek(fd, 0, SEEK_END) to work unless the file descriptor
is associated with a regular file because file size information is not
available at that level.  Does this apply to all kinds of lseek(), include
SEEK_SET and SEEK_CUR?  Or maybe the offset must also given in a multiple
of 512 for some reason.  If I give lseek(fd, 8193, SEEK_SET), it will
actually do lseek(fd, 8192, SEEK_SET)?

Thanks for any help.

--
Zhihui Zhang.  Please visit http://www.freebsd.org
--





To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Re: Accessing special device files

1999-06-01 Thread Zhihui Zhang

On Tue, 1 Jun 1999, Wes Peters wrote:

> 
> ???
> 
> dd verifies the behavior you report:
> 
> r...@homer# dd if=/dev/rwd0s2b of=/dev/null bs=1
> dd: /dev/rwd0s2b: Invalid argument
> ...
> r...@homer# dd if=/dev/rwd0s2b of=/dev/null bs=512
> ^C18805+0 records in
> ...
> 
> w...@homer$ ls -l /dev/*wd0s2a
> crw-r-  1 root  operator3, 0x0003 Apr  1 11:10 /dev/rwd0s2a
> brw-r-  1 root  operator0, 0x0003 Apr  1 11:10 /dev/wd0s2a
> 
> The rwd device is clearly a character-special device, the wd device a
> block special.  Character devices can always be read byte-at-a-time,
> by definition.  When did the semantics of this change?
> 

I have verified the requirement that character device must be read in
multiples of 512 from the source code point of view (the disk involved in
an IDE drive): 

When we call read(int d, void *buf, size_t nbytes) system call, the
argument nbytes is passed on to the iov_len field of an iov structure (see
file sys_generic.c). Later, the routine vn_read() in file vfs_vnops.c is
called via the structure fileops, the uio structure is passed along.
vn_read() will call spec_read() via VOP_READ() because we are talking
about raw device file name. spec_read() will call wdread() via the cdevsw
table.  wdread() will call physio() where b_bcount of a buffer is set to
be iov_len.  The routine wdstrategy() invoked by physio() will check if
bp->b_bcount % DEV_BSIZE != 0.  If it detects an request size that is not
a multiple of 512, it will set b_error = EINVAL. This error will be picked
up by physio() and returned. 

Thanks for your help.

-Zhihui

To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

The choice of MAXPHYS

1999-06-03 Thread Zhihui Zhang


The value of MAXPHYS is chosen to be 64K for the maximum raw I/O transfer
size. I am wondering why it is not set larger.  The maxcontig value of FFS
is default to be 16, which means 16*8192 or 128K bytes (twice as big as
64K) . If we raise the value of MAXPHYS, we can put more data blocks of a
big file contiguously on the disk (perhaps even more than 16 blocks to
achieve better performance). Am I right? Is there any limit of the value
of MAXPHYS?

Any help is appreciated.

-Zhihui



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Re: allocate file blocks contiguously

1999-06-06 Thread Zhihui Zhang


> 
> For more info about maxcontig, you can refer to the well-known
> paper of McKusic et al about Fast File System.  It is a parameter
> that is hardware dependent.  You can't get performance just by
> increasing its value.  Unfortunately, I don't have on-line version
> of that paper.
> 
> 
> --Farshid
> 
> On Wed, 2 Jun 1999, Zhihui Zhang wrote:
> 
> > 
> > In FFS, there is a parameter called maxcontig (default to 16) that
> > determines the number of blocks we can allocate contiguously for a single
> > file.  What is its optimal value? I mean, if we allocate ALL the data
> > blocks of a very big file contiguously, will its I/O performance be
> > improved greatly?  It seems to me this number may also be limited by
> > system buffering capability (MAXPHYS?) and underlying hardware controller.
> > Can anyone give me some hints on the choice of the value of maxcontig?
> > 

I read the paper at http://docs.FreeBSD.org/44doc/, which is basically the
same as in the 4.4 BSD book (p276).

My feeling is that if we allocate ALL the data blocks of a big file
contiguously, this will lead to "too much localization" as described in
the paper (or the book). However, this may be good for this big file if
the system buffering capability and hardware allow it (at the cost of
other files?) 

Regards,

Zhihui





To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

help with I/O optimization with object

1999-06-07 Thread Zhihui Zhang


While studying the file ufs_readwrite.c, I see routines like uiomoveco() 
that calls vm_uiomove() in vm_map.c.  I am almost sure that these are new
in FreeBSD 3.x. The comment in ffs_read() says "not a VM based I/O
requests"  == "not headed for the buffer cache". This does not make sense
to me although I understand something about VMIO buffers and non-VMIO
buffers. I hope someone can explain the basic ideas of I/O optimization
with VM object (relating to the OBJ_OPT flag and the global variable
vfs_ioopt) so that I can understand the code easier. 

Any help is appreciated. 

----------
Zhihui Zhang.  Please visit http://www.freebsd.org
--



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Re: help with I/O optimization with object

1999-06-07 Thread Zhihui Zhang

On Mon, 7 Jun 1999, Zhihui Zhang wrote:

> 
> While studying the file ufs_readwrite.c, I see routines like uiomoveco() 
> that calls vm_uiomove() in vm_map.c.  I am almost sure that these are new
> in FreeBSD 3.x. The comment in ffs_read() says "not a VM based I/O
> requests"  == "not headed for the buffer cache". This does not make sense
> to me although I understand something about VMIO buffers and non-VMIO
> buffers. I hope someone can explain the basic ideas of I/O optimization
> with VM object (relating to the OBJ_OPT flag and the global variable
> vfs_ioopt) so that I can understand the code easier. 
> 

After searching the mailing list archive for some time and tracing down
who calls vm_uiomove(), it seems to me that this is the zero copy read
stuff used to read data into the current process' address space.  However,
I do not know when it can be useful or any more details.

-Zhihui

To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

What is FTW?

1999-06-09 Thread Zhihui Zhang


In the FAQ of FreeBSD 2.X, 13.12. Alternative layout policies for
directories, there is the following sentence: 

Most filesystems are created from archives that were created by a depth
first search (aka ftw). 

What does ftw stand for (My guess is File Tree Walk)? Can anyone give me
examples of programs that create archives from a file tree in a depth
first way? Do these programs rebuild the file tree from archive exactly as
they were created?

Any help is appreciated.

--
Zhihui Zhang.  Please visit http://www.freebsd.org
--



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Re: What is FTW?

1999-06-10 Thread Zhihui Zhang

On Wed, 9 Jun 1999, Zhihui Zhang wrote:

> 
> In the FAQ of FreeBSD 2.X, 13.12. Alternative layout policies for
> directories, there is the following sentence: 
> 
> Most filesystems are created from archives that were created by a depth
> first search (aka ftw). 
> 
> What does ftw stand for (My guess is File Tree Walk)? Can anyone give me
> examples of programs that create archives from a file tree in a depth
> first way? Do these programs rebuild the file tree from archive exactly as
> they were created?
> 

I have just found that ftw does stand for File Tree Walk and there is a C
library routine named ftw() (XPG4 standard) in AIX and HP-UX.  However, I
can not find the same routine in FreeBSD manual pages.  Maybe it is not
supported by FreeBSD. 

-Zhihui

To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

The clean and dirty buffer list of a vnode

1999-06-11 Thread Zhihui Zhang

What does VOP_FREEBLKS() do?

1999-06-19 Thread Zhihui Zhang


I find in the routine ffs_blkfree() there is a new statement saying:

VOP_FREEBLKS(ip->i_devvp, fsbtodb(fs, bno), size);
  
which calls spec_freeblks() in file spec_vnops.c.  The routine 
spec_freeblks() looks simple.  When D_CANFREE is set, it gets an empty
buffer and call strategy routine for the buffer.  Since B_READ is not set,
we must call the strategy routine to write some data.  But where is the
data for the buffer?  Why we call VOP_FREEBLKS() at the time we are going
to free the blocks?  BTW, this vnode operation is not listed in the man
pages.

Any help is appreciated.

--
Zhihui Zhang.  Please visit http://www.freebsd.org
--



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Jeroen Ruigrok/Asmodai's project

1999-06-22 Thread Zhihui Zhang


Dear Jeroen Ruigrok/Asmodai:

I received your email concerning your documentation project a week ago. I
tried to respond a couple of times, but I could not reach your private
email address. I have written a much longer email. Anyway, I am afraid
that being a one year old newbie I could not help as much as you expect. I
appreciate all the help I have received from you and others on this list. 
 
--
Zhihui Zhang.  Please visit http://www.freebsd.org
--



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Difference between msync() and fsync()

1999-06-23 Thread Zhihui Zhang


After we mmap a file, we can write back the dirty pages of the file either
by calling msync() or fsync(). After reading the source code, it seems to
me that they actually does the same thing.  msync() will eventually call
VOP_FSYNC() as fsync() does. Since msync() has already call the routine
vm_object_page_clean() to write back the dirty pages of the file,
VOP_FSYNC() really does not have much left to do except update the inode. 

So is there any real differnce between msync() and fsync() on mmapped
files? Or are they simply provided to do the same thing in an alternate
way?

Thanks for any help.

--
Zhihui Zhang.  Please visit http://www.freebsd.org
--



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Implementation of mmap() in FreeBSD

1999-06-26 Thread Zhihui Zhang

Re: RE: Implementation of mmap() in FreeBSD

1999-06-28 Thread Zhihui Zhang

> 
> Because we can't realign the data in the pages without doing a buffer
> copy.  To force mmap() to align the data to the start of the page requires
> it to allocate memory and copy the in-core disk cache to the new memory.
> 
> This is extremely wasteful of cpu and memory.  The current UNIX mmap
> implementation is able to simply map the existing in-core disk cache
> directly to the process - no buffer copying is required at all, and 
> it is extremely memory efficient.

I guess you are talking about VMIO buffers where the pages are found and
registered into the buffer header during allocbuf().  When we do I/O on
VMIO buffers using conventional system call method, we specify UIO_NOCOPY
to instruct the uiomove() do not perform data copy. 

> Programmers who use mmap() expect it to be as close to optimal as
> possible. 

I write a program to test the mmap() today. It turns out that a user can
modify the part of the mmapped area that is within the system returned
area but not part of the user-specified area. 

As I understand it, there are two access paths to a file: conventional I/O
through read/write systems calls and memory-mapped I/O.  Both of them
converge at the vnode read and write routine (VOP_READ() and VOP_WRITE()). 
This should give us the opportunity to guard against illegal memory-mapped
I/O writes made by the user. 

Maybe we can add some fields in the vm_object to record the real or
user-specifed area which can be passed to the vnode read and write
routine. In the vnode I/O routine, we should be able to limit the write to
only the orginal part of the area specified by the user.  This practice
should not incur any performance loss.

-Zhihui





To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Re: RE: Implementation of mmap() in FreeBSD

1999-06-28 Thread Zhihui Zhang


On Mon, 28 Jun 1999, Matthew Dillon wrote:

> :> it is extremely memory efficient.
> :
> :I guess you are talking about VMIO buffers where the pages are found and
> :registered into the buffer header during allocbuf().  When we do I/O on
> :VMIO buffers using conventional system call method, we specify UIO_NOCOPY
> :to instruct the uiomove() do not perform data copy. 
> 
> UIO_NOCOPY is used to handle a degenerate case in the VFS/BIO vs VM
> interaction for I/O, it has nothing to do with the read() or write() 
> syscall per say, nor is it related to the mmap code.
> 
> :> Programmers who use mmap() expect it to be as close to optimal as
> :> possible. 
> :
> :I write a program to test the mmap() today. It turns out that a user can
> :modify the part of the mmapped area that is within the system returned
> :area but not part of the user-specified area. 
> :
> :As I understand it, there are two access paths to a file: conventional I/O
> :through read/write systems calls and memory-mapped I/O.  Both of them
> :converge at the vnode read and write routine (VOP_READ() and VOP_WRITE()). 
> :This should give us the opportunity to guard against illegal memory-mapped
> :I/O writes made by the user. 
> 
> They converge in the VMIO page cache.

By converge, I mean VOP_GETPAGES() and VOP_PUTPAGES() will call VOP_READ()
and VOP_WRITE() just as read() and write() system call.

> 
> :Maybe we can add some fields in the vm_object to record the real or
> :user-specifed area which can be passed to the vnode read and write
> :routine. In the vnode I/O routine, we should be able to limit the write to
> :only the orginal part of the area specified by the user.  This practice
> :should not incur any performance loss.
> :
> :-Zhihui
> 
> mmap bypasses the vnode.  What you propose will not work because even if
> the VM object is process-specific, the pages underlying the VM object are
> not.  If several processes are mmap()ing overlapping portions of the file,
> they are *sharing* the pages.  So even though they are not sharing the 
> VM object, the VM system will not be able to tell which process modified
> the page, and therefore any byte-ranged limits specified in the VM object
> will be useless.

This is a good point!  I have never thought of it before.  Thanks.

-Zhihui



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

A way to crash system (3.1 & 3.2) with floppy

1999-06-28 Thread Zhihui Zhang


Suppose you have a *write-protected* DOS floppy and you do:

# mount -t msdos /dev/fd0 /floppy  <-- this is OK

# cp somefile /floppy  <-- a lot of error messages

# umount /floppy   <-- crash

Now the system tries to sync the dirty buffers and fails.  You have to
press a key to reboot. 

Is there anything wrong here or FreeBSD simply does not handle this in a
more elegant way? 

Thanks for any help.

------
Zhihui Zhang.  Please visit http://www.freebsd.org
--



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

reason for slow user-user memory copy

1999-07-01 Thread Zhihui Zhang


A graduate student here implements a mmap() interface to a TCP/IP network
card.  He notices that it takes much longer time to copy from mmapp()'ed
area to another user area than it takes to copy the same amount of data
from kernel space to user space. The students here have no idea why this
could be possible.  I hope someone on this list can give us a hint. Below
is a part of his original email.  He uses rdtsc instruction to do the
timing. 


Well I have implemented a memory mapped interface for the user in Linux
using the DEC 21140 Tulip ethernet card. Thus the user has access to the
buffers, but when I did a memcpy from the RX buffer to the user variable,
it took an extraordinary amount of time, approx 70 microsec for 1460
btyes... where as the original scheme takes 25 microsec for the same data
when it does a memcpy_to_iovec in tcp_recvmsg().

I am confused by this unexpected timings. More than 80% of the time is
spent doing the memcpy.
---

Thanks for your help.

--
Zhihui Zhang.  Please visit http://www.freebsd.org
--



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Re: reason for slow user-user memory copy

1999-07-01 Thread Zhihui Zhang


On Thu, 1 Jul 1999, David Greenman wrote:

> >A graduate student here implements a mmap() interface to a TCP/IP network
> >card.  He notices that it takes much longer time to copy from mmapp()'ed
> >area to another user area than it takes to copy the same amount of data
> >from kernel space to user space. The students here have no idea why this
> >could be possible.  I hope someone on this list can give us a hint. Below
> >is a part of his original email.  He uses rdtsc instruction to do the
> >timing. 
> >
> >
> >Well I have implemented a memory mapped interface for the user in Linux
> >using the DEC 21140 Tulip ethernet card. Thus the user has access to the
> >buffers, but when I did a memcpy from the RX buffer to the user variable,
> >it took an extraordinary amount of time, approx 70 microsec for 1460
> >btyes... where as the original scheme takes 25 microsec for the same data
> >when it does a memcpy_to_iovec in tcp_recvmsg().
> >
> >I am confused by this unexpected timings. More than 80% of the time is
> >spent doing the memcpy.
> >---
> 
>If the mapping is being done via a device mapping, then the region will
> be marked non-cacheable.
> 
> -DG

I remember that he said he created a character device /dev/tulip to
represent the network card. Actually, his work borrowed a lot from the
Cornell U-Net project (now the basis of VIA?). Can we change the
corresponding page table (directory) entries to be cacheable as needed?

-Zhihui 




To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Overwrite an executable file that is running

1999-07-06 Thread Zhihui Zhang


For a big executable file that is being run by the OS, all its contents
may not be loaded into the memory.  At the same time, the developer gets
impatient and wants to create a new version of the same file.  He could
modify the makefile to output the new version to a different file name,
but this is tedious. This new version should not overwrite the older
verion of the file being run. My question is how FreeBSD prevents this
from happening?  Can anyone point out for me where in the source code this
is handled?

Thanks a lot.

--
Zhihui Zhang.  Please visit http://www.freebsd.org
--



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Wrong comment in VM code?

1999-07-08 Thread Zhihui Zhang


At the beginning of the file vm_object.c, we have the following comment:

The only items within the object structure which are modified after time
of creation are:

reference count locked by object's lock
pager routine   locked by object's lock 

But at the end of vnode_pager_setsize(), we modify the size field.  So at
least three items can be modified after creation.  Am I right?

Thanks for any help.

------
Zhihui Zhang.  Please visit http://www.freebsd.org
--



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Help with PCI code understanding

1999-07-15 Thread Zhihui Zhang

Can someone outline the initialization process of PCI devices in
FreeBSD?  I know many of the basic stuff of PCI introduced in the book
"PCI System Architecture".  I just want to know how each driver is
registered into some linker set and its probe routine gets called.  In
other words,  I want to know the major data structures and routines and
their relationship. I wonder if there is already a document somewhere.

Any help is appreciated.

-Zhihui



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Buffer emergence reserve

2001-04-18 Thread Zhihui Zhang



While looking at the code in vfs_bio.c, I notice the existence of low and
high free buffer counters. The comments say they are there to give some
special process like buf daemon access to emergence reserve.  I just
don't get the reason for having this emergence reserve.  Do we allocate
buffer in an interrupt environment? Do we need extra buffers in order to
free buffers?  Please shed a light on this for me.  Thanks.

-Zhihui


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Buffer emergence reserve

2001-04-18 Thread Zhihui Zhang



Thanks! I am wondering whether the free VM page reserve has similar reason
to exist, i.e., to clean dirty pages you need more pages. Probably not,
that is for interrupt routines that can not block.

On Wed, 18 Apr 2001, Alfred Perlstein wrote:

> * Zhihui Zhang <[EMAIL PROTECTED]> [010418 09:18] wrote:
> > 
> > While looking at the code in vfs_bio.c, I notice the existence of low and
> > high free buffer counters. The comments say they are there to give some
> > special process like buf daemon access to emergence reserve.  I just
> > don't get the reason for having this emergence reserve.  Do we allocate
> > buffer in an interrupt environment? Do we need extra buffers in order to
> > free buffers?  Please shed a light on this for me.  Thanks.
> 
> It's really a simple issue of:
> 
>   "sometimes to clean a buffer we need one or more buffers"
> 
> Think of some random data block at the far end of a large file.
> 
> If the indirect blocks aren't in memory you will need to bring
> them in to lookup the location of the buffer you're writing
> because buffers use logical offsets rather than physical ones.
> 
> -- 
> -Alfred Perlstein - [[EMAIL PROTECTED]|[EMAIL PROTECTED]]
> Represent yourself, show up at BABUG http://www.babug.org/
> 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

cv_wait() or sv_wait() in FreeBSD

2001-04-23 Thread Zhihui Zhang



Do we have conditional/synchronization variable support in FreeBSD? If
not, is there any alternative mechanism to use in the kernel? Thanks.

-Zhihui


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

shared versus exclusive lock

2001-05-24 Thread Zhihui Zhang



According to my reading of kern_lock.c, it does support shared lock.
However, we are still using LK_EXCLUSIVE mode more often than necessary.
If I want to look up a directory or to read a buffer, I should be able to
use the LK_SHARED lock. Right now, only few places I have found using
LK_SHARED, like in vn_read(). Is there any reason behind this?  If I want
to change this in my code, is there anything I should pay special
attention to? Thanks.

-Zhihui


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Confusion with mknod() and devfs

2001-06-21 Thread Zhihui Zhang



There is following comment inside ufs_mknod() which says

/*
 * Remove inode, then reload it through VFS_VGET so it is
 * checked to see if it is an alias of an existing entry in
 * the inode cache.  
 */ 

I really can not understand it. For each new disk inode, we call
ufs_vinit() from ffs_vget() and ufs_vinit() calls addaliasu() to add the
vnode to the alias list. So why reload?  The alias vnode is already
handled after it calls ufs_makeinode().

Since DEVFS is in use, will it prevent a user from creating alias names to
the same device?  If so, there is no need to handle alias in the kernel.

According to the red daemon book, alias vnodes are used to make cache
coherent (vp as a key).  But getblk() stuff does not seem to check it.  
This makes me feel the code is there for historical reasons.

Thanks for any clarification.

-Zhihui


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Confusion with mknod() and devfs

2001-06-23 Thread Zhihui Zhang

On Fri, 22 Jun 2001, Terry Lambert wrote:

> Zhihui Zhang wrote:
> > According to the red daemon book, alias vnodes are used to make cache
> > coherent (vp as a key).  But getblk() stuff does not seem to check it.
> > This makes me feel the code is there for historical reasons.
> 
> The "BSD 4.4" book was written about a system without a
> unified VM and buffer cache.  The aliases it is talking
> about are the buffers hung off a file vnode and the
> buffers hung off a device vnode, from which that file
> was being read.

I think you got me wrong. I was talking about a device with
more than one names.  So we can have more than one vnode for the same
device. (If there is more than one name to the same device in the same FS,
they can share the vnode, otherwise, they cannot.)

Specifically, I fail to understand why we reload the inode in
ufs_mknod():

/*
 * Remove inode, then reload it through VFS_VGET so it is
 * checked to see if it is an alias of an existing entry in
 * the inode cache.
 */
vput(*vpp);
(*vpp)->v_type = VNON;
/* Save this before vgone() invalidates ip. */
ino = ip->i_number;  
vgone(*vpp);
error = VFS_VGET(ap->a_dvp->v_mount, ino, vpp); 

I wonder with the use of DEVFS, the special device aliases may no longer
exist because they are created by kernel instead of by administrators.

-Zhihui

> The reason getblk() doesn't check it is that the cache is
> maintained as coherent, so there's no need, since the
> check is intended to permit explicit coherency operations
> to take place, when necessary.  There is a lot of "missing"
> code you aren't seeing that is referenced by the book.
> 
> It is still possible to create aliases, but they are done
> by having multiple vm_object_t's pointing to the same data
> blocks as backing objects.  This only occurs in the case
> of stacking VFS's with a non-trivial relationship (e.g.
> where the backing object contents would not be the same
> between layers).  It can also occur to some small extent
> in the NFS client FS case.
> 
> -- Terry
> 
> To Unsubscribe: send mail to [EMAIL PROTECTED]
> with "unsubscribe freebsd-hackers" in the body of the message
> 

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Confusion with mknod() and devfs

2001-06-24 Thread Zhihui Zhang

On Sat, 23 Jun 2001, Terry Lambert wrote:

> Zhihui Zhang wrote:
> > I think you got me wrong. I was talking about a device
> > with more than one names.  So we can have more than one
> > vnode for the same device. (If there is more than one name
> > to the same device in the same FS, they can share the vnode,
> > otherwise, they cannot.)
> 
> This is not how it works.  The specfs/devfs will return
> the same vnode.
> 
> A "special device" file type in the traditional sense is
> a major/minor/{block|character} tuple.
> 
> The entry in an FS that references this is _not_ where
> the vnode comes from, it's a hint to tell the system to
> get the vnode from a single place, instead (specfs in a
> traditional system, vfs in a less traditional system).
> 
> 
> > Specifically, I fail to understand why we reload the inode
> > in ufs_mknod():
> 
> Because when you make the node, you may have an exiting
> open reference to the same major/minor/{block|character}
> tuple, and you don't want to duplicate it in the ihash
> cache.
> 

Thanks.  But I still don't get it.  The ihash is keyed on i_dev (the
device where the filesystem is mounted on) and i_number. 

If I have two names in a filesystem refer to the same device, then their
inode number must be different.

if two names from different filesystems refer to the same device, then
their i_dev is different even if their inode number may happen to be the
same.

So I do not see how can we avoid duplicate entries in the ihash cache.

-Zhihui

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

trace a library call

2001-06-27 Thread Zhihui Zhang



Suppose I write a program that calls sbrk(). How can I trace into the
function sbrk()? In this particular case, I want to know whether
sbrk() calls the function in file lib/libstand/sbrk.c or sys/sbrk.S.
Sometimes it is nice to see what system call is eventually called as well.
I know dynamic linking may make this hard. But is there a way to do
this? Thanks.

-Zhihui


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: does data overflow in pipes

2001-06-27 Thread Zhihui Zhang



I guess the kernel will block the process trying to write more data than
that can be accommodated. Or if you are using non-blocking I/O, it will
return an error.

-Zhihui

On Wed, 27 Jun 2001, Manas Bhatt wrote:

> hi all,
>  pipes uses only direct blocks to store data. so
> depending on the blocksize , a total data of
> 10*blocksize can be written in one go but what happens
> if a writer process tries to write more 10*blocksize
> of data in one go. Does the kernel overwrites the 
> data  in pipe or not ? if yes, why? if not, then how
> does it allow the writer to write more 10*blocksize of
> data?
>  if someone can direct me to implementation
> (source files), it would be great.
> thanks
> --manas
> 
> __
> Do You Yahoo!?
> Get personalized email addresses from Yahoo! Mail
> http://personal.mail.yahoo.com/
> 
> To Unsubscribe: send mail to [EMAIL PROTECTED]
> with "unsubscribe freebsd-hackers" in the body of the message
> 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: trace a library call

2001-06-28 Thread Zhihui Zhang



sbrk() is not supported in FreeBSD as a system call (see file
vm/vm_mmap.c). However, sbrk(0) can reflect the latest end of the heap. I
am interested in how sbrk() interacts with malloc(). I know my question is
too specific.  Thanks for your answer. I did learn a lesson: mixing
abstraction layers is really bad.

-Zhihui

On Thu, 28 Jun 2001, Terry Lambert wrote:

> Zhihui Zhang wrote:
> > 
> > Suppose I write a program that calls sbrk(). How can I trace into the
> > function sbrk()? In this particular case, I want to know whether
> > sbrk() calls the function in file lib/libstand/sbrk.c or sys/sbrk.S.
> > Sometimes it is nice to see what system call is eventually called as well.
> > I know dynamic linking may make this hard. But is there a way to do
> > this? Thanks.
> 
> sbrk() is a system call, not a library call.  It has a
> stub that just loads a register with the call ID and
> does an INT 0x80.
> 
> You can't "trace into" it, since you are in a user space
> program.
> 
> If you want to see how it works, the sources are in /sys;
> but all it does is add pages to the end of the address
> space, in the heap.
> 
> If you are having problems with it, you are probably using
> sbrk() and malloc() in the same program.  Don't do that;
> malloc() traditionally calls sbrk() to get pages, so you
> will have the same effect as trying to use fopen() and
> open() in the same program: mainly, that fd manipulation
> routines can close/open/etc. fd's out from under file
> pointers.  In the sbrk() case, there can be attempts to
> (re)map pages to regions where they don't really belong.
> 
> -- Terry
> 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: trace a library call

2001-06-28 Thread Zhihui Zhang



I am sorry. It turns out when the argument is zero, sbrk() does not enter
into the kernel.  If it does, it will return not supported.

-Zhihui

On Thu, 28 Jun 2001, Zhihui Zhang wrote:

> 
> sbrk() is not supported in FreeBSD as a system call (see file
> vm/vm_mmap.c). However, sbrk(0) can reflect the latest end of the heap. I
> am interested in how sbrk() interacts with malloc(). I know my question is
> too specific.  Thanks for your answer. I did learn a lesson: mixing
> abstraction layers is really bad.
> 
> -Zhihui
> 
> On Thu, 28 Jun 2001, Terry Lambert wrote:
> 
> > Zhihui Zhang wrote:
> > > 
> > > Suppose I write a program that calls sbrk(). How can I trace into the
> > > function sbrk()? In this particular case, I want to know whether
> > > sbrk() calls the function in file lib/libstand/sbrk.c or sys/sbrk.S.
> > > Sometimes it is nice to see what system call is eventually called as well.
> > > I know dynamic linking may make this hard. But is there a way to do
> > > this? Thanks.
> > 
> > sbrk() is a system call, not a library call.  It has a
> > stub that just loads a register with the call ID and
> > does an INT 0x80.
> > 
> > You can't "trace into" it, since you are in a user space
> > program.
> > 
> > If you want to see how it works, the sources are in /sys;
> > but all it does is add pages to the end of the address
> > space, in the heap.
> > 
> > If you are having problems with it, you are probably using
> > sbrk() and malloc() in the same program.  Don't do that;
> > malloc() traditionally calls sbrk() to get pages, so you
> > will have the same effect as trying to use fopen() and
> > open() in the same program: mainly, that fd manipulation
> > routines can close/open/etc. fd's out from under file
> > pointers.  In the sbrk() case, there can be attempts to
> > (re)map pages to regions where they don't really belong.
> > 
> > -- Terry
> > 
> 
> 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Max DMA size

2000-07-06 Thread Zhihui Zhang



Can anyone tell me what factors determine the max DMA size (DMA counter on
each controller or PCI bus related)? What is the typical max DMA size for
a SCSI disk connected to a PCI bus? It seems to be much larger than
MAXPHYS (128K). If so, does it mean we are not using full potential of
DMA? So what's the problem if we enlarge MAXPHYS?

Any help is appreciated.

-Zhihui



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

KLD, kernel threads, zone allocator

2000-07-17 Thread Zhihui Zhang



I am writing a KLD that gives me kernel fault each time I run 'ps' command
after 'make unload'.  The KLD has a system call to create several kernel
threads by calling kthread_create(). During unload, I set flags to each
threads so that they will call exit1() upon wakeup (sleep on a timeout).  
Before the last thread calls exit1(), it wakeup the kld unload process so
that make 'unload' can finish. Is there anything wrong or better
solutions?

I also use vm_zone to allocate some data structes within the KLD. When
unloading, I can use zfree() to free them except the zone header that I
can not free(some_zone, M_ZONE).  This is because M_ZONE is defined as
*static* in vm_zone.c I wonder if this will cause memory leak after
several loading and unloading the KLD.

Finally, I want to know how to save the panic screen without hand writing
it down.  Any info on debugging under db> after fault?

Any help is appreciated.

-Zhihui



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

memory type and its size

2000-07-20 Thread Zhihui Zhang



Does kernel memory of the same type (e.g., M_TEMP) must be allocated
(using malloc()) with the same (range of) size?  BTW, how to display mbuf
cluster usages info. Thanks.

-Zhihui




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: memory type and its size

2000-07-21 Thread Zhihui Zhang

On Thu, 20 Jul 2000, Zhihui Zhang wrote:

> 
> Does kernel memory of the same type (e.g., M_TEMP) must be allocated
> (using malloc()) with the same (range of) size?  BTW, how to display mbuf
> cluster usages info. Thanks.

A memory type can have memory blocks with different sizes.  Use netstat -m
to display mbuf cluster usages.

-Zhihui

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: KLD, kernel threads, zone allocator

2000-07-21 Thread Zhihui Zhang



On Mon, 17 Jul 2000, Zhihui Zhang wrote:

> 
> I am writing a KLD that gives me kernel fault each time I run 'ps' command
> after 'make unload'.  The KLD has a system call to create several kernel
> threads by calling kthread_create(). During unload, I set flags to each
> threads so that they will call exit1() upon wakeup (sleep on a timeout).  
> Before the last thread calls exit1(), it wakeup the kld unload process so
> that make 'unload' can finish. Is there anything wrong or better
> solutions?
> 
> I also use vm_zone to allocate some data structes within the KLD. When
> unloading, I can use zfree() to free them except the zone header that I
> can not free(some_zone, M_ZONE).  This is because M_ZONE is defined as
> *static* in vm_zone.c I wonder if this will cause memory leak after
> several loading and unloading the KLD.
> 
> Finally, I want to know how to save the panic screen without hand writing
> it down.  Any info on debugging under db> after fault?
> 
> Any help is appreciated.

Thanks to those who have helped me privately.  It is not a good idea to
use zone allocator with KLD.  You must clear everything before unloading
the KLD. Any kernel threads can be reparented to initproc to avoid 'ps'
panic.

-Zhihui



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

bridge driver and Yamaha YMF 724

2000-08-08 Thread Zhihui Zhang



Does 4.1-Release support YAMAHA PCI Audio Controller YMF 724? I have tried
the suggestion given by man pcm without success. By the way, what is a
card with bridge driver support and a PnP card as mentioned by man pcm?

Thanks for your help.

-Zhihui



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

recompiling boot blocks & serial console

2000-08-10 Thread Zhihui Zhang



I want to set up a serial console on a freebsd 4.1 box. I follow the
instructions at http://www.mostgraveconcern.com/freebsd/.  I tried to do
the following:

# cd /sys/boot/i386/boo2
# make clean
# make 

I got "cannot open ../btx/lib/crt0.o".  What happened?  Besides, I want to
use another freebsd box as console.  Can I use kermit as the terminal
program?  If so, can I configure it as for normal login purpose?

Any help is appreciated.

-Zhihui



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: recompiling boot blocks & serial console

2000-08-10 Thread Zhihui Zhang

On Thu, 10 Aug 2000, Mike Smith wrote:

> > 
> > I want to set up a serial console on a freebsd 4.1 box. I follow the
> > instructions at http://www.mostgraveconcern.com/freebsd/.  I tried to do
> > the following:
> 
> Put 
> 
> -h
> 
> in /boot.config.  Now you have a serial console.

Yes!  Two more quick questions: how to change baud rate? Can kermit
capture the output?  (I use kermit on the other FreeBSD machine).

-Zhihui

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: recompiling boot blocks & serial console

2000-08-10 Thread Zhihui Zhang


On Thu, 10 Aug 2000, Mike Smith wrote:

> > On Thu, 10 Aug 2000, Mike Smith wrote:
> > 
> > > > 
> > > > I want to set up a serial console on a freebsd 4.1 box. I follow the
> > > > instructions at http://www.mostgraveconcern.com/freebsd/.  I tried to do
> > > > the following:
> > > 
> > > Put 
> > > 
> > > -h
> > > 
> > > in /boot.config.  Now you have a serial console.
> > 
> > Yes!  Two more quick questions: how to change baud rate? Can kermit
> > capture the output?  (I use kermit on the other FreeBSD machine).
> 
> You need to recompile the bootblocks to change the baudrate; set 
> BOOT_COMCONSOLE_SPEED in /etc/make.conf, then do:
> 
> # cd /sys/boot
> # make clean cleandepend
> # make depend && make && make install
> # disklabel -B 
> 
Done!  I use Windows 98 HyperTerminal right now because I do not know
which Unix terminal program can capture its output into a file.  Thanks!

BTW, the web page should tell readers to do "make cleandepend" and "make
depend".

-Zhihui





To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Digital Technical Journal - zhihui

2000-08-15 Thread Zhihui Zhang



Kanad:

I remember you subscribed some journal a while ago. Was it "digital
technical journal?"  I found two papers on VAXcluster filesytem design on
No. 5, september 1987. If so and you happen to keep that issue, please
borrow me for a while.  Thanks.

Regards,

-Zhihui

-
FreeBSD - The Power To Serve (http://www.freebsd.org)
-



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

kernel debugging on 4.1-release

2000-08-22 Thread Zhihui Zhang



I try to trace some system call using remote debugging and find something
that I can not explain myself (the related source is ffs_write()):

case 1:
---

443 if (object)
(gdb) break 430
Breakpoint 6 at 0xc0289cea: file ../../ufs/ufs/ufs_readwrite.c, line 430.
(gdb) c
Continuing.

Breakpoint 6, ffs_write (ap=0xc64f5e70) at
../../ufs/ufs/ufs_readwrite.c:438
438 p = uio->uio_procp;

In the above case, even if I set breakpoint 6 at line 430, it insists on
line 438.

case 2:
---

(gdb) print p->p_limit
$1 = (struct plimit *) 0x

In the above case, the statement has just used p->p_limit to do some
comparison and yet gdb says its value is -1.  The statement using it is:

  if (vp->v_type == VREG && p &&
uio->uio_offset + uio->uio_resid >
p->p_rlimit[RLIMIT_FSIZE].rlim_cur) {

Are these bugs of gdb or am I doing something wrong?  I notice that
4.1-release install KLD files at the same time you install kernel. In the
past, I only copy the file kernel.debug to the target machine.  Do I have
to copy those .ko files to the target machine as well?

Any help is appreciated.

-Zhihui



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

delayed write question

2000-08-24 Thread Zhihui Zhang



I am wondering what exactly will happen if a delayed write goes wrong. It
seems to me that the kernel will just clear the error flag and mark the
buffer as delayed write again.  This gives the buffer a second chance.  
But how many chances at most a buffer can get before it is aborted.

While this may seem not serious on a local filesystem. Consider the NFS
case, if a delayed write to a NFS server fails, how many times will we
retry? My understanding is that the user program will not notice these
retries or aborts until it closes the file.  Am I right?  Please clarify
this for me.

Before 4.0, if we write something to a write-protected floppy, the system
will panic. Obviously, this panic does not happen on 4.0+. So I guess that
the retries must have a limit.

Any help is appreciated.

-Zhihui



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Where is PType in /stand/sysinstall defined?

2000-08-27 Thread Zhihui Zhang



In the FDISK-like menu of /stand/sysinstall, the PType (partition type)
column is given values like 1,2,3,4,6.  While the subtype field is
well-defined (e.g., 0xa5 = freebsd), I can not find where the partition
type is explained. I also tried PCguide in vain.  Can somebody explain
this to me?  Is it useful or some obsolete feature? Thanks.

-Zhihui



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

tr command in DDB

2003-12-25 Thread Zhihui Zhang

Hi,

I always like the command "db> tr 123" in DDB.  Is there an equivalent
command in gdb?  Thanks.

-Zhihui

-- 
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: vfs.vmiodirenable undocumented

2001-07-11 Thread Zhihui Zhang



100% agreed. In this particular vmiodirenable case, you can search the
mailing list archive and will find that people have discussed it at least
one year ago. Plus, if you still do not understand it, read the book "The
design and Implementation of the 4.4 BSD Operating System".  Anyway, when
you get something free, you should be grateful and not complain its
quality because you have not paid for it.

-Zhihui

On Wed, 11 Jul 2001, Jordan Hubbard wrote:

> From: Sheldon Hearn <[EMAIL PROTECTED]>
> Subject: Re: vfs.vmiodirenable undocumented 
> Date: Wed, 11 Jul 2001 11:21:24 +0200
> 
> > I'm very concerned with the fact that this style of response has become
> > commonly accepted within the FreeBSD community.
> 
> I would have to disagree that this is an area of concern.
> 
> Let's take it from the other perspective: There are a lot of clueless
> individuals out there who just don't understand the volunteer nature
> of open source and think it's fine to walk up and post criticisms on
> the bulletin board without any truly helpful suggestions, or to demand
> work of volunteers rather than offering to ASSIST them in their
> efforts.  It happens all the time, and each time it does it serves to
> disillusion the volunteers just a little bit more as they wonder just
> why they're doing this for such an ungrateful pack of cretins.
> 
> In such instances, I'd much rather have the volunteer vent a little
> steam and perhaps feel better rather than bottle it up until one day
> it just becomes all too much and they walk away from the project
> entirely.  I'm not being alarmist or dramatic in painting that picture
> either because it's happened more times than I like to think about.
> 
> It's also the case that people tend to only really learn lessons when
> they're hard lessons, and if getting a public spanking (albeit a mild
> one in this case) is what it takes to really drive the point home then
> I'll be the first to hand out paddles.  Some people, like Mr Xu here,
> are even more resistant to clue transfer than most (just read the
> archives) and, if anything, Bruce was being rather admirably
> restrained with his response.
> 
> In short, your approach may be fine one for conducting sensitivity
> training at the Oh Shamalu Spiritual Center, but I'm not sure it's
> appropriate here.  This is the freebsd-hackers mailing list, and if
> you can't take a little engineering heat then this is probably the
> wrong place for you.  Not everything in life needs to be "kinder and
> gentler", to borrow words from George Bush, and I suspect the folks
> who run police academies and military training programs would be the
> first to agree with me. :)
> 
> - Jordan
> 
> To Unsubscribe: send mail to [EMAIL PROTECTED]
> with "unsubscribe freebsd-hackers" in the body of the message
> 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

SPARE_USRSPACE

2001-07-17 Thread Zhihui Zhang



Can anyone tell me why FreeBSD has 256 bytes of spare space in the user
area?  Thanks.

-Zhihui


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: KLD Programming

2001-07-18 Thread Zhihui Zhang

Yes. But it is not easy. Look at code vfs_vnops.c. You can let a user
process open a file and then push the file descriptor into kernel via a
special system call. Search the mailing list archive and you will find
discussions on how to add a new system call.

-Zhihui

On Wed, 18 Jul 2001, suid wrote:

>  
>  Godday.
> 
>  I'm quite new to KLD-programming and have a question:
> 
>   Is it possible to read/write to files from a module without 
>   too much effort, but still staying in kernelspace? 
> 
> 
>  /suid-
> 
> 
> To Unsubscribe: send mail to [EMAIL PROTECTED]
> with "unsubscribe freebsd-hackers" in the body of the message
> 

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: using syscalls in a module (stack problem ?)

2001-07-23 Thread Zhihui Zhang



Just out of curiosity, Linux's kernel stack is one page. Where in the
kernel source code that says that we can have two pages instead of one
page kernel stack?

-Zhihui


On Mon, 23 Jul 2001, Eugene L. Vorokov wrote:

> > > I call this function with (curproc, PATH_MAX+1), and everything is fine
> > > when I have just a few local variables defined in the caller (it all
> > > works on MOD_LOAD only). However, if I have 2 buffers, 4096 bytes each,
> > > as local variables and then try to allocate userspace memory the same
> > > way, kernel crashes - sometimes inside mmap(), sometimes a bit later.
> > > 
> > > Why could this happen ? Is it related to possible stack overflow ?
> > 
> > Yes.  The kernel stack is only two pages; you absolutely must not use 
> > large local variables in the kernel.
> 
> I see. But I still can define them using "static", right ?
> 
> Regards,
> Eugene
> 
> 
> To Unsubscribe: send mail to [EMAIL PROTECTED]
> with "unsubscribe freebsd-hackers" in the body of the message
> 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: cluster size

2001-07-23 Thread Zhihui Zhang



You must be asking why the mbuf cluster size is chosen as 2048, right? It
is probably a tradeoff between memory efficient and speed.

-Zhihui

On Mon, 23 Jul 2001, [iso-8859-1] vishwanath pargaonkar wrote:

> Hi,
> in freebsd can we change the cluster size from 2048
> bytes.If yes how can we do that?
> do we have to configure in some file?
> 
> TIA
> vishwanath
> 
> 
> 
> To Unsubscribe: send mail to [EMAIL PROTECTED]
> with "unsubscribe freebsd-hackers" in the body of the message
> 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: using syscalls in a module (stack problem ?)

2001-07-23 Thread Zhihui Zhang



Make sense.  But there are other things in the UPAGES.

-Zhihui

On Mon, 23 Jul 2001, Weiguang SHI wrote:

> I guess this is it (/usr/src/sys/i386/i386/locore.s):
> 
> 348 /* now running relocated at KERNBASE where the system is linked to 
> run */
> 349 begin:
> 350 /* set up bootstrap stack */
> 351 movl_proc0paddr,%esp/* location of in-kernel 
> pages */
> 352 addl$UPAGES*PAGE_SIZE,%esp  /* bootstrap stack end 
> location */
> 
> where UPAGES is defined as 2 in 
> /usr/src/sys/compile/MYKERNEL/machine/param.h
> 
> 101 #define UPAGES  2   /* pages of u-area */
> 
> Regards,
> Weiguang
> 
> >From: Zhihui Zhang <[EMAIL PROTECTED]>
> >To: [EMAIL PROTECTED], "Eugene L. Vorokov" <[EMAIL PROTECTED]>
> >CC: [EMAIL PROTECTED]
> >Subject: Re: using syscalls in a module (stack problem ?)
> >Date: Mon, 23 Jul 2001 12:07:47 -0400 (EDT)
> >
> >
> >Just out of curiosity, Linux's kernel stack is one page. Where in the
> >kernel source code that says that we can have two pages instead of one
> >page kernel stack?
> >
> >-Zhihui
> >
> >
> >On Mon, 23 Jul 2001, Eugene L. Vorokov wrote:
> >
> > > > > I call this function with (curproc, PATH_MAX+1), and everything is 
> >fine
> > > > > when I have just a few local variables defined in the caller (it all
> > > > > works on MOD_LOAD only). However, if I have 2 buffers, 4096 bytes 
> >each,
> > > > > as local variables and then try to allocate userspace memory the 
> >same
> > > > > way, kernel crashes - sometimes inside mmap(), sometimes a bit 
> >later.
> > > > >
> > > > > Why could this happen ? Is it related to possible stack overflow ?
> > > >
> > > > Yes.  The kernel stack is only two pages; you absolutely must not use
> > > > large local variables in the kernel.
> > >
> > > I see. But I still can define them using "static", right ?
> > >
> > > Regards,
> > > Eugene
> > >
> > >
> > > To Unsubscribe: send mail to [EMAIL PROTECTED]
> > > with "unsubscribe freebsd-hackers" in the body of the message
> > >
> >
> >
> >To Unsubscribe: send mail to [EMAIL PROTECTED]
> >with "unsubscribe freebsd-hackers" in the body of the message
> 
> 
> _
> Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp
> 
> 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: cluster size

2001-07-25 Thread Zhihui Zhang




On Tue, 24 Jul 2001, Terry Lambert wrote:

> Zhihui Zhang wrote:
> > > Hi,
> > > in freebsd can we change the cluster size from 2048
> > > bytes.If yes how can we do that?
> > > do we have to configure in some file?
> > 
> > You must be asking why the mbuf cluster size is chosen as 2048, right? It
> > is probably a tradeoff between memory efficient and speed.
> 
> Ask yourselves:
> 
>   "What is the minimum cluster size I would have to have
>to be able to contain the maximum MTU worth of data,
>yet remain an even multiple of sizeof(mbuf) -- 256
>bytes?"

A dumb question: why even not odd multiple?

-Zhihui


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: cluster size

2001-07-25 Thread Zhihui Zhang



I see.  It has something to do with the power-of-two allocator we are
using inside the kernel.

-Zhihui

On Wed, 25 Jul 2001, Bosko Milekic wrote:

> 
> On Wed, Jul 25, 2001 at 01:51:51PM -0400, Zhihui Zhang wrote:
> > 
> > 
> > On Tue, 24 Jul 2001, Terry Lambert wrote:
> > 
> > > Zhihui Zhang wrote:
> > > > > Hi,
> > > > > in freebsd can we change the cluster size from 2048
> > > > > bytes.If yes how can we do that?
> > > > > do we have to configure in some file?
> > > > 
> > > > You must be asking why the mbuf cluster size is chosen as 2048, right? It
> > > > is probably a tradeoff between memory efficient and speed.
> > > 
> > > Ask yourselves:
> > > 
> > >   "What is the minimum cluster size I would have to have
> > >to be able to contain the maximum MTU worth of data,
> > >yet remain an even multiple of sizeof(mbuf) -- 256
> > >bytes?"
> > 
> > A dumb question: why even not odd multiple?
> > 
> > -Zhihui
> 
>   It actually has to do with the fact that 2K is the only size equal to
> or greater than the maximum MTU worth of data that can be multiplied to a page
> size without any leftover (in other words, page size modulo 2K is zero).
> 
> -- 
>  Bosko Milekic
>  [EMAIL PROTECTED]
> 
> 
> To Unsubscribe: send mail to [EMAIL PROTECTED]
> with "unsubscribe freebsd-hackers" in the body of the message
> 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: cluster size

2001-07-27 Thread Zhihui Zhang



I thought doing a memory free is always safe in an interrupt context. Now
it seems doing an allocation of memory is safe too.  Does MCLGET() call
vm_page_alloc() or malloc() eventually?  If so, it might block.

-Zhihui

On Thu, 26 Jul 2001, Terry Lambert wrote:

> Bosko Milekic wrote:
> > > > Er, wouldn't that be the only way for cards to refil thier DMA
> > > > recieve buffers?
> > >
> > > Look at the Tigon II and FXP drivers.  The allocations in
> > > the macros turn into m_get, not m_clusterget.
> > 
> > From if_fxp.c (fxp_add_rfabuf(), sometimes called from fxp_intr()):
> > 
> > MGETHDR(...);  <-- get mbuf
> > if (m != NULL) {
> > MCLGET(...); <-- get cluster
> > ...
> > }
> 
> Yes, I had misread things.  Alfred pointed this out to me in
> person, earlier.  I had been reading the jumbogram code,
> which uses a seperate buffer space, and then just incorrectly
> assumed.
> 
> Thanks for getting thecorrection into the list archives, so
> that future readers will be less confused: you spared me
> having to do the same.
> 
> -- Terry
> 
> To Unsubscribe: send mail to [EMAIL PROTECTED]
> with "unsubscribe freebsd-hackers" in the body of the message
> 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Allocate a page at interrupt time

2001-08-01 Thread Zhihui Zhang



FreeBSD can not allocate from the PQ_CACHE queue in an interrupt
context. Can anyone explain it to me why this is the case?

Thanks,

-Zhihui


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Allocate a page at interrupt time

2001-08-03 Thread Zhihui Zhang



FreeBSD can not allocate from the PQ_CACHE queue in an interrupt context.
Can anyone explain it to me why this is the case?


Thanks,
  
-Zhihui


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

ata0-master: non aligned DMA transfer attempted

2001-08-24 Thread Zhihui Zhang



I write a program that writes into a raw device directly. Although the
program runs OK, the system prints messages like:

ata0-master: non aligned DMA transfer attempted

What exactly happens here? Is there any problem in my program? 

Thanks.

-Zhihui


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: ata0-master: non aligned DMA transfer attempted

2001-08-26 Thread Zhihui Zhang



Thanks for your replay. I use gdb to find out that the buffer address is
not 16-byte aligned. This leads to a question as to how to align a
statically allocated data structure properly. Using union seems to be able
to align you on a long boundary (or even long long?), but that is not 16
byte aligned.

union {
my_data_structure_t xyz;
long pad;
}

The natural alignment seems to work only on primitive data types. If you
define:

unsigned char sector_buf[512];

It will not always be aligned on a 512 byte boundary, even 16-byte
alignment is not guaranteed.  Is there a way to achieve this?


-Zhihui


On Fri, 24 Aug 2001, Julian Elischer wrote:

> Zhihui Zhang wrote:
> > 
> > I write a program that writes into a raw device directly. Although the
> > program runs OK, the system prints messages like:
> > 
> > ata0-master: non aligned DMA transfer attempted
> make sure your DMA buffer is alligned on a 64 byte boundary...
> (a page would be best)
> and that you are transferring an exact bultiple of 512 bytes.
> 
> The DMA hardware on some macines cannot handle a buffer on less than 16 byte
> allignment, (some on odd allignment,.. (it's a bit hardware dependent).
> 
> so be safe and allign your buffers.
> 
> 
> when it detects it cannot do it, i used PIO instead, so your data is still
> transferred...
> 
> > 
> > What exactly happens here? Is there any problem in my program?
> > 
> > Thanks.
> > 
> > -Zhihui
> > 
> > To Unsubscribe: send mail to [EMAIL PROTECTED]
> > with "unsubscribe freebsd-hackers" in the body of the message
> 
> -- 
> ++   __ _  __
> |   __--_|\  Julian Elischer |   \ U \/ / hard at work in 
> |  /   \ [EMAIL PROTECTED] +-->x   USA\ a very strange
> | (   OZ)\___   ___ | country !
> +- X_.---._/presently in San Francisco   \_/   \\
>   v
> 



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: ata0-master: non aligned DMA transfer attempted

2001-08-27 Thread Zhihui Zhang




On Sun, 26 Aug 2001, Julian Elischer wrote:

> Zhihui Zhang wrote:
> > 
> > Thanks for your replay. I use gdb to find out that the buffer address is
> > not 16-byte aligned. This leads to a question as to how to align a
> > statically allocated data structure properly. Using union seems to be able
> > to align you on a long boundary (or even long long?), but that is not 16
> > byte aligned.
> > 
> > union {
> > my_data_structure_t xyz;
> > long pad;
> > }
> > 
> > The natural alignment seems to work only on primitive data types. If you
> > define:
> > 
> > unsigned char sector_buf[512];
> > 
> > It will not always be aligned on a 512 byte boundary, even 16-byte
> > alignment is not guaranteed.  Is there a way to achieve this?
> 
> unfortunatly not, except to allocate N+16 bytes, and allign it yourself by
> 
> using a 2nd variable..
> 
> x = malloc(buffesize + 16)
> y = x + 15 & ~15
> ... 
> write (fd, y, buffersize);
> ...
> free (x);
> exit();
> 
> 
> You may experiment to see what allignment your hardware needs...
> 2?, 4?, 6?, 16?
> 
> when does the message happen?

I believe that message is from ata_dmasetup():

if (((uintptr_t)data & scp->alignment) || (count & scp->alignment)) {
ata_printf(scp, device, "non aligned DMA transfer attempted\n");
return -1;
}

The user address obtained by static allocation is not 16-byte aligned. The
kernel routine physio() grabs a physical buffer to do DMA, but it still
uses the user's address.  The KVA associated with the buffer is not used.

-Zhihui


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

1 2 3 >

1 - 100 of 281 matches

Mail list logo