Re: New FreeBSD package system (a.k.a. Daemon Package System (dps))

2007-05-12 Thread Ivan Voras
Jona Joachim wrote:

> I don't think it would be a good idea to use SQLite for this purpose.
> First of all using the file system is the Unix way of doing things. It's
>  efficient and easy to use, it transparent and user friendly. You can
> simply run vi to inspect a text file but you can't do this which an
> sqlite database. You have to learn sqlite to do it.

That particular barrier to entry / learning curve is very shallow.

In the end, it's a tradeoff between speed in the general case and ease
of use (from the developer side), vs convenience in the extreme cases.

> Furthermore I don't think the pkg_* tools are slow. They are quite fast
> IMO. If you let pkg_info print the entire list of installed ports it's
> only slow because of your line-buffered console. Just redirect the
> output to a file and you'll see that it's blazing fast. If I compare it

Err, nope. :) I see Eric has provided the numbers.

> for example to Debians apt-get/apt-cache commands it's much faster.

AFAIK Debian has the same dichotomy FreeBSD has: a tree of text files
used by dpkg, and a binary database (cache) used by apt.

> portupgrade is very slow, that's true. First of all it's written in Ruby
> which is not one of the fastest languages but there is another thing
> that slows it down considerably, which is rebuilding its database.

In my (limited) experience, this sort of task should not depend much on
the speed of the language. The most CPU-intensive task portupgrade does
is resolving dependencies, and on a running system this is a DAG forest
of about 500 nodes. I know portupgrade has some highly unoptimal code in
it (if I understand the code correctly, there's a brute force check for
cyclic dependancies in it!), but still, in itself, I think the choice of
Ruby isn't performance-critical.

> Furthermore I think it would be a very bad idea to include sqlite in
> base. There is already a lot of third party stuff in base. The
> philosophy of the BSDs is to provide and maintain an entire OS. This is
> quite the opposite of how a GNU/Linux system is designed. Both ways have
> their pros and cons. An advantage of the BSD way of doing things is that
> the developers know the code very well and have control over the quality
> of the code. If you include 3rd party software into the FreeBSD base
> system you make the FreeBSD project depend on the people that wrote that
> code. Of course you could fork it but the FreeBSD developers are not
> necessarily familiar with the code. Security patches would have to be
> merged all the time and a lot of communication between the two projects
> is needed.

I think this line of reasoning was made invalid by the continuing
inclusion of sendmail, bind, expat (xml parser!), etc.

Not that I don't realize this increases the burden on maintainability,
but including a "frozen" branch of a library, which is supported, but
won't be changed for ages isn't going to increase it much.

Offloading much of the "smarts" to a database would also permit easier
reimplementation of portupgrade-like tool in C, since the heavy parsing
/ regex facilities scripting languages offer won't be used as much.

But yes, it's a heavy departure from "the unix way".

> I think the best way to go would be to use only folder hierarchies and
> text files and write a libary in C that provides portupgrade
> functionality. The code under src/usr.sbin/pkg_install/lib/ would be a
> good base for this. Then you could use a frontend program that makes use
> of this library. This frontend could be a CLI program or a GUI based
> program.

The issue in this thread (at least for me) is performance and
reliability, and creating a C wrapper around the current situation won't
solve neither.



signature.asc
Description: OpenPGP digital signature


Re: New FreeBSD package system (a.k.a. Daemon Package System (dps))

2007-05-12 Thread Stanislav Sedov
On Fri, 11 May 2007 02:10:05 +0200
Ivan Voras <[EMAIL PROTECTED]> mentioned:

> - I think it's time to give up on using BDB+directory tree full of text
> files for storing the installed packages database, and I propose all of
> this be replaced by a single SQLite database. SQLite is public domain
> (can be slurped into base system), embeddable, stores all data in a
> single file, lightweight, fast, and can be used to do fancy things such
> as reporting.

What is the reason to use SQL-based database? You'll perform direct
queries to database? The packaging system is for ordinal users, not sql
geeks, so they should not have to use sql for managing packages. So a
simple set of hashes will suffer or needs. I agree with Julian that we
should have a backup of packaging database in plain text format, and
utility to rebuild it. This way we can always restore the database if
something goes wrong. Furhtermore, that should not make a great impact
on performance, since we don't have to rebuild it every day.

>
> - A quick test confirms that the current bsdtar will happily ignore any
> extra data at the end of a tgz/tbz archive, so package metadata can be
> embedded there, thus conserving existing infrastructure and being fast
> to parse. I suggest encoding this metadata in a sane and easy to parse
> XML structure.
>
> I cannot currently actively participate in implementing proposed things,
> but I can give advice on sqlite, database and xml schemas if anyone
> wants to...
>

Why use XML for that? It's hard to parse and hard to read format, and I
personally see no benefits of using it. If you're suggesting XML a
simple bracket-structure format (like bind's config) will fit our needs
much better (easier to parse and read and same benefits as XML). Also
we might consider YAML, thought I like this idea much fewer.

--
Stanislav Sedov
ST4096-RIPE


pgpZu3qUtFFBh.pgp
Description: PGP signature


Re: DPS Initial Ideas

2007-05-12 Thread Michel Talon
On Fri, May 11, 2007 at 10:01:46PM -0400, Mike Meyer wrote:
> In <[EMAIL PROTECTED]>, Michel Talon <[EMAIL PROTECTED]> typed:
> > One of the most obvious being that the sqlite database can be edited
> > as easily as a pure textfile using the sqlite3 program
> 
> Huh? They can? With a pure textfile, if vi is busted, I can use ed. If
> ed is also busted, I can use sed. What do I use on an sqlite database
> if sqlite3 is busted?

Answering both you and Bill Moran:

- first i don't suppose sqlite3 is busted, since i suppose it is in the
  base system and it works by definition. Your hypothesis is alike, what
do i do to edit my config files if vi and ed are busted? Moreover if
sqlite3 gets really busted i can import a copy and hope it works, it
requires very few libraries and other files, not much more than vi, 
plus the sqlite3 library, of course. The combined size of sqlite3
and libsqlite3 is less than 400k.
- second, if i am sql allergic, it takes one command to export the table
  to a straight file, each row in a line, each field separated by | or
anything else of my choice. Exactly the same tools that you have
mentioned allow to edit this file, and then one command allows to load
it in the database.
- so what are the benefits? They are that non sql impaired people can
  make good use of the power of sql queries to simplify their work. And
this without reducing the possibilities of sql impaired people. Moreover
one can use  general tools like graphic sql tools to present the
contents of the database to the end user in a pleasant way if it is
desired. And finally it may be that the transactional properties of
sqlite can be used to gain better reliablity.
- is the cost of including sqlite in the base system so high that  
the above benefits are insufficient? Personnally i don't know, but i
think some discussion is at least in order.
- and finally to answer one of Bill's critiques, why sqlite rather than 
a Berkeley database? Precisely because sqlite offer a lot of facilities 
that Berkeley db doesn't offer, such as export and import to and from 
csv files, auto documentation of the table contents, while it requires
in fact programming and knowledge of the api of the database to hand
edit the Berkeley db.

Anyways, i have read that Marc Espie is envisioning using sqlite3 for
OpenBSD package system, and that he is very satisfied with what he has
seen up to now. If this enters production, perhaps this will confer
BSD legitimity to such practices ... Seriously, the FreeBSD package
system is in great need of a profound overhaul, pretending it works well
is complete denial of reality. I hope that young people working on 
summer code projects will infuse *new* ideas, and not spend their
vacations polishing inadequate tools.


-- 

Michel TALON

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


[NEW PORT] ports-mgmt/pkg - smart tool for managing FreeBSD ports

2007-05-12 Thread Andy Kosela

Hi all,

I would like to present to you the new utility to deal with the ports
system. The main goal of this project is to provide one common tool
for managing ports and packages instead of relying on many
applications (pkg_add, pkg_delete, pkg_info, pkg_version etc.).
Actually it is a smart wrapper written in /bin/sh to the previously
mentioned applications. It also uses external tool portmaster written
also in /bin/sh by Doug Barton to work with the ports compiled from
source. Pkg tool automates upgrading installed packages, outputs
valuable information about packages/ports and overall simplifies
working with the FreeBSD Ports Collection. It uses no external
databases like portupgrade, just simplicity and minimalism are its
main goals.

You can test the latest version by installing the package from here
http://home.si.rr.com/pyn/pf/pkg-1.1.tbz

I commited pkg-1.0 with send-pr to the ports tree a few days ago. It
is awaiting approval...
http://www.freebsd.org/cgi/query-pr.cgi?pr=ports/112572

Feel free to send any suggestions, new ideas and of course bug
reports...
Thank you,

Andy Kosela
Pythagoras Foundation
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: New FreeBSD package system (a.k.a. Daemon Package System (dps))

2007-05-12 Thread Roman Divacky
> cyclic dependancies in it!), but still, in itself, I think the choice of
> Ruby isn't performance-critical.

ruby2.0 will come with a virtual machine which should speed up things. ruby2.0
is expected "soon enough" (2008?)
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: New FreeBSD package system (a.k.a. Daemon Package System (dps))

2007-05-12 Thread Michel Talon
Ivan Voras wrote:

> In my (limited) experience, this sort of task should not depend much on
> the speed of the language. The most CPU-intensive task portupgrade does
> is resolving dependencies, and on a running system this is a DAG forest
> of about 500 nodes. I know portupgrade has some highly unoptimal code in
> it (if I understand the code correctly, there's a brute force check for
> cyclic dependancies in it!), but still, in itself, I think the choice of
> Ruby isn't performance-critical.

If i remember well, portupgrade uses a reasonable algorithm to sort the
DAG in question. But from time to time, it runs 
make depends 
or something of the sort in some ports directory, and this is the slow
step which kills any package manager whatsoever, be it written in super
fast C or superslow ruby. This is because as soon as you run "make" in
such a directory it has to read and interpret the 4000 lines file
bsd.ports.mk. This takes 1/10 s on my old laptop and perhaps 5 times
faster on a powerful machine.

Add to that the natural slowness of ruby, the fact that portupgrade
constantly does queries in the Berkeley pkgdb.db, portsdb.db etc. while
it could cache the whole content in memory without any problem in modern
machines, and you have something which is slow as a mollasse.

Then you have more structural problems such as the "guessing facilities"
of portupgrade, which aim at guessing what is the new origin of a port
whose old origin has disappeared. To do that it counts similarities on
names, adds some snake oil and other satanic ingredients and comes out
with a guess. For some time i was impressed by the exactness of these
guesses, but recently i have seen incredibly hilarious results. As a
consequence i think that portupgrade is a completely inadequate tool
to maintain a FreeBSD machine in an automated way in spite of the 
remarkable insight which is coded in it, and that Debian like systems
run circles around FreeBSD for ease of maintenance. Yes the FreeBSD
system is very good for initial installation of software and the
flexibility it allows to do that. But it sucks completely as soon as one
wants to upgrade stuff. The best way by far is wipe out every port and
reinstall. This will take 1/3 of the time to run portupgrade -a and
with a much better chance of success and coherency at the end.

Note that these innocent looking facts have little consequences:
FreeBSD was in position 11 on Distrowatch in 2005, it is now in position
17 and falling like a brick. 


-- 

Michel TALON

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: New FreeBSD package system (a.k.a. Daemon Package System (dps))

2007-05-12 Thread Philippe Laquet

Stanislav Sedov a écrit :

On Fri, 11 May 2007 02:10:05 +0200
Ivan Voras <[EMAIL PROTECTED]> mentioned:

  

- I think it's time to give up on using BDB+directory tree full of text
files for storing the installed packages database, and I propose all of
this be replaced by a single SQLite database. SQLite is public domain
(can be slurped into base system), embeddable, stores all data in a
single file, lightweight, fast, and can be used to do fancy things such
as reporting.



What is the reason to use SQL-based database? You'll perform direct
queries to database? The packaging system is for ordinal users, not sql
geeks, so they should not have to use sql for managing packages. So a
simple set of hashes will suffer or needs. I agree with Julian that we
should have a backup of packaging database in plain text format, and
utility to rebuild it. This way we can always restore the database if
something goes wrong. Furhtermore, that should not make a great impact
on performance, since we don't have to rebuild it every day.
  

I agree with Stan ;)

"fast and improved" package utilities uses mainly some indexed berkeley 
DB combined with flat files, aren't they? I, and may be many other 
FreeBSD users use light systems for efficiency and eaiser management, if 
we use some database system it will require Disk Space, ressources for 
the DB to run, dependencies and so on... And we also may be exposed to a 
"that DB is better" war ;)


  

- A quick test confirms that the current bsdtar will happily ignore any
extra data at the end of a tgz/tbz archive, so package metadata can be
embedded there, thus conserving existing infrastructure and being fast
to parse. I suggest encoding this metadata in a sane and easy to parse
XML structure.

I cannot currently actively participate in implementing proposed things,
but I can give advice on sqlite, database and xml schemas if anyone
wants to...




Why use XML for that? It's hard to parse and hard to read format, and I
personally see no benefits of using it. If you're suggesting XML a
simple bracket-structure format (like bind's config) will fit our needs
much better (easier to parse and read and same benefits as XML). Also
we might consider YAML, thought I like this idea much fewer.
  
XML could be an altertative to order packages, it can be parsed with 
some limited dependencies like PERL. The userland tools to manage 
packages could be based on that language? It is well known by many 
users, quite simple, required by many other packages so the whole system 
won't be much heavier. PERL XML Parser can't be a good choice?


* PERL-DB for managing packages databases
* PERL-XML for parsing categories, dependencies ...

PERL also give , in most cases, good performance issues.

This is solely ma humble opinion ;)


--
Stanislav Sedov
ST4096-RIPE
  

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: DPS Initial Ideas

2007-05-12 Thread Peter Jeremy
On 2007-May-12 11:09:35 +0200, Michel Talon <[EMAIL PROTECTED]> wrote:
>- first i don't suppose sqlite3 is busted, since i suppose it is in the
>  base system and it works by definition.

It can happen that base system utilities become unusable for various
reasons:  Maybe an installworld went wrong, maybe someone accidently
deleted a shared library, maybe the disk developed an inconvenient
bad sector.  If sqlite3 is being used solely for ports management
then it may be a reasonable assumption that sqlite3 is always working
but if (as has been suggested) SQLite is used for base system config
files then it is essential that those files be repairable.

> Your hypothesis is alike, what
>do i do to edit my config files if vi and ed are busted?

Use emacs :-)

Seriously, you can probably get away with sh builtins for most purposes:
while read line
do
  case "$line" in
foo*) echo bar ;;
*) echo "$line" ;;
  esac
done < file > file.tmp
mv file.tmp file

> Moreover if
>sqlite3 gets really busted i can import a copy and hope it works,

I agree that sqlite3 is good in this respect.  "Import" implies that
you are able to get the system to a point where it can communicate
externally without needing whatever tool is broken.

>- second, if i am sql allergic, it takes one command to export the table
>  to a straight file,

This is only usable if the schema is designed so that the tables are
reasonably independent.  It's certainly possible to design something
that could not be usefully exported table-by-table and edited.

>- is the cost of including sqlite in the base system so high that  
>the above benefits are insufficient? Personnally i don't know, but i
>think some discussion is at least in order.

I agree that this topic is worth discussing.  There is a very high bar
for including new utilities in the base system because every time a
new utility is added, the maintenance effort goes up.  So far I
haven't seen anything that would make me say "SQLite should be imported".

>- and finally to answer one of Bill's critiques, why sqlite rather than 
>a Berkeley database? Precisely because sqlite offer a lot of facilities 
>that Berkeley db doesn't offer, such as export and import to and from 
>csv files,

This is a function of sqlite3(1) rather than the SQLite database itself.
It wouldn't be that difficult to write a tool to convert a BDB into
a flat file of key,value pairs.

> auto documentation of the table contents,

This is a big plus for SQL.

> while it requires
>in fact programming and knowledge of the api of the database to hand
>edit the Berkeley db.

Very trivial effort - if we had a need for it, someone could write the
necessary few dozen lines and commit it.  The downside is that since
BDB isn't self documenting, a flat file may not be any use.

-- 
Peter Jeremy


pgp4DdQEuksZO.pgp
Description: PGP signature


Re: New FreeBSD package system (a.k.a. Daemon Package System (dps))

2007-05-12 Thread Stanislav Sedov
On Sat, 12 May 2007 14:14:39 +0200
Philippe Laquet <[EMAIL PROTECTED]> mentioned:
> >
> XML could be an altertative to order packages, it can be parsed with
> some limited dependencies like PERL. The userland tools to manage
> packages could be based on that language? It is well known by many
> users, quite simple, required by many other packages so the whole system
> won't be much heavier. PERL XML Parser can't be a good choice?
>
> * PERL-DB for managing packages databases
> * PERL-XML for parsing categories, dependencies ...
>
> PERL also give , in most cases, good performance issues.
>

I agree, that there's a lot of ready tools for parsing xml, but why
not use much simple language that can be parsed by sed or awk in few
lines? Will not require dependecies at all and much simplier (read
better). The entire FreeBSD's ideology is to be as simple as possible
but powerful.

--
Stanislav Sedov
ST4096-RIPE


pgpVaa42FhVhq.pgp
Description: PGP signature


Re: New FreeBSD package system (a.k.a. Daemon Package System (dps))

2007-05-12 Thread Ivan Voras
Stanislav Sedov wrote:
> On Fri, 11 May 2007 02:10:05 +0200
> Ivan Voras <[EMAIL PROTECTED]> mentioned:
> 
>> - I think it's time to give up on using BDB+directory tree full of text
>> files for storing the installed packages database, and I propose all of
>> this be replaced by a single SQLite database. SQLite is public domain
>> (can be slurped into base system), embeddable, stores all data in a
>> single file, lightweight, fast, and can be used to do fancy things such
>> as reporting.
> 
> What is the reason to use SQL-based database? You'll perform direct
> queries to database? The packaging system is for ordinal users, not sql
> geeks, so they should not have to use sql for managing packages. So a

It's not SQL I'm interested in, it's the "additional" features:

- performance
- transaction safety ("commit all changes or none")
- constraints (like "unique" keys - sqlite unfortunately doesn't support
foreign keys)
- concurrent access (allowing to run multiple portupgrades at the same time)
- easy interface to C programs

If a BDB variety or some other storage layer can achieve these things,
I'll likely support them.

I know "Sleepycat" BDB implementations boast "transaction processing",
but can they offer this across multiple stores / databases at the same
time (i.e. like one transaction includes updates to multiple tables)?
Efficient (performance-wise) storage would probably need to use more
than one store, at least to index data by different keys.




signature.asc
Description: OpenPGP digital signature


Re: New FreeBSD package system (a.k.a. Daemon Package System (dps))

2007-05-12 Thread Ivan Voras
Stanislav Sedov wrote:

> I agree, that there's a lot of ready tools for parsing xml, but why
> not use much simple language that can be parsed by sed or awk in few
> lines? 

Because of mindshare. Young people know SQL and XML, but not grep.



signature.asc
Description: OpenPGP digital signature


FBSD on HP Pavillion dv6000 Family

2007-05-12 Thread Maslan

Hi all,

I want to install freebsd-6.2 on my new laptop rather than win vista,
but by doing some googling i found that almost everything will not work.
any resources/links for drivers even if still untested, i would like to help.

Thanks


--
I'm Searching For Perfection,
So Even If U Need Portability U've To Use Assembly ;-)
http://libosdk.berlios.de
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Experiences with 7.0-CURRENT and vmware.

2007-05-12 Thread Bert JW Regeer


On May 10, 2007, at 5:54 AM, Darren Reed wrote:


[...]


But if if_em is probing, it suggests a VMware
change rather than a FreeBSD change, which you may be able to  
revert by
telling it to expose a Lance-style device as opposed to an Intel  
device.


There's no way to choose the type of card vmware emulates.



I always set my VMWare to expose an intel e1000 card, which gets  
probed correctly by almost all systems, even Windows with the Intel  
drivers installed. In your .vmx file you should find a line like this:


ethernet0.virtualDev="e1000"

If VMWare is to give an em device, if you remove that line it should  
default back to lnc driver.


That being said, I have had no performance problems with the em  
driver on FreeBSD 6.0 in VMWare, and have not had the timeout  
problems you mentioned.



Darren


Bert JW Regeer



Re: New FreeBSD package system (a.k.a. Daemon Package System (dps))

2007-05-12 Thread Bert JW Regeer


On May 12, 2007, at 5:14 AM, Philippe Laquet wrote:


Stanislav Sedov a écrit :

On Fri, 11 May 2007 02:10:05 +0200
Ivan Voras <[EMAIL PROTECTED]> mentioned:


- I think it's time to give up on using BDB+directory tree full  
of text
files for storing the installed packages database, and I propose  
all of
this be replaced by a single SQLite database. SQLite is public  
domain

(can be slurped into base system), embeddable, stores all data in a
single file, lightweight, fast, and can be used to do fancy  
things such

as reporting.



What is the reason to use SQL-based database? You'll perform direct
queries to database? The packaging system is for ordinal users,  
not sql

geeks, so they should not have to use sql for managing packages. So a
simple set of hashes will suffer or needs. I agree with Julian  
that we

should have a backup of packaging database in plain text format, and
utility to rebuild it. This way we can always restore the database if
something goes wrong. Furhtermore, that should not make a great  
impact

on performance, since we don't have to rebuild it every day.


I agree with Stan ;)

"fast and improved" package utilities uses mainly some indexed  
berkeley DB combined with flat files, aren't they? I, and may be  
many other FreeBSD users use light systems for efficiency and  
eaiser management, if we use some database system it will require  
Disk Space, ressources for the DB to run, dependencies and so on...  
And we also may be exposed to a "that DB is better" war ;)




SQLite is compiled inside a program, and as such does not require any  
resources other than one file handle and some CPU time when querying.  
The file is stored on disk, and requires no separate process to be  
running to query. Maybe I misunderstood what you were trying to say.  
SQLite will require less resources than flat text files, since SQLite  
is a one time open then process, instead of what is currently  
happening, having to open and close hundreds of files depending on  
how many ports are installed. With this regard, SQLite is like BDB.  
Where SQLite uses standards compliant SQL statements to get data.





--
Stanislav Sedov
ST4096-RIPE





I am able to understand many of the gripes with using a databases,  
and have to import yet another code base into the FreeBSD base,  
however as one of the young ones, and knowing sed/awk/grep and SQL, I  
prefer SQL over having to process hundreds of text files using text  
processing tools. It saddens me each time I run one of the pkg_*  
tools that needs to parse the flat file structure since it takes so  
long. I have friends running Ubuntu and their apt-get returns results  
much faster.


In a world where hard drives are becoming more reliable, and are  
automatically relocating sectors that go bad, do we really have to  
worry about database corruption as much? I feel that many of the  
fears that are being put forward will do harm to a text based  
"storage" system as well. If one block drops out, it can cause tools  
to not be able to parse the files. Create a backup copy of the  
database after each successful transaction? There are ways to battle  
data corruption.


Using BDB is not an real option either. I can not even count the  
amount of times that the BDB database that portupgrade created has  
become corrupt because I accidently ran two portupgrades at the same  
time, or even remembered that I did not want to upgrade something and  
hit Ctrl+C. The experience I got from running SVN with BDB as the  
back-end database to store my data, I say no thanks. In that case I  
would much rather stick with the flat text files than go with a  
database.


Bert JW Regeer


Re: New FreeBSD package system (a.k.a. Daemon Package System (dps))

2007-05-12 Thread Bert JW Regeer


On May 12, 2007, at 5:14 AM, Philippe Laquet wrote:


Stanislav Sedov a écrit :

On Fri, 11 May 2007 02:10:05 +0200
Ivan Voras <[EMAIL PROTECTED]> mentioned:


- I think it's time to give up on using BDB+directory tree full  
of text
files for storing the installed packages database, and I propose  
all of
this be replaced by a single SQLite database. SQLite is public  
domain

(can be slurped into base system), embeddable, stores all data in a
single file, lightweight, fast, and can be used to do fancy  
things such

as reporting.



What is the reason to use SQL-based database? You'll perform direct
queries to database? The packaging system is for ordinal users,  
not sql

geeks, so they should not have to use sql for managing packages. So a
simple set of hashes will suffer or needs. I agree with Julian  
that we

should have a backup of packaging database in plain text format, and
utility to rebuild it. This way we can always restore the database if
something goes wrong. Furhtermore, that should not make a great  
impact

on performance, since we don't have to rebuild it every day.


I agree with Stan ;)

"fast and improved" package utilities uses mainly some indexed  
berkeley DB combined with flat files, aren't they? I, and may be  
many other FreeBSD users use light systems for efficiency and  
eaiser management, if we use some database system it will require  
Disk Space, ressources for the DB to run, dependencies and so on...  
And we also may be exposed to a "that DB is better" war ;)




SQLite is compiled inside a program, and as such does not require any  
resources other than one file handle and some CPU time when querying.  
The file is stored on disk, and requires no separate process to be  
running to query. Maybe I misunderstood what you were trying to say.  
SQLite will require less resources than flat text files, since SQLite  
is a one time open then process, instead of what is currently  
happening, having to open and close hundreds of files depending on  
how many ports are installed. With this regard, SQLite is like BDB.  
Where SQLite uses standards compliant SQL statements to get data.





--
Stanislav Sedov
ST4096-RIPE





I am able to understand many of the gripes with using a databases,  
and have to import yet another code base into the FreeBSD base,  
however as one of the young ones, and knowing sed/awk/grep and SQL, I  
prefer SQL over having to process hundreds of text files using text  
processing tools. It saddens me each time I run one of the pkg_*  
tools that needs to parse the flat file structure since it takes so  
long. I have friends running Ubuntu and their apt-get returns results  
much faster.


In a world where hard drives are becoming more reliable, and are  
automatically relocating sectors that go bad, do we really have to  
worry about database corruption as much? I feel that many of the  
fears that are being put forward will do harm to a text based  
"storage" system as well. If one block drops out, it can cause tools  
to not be able to parse the files. Create a backup copy of the  
database after each successful transaction? There are ways to battle  
data corruption.


Using BDB is not an real option either. I can not even count the  
amount of times that the BDB database that portupgrade created has  
become corrupt because I accidently ran two portupgrades at the same  
time, or even remembered that I did not want to upgrade something and  
hit Ctrl+C. The experience I got from running SVN with BDB as the  
back-end database to store my data, I say no thanks. In that case I  
would much rather stick with the flat text files than go with a  
database.


Bert JW Regeer

Re: DPS Initial Ideas

2007-05-12 Thread Kris Kennaway
On Sat, May 12, 2007 at 11:09:35AM +0200, Michel Talon wrote:

> Seriously, the FreeBSD package
> system is in great need of a profound overhaul, pretending it works well
> is complete denial of reality. I hope that young people working on 
> summer code projects will infuse *new* ideas, and not spend their
> vacations polishing inadequate tools.

I know that this is your belief, but please try to avoid grasping at
straws: there are elements in your argument that are along the lines
of "The FreeBSD package system is broken and needs to be fundamentally
changed.  Rewriting it to use SQLite is a fundamental change.
Therefore rewriting it to use SQLite will fix the problems."

First figure out what specific problems need to be solved, then figure
out how to solve them, not the other way around.  So far I have seen
little discussion of how SQLite is necessary and sufficient for fixing
fundamental issues.  The argument in favour of SQL seems to boil down
to "It's SQL!  You can do more complex queries...if you wanted to".

Without a clear demonstration of how this would solve a problem
associated with package management, it is not very compelling and
basically reduces to change for the sake of change.

As I discussed in my email yesterday, there are serious issues to be
solved.  Some of them can be solved by improving the storage backend
of the package database to use a database; but this is in progress
using existing tools.

Given that this work is happening (or at least will be happening, I am
not sure when the SoC officially starts), the best thing is for
interested people to work with Garrett to help him achieve the goals
of his project.

Kris


pgp8SBPegRxSN.pgp
Description: PGP signature


Re: DPS Initial Ideas

2007-05-12 Thread Ivan Voras
David Naylor wrote:

> 
> I am looking at a hybrid approach to storing the package metadata, a 
> combination of SQLite and compressed text files.  I am hoping to create a 
> situation where if either gets corrupted it can be created from the other.  

... throwing away transaction safety, as it means updating 2 completely
unrelated (and unrelatable) data stores.

(I'm not against your work, I'm just pointing out an area where you need
to be extra careful - some kind of 2pc protocol to update both sides may
be required).



signature.asc
Description: OpenPGP digital signature


Re: DPS Initial Ideas

2007-05-12 Thread Ivan Voras
Kris Kennaway wrote:

> First figure out what specific problems need to be solved, then figure
> out how to solve them, not the other way around.  So far I have seen
> little discussion of how SQLite is necessary and sufficient for fixing
> fundamental issues.  The argument in favour of SQL seems to boil down
> to "It's SQL!  You can do more complex queries...if you wanted to".

I've posted some general ideas (resulting from my experience using the
package / port system, not developing for it):

1. speed and simplicity of querying (single query vs traversing a tree
of text files)
2. formal data constraints (UNIQUE, CHECK)
3. transaction safety (a consequence of which is the ability to run
concurrent installs / updates)
4. easy interface for 3d party tools

I admit again that I didn't develop anything with the package / ports
subsystems, so there might be other, bigger problems not solvable by
sqlite, but I believe the features above could at least solve
performance problems.

(I also agree there's no point in changing the ports infrastructure
itself, just the package tracking database in base system).



signature.asc
Description: OpenPGP digital signature


Re: DPS Initial Ideas

2007-05-12 Thread Kris Kennaway
On Sat, May 12, 2007 at 11:25:58PM +0200, Ivan Voras wrote:
> Kris Kennaway wrote:
> 
> > First figure out what specific problems need to be solved, then figure
> > out how to solve them, not the other way around.  So far I have seen
> > little discussion of how SQLite is necessary and sufficient for fixing
> > fundamental issues.  The argument in favour of SQL seems to boil down
> > to "It's SQL!  You can do more complex queries...if you wanted to".
> 
> I've posted some general ideas (resulting from my experience using the
> package / port system, not developing for it):
> 
> 1. speed and simplicity of querying (single query vs traversing a tree
> of text files)
> 2. formal data constraints (UNIQUE, CHECK)
> 3. transaction safety (a consequence of which is the ability to run
> concurrent installs / updates)
> 4. easy interface for 3d party tools
> 
> I admit again that I didn't develop anything with the package / ports
> subsystems, so there might be other, bigger problems not solvable by
> sqlite, but I believe the features above could at least solve
> performance problems.

That is the "sufficient" part but not the "necessary part".  1) and 3)
are solvable using existing tools.  2 and 4 not so much, but you
haven't described what problems they solve.

Homework for SQLite advocates: write a 1-page essay on the following
topic.

"Rewriting the package tools to use SQLite will solve problem(s) 
that exist in the current system.  Compare and contrast to other
possible solutions including Berkeley DB."

> (I also agree there's no point in changing the ports infrastructure
> itself, just the package tracking database in base system).

Well, I was talking about both.

Kris



pgp11ldGGTXHR.pgp
Description: PGP signature


Re: DPS Initial Ideas

2007-05-12 Thread Matthew Jacob


Seriously, the FreeBSD package system is in great need of a profound 
overhaul, pretending it works well is complete denial of reality. I 
hope that young people working on summer code projects will infuse 
*new* ideas, and not spend their vacations polishing inadequate tools.


Hmm? Works fine for me and many others who are more than casual users.

I think Kris has it right when he asks you to state what problems need 
to be addressed and *then* how you would address them- not to find a way 
to address problems first.


respectfully

-matt

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: DPS Initial Ideas

2007-05-12 Thread Michel Talon
On Sat, May 12, 2007 at 03:33:02PM -0400, Kris Kennaway wrote:
> On Sat, May 12, 2007 at 11:09:35AM +0200, Michel Talon wrote:
> 
> > Seriously, the FreeBSD package
> > system is in great need of a profound overhaul, pretending it works well
> > is complete denial of reality. I hope that young people working on 
> > summer code projects will infuse *new* ideas, and not spend their
> > vacations polishing inadequate tools.
> 
> I know that this is your belief, but please try to avoid grasping at
> straws: there are elements in your argument that are along the lines
> of "The FreeBSD package system is broken and needs to be fundamentally
> changed.  Rewriting it to use SQLite is a fundamental change.
> Therefore rewriting it to use SQLite will fix the problems."
> 

Really i don't think at all this way. I think that *perhaps* SQLite
may marginally better than a Berkeley database for solving part of the
problem, not much more. What i reacted to, was the conservatism which 
pervades the community as soon as someone emits the idea of using a new tool. 


> First figure out what specific problems need to be solved, then figure
> out how to solve them, not the other way around.  So far I have seen
> little discussion of how SQLite is necessary and sufficient for fixing
> fundamental issues.  The argument in favour of SQL seems to boil down
> to "It's SQL!  You can do more complex queries...if you wanted to".

No, for me the main argument is that SQL is more familiar for many people than 
running a perl script to connect to a Berkeley database. I have also heard
that SQLite is more performant, but i would have to see it to beleive it.

> 
> Without a clear demonstration of how this would solve a problem
> associated with package management, it is not very compelling and
> basically reduces to change for the sake of change.

I think that a lot of changes are necessary, and it seems they will happen. So 
*perhaps* it may be beneficial in this sea of changes to consider a minor 
change, moving from a more traditional Berkeley database to SQLite.

> 
> As I discussed in my email yesterday, there are serious issues to be
> solved.  

I think some of the issues have nothing to do with the database question.
Some of the issues are entirely trivial to solve. One of the worst offenders
for misbehaviour of the package system is the constant changes in the port
origins and the poor standardisation of the package names. When it will
be clear that these name changes bring nothing to the table but 
introduce a lot of confusion both for end users and automated programs,
things will be easier.

It may be that borrowing from Debian the idea of "abstract" dependencies
which can be fulfilled by several concrete packages may also simplify
the dependency problem. For example tomcat may depend on "java" and java
my be fulfilled either by diablo-jdk15 or jdk15. This way when you change
from diablo-jdk15 to jdk15 you don't need to change anything to tomcat.

Another feature that Debian has, and which may happily complete the previous
one, is the specification of necessary dependencies with a version number
in a certain range (this obviously requires a reasonable standardisation of
version numbers, so that comparison of -0.99 to 
-1.0-rc doesn't depend on arcane rules). This way you don't need
to change dependencies which are in the correct range, even if a more recent
version exists. This mechanism has been imported in NetBSD pkgsrc.

And a problem which has proven useful in Debian is keeping track of the
packages which have been required by the end user and those which have been
installed as dependencies. This is the difference between apt-get and
aptitude. Apparently people are very happy to be able to remove not only
a package they have required, but also all its dependencies (which are
not required by another program) at one stroke. This also helps in case
some big package requires dependency A, but after upgrade, they have changed
their mind and require alternative dependency B. With this mechanism, after
upgrade A disappears, while without it you will have both an upgraded version
of A and B. I have observed on my machine this is an important cause 
of time monotonic bloat of the package tree.

To answer the slowness problem in registering installed packages, one may
think about making use of the INDEX file. In fact all the information that
is necessary to fill the dependency entries is contained in INDEX, and
accessible here in milliseconds with any tool such as awk. It so happens that
the ports system doesn't make any use of the INDEX file and systematically
recomputes the dependencies through recursive make invocations which are very
time consuming. Of course this requires up to date INDEX, or a mechanism to
keep INDEX continually up to date.


Part of the registration is also filling the +REQUIRED_BY files of the
dependencies of a package when one installs a package.  If this package has a
lot of dependencies this means opening

Re: DPS Initial Ideas

2007-05-12 Thread Kris Kennaway
On Sat, May 12, 2007 at 11:44:22PM +0200, Michel Talon wrote:
> On Sat, May 12, 2007 at 03:33:02PM -0400, Kris Kennaway wrote:
> > On Sat, May 12, 2007 at 11:09:35AM +0200, Michel Talon wrote:
> > 
> > > Seriously, the FreeBSD package
> > > system is in great need of a profound overhaul, pretending it works well
> > > is complete denial of reality. I hope that young people working on 
> > > summer code projects will infuse *new* ideas, and not spend their
> > > vacations polishing inadequate tools.
> > 
> > I know that this is your belief, but please try to avoid grasping at
> > straws: there are elements in your argument that are along the lines
> > of "The FreeBSD package system is broken and needs to be fundamentally
> > changed.  Rewriting it to use SQLite is a fundamental change.
> > Therefore rewriting it to use SQLite will fix the problems."
> > 
> 
> Really i don't think at all this way. I think that *perhaps* SQLite
> may marginally better than a Berkeley database for solving part of the
> problem, not much more. What i reacted to, was the conservatism which 
> pervades the community as soon as someone emits the idea of using a new tool. 

It seems to me that you do not appreciate the reasons behind this
conservatism.  A very important one is that we have two students who
have committed to spending their summer working on improving the
existing pkg_tools in ways that will solve some of the real problems
we are facing, and the project we have agreed upon is that they will
be using existing tools rather than rewriting from scratch as part of
a not-yet-defined larger project.

To some extent it is the timing here that is most unfortunate.  If
SQLite has been raised as a viable alternative a few months ago it
would have made a great project idea, but instead we have committed to
improving our existing tools, and the barrier for throwing out these
plans is therefore very high.  The burden of proof is therefore set
much higher than "SQL is awesome and buzzword-compliant and might be
better".

> It may be that borrowing from Debian the idea of "abstract" dependencies
> which can be fulfilled by several concrete packages may also simplify
> the dependency problem. For example tomcat may depend on "java" and java
> my be fulfilled either by diablo-jdk15 or jdk15. This way when you change
> from diablo-jdk15 to jdk15 you don't need to change anything to tomcat.
> 
> Another feature that Debian has, and which may happily complete the previous
> one, is the specification of necessary dependencies with a version number
> in a certain range (this obviously requires a reasonable standardisation of
> version numbers, so that comparison of -0.99 to 
> -1.0-rc doesn't depend on arcane rules). This way you don't need
> to change dependencies which are in the correct range, even if a more recent
> version exists. This mechanism has been imported in NetBSD pkgsrc.

We actually have both of these features in ports, but they have not
been pushed down into packages.  I think it will be relatively simple
to do so, without requiring a rewrite from scratch.

> And a problem which has proven useful in Debian is keeping track of the
> packages which have been required by the end user and those which have been
> installed as dependencies. This is the difference between apt-get and
> aptitude. Apparently people are very happy to be able to remove not only
> a package they have required, but also all its dependencies (which are
> not required by another program) at one stroke. This also helps in case
> some big package requires dependency A, but after upgrade, they have changed
> their mind and require alternative dependency B. With this mechanism, after
> upgrade A disappears, while without it you will have both an upgraded version
> of A and B. I have observed on my machine this is an important cause 
> of time monotonic bloat of the package tree.

This one could also be added to the existing tools.

> To answer the slowness problem in registering installed packages, one may
> think about making use of the INDEX file. In fact all the information that
> is necessary to fill the dependency entries is contained in INDEX, and
> accessible here in milliseconds with any tool such as awk. It so happens that
> the ports system doesn't make any use of the INDEX file and systematically
> recomputes the dependencies through recursive make invocations which are very
> time consuming. Of course this requires up to date INDEX, or a mechanism to
> keep INDEX continually up to date.

The problem is that maintaining the INDEX is expensive and/or tricky.
p5-FreeBSD-Portindex comes close but seems to have some wrinkles.

> Part of the registration is also filling the +REQUIRED_BY files of the
> dependencies of a package when one installs a package.  If this package has a
> lot of dependencies this means opening, editing and closing a large number of
> files. This is expensive. One may imagine using a database containing the
> global dependency i

Re: DPS Initial Ideas

2007-05-12 Thread Ivan Voras
Kris Kennaway wrote:

> It seems to me that you do not appreciate the reasons behind this
> conservatism.  A very important one is that we have two students who
> have committed to spending their summer working on improving the
> existing pkg_tools in ways that will solve some of the real problems
> we are facing, and the project we have agreed upon is that they will
> be using existing tools rather than rewriting from scratch as part of
> a not-yet-defined larger project.

So change their project :)

I'm only half-serious but SoC hasn't officialy started yet and Google's
ok with projects' goals being modified. Of course, the students should
decide.



signature.asc
Description: OpenPGP digital signature


SoC

2007-05-12 Thread Duane Whitty
Garrett,

Sounds like you're involved in a cool project.  What kind of
community collaboration/involvement would be helpful to you?

Once, a long, long time ago, I wrote quite a bit of bdb 1.85
code.  At that time it WAS the current version :)  I might
actually remember a bit if I start working with it again.
But what would be most useful to you?

And if I may ask about a design decision: Why did you choose
a hash structure?  Perhaps if you have time you could give
a little more info but whatever fits your schedule.

Good luck on your project.

Duane
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: New FreeBSD package system (a.k.a. Daemon Package System (dps))

2007-05-12 Thread Duane Whitty
On Thursday, 10 May 2007 at 20:20:42 -0700, Garrett Cooper wrote:
> David Naylor wrote:
> >Dear Jordan
> >
> >Recently I stumbled across a document you wrote in 2001, entitled "FreeBSD 
> >installation and package tools, past, present and future".  I find FreeBSD 
> >appealing and I would like to contribute it its success, and as your 
> >article describes, the installation and packaging system is lacking.  
> >Since the installation system is being tackled under a SoC project I am 
> >hoping to give the packaging system a go.  
> >
> >I was hoping you could help me with an update about the situation with 
> >pkg.  I have searched the FreeBSD mailing lists and have found little 
> >information on the package system.  Once I have a (much more) complete 
> >understanding of the packaging system (and providing there is work to be 
> >done) I would like to write up a proposal to solve the problems, and 
> >perhaps provide some innovating new capabilities.  
> >
> >After that I will gladly contribute what I can to this (possible) project 
> >and hopefully further and improve FreeBSD.  Any assistance or information 
> >you can give I will be greatly appreciate.  
> >
> >I look forward to your reply.  
> >
> >David
> 
> Yipes. The name of the game is to get something working in the base 
> system, instead of dragging in multiple 3rd party packages, with 
> licensing schemes that may not be aligned with the BSD license.
> 
> SQL's great, SQL's wonderful for db use, but the problem is that 
> supporting it from my POV would cause a lot more grief and waiting than 
> having me wait a few months to get a BDB compatible scheme out the door.
> 

I'm a little out of practice, however, perhaps the routines that manipulate
the ports meta-data could be sufficiently agnostic about how the data is
being manipulated that it would facilitate experimentation with different
back-ends at a later time.  Just a thought and perhaps I'm way off.

Duane

> If only Oracle didn't make BDB 3.x non-BSD license friendly though.. 
> that would be nice..
> 
> -Garrett
> ___
> freebsd-hackers@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "[EMAIL PROTECTED]"
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"