creating a database suitable for LDAP implementations might be of more general use. this sounds like a very good fit for db.apache.org (even if the entire project is not).

- robert

On Friday, September 12, 2003, at 03:07 AM, Alex Karasulu wrote:

Wow you guys are getting real deep. Perhaps we should take these technical
conversations over to the ldapd dev list for now. This place is for
incubator stuff not tech stuff so we'll continue at ldapd-devel after this
bridge email.


BTW I think BerkeleyDB was faster than the jdbc based implementation but not
jump out at you faster like we thought. The backend design using bdb is
pretty much the same as jdbm. The difference was the JNI performance
degradation - yes all that copying from crossing the java/c barrier back and
forth slowed us down. Bdb is great for C but a pure Java implementation
like Jdbm is best.


Once we tried Jdbm we were very happy with the results. Performance went
through the roof. And btw an RDBMS is a hog for an LDAP server backing
store. All the SQL overhead makes it so. I have to agree with the OpenLDAP
folks - they know what they're talking about.


Alex

-----Original Message-----
From: Robb Penoyer [mailto:[EMAIL PROTECTED]
Sent: Thursday, September 11, 2003 8:57 PM
To: [EMAIL PROTECTED]
Subject: Re: Official Apache Directory Project Proposal Submission


Hi Jim,


The original pre-release versions of LDAPd were implemented with a
BerkeleyDB backend, with custom index management etc, much like the
openldap articale you reference. Those early designs, did have a contracted
backend store interface defined (thank-you Mr magic - Alex), and indeed
there is a basic SQL backend implementation in alpha 0.7 Taking into
perspective the second reference to IBM you provided. We measured the
performance of a very strongly tuned Oracle database against the BerkleyDB
implementation and found virtually no performance difference. Albeit, these
were not formal tests, but they were exactly the same. (hardware, test
cases etc).


We moved to a default backend based upon the JDBM implementation Alex
referenced earlier. The performance improvement was staggering, to say the
least. The nature of LDAP is for high performance read operations, a pure
indexing mechanism turned out to be ideal. With JDBM everything amounts to
an index in relational terms. This would absolutely become a problem for a
standard transactional application, primarily because the biggest struggle
for our JDBM implementation is how to handle duplicate entries - it costs
us performance.


Here is where I see it fall out:
RDBMS : hard to beat for pure transactional power. If you have to store,
retrieve, update and delete, all the operations will generally work within
the same performance envelope.


   OODBMS: hard to beat for pure synergy between logic layer and storage,
 I
admit a personal weakness with these types of databases.

Heriarchical Databases: great for maintaining a complete picture of
complex models (many to many parent child relationships) This is likely
where stored procedures prove important in RDBMS technology (as you pointed
out).


I'll say it again, LDAP is NOT a database, it just needs one. That's what
IBM was saying. the openldap folks retrofitted BerkeleyDB. We went a
different way. The only real distinction on this front, is that we
recognixed very early, that the nature of the backend store will dictate
the performance of your LDAP server "in the context it is designed for".
Meaning, it is likely that an RDBMS backend associated with LDAP will give
a better overall performance for a heavily modified directory information
tree, but will never outperform the raw search capabilities of a purely
indexed backend for searches. So we leave it up to the implementors. At one
time we spoke about actually measuring all this stuff and providing
guidelines - it's something we would love to get work going on --- hint.


Turning over to how LDAP could impact database technologies - let's bring
it up a layer above the data store. A solid LDAP implementation to protocol
compliance requires some truly industrial strength mechanisms (beyond data
storage): schema management, access controls, protocol encoding/decoding,
search optimizers, providers and on and on and on. If this sounds similar
to a database implementation, stop wondering. The core of an LDAP server
performs the same basic functions as a database management system.


What if you added the missing pieces, for example a SQL parser, a JDBC
driver, a transaction manager, stored procs - but kept the LDAP protocol
requirements of forwarding and authoritative areas. You are now in a
scenario where one LDAP server is representing a database. What kind of
database, the one you chose, Berkeley, Oracle, SQL Server (yuck), DB2, JDBM.


Now add more LDAP protocol stuff, replication and referrals. You can now
had a set of LDAP servers acting in concert as 1 database. What storage
mechanism, how about one RDBMS, one JDBM, and one oodbms. Each configured
with a schema designed specifically to take advantage of their performance
characteristics. You now have the best of all worlds - one interface.
pretty cool huh?

Robb


At 01:49 AM 9/12/2003 +0200, you wrote:
Alex and Brian,

Regarding the relationship between RDBMS and LDAP...

I believe this document says why RDBMS is wrong for LDAP:

http://www.openldap.org/faq/data/cache/378.html

On the other hand IBM have implemented LDAP in DB2. See:

http://www.research.ibm.com/journal/sj/392/shi.html

Since reading that I have got quite carried away designing
and implementing heirarchical structures in RDBMS's.
Mainly designing actually, but the demo referenced from
my signature below implements heirarchical categories
of contacts using the same principles.

I understand this project is not just about the protocol but
about the directory. It seems to me that it is very valuable
to have a single DBMS that supports both relational and
heirarchical structures as efficiently as possible. (In fact
I would suggest not just heirarchies but directed graphs.
I.e. a child can have one or more parents.)

If the IBM way (that I adapted) turns out to be one of the
best (following project design) then one thing that is important
is that you can efficiently add, remove and traverse nodes
in a tree represented by lots of small RDB records.
This becomes important for deep heirarchies. I guess
stored procedures might help in an standard RDBMS.

I might be interested in getting involved.

Regards,

Jim Wright

--
Recently completed - Child Brain Injury Trust Admin System
http://cbitdemo.paneris.org/

Urgently seeking paid work
Java, Linux, XML and much more.
http://be.webz.cz/




--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to