-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 08.11.2012 17:37, SMoratinos wrote: > > 3 / The Uploader publish things encrypted by namespace "a1" public > key, it publish thinks to "goodfiles" namespace. We'd have to avoid that, see below.
> > 4 / The namespace owner "a1" will continuously search for > publications for namespace "a1". This suppose to be the default > functionning of gnunet, a namespace search content for his own > namespace. He find the new content and index only it's link to him > database. Your wording is quite unfortunate. I know what you mean "publications intended for inclusion in a1", not "publications made in a1", because only a1's owner can publish in a1. Searching your own namespace makes no sense - only you can publish things in it anyway. And if you do share its private key, then it is just the same as the global namespace. > But how the downloader discover the "a1" namespace ? The problem is > the same as the beginning. Off-band. How do you know about TPB? You learn eventually, somehow. Maybe google for it. Sadly, unless database contents are published on a public web-site, they won't be googleable (and you won't be able to discover database's existence initially by finding its contents), and publishing them like that will attract attention, censorship, and may threaten publisher's anonymity (OTOH, publisher may not be the same entity as the database maintainer; still, who'd want to put himself in danger like that? On a regular basis?) > > If there is no central server, the link index database must be in > all the network. Yes, each user who regularly uses the database will maintain his own copy (a closed community of users may opt to share one copy, but that is their business). > > All peers are a Database Owner, all the peer on the network have a > copy of the Link Index Database. No, only one node (or group of collaborating nodes) is the owner, able to publish updated database in a1 namespace. > When a database is updated, this update is propagated over the > network. Yes, since everyone will [eventually] get a copy. For the purpose of efficiency database won't be monolithic, so you'd only need to downloads a couple of megabytes to update existing copy. > > Search became local ! This is no longer the search result which is > propagated but this is the Link Index Database. Yes, although your local database will always be somewhat older than the one database maintainer has (depending on how often you're able to update it). > Finally, I'm notified by all the network from updates. New > contents are indexes by all the network continuously. Yes, but GNUnet kind of does that already. Database is only different in that it's moderated and allows discovering new stuff without foreknowledge. > > The drawback is that all peers must persit this database, and have > enough disk space but few Gbits is not a problem ? Yes (also, see above about database sharing). > And yes Gnunet doesn't work like this now. Yes. Some things are pending features (DSA, always-namespace searches), others are yet-to-be designed (basically what this discussion is). > > So what about that ? > OK, i took some time to actually read [1] instead of just glancing over the points. Normal (global) search is like this (based on GHM 2010 talk): K-block for keyword K contains payload R (payload is CHK and metadata). H(K) is the hash of the keyword. R is encrypted with a symmetric key derived from H(K), i.e. encrypted payload is E_{H(K)} (R). K is used to generate an RSA private/public key pair {PRIV_K, PUB_K}. Publisher then appends PUB_K to E_{H(K)} (R), and signs all that with PRIV_K to produce B, which is the K-block. B can be produced by anyone, you only need to know the keyword. Query initiator goes through the same thing - computes {PRIV_K, PUB_K}, then produces hash of the public key H(PUB_K) and sends it as a query. Also anyone, who somehow gets a K-block, can check its signature (verify that E_{H(K)} (R) + PUB_K was signed with PRIV_K, you only need PUB_K itself for that). Anyone, who has the K-block, can take the PUB_K part of it and hash that, producing the same H(PUB_K), and remember that hash, then match incoming queries against it. If it matches, then query initiator asked for the corresponding K-block. Since that stuff only depends on K, attackers can pre-compute PUB_K and H(PUB_K) for any K's that they want to censor/monitor, but the worst thing they can do is to refuse to re-transmit a query for K they know. Or send valid K-blocks with garbage payload (yeah, that's actually worse than dropping queries). But that is where namespaces come in. Now, what [1] proposes: All K-blocks are published under _some_ namespace. The corner case is the global namespace, private key for which is not a secret, so anyone can share in that namespace. PUB_N is the public key of a namespace, PRIV_N is the private key of a namespace. H(K+PUB_N) is the hash of the combination of keyword K and public key of the namespace (as opposed to H(K), which is the hash of only the keyword). R is encrypted with a symmetric key derived from H(K+PUB_N), i.e. encrypted payload is E_{H(K+PUB_N)} (R). DSA private/public key pair {PRIV_K, PUB_K} is mathematically derived from PRIV_N and H(K+PUB_N) (as opposed to generating it from K only). Publisher then appends PUB_K to E_{H(K+PUB_N)} (R), and signs all that with PRIV_K to produce B, which is the K-block. B can NOT be produced by anyone, you need to know the keyword and the private key of the namespace (it can be produced by anyone for global namespace, as its PRIV_N is common knowledge, you just need to know/guess the keyword). Query initiator computes PUB_K (but not PRIV_K, as that requires knowledge of PRIV_N) from PUB_N and H(K+PUB_N). PUB_N for global namespace is known by everyone; how to learn PUB_N for other namespaces not relevant here. Then initiator produces hash of the public key H(PUB_K) and sends it as a query. Anyone, who somehow gets a K-block, can check its signature (verify that E_{H(K+PUB_N)} (R) + PUB_K was signed with PRIV_K, you only need PUB_K itself for that). Anyone, who has the K-block, can take the PUB_K part of it and hash that, producing the same H(PUB_K), and remember that hash, then match incoming queries against it. If it matches, then query initiator asked for the corresponding K-block. Since that stuff depends on K and PUB_N, attackers can pre-compute PUB_K and H(PUB_K) for any pairs of K's and PUB_N's that they want to censor/monitor, but the worst thing they can do is to refuse to re-transmit a query for K and PUB_N combinations they know. Obviously the size of the list of forbidden H(PUB_K) that they must know is multiplied by N (number of namespaces they want to censor) from the size it had when only K was used for queries. But they cannot forge K-blocks, since that requires knowing PRIV_K, which, unlike the RSA case, is not known to everyone (except for the global namespace, they can still poison keywords with garbage K-blocks there). Now, revenons a nos moutons. We know that searching in a namespace will be secure. The remaining problem is pushing publications to the anonymous database maintainer. Your diagram mostly matches what i think, by the way, but after thinking a bit i would now prefer to use the term "database maintainer", not "database owner" ("owning" is not an activity we should be concerned with, "maintaining" is). Also, it doesn't really matter how uploader's namespace will be named. While it will be beneficial for people to be able to query it directly (doing normal namespace searches, once they learn of that namespace), it's not a requirement that this namespace is advertised in any way - people will eventually get CHKs for files in that namespace from the database, without ever knowing uploader's namespace. Anyway, the problem is that for normal K-block publication we have a common shared "secret" - the keyword, known to both publisher and query initiator. The cryptography builds on that. For reverse publications we have nothing of the sort. Database maintainer does not have any search criteria other than his own PUB_N or things derived from it. So it will go like this: Uploader will publish a K-block in global namespace. It will be a normal K-block, findable by a normal global namespace search. The difference is that it will correspond to a keyword K that is computed simply as H(PUB_N) (PUB_N is the public key of database maintainer's namespace), and then it will go through all the normal perturbations K goes through. It won't be something people will normally find or search for. Also, will be easily guessable, since PUB_N will, at some point, be well known to everyone. So censoring this K-block (not re-transmitting queries for it) for adversaries will be as easy as censoring a K-block for any other keyword they know in advance and that doesn't change over time. It's not much worse than normal K-block censorship, but for normal publications you can (and will) use multiple different K, and my hope is that at least some of them will not be well-known in advance (they will be made known outside of GNUnet, at the moment of publication or shortly after it). This H(PUB_N)-as-K will be a sitting duck, and won't change in a long time. Anyway, database maintainer will make a query using that keyword. I expect that database maintainer will have to update namespace key pair every now and then to avoid running out of Bloom filter space (statistically it's large, but running the same search again and again, and getting tons results, and then filtering them might give too many false positives). Also is that it will have the link (not CHK, that's a link to the file; what's the acronym for namespace links? NSK? i forgot...) to uploader's own namespace, in which the file must _also_ be published. This is also akin to [2] (i'm still not sure how [2] will be implemented; the idea that _i_ want to see in action is that finding _any_ KBlock in global namespace for stuff that is _also_ published in a non-global namespace namespace should allow you to learn that non-global namespace, as right now you need to find that non-global namespace first, _then_ search in it). If upload goes through, that namespace will later be used by the uploader to update his links in the database: database maintainer will search for K of that K-block in that uploader's namespace, and update the database from the search results -> much narrower, easier to do. And/or database maintainer will just search the root element of that namespace, and update the database from search results (i.e. uploader will be able to not only update already published links, but also publish new ones much faster). Again, root element of that namespace will have to have special, agreed-upon format. The K-block in question will also have some database-specific information (category, etc, although THAT kind of info should not be tied to that database implementation, and be made part of GNUnet/libextractor specs and be usable by everyone). Encrypting things with PUB_N of the database namespace doesn't really give us anything (also, not a good idea to fill the net with K-blocks that only 1 node in the whole network can decode, ever). Database maintainer can't specify "only decodeable by my private key" as a search criteria. So instead these uploads will be public, with extra metadata attached. Database maintainer will need to have a good machine to filter out spam and fakes though. Also, i think i should note that if you're thinking of mimicking torrent indexers (moderated collections of well-categorized links) in GNUnet this way, you should remember that you can't put ads on the content of the database (or, rather, you can, but someone will just automatically strip them off and re-publish the database without them - - and people will use that version instead). And you won't have a web-site to show ads on, to accept donations on or to sell stuff (unless you want to just throw away anonymity/censorship resistance, and play whack-a-mole the same way TPB does right now; or the database viewer and format will be proprietary, in which case you won't see any cooperation from us; and that still won't work in a long term). So these databases will have to be numerous, small, easy to moderate, and moderators will have to work for free. Or you'd have to go back to variant one, where database only contains things that database maintainer (who could be a group of persons) discovered on his own - that way database maintainer won't need to wade through swamps of spam/fakes targeted at him, just through loads of crap that will populate the global namespace. Or cherry-pick other people's namespaces (which the maintainer will learn passively, on his own). On GHM 2010 Grothoff also mentioned using web of trust to create chains of namespaces to reduce spam - that could be used to great effect (database maintainer will [eventually] learn a number of passively trusted namespaces, and will be able to restrict [some] searches to the WoT that spreads from these namespaces). By the way, i'm not discussing the way moderators will communicate with each other and synchronize the database updates among themselves. They may or may not be anonymous to one another, and may or may not use GNUnet for this. [1] https://gnunet.org/bugs/view.php?id=2564 [2] https://gnunet.org/bugs/view.php?id=2185 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (MingW32) Comment: Using GnuPG with Mozilla - http://www.enigmail.net/ iQEcBAEBAgAGBQJQnAziAAoJEOs4Jb6SI2Cw0v4H/01qB6N785PxBodCV84S4EDx CxUALMRTtChoGmByPo4uGfUdAYBKTy2mFQtdsDaPMzsH74ZTGFcbw4mLLhCwV0aA bsxNeRORYac79/rqm05oh4/6VMsU5feFucaWakvgVkyM6/EmzUwKHv7PDd5kcbjZ pMNMPqOLMtxaAopMHdXGYDzmCNEw04D/4CyuazQNyofU+rkbRV3dKyW1he960DpG +PyHMkqDty34KQZIOGHj5M7ngAZZn3OIuqlTsc6Ywf3M/SnstIF70fbe9uqQeYl4 NGpIMOd64E4nZJC2ffxprSEbHh+6n/SHSe1R3qT1JTRHytchVNEqumfa9M3IqwQ= =EbP7 -----END PGP SIGNATURE----- _______________________________________________ GNUnet-developers mailing list GNUnet-developers@gnu.org https://lists.gnu.org/mailman/listinfo/gnunet-developers