Hey Francesco and others,

First thanks of the direction.

I was thinking about using generic tools that are available as possible.
Also, in education there is a whole thing about it not being an intercept proxy 
(with or without bump) so
it simplifies some of the aspects of the setup.

I would try to write the general specs of the project from my point of view.
Since the goal is educating and not enforcing a policy we can start by defining 
the age of the kids as low as 6-5 or even lower.
Due to the age of the kids there is a baseline policy that must be enforced ie 
couple of standard and known categories.
With this in mind we need a DB setup that will host these categories and will 
be performant enough for high load ie Schools.
Since the law in most if not all countries on earth prohibit nudity to a degree 
and also prohibit the demonstration of reproductive activities
in both animals and humans, it's pretty clear that any violation of these 
should only be possible only by professional staff that is allowed by the law
to open these doors for very very special cases (which I know to exist).
There are also other activities and categories which are known to be harmful 
for specific ages which should be blocked by policy.
We can divide the level of filtering policy to domains, urls and content inside 
a page or a dynamic app which is either embedded or sources in another method.
Domains and urls are the most known and is commonly filtered and many tools are 
available to enforce and block these.
There is some issue is with systems and sites which are not based on static 
content and others which are based on content which is 
streams inside Websocket or another method such as content that is chunked over 
multiple urls or customized requests and responses.

On this specific project I want to address only the basics which are: domains 
and maybe urls.
Due to the above fact and the fact that the internet is far ahead of 1985 or 
2000 the depth of the education session is restricted 
the proxy will be used only to demonstrate that there are bad actors on the 
Internet.
There are also other categories that probably many would like to add into the 
list such as malware sites.

I believe that the right way is to use a forward proxy which will use usernames 
to authenticate and identify the user.
This will make the whole setup a bit simpler to build and it is based on that 
the kids or teenagers are actively participating
in the setup and agreed to the terms of use based on their trust in the 
teachers and parents.
We also need to show some trust to the kids to allow them to be open in the 
session.

From my point of view the architecture should be something like this:
* Proxy
* DB (SQL or another)
* Users Web portal (app)
* Admins Web Portal (app)
* Blockpages (static content with a touch of JS)
* A set of external helpers (auth, dstdomain matcher, time limit, dns rbl 
checker)
* Audit system

The assumption is that only authenticated users can use the proxy ie no 
username no internet.. even for windows and AV updates.
We also assume that the admins of the proxy do not need to override the basic 
polices because they have access to unrestricted internet.
Authentication can be done using the existing tools with a MySQL DB which can 
be integrated with the web portal( not AD or LDAP..)
The DB for the dstdomain/url blacklists should be fast enough to allow almost 
real time updates to the degree of TTL such as 5 to 10 seconds.
Every domain which should be blocked by the policy is a "must bump" one while 
if it allowed by the policy a "no bump" should be applied.
There are couple layers of block and whitelists (first match from left to 
right):
  Top-level(never allowed), , campus wide customized blacklist (for testing), 
campus wide customized whitelist(for testing), user customized blacklist, user 
customized whitelist, campus wide blacklist, campus wide whitelist

The user can manage his lists via the web portal but not the top level and 
campus lists.
There is also a section in the web portal which allows the user to contact the 
content administrators about any non user customized
lists such as the top level and the campus wide.
The expectation from the content administrators is to really understand the 
user interaction with them and to not just enforce the policy.
There is also a requirement from the content admin to have above the average 
technical knowledge about how internet works.
It includes both IP level and application level such as how TLS and firewall 
piercing.
The expectation is that all changes in any of the lists will be logged in the 
audit log.
Also, any "action" in the web portal will be logged in the audit log.
The audit is required by law to prevent from bad actions to be done in un 
supervised manner.
Due to this the Proxy structure and config is set and cannot be changed by 
anyone, even the sys admins.
To allow the system to be effective the only option to access the DB is using 
an audited web portal.
Since the structure of the DBs is pretty simple we can simplify the access to 
it VIA a very simple API.
The API should include both single entries action(add/modify) of entries and 
also bulk actions(for big lists).

I believe that such a setup can be implemented with containers and in a HA 
architecture. 

The actual DB for the lists which I have considered are:
* MySQL/MariaDB
* PostgreSQL
* MSSql
* SquidGuard
* ufdbguard
* SquidBlocker
* DNS Rbl

The limitations of SquidGuard and ufdbguard and DNS Rbl services is that they 
need to recompile the lists for usage.
For lists which should be re-compiled every one hour or so we can create a CI 
CD pipeline which includes a compilation
of the lists DB on a dedicated system and publish the precompiled files in a 
public storage ie s3 compatible or git.
A list change check can be done every 5 minutes for emergency updates but will 
only be updated periodically every 1 hour.
The above idea can work with both ufdbguard and SquidGuard or any DNS RBL 
system.

As for the user and campus dynamic lists, these should be stored and managed on 
a DB such as key-value or any other SQL
which doesn't require compilation to begin with.
If the dynamic lists DB will be small enough per user or campus it would be 
possible to use a ttl of 5-15 seconds on the dstdomain
external helper to reduce the number of times the "slow" queries against the DB 
will happen.
The other option is to use some kinds of RAM caching service such as Memcached 
or redis and to cache the response per 
domain per user for 300 Seconds ie the user "eliezer" response for 
"www.example.com" will be stored as: "eliezer://www.example.com"
and if the lists are small enough it would probably be simple to even trigger a 
prefix cleanup for all "eliezer://" namespace.
Currently all the Lists DB that I know about do not allow a query to know if a 
dstdomain is in a category or a set of categories.
With such a service we can divide the user lists to 2 separate searches/steps.
* customized dstdomain list match
* customized dstdomain to set of categories match

It can work for both white and black lists.

Working with pre-compiled lists or a fast enough service would allow the system 
to work fast enough and probably scale.

I have all the squid knowledge required for such a system but I need some help 
with the other moving parts.

I am open for any comments and suggestions about the setup technical or other 
aspects.

Thanks,
Eliezer Croitoru
ngtech1...@mgail.com
+972-5-28704261


From: squid-users <squid-users-boun...@lists.squid-cache.org> On Behalf Of 
Francesco Chemolli
Sent: Friday, February 9, 2024 12:00 PM
To: Marcus Kool <marcus.k...@urlfilterdb.com>
Cc: squid-users@lists.squid-cache.org
Subject: Re: [squid-users] Squid as an education tool

Hi Eliezer, Marcus,
  what you describe seems very similar to a captive portal, just with a very 
dynamic allowlist policy.
I'm confident that it can be implemented with Squid, a few helpers, and a side 
webserver plus a small website.
In fact, it would probably be a nice project to release to the community if it 
were built to be generic enough

On Fri, Feb 9, 2024 at 9:23 AM Marcus Kool <mailto:marcus.k...@urlfilterdb.com> 
wrote:
Hi Eliezer,

I am not aware of a tool that has all functionality that you seek so you 
probably have to make it yourself.
I know that you are already familiar with ufdbGuard for Squid to block access, 
but you can also use ufdbGuard for temporary access by including a 
time-restricted whitelist in the configuration file 
and doing a reload of the ufdbGuard configuration.  The reload does not 
interrupt the function of the web proxy or ufdbGuard itself.

Marcus

On 09/02/2024 03:41, mailto:ngtech1...@gmail.com wrote:
> Hey Everybody,
>
> I am just releasing the latest 6.7 RPMs and binaries while running couple 
> tests and I was wondering if this was done.
> As I am looking at proxy, in most cases it's being used as a policy enforcer 
> rather than an education tool.
> I believe in education as one of the top priorities compared to enforcing 
> policies.
> The nature of policies depends on the environment and the risks but 
> eventually understanding the meaning of the policy
> gives a lot to the cooperation of the user or an employee.
>
> I have yet to see a solution like the next:
> Each user has a profile/user which when receiving a policy block will be 
> prompted with an option to allow temporarily
> the specific site or domain.
> Also, I have not seen an implementation which allows the user to disable or 
> lower the policy strictness for a short period of time.
>
> I am looking for such implementations if those exist already to run education 
> sessions with teenagers.
>
> Thanks,
> Eliezer
>
> _______________________________________________
> squid-users mailing list
> mailto:squid-users@lists.squid-cache.org
> https://lists.squid-cache.org/listinfo/squid-users
_______________________________________________
squid-users mailing list
mailto:squid-users@lists.squid-cache.org
https://lists.squid-cache.org/listinfo/squid-users



-- 
    Francesco

_______________________________________________
squid-users mailing list
squid-users@lists.squid-cache.org
https://lists.squid-cache.org/listinfo/squid-users

Reply via email to