On 07/18/2011 11:44 AM, Robert Bonomi wrote:
 From owner-freebsd-questi...@freebsd.org  Mon Jul 18 03:55:59 2011
Date: Mon, 18 Jul 2011 10:55:58 +0200
From: Frank Bonnet<f.bon...@esiee.fr>
To: freebsd-questions@freebsd.org
Subject: Re: Tools to find "unlegal" files ( videos , music etc )

On 07/18/2011 10:45 AM, Polytropon wrote:
On Mon, 18 Jul 2011 10:38:22 +0200, Frank Bonnet wrote:
On 07/18/2011 10:10 AM, Polytropon wrote:
On Mon, 18 Jul 2011 09:55:09 +0200, Frank Bonnet wrote:
Hello

Anyone knows an utility that I could pipe to the "find" command
in order to detect video, music, games ... etc  files ?

I need a tool that could "inspect" inside files because many users
rename those filename to "inoffensive" ones :-)
One way could be to define a list of file extensions that
commonly matches the content you want to track. Of course,
the file name does not directly correspond to the content,
but it often gives a good hint to search for *.wmv, *.flv,
*.avi, *.mp(e)g, *.mp3, *.wma, *.exe - and of course all
the variations of the extensions with uppercase letters.
Also consider *.rar and maybe *.zip for compressed content.

If file extensions have been manipulated (rare case), the
"file" command can still identify the correct file type.




yes thanks , gonna try with the file command
You could make a simple script that lists "file" output for
all files (just to be sure because of possible suffix renaming)
for further inspection. Sometimes, you can also run "strings"
for a given file - maybe that can be used to identify typical
suspicious string patters for a "strings + grep" combination
so less manual identification has to be done.


yes , my main problem is the huge number of files
but anyway I'm gonna first check files greater than 500 Mb
it could be a good start
That's what 'find(1)' is for.  Something like (run as superuser):

  find / -exec  ./inspect {}>>  /tmp/suspects \;

with './inspect' being a trivial (executable!) shell-script:

     #!/bin/sh
     file $1 | awk -f  ./inspect.awk

and './inspect.awk' is:

           {file = $1 ; $1 = "";}
/regex1/  {printf("%s  %s\n",file,$0;next);
/regex2/  {printf("%s  %s\n",file,$0;next);
/regex3/  {printf("%s  %s\n",file,$0;next);
   ...      ...
   ...      ...
           {next;}

where 'regex1', 'regex2', etc. are things to select 'files' of interest,
based on what 'file' reports.  The awk code strips out the file name, so
that the regex will match only against the 'file' output, with no false-
Positives against a substring in the file name itself.

See the find(1) manpage for things you can put before the '-exec' param,
to filter by size, etc.  You can also limit the search to a specific
part of the filesystem tree, by replacing '/' with the name of the directory
hierarchy you want to search -- e.g. '/home' (if that's where all 'user'
files are) -- although, 'for completeness' (given the 'legal" issues)  you
may well want to run it over 'everything'.


_______________________________________________
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"

Thanks a lot for your help !

_______________________________________________
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"

Reply via email to