On 10/12/2010 23:51, John Doe wrote:
> I'm In the process of creating a cleanup tool that checks archive.org and
> webcitation.org if a URL is not archived it checks to see if it is live and
> if it is I request that webcitation archive it on demand, and fills in the
> archiveurl parameter of cite
In a message dated 12/10/2010 2:58:08 PM Pacific Standard Time,
jamesmikedup...@googlemail.com writes:
> my idea was that you will want to search pages that are referenced by
> wikipedia already, in my work on kosovo, it would be very helpful
> because there are lots of bad results on google, a
In a message dated 12/10/2010 1:10:26 PM Pacific Standard Time,
jamesmikedup...@googlemail.com writes:
> My point is we should index them ourselves. We should have the pages
> used as references first listed in an easy to use manner and if
> possible we should cache them. If they are not cacheab
In a message dated 12/10/2010 1:31:20 PM Pacific Standard Time,
jamesmikedup...@googlemail.com writes:
> If we prefer pages that can be cached and translated, and mark the
> others that cannot, then by natural selection we will in long term
> replaces the pages that are not allowed to be cached
In a message dated 12/10/2010 2:12:44 PM Pacific Standard Time,
jamesmikedup...@googlemail.com writes:
> Well, lets backtrack.
> The original question was, how can we exclude wikipedia clones from the
> search.
> my idea was to create a search engine that includes only refs from
> wikipedia in
I'm In the process of creating a cleanup tool that checks archive.org and
webcitation.org if a URL is not archived it checks to see if it is live and
if it is I request that webcitation archive it on demand, and fills in the
archiveurl parameter of cite templates.
John
___
On Sat, Dec 11, 2010 at 12:02 AM, wrote:
> In a message dated 12/10/2010 2:58:08 PM Pacific Standard Time,
> jamesmikedup...@googlemail.com writes:
>
>
> my idea was that you will want to search pages that are referenced by
> wikipedia already, in my work on kosovo, it would be very helpful
> bec
On Fri, Dec 10, 2010 at 11:16 PM, wrote:
> In a message dated 12/10/2010 2:12:44 PM Pacific Standard Time,
> jamesmikedup...@googlemail.com writes:
>
>
> Well, lets backtrack.
> The original question was, how can we exclude wikipedia clones from the
> search.
> my idea was to create a search engi
Well, lets backtrack.
The original question was, how can we exclude wikipedia clones from the search.
my idea was to create a search engine that includes only refs from
wikipedia in it.
then the idea was to make our own engine instead of only using google.
lets agree that we need first a list of re
I know all about the aspects of programming and copyright, I thought I
answered the questions.
Of course I can program this myself, and we can use open source
indexing tools for that. the translations of course are a separate
issue, they would be under the same restrictions as the source page.
If
On Fri, Dec 10, 2010 at 9:54 PM, wrote:
> In a message dated 12/10/2010 12:48:31 PM Pacific Standard Time,
> jamesmikedup...@googlemail.com writes:
>
>
> I am not talking about books, just webpages.
>
> lets take ladygaga.com as example
>
> Wayback engine :
> http://web.archive.org/web/*/http://w
I am not talking about books, just webpages.
lets take ladygaga.com as example
Wayback engine :
http://web.archive.org/web/*/http://www.ladygaga.com
Google cache:
http://webcache.googleusercontent.com/search?q=cache:1720lEPHkysJ:www.ladygaga.com/+lady+gaga&cd=1&hl=de&ct=clnk&gl=de&client=firefox
i mean google has copies, caches of items for searching.
How can google cache this?
Archive.org has copyrighted materials as well.
We should be able to save backups of this material as well.
mike
On Fri, Dec 10, 2010 at 5:16 PM, wrote:
> In a message dated 12/9/2010 11:06:30 PM Pacific Standard
In a message dated 12/9/2010 11:06:30 PM Pacific Standard Time,
jamesmikedup...@googlemail.com writes:
> Google does it, archive.org (wayback machine) does it, we can copy
> them for caching and searching i assume. we are not changing the
> license, but just preventing the information from disap
On Thu, Dec 9, 2010 at 6:02 PM, wrote:
> In a message dated 12/9/2010 2:51:39 AM Pacific Standard Time,
> jamesmikedup...@googlemail.com writes:
>
>
>> yes it would be great. As i said, it could just include all pages
>> listed as REF pages and that would allow people to review the results
>> and
In a message dated 12/9/2010 2:51:39 AM Pacific Standard Time,
jamesmikedup...@googlemail.com writes:
> yes it would be great. As i said, it could just include all pages
> listed as REF pages and that would allow people to review the results
> and find pages that should not belong.
>
> We also
On Thu, Dec 9, 2010 at 12:52 PM, Fred Bauder wrote:
>> On Thu, Dec 9, 2010 at 9:55 AM, Domas Mituzas
>> wrote:
>>>
>>> On Dec 8, 2010, at 6:21 PM, Mike Dupont wrote:
>>>
Sounds like we need to have a notable search engine that includes only
"approved and allowed" sources, that would be
> On Thu, Dec 9, 2010 at 9:55 AM, Domas Mituzas
> wrote:
>>
>> On Dec 8, 2010, at 6:21 PM, Mike Dupont wrote:
>>
>>> Sounds like we need to have a notable search engine that includes only
>>> "approved and allowed" sources, that would be nice to have.
>>
>> Sounds like a great community project, W
On Thu, Dec 9, 2010 at 9:55 AM, Domas Mituzas wrote:
>
> On Dec 8, 2010, at 6:21 PM, Mike Dupont wrote:
>
>> Sounds like we need to have a notable search engine that includes only
>> "approved and allowed" sources, that would be nice to have.
>
> Sounds like a great community project, Wiki Search!
On Dec 8, 2010, at 6:21 PM, Mike Dupont wrote:
> Sounds like we need to have a notable search engine that includes only
> "approved and allowed" sources, that would be nice to have.
Sounds like a great community project, Wiki Search!
Domas
___
founda
ation Mailing List"
Sent: Wednesday, December 08, 2010 7:58 PM
Subject: Re: [Foundation-l] excluding Wikipedia clones from searching
>I thought about this more,
> It would be to extract a list of all pages that are included as
> in the WP. We would use this for the search engi
I thought about this more,
It would be to extract a list of all pages that are included as
in the WP. We would use this for the search engine.
we should also make sure that all referenced pages (not linked ones)
are stored in archive.org or someplace permanent.
I wonder if there is some API to ext
On Wednesday 08 December 2010 05:16 PM, Amir E. Aharoni wrote:
> I know that some Wikipedias customized Special:Search, adding other search
> engines except Wikipedias built-in one. I tried to see whether any Wikipedia
> added an ability to search using Google (or Bing, or Yahoo, or any other
> sea
On 8 December 2010 11:46, Amir E. Aharoni wrote:
> For some time i used to fight this problem by adding
> "-site:wikipedia.org-site:
> wapedia.mobi -site:miniwiki.org" etc. to my search queries, but i hit a
> wall: Google limits the search string to 32 words, and today there are many
> more than
Sounds like we need to have a notable search engine that includes only
"approved and allowed" sources, that would be nice to have.
On Wed, Dec 8, 2010 at 5:08 PM, David Gerard wrote:
> On 8 December 2010 15:26, Amir E. Aharoni
> wrote:
>
>> Yes, but that may also exclude sites that are useful a
On 8 December 2010 15:26, Amir E. Aharoni wrote:
> Yes, but that may also exclude sites that are useful and original, but
> happen to mention Wikipedia.
Add -"quoted sentence from article intro" to the search?
- d.
___
foundation-l mailing list
foun
On Wed, Dec 8, 2010 at 15:42, Fred Bauder wrote:
>
> If the copyright license has been followed -wikipedia should exclude all
> clones. However, often, material is copied without crediting it to
> Wikipedia.
Yes, but that may also exclude sites that are useful and original, but
happen to mention
If the copyright license has been followed -wikipedia should exclude all
clones. However, often, material is copied without crediting it to
Wikipedia.
Fred
User:Fred Bauder
> The "Google test" used to be a tool for checking the notability of a
> subject
> or to find sources about it. For some la
On Wed, Dec 8, 2010 at 10:46 PM, Amir E. Aharoni
wrote:
>
> For some time i used to fight this problem by adding
> "-site:wikipedia.org-site:
> wapedia.mobi -site:miniwiki.org" etc. to my search queries, but i hit a
> wall: Google limits the search string to 32 words, and today there are many
> m
On 12/08/2010 12:46 PM, Amir E. Aharoni wrote:
> The "Google test" used to be a tool for checking the notability of a subject
> or to find sources about it. For some languages it may be also used for
> other purposes - for example in Hebrew, the spelling of which is not
> established so well, it is
The "Google test" used to be a tool for checking the notability of a subject
or to find sources about it. For some languages it may be also used for
other purposes - for example in Hebrew, the spelling of which is not
established so well, it is very frequently used for finding the most common
spell
31 matches
Mail list logo