bug#52338: Crawler bots are downloading substitutes

2021-12-09 Thread Mathieu Othacehe


Hello Leo,

> +   (nginx-location-configuration
> + (uri "/robots.txt")
> + (body
> +   (list
> + "add_header  Content-Type  text/plain;"
> + "return 200 \"User-agent: *\nDisallow: /nar/\n\";"))

Nice, the bots are also accessing the Cuirass web interface, do you
think it would be possible to extend this snippet to prevent it?

Thanks,

Mathieu





bug#52338: Crawler bots are downloading substitutes

2021-12-09 Thread Tobias Geerinckx-Rice via Bug reports for GNU Guix

Mathieu Othacehe 写道:

Hello Leo,


+   (nginx-location-configuration
+ (uri "/robots.txt")


It's a micro-optimisation, but it can't hurt to generate ‘location 
= /robots.txt’ instead of ‘location /robots.txt’ here.



+ (body
+   (list
+ "add_header  Content-Type  text/plain;"
+ "return 200 \"User-agent: *\nDisallow: 
/nar/\n\";"))


Use \r\n instead of \n, even if \n happens to work.

There are many ‘buggy’ crawlers out there.  It's in their own 
interest to be fussy whilst claiming to respect robots.txt.  The 
less you deviate from the most basic norm imaginable, the better.


I tested whether embedding raw \r\n bytes in nginx.conf strings 
like this works, and it seems to, even though a human would 
probably not do so.


Nice, the bots are also accessing the Cuirass web interface, do 
you

think it would be possible to extend this snippet to prevent it?


You can replace ‘/nar/’ with ‘/’ to disallow everything:

 Disallow: /

If we want crawlers to index only the front page (so people can 
search for ‘Guix CI’, I guess), that's possible:


 Disallow: /
 Allow: /$

Don't confuse ‘$’ with ‘supports regexps’.  Buggy bots might fall 
back to ‘Disallow: /’.


This is where it gets ugly: nginx doesn't support escaping ‘$’ in 
strings.  At all.  It's insane.


 geo $dollar { default "$"; } # 
 stackoverflow.com/questions/57466554

 server {
   location = /robots.txt {
 return 200
 "User-agent: *\r\nDisallow: /\r\nAllow: /$dollar\r\n";
   }
 }


*Obviously.*

An alternative to that is to serve a real on-disc robots.txt.

Kind regards,

T G-R


signature.asc
Description: PGP signature


bug#52393: pdf links not clickable when zathura is launched via xdg-open

2021-12-09 Thread bdju
guix (GNU Guix) f199427c1b6dd8e3428e25d4e15f604b3c90a3b7
Guix System

works: zathura file.pdf
doesn't work: xdg-open file.pdf

Both launch zathura since it's my default reader. In the xdg-open case,
clicking a link in a pdf seems to not do anything. If I launched with
zathura explicitly, it opens the link in my existing instance of
qutebrowser as I would expect.
No errors related to clicking the link seem to show up in a terminal.
When launched either way I do get a lot of errors saying
```
error: plugin: filetype already registered: application/pdf
```
and then similar ones for oxps, epub+zip, xml, etc. as well as a message
that libdjvu.so cannot be registered as a plugin. These seem like
non-fatal errors and are probably unrelated, but I thought I would
mention them anyway.





bug#52393: pdf links not clickable when zathura is launched via xdg-open

2021-12-09 Thread bdju
in my mimeapps.list file I have zathura set like this:
application/pdf=org.pwmt.zathura.desktop
application/octet-stream=org.pwmt.zathura.desktop;
application/pdf=org.pwmt.zathura-pdf-mupdf.desktop;

I noticed there are two other zathura.desktop files on my system.
~/.guix-profile/share/applications/org.pwmt.zathura.desktop
~/.guix-profile/share/applications/org.pwmt.zathura-djvu.desktop
~/.guix-profile/share/applications/org.pwmt.zathura-pdf-mupdf.desktop

Perhaps I'm using the wrong one here. I also noticed I have two
application/pdf lines, but the first one is from the Default
Applications section and the second is from Added Associations.

Deleting the first line then makes xdg-open use a browser to open the
pdf, so I don't think that's the right way to go.

I tried also copying the later application/pdf line and replacing the
first one in the file with it, so this way it's still in the file in
both sections. With this, the file opens in zathura again, but clicking
links still does nothing. I'm not sure of the significance of the
semicolon at the end. Some of my lines have it, and some don't.

I'll attach my mimeapps.list file in case it's helpful to look over in
its entirety. (file has been reverted to the state it was at the start
of troubleshooting since none of the changes so far fixed the issue)
[Default Applications]
x-scheme-handler/http=org.qutebrowser.qutebrowser.desktop
x-scheme-handler/https=org.qutebrowser.qutebrowser.desktop
x-scheme-handler/ftp=filezilla.desktop
x-scheme-handler/chrome=org.qutebrowser.qutebrowser.desktop
text/html=org.qutebrowser.qutebrowser.desktop
application/x-extension-htm=org.qutebrowser.qutebrowser.desktop
application/x-extension-html=org.qutebrowser.qutebrowser.desktop
application/x-extension-shtml=org.qutebrowser.qutebrowser.desktop
application/xhtml+xml=org.qutebrowser.qutebrowser.desktop
application/x-extension-xhtml=org.qutebrowser.qutebrowser.desktop
application/x-extension-xht=org.qutebrowser.qutebrowser.desktop
image/png=sxiv-usercreated-0.desktop;sxiv.desktop;
image/jpg=sxiv.desktop
image/gif=mpv.desktop
image/jpeg=sxiv.desktop
image/webp=sxiv.desktop
video/webm=mpv.desktop
application/pdf=org.pwmt.zathura.desktop
x-scheme-handler/magnet=userapp-transmission-gtk-W27CS0.desktop
inode/directory=ranger.desktop

[Added Associations]
x-scheme-handler/http=org.qutebrowser.qutebrowser.desktop
x-scheme-handler/https=org.qutebrowser.qutebrowser.desktop
x-scheme-handler/ftp=filezilla.desktop
x-scheme-handler/chrome=org.qutebrowser.qutebrowser.desktop
text/html=org.qutebrowser.qutebrowser.desktop
application/x-extension-htm=org.qutebrowser.qutebrowser.desktop
application/x-extension-html=org.qutebrowser.qutebrowser.desktop
application/x-extension-shtml=org.qutebrowser.qutebrowser.desktop
application/xhtml+xml=org.qutebrowser.qutebrowser.desktop
application/x-extension-xhtml=org.qutebrowser.qutebrowser.desktop
application/x-extension-xht=org.qutebrowser.qutebrowser.desktop
application/octet-stream=org.pwmt.zathura.desktop;
application/pdf=org.pwmt.zathura-pdf-mupdf.desktop;
x-scheme-handler/magnet=userapp-transmission-gtk-W27CS0.desktop;
text/plain=org.xfce.mousepad.desktop;mousepad.desktop;nvim.desktop;
application/json=mousepad.desktop;
text/x-csrc=mousepad.desktop;
image/png=sxiv-usercreated-0.desktop;
text/markdown=org.xfce.mousepad.desktop;
application/x-shellscript=org.xfce.mousepad.desktop;


bug#43166: The issues.guix.gnu.org is hard to read in emacs-w3m.

2021-12-09 Thread Ricardo Wurmus
This is now fixed in mumi.  I tested the change in eww and in icecat.

This was easier to implement than when the bug was first reported.  Due
to later developments I could limit the use of “pre” to lines in the
“diff” context, so that message text can still be reflown.

Sorry for the delay, but there was no way for me to take enough time to
work on mumi before.  You are all still more than welcome to contribute
to mumi.

I’m reconfiguring the server hosting issues.guix.gnu.org now, so this
change will go live as soon as that’s done.

-- 
Ricardo





bug#52393: pdf links not clickable when zathura is launched via xdg-open

2021-12-09 Thread bdju
Okay, disregard. It seems a double-click is required rather than a
single click, and it takes about 5 seconds for anything to happen. I was
possibly clicking a different number of times between tests. Maybe I was
more desperate when I launched with zathura explicitly. It seems to work
the same both ways now.





bug#52051: [core-updates-frozen] cannot login ('org.freedesktop.login1' service times out)

2021-12-09 Thread Ludovic Courtès
Hello!

Maxim Cournoyer  skribis:

> 374   connect(11, {sa_family=AF_UNIX, 
> sun_path="/var/run/dbus/system_bus_socket"}, 34) = 0

[...]

> 374   epoll_wait(5, [{events=EPOLLIN|EPOLLOUT|EPOLLHUP, data={u32=24802800, 
> u64=24802800}}], 20, -1) = 1
> 374   sendmsg(11, {msg_name=NULL, msg_namelen=0, 
> msg_iov=[{iov_base="l\1\0\1\0\0\0\0\1\0\0\0m\0\0\0\1\1o\0\25\0\0\0/org/freedesktop/DBus\0\0\0\3\1s\0\5\0\0\0Hello\0\0\0\2\1s\0\24\0\0\0org.freedesktop.DBus\0\0\0\0\6\1s\0\24\0\0\0org.freedesktop.DBus\0\0\0\0",
>  iov_len=128}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 
> MSG_DONTWAIT|MSG_NOSIGNAL) = -1 EPIPE (Broken pipe)
> 374   gettid()  = 374
> 374   epoll_ctl(5, EPOLL_CTL_MOD, 11, {events=0, data={u32=24802800, 
> u64=24802800}}) = 0
> 374   timerfd_settime(12, TFD_TIMER_ABSTIME, {it_interval={tv_sec=0, 
> tv_nsec=0}, it_value={tv_sec=0, tv_nsec=1}}, NULL) = 0
> 374   epoll_wait(5, [{events=EPOLLHUP, data={u32=24802800, u64=24802800}}, 
> {events=EPOLLIN, data={u32=24764384, u64=24764384}}], 20, -1) = 2
> 374   read(12, "\1\0\0\0\0\0\0\0", 8)   = 8
> 374   gettid()  = 374
> 374   timerfd_settime(12, TFD_TIMER_ABSTIME, {it_interval={tv_sec=0, 
> tv_nsec=0}, it_value={tv_sec=0, tv_nsec=1}}, NULL) = 0
> 374   epoll_wait(5, [{events=EPOLLHUP, data={u32=24802800, u64=24802800}}, 
> {events=EPOLLIN, data={u32=24764384, u64=24764384}}], 20, -1) = 2
> 374   read(12, "\1\0\0\0\0\0\0\0", 8)   = 8
> 374   gettid()  = 374
> 374   timerfd_settime(12, TFD_TIMER_ABSTIME, {it_interval={tv_sec=0, 
> tv_nsec=0}, it_value={tv_sec=0, tv_nsec=1}}, NULL) = 0
> 374   epoll_wait(5, [{events=EPOLLHUP, data={u32=24802800, u64=24802800}}, 
> {events=EPOLLIN, data={u32=24764384, u64=24764384}}], 20, -1) = 2
> 374   read(12, "\1\0\0\0\0\0\0\0", 8)   = 8
> 374   epoll_ctl(5, EPOLL_CTL_DEL, 11, NULL) = 0
> 374   close(11) = 0
> 374   gettid()  = 374
> 374   epoll_wait(5,  
> 391   <... close resumed>)  = 0
> 391   madvise(0x7fd6c83dc000, 8368128, MADV_DONTNEED) = 0
> 391   exit(0)   = ?
> 391   +++ exited with 0 +++
> 374   <... epoll_wait resumed>[{events=EPOLLERR, data={u32=24768000, 
> u64=24768000}}], 17, -1) = 1
> 374   lseek(7, 0, SEEK_SET) = 0
> 374   read(7, "tty7\n", 63) = 5

As you pointed out on IRC, the initially ‘Hello’ method call above leads
to EPIPE, and we can see that elogind eventually closes its socket to
dbus-daemon *but* keeps doing its thing.

Some interesting things to note…

First, to my surprise, elogind does not use the client library of the
‘dbus’ package:

--8<---cut here---start->8---
$ guix gc --references $(./pre-inst-env guix build elogind)|grep dbus
$ echo $?
1
--8<---cut here---end--->8---

(This is already the case in ‘master’ with v243.7.)  Instead, it has its
own implementation of the DBus protocol, in C, from systemd—we can’t
have enough sources of bugs and vulnerabilities.

Anyway, the “Hello” message is sent to the system bus asynchronously in
‘sd-bus.c’:

--8<---cut here---start->8---
static int bus_send_hello(sd_bus *bus) {
_cleanup_(sd_bus_message_unrefp) sd_bus_message *m = NULL;
int r;

assert(bus);

if (!bus->bus_client)
return 0;

r = sd_bus_message_new_method_call(
bus,
&m,
"org.freedesktop.DBus",
"/org/freedesktop/DBus",
"org.freedesktop.DBus",
"Hello");
if (r < 0)
return r;

return sd_bus_call_async(bus, NULL, m, hello_callback, NULL, 0);
}
--8<---cut here---end--->8---

A callback is called when a reply is received or an error arises:

--8<---cut here---start->8---
static int hello_callback(sd_bus_message *reply, void *userdata, sd_bus_error 
*error) {

[...]

fail:
/* When Hello() failed, let's propagate this in two ways: first we 
return the error immediately here,
 * which is the propagated up towards the event loop. Let's also 
invalidate the connection, so that
 * if the user then calls back into us again we won't wait any longer. 
*/

bus_set_state(bus, BUS_CLOSING);
return r;
}
--8<---cut here---end--->8---

It’s not clear from that whether the authors intended for the thing to
keep going in case of failure.  In our case it’s not helpful.

But why does dbus-daemon drop the connection in the first place?

To know that, we could change ‘dbus-root-service-type’ to run
dbus-daemon from a ‘--enable-verbose-mode’ build, and with the
‘DBUS_VERBOSE’ environment set to 1.

Looking at ‘dbus-server-socket.c’ it would seem that t

bug#52375: webkitgtk page crashes on core-updates-frozen

2021-12-09 Thread Jack Hill

On Wed, 8 Dec 2021, Maxim Cournoyer wrote:


Hello!

Would you be able to run it in GDB to gather a backtrace?  That may
provide clues.

Thanks!


Yes! I was albe to get the following backtrace. Thanks to hikiko [0] 
for tips on using GDB with WebKit browsers


[0] https://eleni.mutantstargoat.com/hikiko/webkit-gdb/


$ gdb .midori-real 1979
GNU gdb (GDB) 11.1
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from .midori-real...
(No debugging symbols found in .midori-real)
Attaching to program: 
/gnu/store/kz9inh53yvpzxv818jq6naiwp6ms85l0-midori-9.0/bin/.midori-real, 
process 1979
[New LWP 1981]
[New LWP 1983]
[New LWP 1984]
[New LWP 1985]
[New LWP 1986]
[New LWP 1987]
[New LWP 1988]
[New LWP 1992]
[New LWP 1993]
[New LWP 1994]
[New LWP 2001]

warning: Unable to find libthread_db matching inferior's thread library, thread 
debugging will not be available.
0x7f70beab5d6f in poll () from 
/gnu/store/2fk1gz2s7ppdicynscra9b19byrrr866-glibc-2.33/lib/libc.so.6
(gdb) c
Continuing.
[New LWP 2395]
[New LWP 2411]

Thread 1 "WebKitWebProces" received signal SIGABRT, Aborted.
0x7f70bea04030 in raise () from 
/gnu/store/2fk1gz2s7ppdicynscra9b19byrrr866-glibc-2.33/lib/libc.so.6
(gdb) bt
#0  0x7f70bea04030 in raise () from 
/gnu/store/2fk1gz2s7ppdicynscra9b19byrrr866-glibc-2.33/lib/libc.so.6
#1  0x7f70be9ee526 in abort () from 
/gnu/store/2fk1gz2s7ppdicynscra9b19byrrr866-glibc-2.33/lib/libc.so.6
#2  0x7f70c4089a52 in WebCore::makeGStreamerElement(char const*, char 
const*) [clone .cold] () from 
/gnu/store/77vf3q1v7aa8h2av7s10fn95b0crh7zc-webkitgtk-with-libsoup2-2.34.1/lib/libwebkit2gtk-4.0.so.37
#3  0x7f70c63be439 in 
WebCore::MediaPlayerPrivateGStreamer::createVideoSink() () from 
/gnu/store/77vf3q1v7aa8h2av7s10fn95b0crh7zc-webkitgtk-with-libsoup2-2.34.1/lib/libwebkit2gtk-4.0.so.37
#4  0x7f70c63c351f in 
WebCore::MediaPlayerPrivateGStreamer::createGSTPlayBin(WTF::URL const&) () from 
/gnu/store/77vf3q1v7aa8h2av7s10fn95b0crh7zc-webkitgtk-with-libsoup2-2.34.1/lib/libwebkit2gtk-4.0.so.37
#5  0x7f70c63c442b in WebCore::MediaPlayerPrivateGStreamer::load(WTF::String 
const&) () from 
/gnu/store/77vf3q1v7aa8h2av7s10fn95b0crh7zc-webkitgtk-with-libsoup2-2.34.1/lib/libwebkit2gtk-4.0.so.37
#6  0x7f70c5c7e19c in 
WebCore::MediaPlayer::loadWithNextMediaEngine(WebCore::MediaPlayerFactory 
const*) () from 
/gnu/store/77vf3q1v7aa8h2av7s10fn95b0crh7zc-webkitgtk-with-libsoup2-2.34.1/lib/libwebkit2gtk-4.0.so.37
#7  0x7f70c5c7e76b in WebCore::MediaPlayer::load(WTF::URL const&, 
WebCore::ContentType const&, WTF::String const&) ()
   from 
/gnu/store/77vf3q1v7aa8h2av7s10fn95b0crh7zc-webkitgtk-with-libsoup2-2.34.1/lib/libwebkit2gtk-4.0.so.37
#8  0x7f70c570d73d in WebCore::HTMLMediaElement::loadResource(WTF::URL const&, 
WebCore::ContentType&, WTF::String const&) ()
   from 
/gnu/store/77vf3q1v7aa8h2av7s10fn95b0crh7zc-webkitgtk-with-libsoup2-2.34.1/lib/libwebkit2gtk-4.0.so.37
#9  0x7f70c570e3e1 in WebCore::HTMLMediaElement::loadNextSourceChild() () 
from 
/gnu/store/77vf3q1v7aa8h2av7s10fn95b0crh7zc-webkitgtk-with-libsoup2-2.34.1/lib/libwebkit2gtk-4.0.so.37
#10 0x7f70c54fc3c2 in WebCore::EventLoop::run() () from 
/gnu/store/77vf3q1v7aa8h2av7s10fn95b0crh7zc-webkitgtk-with-libsoup2-2.34.1/lib/libwebkit2gtk-4.0.so.37
#11 0x7f70c558a80d in WebCore::WindowEventLoop::didReachTimeToRun() () from 
/gnu/store/77vf3q1v7aa8h2av7s10fn95b0crh7zc-webkitgtk-with-libsoup2-2.34.1/lib/libwebkit2gtk-4.0.so.37
#12 0x7f70c5bdaadc in WebCore::ThreadTimers::sharedTimerFiredInternal() () 
from 
/gnu/store/77vf3q1v7aa8h2av7s10fn95b0crh7zc-webkitgtk-with-libsoup2-2.34.1/lib/libwebkit2gtk-4.0.so.37
#13 0x7f70c2a830f5 in 
WTF::RunLoop::TimerBase::TimerBase(WTF::RunLoop&)::{lambda(void*)#1}::_FUN(void*)
 ()
   from 
/gnu/store/77vf3q1v7aa8h2av7s10fn95b0crh7zc-webkitgtk-with-libsoup2-2.34.1/lib/libjavascriptcoregtk-4.0.so.18
#14 0x7f70c2a8333f in WTF::RunLoop::{lambda(_GSource*, int (*)(void*), 
void*)#1}::_FUN(_GSource*, int (*)(void*), void*) ()
   from 
/gnu/store/77vf3q1v7aa8h2av7s10fn95b0crh7zc-webkitgtk-with-libsoup2-2.34.1/lib/libjavascriptcoregtk-4.0.so.18
#15 0x7f70bef5236f in g_main_context_dispatch () from 
/gnu/store/qqs98rxwjrji6aaf6dqwp7q4m545g2sn-glib-2.70.0/lib/libglib-2.0.so.0
#16 0x7f70bef526e8 in g_main_context_iterate.co

bug#51787: GC takes more than 9 hours on berlin

2021-12-09 Thread Mathieu Othacehe


Hey,

New GC recap. The process that has been started yesterday at 04:00 is
still running. I killed the GC that was started today at 04:00 to keep
things clear.

>From yesterday 11:00 when I started monitoring it to today when I'm
writing this email, 20 hours have elapsed and the GC is still in the
same phase: removing recursively the /gnu/store/trash directory content.

It corresponds to the following snippet for those of you who would like
to have a look to the corresponding code:

--8<---cut here---start->8---
if (state.shouldDelete) {
if (pathExists(state.trashDir)) deleteGarbage(state, state.trashDir); // > 
20 hours
try {
createDirs(state.trashDir);
} catch (SysError & e) {
if (e.errNo == ENOSPC) {
printMsg(lvlInfo, format("note: can't create trash directory: %1%") 
% e.msg());
state.moveToTrash = false;
}
}
}--8<---cut here---end--->8---

This is an early phase of the garbage collecting, where store items that
were moved to the trash directory by previous GC runs are effectively
removed.

Stracing the guix-daemon process associated with the GC process clearly
shows what's going on:

--8<---cut here---start->8---
chmod("/gnu/store/trash/272ibwb38i0kcbcl3n9v0ka1rsmd1104-guix-web-site/de/packages/rust-syntex-0.58.1",
 040755) = 0 <0.12>
openat(AT_FDCWD, 
"/gnu/store/trash/272ibwb38i0kcbcl3n9v0ka1rsmd1104-guix-web-site/de/packages/rust-syntex-0.58.1",
 O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 13 <0.11>
fstat(13, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 <0.07>
getdents64(13, 0x397a510 /* 3 entries */, 32768) = 80 <0.005059>
getdents64(13, 0x397a510 /* 0 entries */, 32768) = 0 <0.07>
close(13)   = 0 <0.08>
statx(AT_FDCWD, 
"/gnu/store/trash/272ibwb38i0kcbcl3n9v0ka1rsmd1104-guix-web-site/de/packages/rust-syntex-0.58.1/index.html",
 AT_STATX_SYNC_AS_STAT|AT_SYMLINK_NOFOLLOW, STATX_MODE|STATX_NLINK|STATX_SIZE, 
{stx_mask=STATX_BASIC_STATS|0x1000, stx_attributes=0, stx_mode=S_IFREG|0444, 
stx_size=10265, ...}) = 0 <0.23>
unlink("/gnu/store/trash/272ibwb38i0kcbcl3n9v0ka1rsmd1104-guix-web-site/de/packages/rust-syntex-0.58.1/index.html")
 = 0 <0.13>
rmdir("/gnu/store/trash/272ibwb38i0kcbcl3n9v0ka1rsmd1104-guix-web-site/de/packages/rust-syntex-0.58.1")
 = 0 <0.28>
statx(AT_FDCWD, 
"/gnu/store/trash/272ibwb38i0kcbcl3n9v0ka1rsmd1104-guix-web-site/de/packages/lofreq-2.1.5",
 AT_STATX_
--8<---cut here---end--->8---

Several syscalls are involved to clean the trash directory: chmod,
openat, statx, unlink and rmdir. This does not seem particularly wrong.

What is problematic though is that in 20 hours, the free space has
bumped from 9.6T to 9.7T in the store partition. As the GC lock is
preventing most of Berlin services from running, almost all the machine
IO is dedicated to removing this directory, as shown by iotop.

I'm not sure to understand why this removing process is so long, but if
someone has an idea, I'm all ears. In the meantime, I plan to let the GC
run and keep monitoring it.

Thanks,

Mathieu





bug#52266: (no subject)

2021-12-09 Thread opalvaults (ry)
Probably best to close this as I won't be able to get back around to 
testing. I'm running into sof-firmware issues so I gotta tackle that. 
Thank you for your time!