pypubsub uses a slightly different protocol than svnpubsub, so you shouldn't 
really be testing against it (unless you want to support both protocols, in 
which case that'd be awesome - https://infra.apache.org/pypubsub.html can help 
you there).

The svnpubsub service we have is still located at svn.apache.org:2069.


On 2024/11/28 21:47:22 Daniel Sahlberg wrote:
> On 2023/07/03 09:37:01 Yasuhito FUTATSUKI wrote:
> > Hi,
> > 
> > On 2023/07/03 3:17, James McCoy wrote:
> > > Hi all,
> > > 
> > > Python 3.6 deprecated the asyncore and asynchat modules in favor of the
> > > asyncio module.  Per [PEP 594], that means that the modules
> > > will be removed in Python 3.12, thus breaking svnpubsub/client.py.
> > > 
> > > Is there someone familiar enough with these Python modules to work on
> > > updating svnpubsub?  What's our policy on minimum supported Python
> > > version for the tools?
> > 
> > Before it, nobody check (and/or report) that the scripts in
> > tools/server-side/svnpubsub can run with any version of Python 3.x.
> > 
> > https://cwiki.apache.org/confluence/display/SVN/Subversion%27s+Python+3+Support+Status
> > 
> > It seems that those tools in the section "Not Checked For Python 3 Yet"
> > are now unmantained.
> > 
> > > [PEP 594]: https://peps.python.org/pep-0594/#deprecated-modules
> > > 
> > > Cheers,
> > 
> > Cheers,
> > -- 
> > Yasuhito FUTATSUKI <futat...@yf.bsdclub.org>
> > 
> 
> I took some time today to try to figure out how it all connects. I made a 
> minimal client which I believe should be correct:
> [[[
> def cc(o):
>   print("commit " + repr(o))
> 
> def ec(url, type, o):
>   print("event on " + repr(url) + ": " + repr(type) + "\n" + repr(o))
> 
> def mc(o):
>   print("metadata " + repr(o))
> 
> import client
> c=client.MultiClient(["http://pubsub.apache.org:2069";], cc, ec, mc)
> c.run_forever()
> ]]]
> 
> Running on Python 3.11.2, I only get the following output:
> [[[
> event on 'http://pubsub.apache.org:2069': 'connected'
> None
> event on 'http://pubsub.apache.org:2069': 'closed'
> None
> event on 'http://pubsub.apache.org:2069': 'connected'
> None
> event on 'http://pubsub.apache.org:2069': 'closed'
> None
> ...
> ]]]
> 'closed' come immediately after 'connected'. Between each pair there is 
> approximately 30 seconds
> 
> I also tries settingup svnwcsub.py/svnwcsub.conf and I got it running but 
> with the same events as above.
> 
> Running curl on the same endpoint print a "stillalive" event every 5 seconds 
> (with the occational actual event inbetween):
> [[[
> $ curl http://pubsub.apache.org:2069
> {"stillalive": 1732781004.3811345}
> {"stillalive": 1732781009.3833125}
> ]]]
> 
> Comparing the svnpubsub and curl network traffic using tcpdump, I noted that 
> svnpubsub use HTTP/1.0 while curl use HTTP/1.1. I also saw some differences 
> between the format actual on-the-wire-format by pubsub.apache.org 
> (Transfer-Encoding: chunked, so each record is a tuple consisting of [length] 
> and [json] separated and terminated by \r\n) and the expected formt (record 
> separator \0 and no length field).
> 
> Checking the code for pypubsub [1] it seems the \0 can be added depending on 
> the User-Agent header but it seems to always use Transfer-Encoding: chunked.
> 
> I've tested the below changes:
> 
> [[[
> Index: client.py
> ===================================================================
> --- client.py   (revision 1922187)
> +++ client.py   (working copy)
> @@ -97,7 +97,8 @@
>        self.handle_error()
>        return
> 
> -    self.push(('GET %s HTTP/1.0\r\n\r\n' % resource).encode('ascii'))
> +    self.push(('GET %s HTTP/1.1\r\nHost: %s\r\n\r\n' %
> +              (resource, host + ":" + str(port))).encode('ascii'))
> 
>    def handle_connect(self):
>      self.event_callback('connected', None)
> @@ -113,12 +114,20 @@
>    def found_terminator(self):
>      if self.skipping_headers:
>        self.skipping_headers = False
> -      # Each JSON record is terminated by a null character
> -      self.set_terminator(b'\0')
> +      # pypubsub is using Transfer-Encoding: chunked so each message is
> +      # 0xLength\r\n{json...}\n\r\n
> +      #             ^^^^^^^^^^ <- This is the Length
> +      # Set terminator to the trailing \n plus the second \r\n to catch the
> +      # whole message
> +      self.set_terminator(b'\n\r\n')
>      else:
> -      record = b"".join(self.ibuffer)
> +      record = b"".join(self.ibuffer).decode()
>        self.ibuffer = []
> -      self.parser.feed(record.decode())
> +      # Cut out the length before parsing the json
> +      index = record.find('\r\n')
> +      if index:
> +        record = record[index+2:]
> +      self.parser.feed(record)
> 
>    def collect_incoming_data(self, data):
>      # Remember the last time we saw activity
> ]]]
> 
> With this, the client seems to work. I'd appreciate if someone with Python 
> knowledge can review it before I commit.
> 
> The next step will be to rewrite for asyncio, but I'd like to have a 
> functional client as a baseline.
> 
> Cheers,
> Daniel
> 
> [1] https://github.com/apache/infrastructure-pypubsub/blob/master/pypubsub.py
> 

Reply via email to