Re: How to create a file on users XP desktop

2007-10-08 Thread Julius
On Monday 08 October 2007 17:11:25 Tim Golden wrote:
> [EMAIL PROTECTED] wrote:
> > On Oct 8, 9:19 am, goldtech <[EMAIL PROTECTED]> wrote:
> >>> from win32com.shell import shell, shellcon
> >>> desktop = shell.SHGetFolderPath (0, shellcon.CSIDL_DESKTOP, 0, 0)
> >>> 
> >>
> >> Tim,
> >>
> >> How did you learn Win32com?
> >>
> >> Other than the O'Reilly book, I've never found a lot of
> >> documentation.
> >>
> >> Trying to browse COM in PythonWin is tough - there's tons of stuff in
> >> there. I've never been able to find the Win32com classes, methods,
> >> usage examples when I browse COM in PythonWin.
> >>
> >> For example where is, shell.SHGetFolderPath and shellcon.CSIDL_DESKTOP
> >> officially documented?
> >>
> >> Did you learn from using Visual C++ or VB? How did you learn this
> >> stuff?
> >>
> >> Thanks,
> >> Lee G.
> >
> > Pretty much the only place to learn stuff that's not in the PyWin32
> > docs is on one of the MSDN sites. Yes, that can suck. Here's the
> > general page: http://msdn2.microsoft.com/en-us/default.aspx
> >
> > You can google for them too to get direct links to the MSDN page.
> >
> > The ActiveState Python (AKA ActivePython) has an IDE that allows you
> > to browse the COM module. It also has a help file that allows you to
> > browse the PyWin32 docs locally. I think you can download that without
> > downloading ActivePython.
> >
> > Mike
>
> FWIW, the pywin32 distribution itself also comes with a local
> .chm file. But aside from that, there have been several abortive
> attempts -- including by Mike & myself! -- to get some kind of
> online help going for pywin32, but nothing's really gained traction,
> and we've all got more interesting things to be doing...
>
> One point to bear in mind that, more or less, the pywin32 stuff
> just wraps the MS API really closely, mostly doing just enough
> of the messy plumbing to present the API "objects" as Python
> objects. That's to say: find out how to do it from a C++ or VB
> or Delphi tutorial and translating into Python often isn't hard.
>
> As it happens I've been using Windows APIs for a few years,
> so I have a bit of a head start. But I've answered quite
> a few questions on python-win32 by putting the subject line
> into Google, picking a likely-looking response and translating
> it into Python.
>
> In this case ("How to create a file on users XP desktop") the
> question was too broad and tended to throw up user-oriented
> answers. I tried a few permutations, including limiting the
> search to msdn.microsoft.com, none of which showed much on the
> first couple of pages. A search of the pywin32.chm files does
> point in the right direction, but the fact is that the shell
> functionality exposed by Windows which does this kind of
> stuff is non-intuitive.
>
> While I think everyone agrees that the Windows side of Python
> could benefit from more and better docs, the general answer to:
> How do I do X in Python under Windows? is: How do I X under Windows?
>
> TJG
--
Free version Zoner Photo Studio 9 - http://www.zps9.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Urllib.request vs. Requests.get

2021-12-07 Thread Julius Hamilton
Hey,

I am currently working on a simple program which scrapes text from webpages
via a URL, then segments it (with Spacy).

I’m trying to refine my program to use just the right tools for the job,
for each of the steps.

Requests.get works great, but I’ve seen people use urllib.request.urlopen()
in some examples. It appealed to me because it seemed lower level than
requests.get, so it just makes the program feel leaner and purer and more
direct.

However, requests.get works fine on this url:

https://juno.sh/direct-connection-to-jupyter-server/

But urllib returns a “403 forbidden”.

Could anyone please comment on what the fundamental differences are between
urllib vs. requests, why this would happen, and if urllib has any option to
prevent this and get the page source?

Thanks,
Julius
-- 
https://mail.python.org/mailman/listinfo/python-list


HTML extraction

2021-12-07 Thread Julius Hamilton
Hey,

Could anyone please comment on the purest way simply to strip HTML tags
from the internal text they surround?

I know Beautiful Soup is a convenient tool, but I’m interested to know what
the most minimal way to do it would be.

People say you usually don’t use Regex for a second order language like
HTML, so I was thinking about using xpath or lxml, which seem like very
pure, universal tools for the job.

I did find an example for doing this with the re module, though.

Would it be fair to say that to just strip the tags, Regex is fine, but you
need to build a tree-like object if you want the ability to select which
nodes to keep and which to discard?

Can xpath / lxml do that?

What are the chief differences between xpath / lxml and Beautiful Soup?

Thanks,
Julius
-- 
https://mail.python.org/mailman/listinfo/python-list


Short, perfect program to read sentences of webpage

2021-12-08 Thread Julius Hamilton
Hey,

This is something I have been working on for a very long time. It’s one of
the reasons I got into programming at all. I’d really appreciate if people
could input some advice on this.

This is a really simple program which extracts the text from webpages and
displays them one sentence at a time. It’s meant to help you study dense
material, especially documentation, with much more focus and comprehension.
I actually hope it can be of help to people who have difficulty reading. I
know it’s been of use to me at least.

This is a minimally acceptable way to pull it off currently:

deepreader.py:

import sys
import requests
import html2text
import nltk

url = sys.argv[1]

# Get the html, pull out the text, and sentence-segment it in one line of
code

sentences = nltk.sent_tokenize(html2text.html2text(requests.get(url).text))

# Activate an elementary reader interface for the text

for index, sentence in enumerate(sentences):

  # Print the sentence
  print(“\n” + str(index) + “/“ + str(len(sentences)) + “: “ + sentence +
“\n”)

  # Wait for user key-press
  x = input(“\n> “)


EOF



That’s it.

A lot of refining is possible, and I’d really like to see how some more
experienced people might handle it.

1. The HTML extraction is not perfect. It doesn’t produce as clean text as
I would like. Sometimes random links or tags get left in there. And the
sentences are sometimes randomly broken by newlines.

2. Neither is the segmentation perfect. I am currently researching
developing an optimal segmenter with tools from Spacy.

Brevity is greatly valued. I mean, anyone who can make the program more
perfect, that’s hugely appreciated. But if someone can do it in very few
lines of code, that’s also appreciated.

Thanks very much,
Julius
-- 
https://mail.python.org/mailman/listinfo/python-list


Advanced ways to get object information from within python

2021-12-23 Thread Julius Hamilton
Hello,

I would like to significantly increase my abilities to find the information
I am seeking about any Python object I am using from within Python. I find
this to be a really essential skill set. After reading documentation, it
really helps to get under the hood at the command line and start testing
your own competence by examining all the methods and classes, and their
arguments and return types and so on.

I was hoping someone could help me fill in more details about what I
currently know.

I'd like to use Scrapy as an example, since it's a library I'm currently
learning.

import scrapy

I assume I'll start with "dir", as it's the most convenient.

dir(scrapy) shows this:

['Field', 'FormRequest', 'Item', 'Request', 'Selector', 'Spider',
'__all__', '__builtins__', '__cached__', '__doc__', '__file__',
'__loader__', '__name__', '__package__', '__path__', '__spec__',
'__version__', '_txv', 'exceptions', 'http', 'item', 'link',
'linkextractors', 'selector', 'signals', 'spiders', 'twisted_version',
'utils', 'version_info']

I wish there was a convenient way for me to know what all of these are. I
understand "dir" shows everything from the namespace - so that includes
methods which are present only because they are imported from other
modules, by this module.

Let's assume at minimum I know that I should be able to call all these
"attributes" (I believe this is what Python calls them - an attribute can
be anything, a method, a variable, etc. But then, how to distinguish
between this general notion of an "attribute" vs a specific attribute of a
class? Or is that called a "property" or something?)

I can confirm that every single name in the above list works when I call it
from scrapy, like this:

>>> scrapy.Field


>>> scrapy.utils


But I can't conveniently iterate over all of these to see all their types,
because dir() returns a list of strings. How can I iterate over all
attributes?

I can't use "getattr" because that requires you to enter the name of what
you're looking for. I would like to spit out all attributes with their
types, so I can know what my options are in more detail than dir() provides.

This is basically a dead-end for me until someone can illuminate this
strategy I'm pursuing, so now I'd like to focus on inspect and help.

inspect.getmembers is useful in principle, but I find the results to give
information overload.

This is just an excerpt of what it returns:

pprint.pprint(inspect.getmembers(scrapy))
[('Field', ),
 ('Selector', ),
 ('Spider', ),
 ('__all__',
  ['__version__',
   'version_info',
   'twisted_version',
   'Spider',

Why does it just list the name and type for some classes, but for others
goes on to a sublist? __all__ does not list any type in adjacent angular
brackets, it just goes on to list some attributes without any information
about what they are. Can I suppress sublists from being printed with
inspect.getmethods? Or can I recursively require sublists also display
their type?

Lastly, the "help" function.

I find "help" to similarly be a situation of information overload. Again,
it starts with a list of "package contents". I'm not sure I see the use of
this long list of names, without much description of what they are. Next,
it lists "classes", but I don't understand:

builtins.dict(builtins.object)
scrapy.item.Field
parsel.selector.Selector(builtins.object)
scrapy.selector.unified.Selector(parsel.selector.Selector,
scrapy.utils.trackref.object_ref)

What determines the order of these classes - the order in which they appear
in the source code? What about the indentation? builtins.dict() is a Python
builtin. Then why is it listed inside of Scrapy's "help" - are all builtins
necessarily listed inside a class or just the builtins it specifically
imported or inherited?

My best guess is the most indented lines are what is actually written in
the class, the lines above are just listing the inheritance? So
scrapy.item.Field inherits the Python dictionary class, and it does this
because that way you can treat the class like a dictionary sometimes, using
dictionary methods and so on?

class Field(builtins.dict)
 |  Container of field metadata
 |
 |  Method resolution order:
 |  Field
 |  builtins.dict
 |  builtins.object
 |
 |  Data descriptors defined here:

What are data descriptors?

I understand IDE's tend to just print 

Re: Pandas or Numpy

2022-01-23 Thread Julius Hamilton
Hey,


I don’t know but in case you don’t get other good answers, I’m pretty sure
Numpy is more of a mathematical library and Pandas is definitely for
handling spreadsheet data.


So maybe both.


Julius

On Sun 23. Jan 2022 at 18:28, Chris Angelico  wrote:

> On Mon, 24 Jan 2022 at 04:10, Tobiah  wrote:
> >
> > I know very little about either.  I need to handle score input files
> > for Csound.  Each line is a list of floating point values where each
> > column has a particular meaning to the program.
> >
> > I need to compose large (hundreds, thousands, maybe millions) lists
> > and be able to do math on, or possibly sort by various columns, among
> other
> > operations.  A common requirement would be to do the same math operation
> > on each value in a column, or redistribute the values according to an
> > exponential curve, etc.
> >
> > One wrinkle is that the first column of a Csound score is actually a
> > single character.  I was thinking if the data types all had to be the
> > same, then I'd make a translation table or just use the ascii value
> > of the character, but if I could mix types that might be a smidge better.
> >
> > It seems like both libraries are possible choices.  Would one
> > be the obvious choice for me?
> >
>
> I'm not an expert, but that sounds like a job for Pandas to me. It's
> excellent at handling tabular data, and yes, it's fine with a mixture
> of types. Everything else you've described should work fine (not sure
> how to redistribute on an exponential curve, but I'm sure it's not
> hard).
>
> BTW, Pandas is built on top of Numpy, so it's kinda "both".
>
> ChrisA
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list


thanks

2016-01-26 Thread Kawuma Julius
Thank you for developing us.We love you God bless you the more. I am Julius
in Uganda learning computer.
-- 
https://mail.python.org/mailman/listinfo/python-list


Eric3 Help/Tutorial

2006-02-19 Thread Julius Lucks
Does anyone know of Help docs/tutorials that explain Eric3's plethora
of features?

http://www.die-offenbachs.de/detlev/eric3.html

Thanks,

Julius

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to Read Bytes from a file

2007-03-06 Thread Matthias Julius
"Gabriel Genellina" <[EMAIL PROTECTED]> writes:

> En Fri, 02 Mar 2007 08:22:36 -0300, Bart Ogryczak
> <[EMAIL PROTECTED]> escribió:
>
>> On Mar 1, 7:36 pm, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
>> wrote:
>>> Thanks Bart.  That's perfect.  The other suggestion was to precompute
>>> count1 for all possible bytes, I guess that's 0-256, right?
>>
>> 0-255 actually. It'd be worth it, if accessing dictionary with
>> precomputed values would be significantly faster then calculating the
>> lambda, which I doubt. I suspect it actually might be slower.
>
> Dictionary access is highly optimized in Python. In fact, using a
> precomputed dictionary is about 12 times faster:

Why using a dictionary and not a list?

Matthias
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Is numeric keys of Python's dictionary automatically sorted?

2007-03-08 Thread Matthias Julius
"John" <[EMAIL PROTECTED]> writes:

> I am coding a radix sort in python and I think that Python's dictionary may 
> be a choice for bucket.
>
> The only problem is that dictionary is a mapping without order. But I just 
> found that if the keys are numeric, the keys themselves are ordered in the 
> dictionary.
>
> part of my code is like this:
> radix={}
> for i in range(256):
> radix[i]=[]

I wonder why nobody has suggested the use of a list:

redix = [[] for i in range(256)]

Matthias
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: optimizing large dictionaries

2009-01-16 Thread Matthias Julius
Per Freem  writes:

> the only 'twist' is that my elt is an instance of a class (MyClass)
> with 3 fields, all numeric. the class is hashable, and so
> my_dict[elt] works well.  the __repr__ and __hash__ methods of my
> class simply return str() representation of self, 

which just calls __str__().  I guess you are aware of that but you
could call self.__str__() directly.  Maybe that saves something when
you do that 10 million times.

> while __str__ just makes everything numeric field into a
> concatenated string:
>
> class MyClass
>
>   def __str__(self):
> return "%s-%s-%s" %(self.field1, self.field2, self.field3)
>
>   def __repr__(self):
> return str(self)
>
>   def __hash__(self):
> return hash(str(self))

Maybe it would be faster to numerically combine the three fields
instead of hashing the string representation.

Matthias
--
http://mail.python.org/mailman/listinfo/python-list