On Wed, Oct 30, 2013 at 12:21 PM, <jonas.thornv...@gmail.com> wrote: > I am searching for the program or algorithm that makes the best possible > of completly (diffused data/random noise) and wonder what the state of art > compression is. > > I understand this is not the correct forum but since i think i have an > algorithm that can do this very good, and do not know where to turn for > such question i was thinking to start here. > > It is of course lossless compression i am speaking of. > -- > https://mail.python.org/mailman/listinfo/python-list
>> I am searching for the program or algorithm that makes the best possible of >> completly (diffused data/random noise) and wonder what the state of art >> compression is. None. If the data to be compressed is truly homogeneous, random noise as you describe (for example a 100mb file read from cryptographically secure random bit generator such as /dev/random on *nix systems), the state-of-the-art lossless compression is zero and will remain that way for the foreseeable future. There is no lossless algorithm that will reduce truly random (high entropy) data by any significant margin. In classical information theory, such an algorithm can never be invented. See: Kolmogorov complexity Real world data is rarely completely random. You would have to test various algorithms on the data set in question. Small things such as non-obvious statistical clumping can make a big difference in the compression ratio from one algorithm to another. Data that might look "random", might not actually be random in the entropy sense of the word. >> I understand this is not the correct forum but since i think i have an >> algorithm that can do this very good, and do not know where to turn for such >> question i was thinking to start here. Not to sound like a downer, but I would wager that the data you're testing your algorithm on is not as truly random as you imply or is not a large enough body of test data to draw such conclusions from. It's akin to inventing a perpetual motion machine or an inertial propulsion engine or any other classically impossible solutions. (This only applies to truly random data.) -Modulok-
-- https://mail.python.org/mailman/listinfo/python-list