Contact us via our online contact form . Online contact form
Want to be a reporter or would you like to buy a report for the best price?
Just Sign Up here!
Privacy guidelines License our content Help
When Jarek Duda invented an important new compression technique called asymmetric numeral systems (ANS) a few years ago, he wanted to make sure it would be available for anyone to use. So instead of seeking patents on the technique, he dedicated it to the public domain. Since 2014, Facebook, Apple, and Google have all created software based on Duda's breakthrough, reported Ars Technica.
But now Google is seeking a patent that would give it broad rights over the use of ANS for video compression. And Duda, a computer scientist at Jagiellonian University in Poland, isn't happy about it.
Google denies that it's trying to patent Duda's work. A Google spokesperson told Ars that Duda came up with a theoretical concept that isn't directly patentable, while Google's lawyers are seeking to patent a specific application of that theory that reflects additional work by Google's engineers.
But Duda says he suggested the exact technique Google is trying to patent in a 2014 email exchange with Google engineers—a view largely endorsed by a preliminary ruling in February by European patent authorities.
The European case isn't over, though, and Google is also seeking a patent in the United States.
We first started looking into this issue after we got an email about it from Duda back in March. After weeks of back-and-forth discussions, Google finally provided us with an on-the-record statement about the patent—albeit a very bland one. It stated that Google had included information about Duda's prior work in its application and that "we await and will respect the USPTO's determination."
But a few days later, Google sent a follow-up statement with a different tone.
"Google has a long-term and continuing commitment to royalty-free, open source codecs (e.g., VP8, VP9, and AV1) all of which are licensed on permissive royalty-free terms, and this patent would be similarly licensed."
Duda isn't convinced, though. "We can hope for their goodwill; however, there are no guarantees," he said in an email to Ars. "Patents licensed in 'permissive royalty-free terms' usually have a catch."
Duda wants the company to recognize him as the original inventor and legally guarantee that the patent will be available for anyone to use. Or better yet, stop pursuing the patent altogether.
ANS: Better, faster compression
Computers represent data using strings of ones and zeros. For example, the ASCII encoding scheme uses a seven-bit string to represent alphanumeric characters.
Data compression techniques represent data more compactly by exploiting the fact that symbols do not appear with equal frequency. In English text, for example, the character "e" appears much more often than "z" or "x." So rather than representing every character with seven bits, an efficient scheme might use three or four bits to represent the most common letters while using more than seven bits to represent the least common.
A standard way to do this is known as Huffman coding, which works well when dealing with symbols whose probabilities are inverse powers of two. Information theory says that the optimal encoding makes the length of each symbol (in bits) proportional to the negative logarithm of its probability. For example, suppose you're trying to encode the symbols A (P=1/2), B (P=1/4), C (P=1/8), and D (P=1/8). In that case, an optimal encoding might be A=0, B=10, C=110, D=111.
This is optimal because log2(1/2) is -1, so A should have a 1-bit representation, log2(1/4) is -2, so B should have a 2-bit representation, and log2(1/8)=-3, so C and D should have 3-bit representations.
But Huffman encoding doesn't do as good a job when symbol probabilities are not inverse powers of two. For example, if your symbols are E (P=1/3), F (P=1/3), G (P=1/6), and H (P=1/6), Huffman coding isn't so efficient. Information theory says that E and F should be represented by bit strings 1.584 bits long, while G and H should be represented by strings 2.584 bits long.
That's impossible with Huffman coding—any Huffman code will use too many bits to represent some symbols and too few to represent others. As a result, data compressed with Huffman coding techniques will often wind up being longer than it needs to be.
But it is possible to effectively represent symbols with a non-integer number of bits if you relax the requirement that each symbol be represented by a specific, discrete bit string. For example, a technique called arithmetic coding subdivides the real number line between 0 and 1 so that each symbol's share of the interval is proportional to the frequency with which the symbol is expected to appear in the data. The region corresponding to the first symbol is identified, then that region is sub-divided (again, with each symbol's share proportional to its frequency) to encode the second symbol, and so forth.
Once all symbols have been encoded, the system uses a long binary string (something like 0.1010010100111010110...) to represent the exact point on the number line corresponding to the encoded string. This approach achieves a level of compression that's close to the theoretical maximum. But because it involves multiplication of arbitrary-precision fractional values, the encoding and decoding steps are computationally expensive.
Duda's breakthrough was to develop an encoding scheme, called asymmetric numeral systems (ANS), that provides the best of both worlds. It can represent a string of symbols about as compactly as arithmetic coding. But the encoding and decoding steps are fast, like Huffman codes.
The technique has been picked up rapidly by people making real software. Facebook announced a new compression algorithm called ZStandard based on Duda's work in 2016. Apple incorporated ANS into its LZFSE compression algorithm around the same time. Google has incorporated ANS into its Draco library for compressing 3-D point clouds, as well as a new image compression format called Pik.
Google is patenting the use of ANS for video compression
Compression of images and video fundamentally works the same way as compression of text. Compression software looks for statistical patterns in an image—colors or shapes that occur much more frequently than average, for example. Video encoders often use mathematical transformations of the data to identify subtle regularities.
Then they compress the image by using shorter bit strings to represent patterns that show up more frequently. ANS-based algorithms can be used to encode image data from a video as easily as a string of alphanumeric symbols.
Duda didn't just develop the basic ideas for ANS; he has also been an evangelist for the technique. In January 2014, he posted to a video codec developers email list, suggesting that ANS could be used for video encoding formats like Google's VP9.
Paul Wilkins, a senior technologist involved in developing VP9, responded that "this is not something that we can retro fit to VP9 at this stage, but it is worth looking at for a future codec."
A couple of years later, Google filed an application for a patent called "mixed boolean-token ANS coefficient coding." Like any patent application, this one was dense with legal jargon. But the patent claims—the most important part, legally speaking—are fairly clear. The first one claims the concept of using an entropy decoder state machine that includes a Boolean ANS decoder and a symbol ANS decoder—both versions of ANS pioneered by Duda—to decode the stream of symbols. Those symbols represent video broken down into "frames, the frames having blocks of pixels." Those blocks of pixels, in turn, are represented by a sequence of transform coefficients.
Duda argues that this "invention" just applies ANS to a conventional video decoding pipeline. Most efficient video compression schemes represent video frames as blocks of pixels and use mathematical transformations to represent those blocks using symbols that can be compressed efficiently. The only significant innovation, in Duda's view, is that this patent claims the use of ANS to encode those symbols.
Over the last couple of months, we've repeatedly asked Google to put us in touch with a Google technology expert who can explain exactly what Google invented and how it goes beyond Duda's own work. Google never made someone like that available to us, so we can't explain how Google distinguishes its own invention from Duda's original work. But Duda's argument that Google's patent just applies ANS to a conventional video decoder seems pretty plausible.
Indeed, that's the conclusion the European patent office reached in a preliminary ruling on the topic. "The subject matter of claim 1 does not involve an inventive step," a February European Patent Office ruling said. The information Duda provided in that January 2014 email thread "would allow a skilled person to reach the invention without having to apply any inventive skills."
That obviously isn't an encouraging sign for Google. But the European patent process isn't over, and we're still waiting for a ruling from the US Patent and Trademark Office.