LLMs are surprisingly great at compressing images and audio, DeepMind researchers find

pavnilschanda@lemmy.world · 1 year ago

LLMs are surprisingly great at compressing images and audio, DeepMind researchers find

rubikcuber@programming.dev · 1 year ago

The research specifically looked at lossless algorithms, so gzip

“For example, the 70-billion parameter Chinchilla model impressively compressed data to 8.3% of its original size, significantly outperforming gzip and LZMA2, which managed 32.3% and 23% respectively.”

However they do say that it’s not especially practical at the moment, given that gzip is a tiny executable compared to the many gigabytes of the LLM’s dataset.

NaibofTabr@infosec.pub · 1 year ago

Do you need the dataset to do the compression? Is the trained model not effective on its own?

Tibert@compuverse.uk · 1 year ago

Well from the article a dataset is required, but not always the heavier one.

Tho it doesn’t solve the speed issue, where the llm will take a lot more time to do the compression.

gzip can compress 1GB of text in less than a minute on a CPU, an LLM with 3.2 million parameters requires an hour to compress

rubikcuber@programming.dev · 1 year ago

deleted by creator