Data Compression. RunLength Encoding. Data Compression. Huffman Coding. Lempel-Ziv Compression

Страницы работы

33 страницы (Word-файл)

Содержание работы

Data Compression

RunLength Encoding

RLE (RUN LENGTH ENCODING)

Aims to save money and time by reducing amount of data transmitted

Instead of sending <char> <char><char><char><char><char><char> Send ESC 7 <char> When data includes ESC, send ESC ESC

Text usually unsuitable for RLE only contains repeated space chars

more elegant; <char> acts as its own esc

Binary files better often contain repeated chars, especially NUL

RLE is used for encoding faxes

Data Compression

Huffman Coding

HUFFMAN CODING

ASCII encodes all characters with 7 bits

Characters occur with unequal frequencies f e =100 x fq

Use fewer bits to encode most-common

Makes boundaries between characters hard to find

Data Compression

Huffman Coding

HUFFMAN CODING

Consider an alphabet with only 4 letters

00

00

00

01

01

01

00

10

10

11

AAAABBBCCD

1

1

1

01

01

01

1

001

001

000

20 bits → 19 bits better compression with larger alphabets!

Data Compression

Huffman Coding

HUFFMAN CODING

Both ends must agree on the code set

Most efficient to create a code specifically for the data being sent Allows for different letter frequencies in different languages

Fax machines use a modified Huffman scheme. codes for sequences containing 1, 2, 3, .... , 63, 64 black or white dots 128, 192, ... (i.e. multiples of 64) dots

So 67-dot sequence would be sent as codes for 64 then 3

Data Compression

Lempel-Ziv Compression

LEMPEL-ZIV COMPRESSION

ZIP and UNIX Compress utility use (modified) L-Z compression

Codes are fixed-length (usually 12 or 16-bits )

7-bit ASCII for single characters + nine 0-bits Inefficient when sending single characters But after a while, very few single characters get sent

Extra codes for most common character sequences Sender creates extra codes based on the letter frequencies in the message Receiver constructs extra codes while decompressing the original message

Data Compression

Lempel-Ziv Compression

LEMPEL-ZIV COMPRESSION

the Remaining string

R denotes the unsent part of the message

Initially, R is the complete message

A code table relates character sequences to codes

a 1 b 2 . z 26 . th 27 . the 79 . then 158

Initially, code table just contains the alphabet Sender and receiver have the same table

New sequences are added when they are encountered sender and receiver both add the same codes easy for the sender!

L denotes the longest string of characters… starting from the first character of R occurring in the code table

L’ denotes L + the next character in R

Data Compression

Lempel-Ziv Compression

LEMPEL-ZIV COMPRESSION

…they then and there theorised that this was thus

Sender identifies L, sends the code for L to the receiver

Receiver receives code looks it up in the code table adds L to the message string

Sender identifies L’, & makes a new entry in the code table for L’

Data Compression

Lempel-Ziv Compression

LEMPEL-ZIV COMPRESSION

Data Compression

Lempel-Ziv Compression

LEMPEL-ZIV COMPRESSION

Receiver

Sender

a 1 b 2 c 3 d 4 e 5 f 6 g 7 h 8 i 9 j 10 k 11 l 12 m 13 n 14 o 15 p 16 q 17 r 18 s 19 t 20 u 21 v 22 w 23 x 24 y 25 z 26 _ 27

a 1 b 2 c 3 d 4 e 5 f 6 g 7 h 8 i 9 j 10 k 11 l 12 m 13 n 14 o 15 p 16 q 17 r 18 s 19 t 20 u 21 v 22 w 23 x 24 y 25 z 26 _ 27

t

h

e

y

_

t

h

e

n

_

a

n

d

_

t

h

e

r

e

_

t

h

e

o

r

i

s

e

d

_

t

h

a

t

_

t

h

i

s

_

w

a

s

_

t

h

u

s

t

h

R L L’

20

28

Data Compression

Lempel-Ziv Compression

LEMPEL-ZIV COMPRESSION

Receiver

Sender

a 1 b 2 c 3 d 4 e 5 f 6 g 7 h 8 i 9 j 10 k 11 l 12 m 13 n 14 o 15 p 16 q 17 r 18 s 19 t 20 u 21 v 22 w 23 x 24 y 25 z 26 _ 27

a 1 b 2 c 3 d 4 e 5 f 6 g 7 h 8 i 9 j 10 k 11 l 12 m 13 n 14 o 15 p 16 q 17 r 18 s 19 t 20 u 21 v 22 w 23 x 24 y 25 z 26 _ 27

e

e

h

y

_

t

h

e

n

_

a

n

d

_

t

h

e

r

e

_

t

h

e

o

r

i

s

e

d

_

t

h

a

t

_

t

h

i

s

_

w

a

s

_

t

h

u

s

t

h

R L L’

8

29

Data Compression

Lempel-Ziv Compression

LEMPEL-ZIV COMPRESSION

Receiver

Sender

a 1 b 2 c 3 d 4 e 5 f 6 g 7 h 8 i 9 j 10 k 11 l 12 m 13 n 14 o 15 p 16 q 17 r 18 s 19 t 20 u 21 v 22 w 23 x 24 y 25 z 26 _ 27

a 1 b 2 c 3 d 4 e 5 f 6 g 7 h 8 i 9 j 10 k 11 l 12 m 13 n 14 o 15 p 16 q 17 r 18 s 19 t 20 u 21 v 22 w 23 x 24 y 25 z 26 _ 27

y

e

y

_

t

h

e

n

_

a

n

d

_

t

h

e

r

e

_

t

h

e

o

r

i

s

e

d

_

t

h

a

t

_

t

h

i

s

_

w

a

s

_

t

h

u

s

t

h

e

R L L’

5

30

Data Compression

Lempel-Ziv Compression

LEMPEL-ZIV COMPRESSION

Receiver

Похожие материалы

Информация о работе

Тип:
Написанные программы на языках программирования
Размер файла:
3 Mb
Скачали:
0