SmartExpert

(118)

$30/per page/Negotiable

About SmartExpert

Levels Tought:
Elementary,Middle School,High School,College,University,PHD

Expertise:
Accounting,Business & Finance See all
Accounting,Business & Finance,Economics,English,HR Management,Math Hide all
Teaching Since: Apr 2017
Last Sign in: 56 Weeks Ago, 5 Days Ago
Questions Answered: 7570
Tutorials Posted: 7352

Education

  • BS,MBA, PHD
    Adelphi University/Devry
    Apr-2000 - Mar-2005

Experience

  • HOD ,Professor
    Adelphi University
    Sep-2007 - Apr-2017

Category > Programming Posted 06 Nov 2017 My Price 15.00

JAVA PROGRAM

JAVA PROGRAM

 

Normally text files are encoded in binary by translating each character to a fixed-length binary pattern. For example, ASCII uses an 8-bit string to encode each character, and Unicode uses a 16-bit string. Thus the number of bits to encode a file would be 8 * number of characters for an ASCII encoded file. One simple method for compressing text files is Huffman Encoding. Instead of using fixed-length bit patterns for the characters, the patterns vary in length, so that words that are more frequently used have shorter patterns and words that are less commonly used have longer patterns. Thus the overall number of bits to encode a file may be less than in the original coding scheme. In Huffman Encoding no character's bit pattern is a prefix for another character's. It is known where the bit pattern for one character ends and the bit pattern for the next character begins. Thus there's no need for spaces between the characters. Huffman encoding can also be done on words rather than characters.

To implement Huffman encoding we put all the characters (or words) into the leaves of a binary tree, with the more frequent characters higher up in the tree, and less frequent characters lower down in the tree. Every internal node in the tree has two children, and again, the characters are stored in the leaves. To find the bit pattern for a character, just trace the path from the root to the node containing that character and note the direction taken at each level. Start at the root and set the initial bit pattern to "". If you go left toward the leaf, add a '0' to the end of the bit pattern, and if you go right, add a '1' to the end of the bit pattern. When you get to the leaf, the bit pattern is the representation of the character in that leaf. Note that this ensures that no character's bit pattern is a prefix of another, as no path to a leaf is a prefix to another leaf's path.

To construct the tree, we begin by putting each character (or word) into a single-node tree by itself. The node contains the character and its frequency count. We combine two nodes together into a tree by creating a root (with empty key as only leaves have keys), setting the frequency count to be the sum of the frequency counts of the two nodes, making one node the left node of the root, and the other node the right node of the root (it doesn't matter which is which). Similarly, we can combine two trees (with multiple leaves) into one tree by making a new root (with no key), setting its frequency count to the sum of the counts in the roots of the two trees, and making the two trees the children of this new root.

To construct the whole tree, we start with the single-node trees, pick the two nodes with the smallest two frequency counts, and combine them into a single tree. We then repeat - always finding the two trees with the smallest frequency counts and combining them in to a single tree. This is repeated until there's only one tree, and that tree is used to create the bit patterns as noted above.

We can then use the bit patterns to encode the file, by replacing each character (or word) by its bit pattern. Again, there's no need for spaces being the patterns.

 

 

 

 

 

 

An example

Here's an example of using Huffman encoding on words. Given the input:

this is a test this is only a test in the the event of an

actual emergency you would be told where to tune on your radio dial

this is only a test

We would first find the frequency of the words:

to 1

is 3

test 3

your 1

told 1

a 3

emergency 1

dial 1

you 1

the 2

tune 1

where 1

in 1

of 1

would 1

event 1

an 1

radio 1

on 1

only 2

actual 1

this 3

be 1

and then assign bit patterns to the words as follows:

to 10100

is 001

test 1110

your 10010

told 01101

a 000

emergency 11000

dial 10011

you 01000

the 1000

tune 01011

where 11001

in 111110

of 111111

would 11110

event 01110

an 01111

radio 01001

on 01100

only 1011

actual 01010

this 1101

be 10101

In this example, the original file has 113 characters (not including spaces), for a total of 904 bits (using 8-bit ASCII), while the encoded file would have only 145 bits.

Program

For this assignment, write a Java program that will read in a file, count the word frequencies, and then assign Huffman encodings to all the (unique) words based on those frequencies. The program should print out the number of bits need to encoded the original file (you may exclude counting spaces), and the number of bits for the encoded file. Note that you need not encode file, but just say how many bits it would take. the program should read from standard input and write to standard output (can you read the file just once?). Run your program on the bible.The program should be efficient. State what ADT(s) you use and how they are implemented. Give the asymptotic running time of your program.

 

Before you start tell me what ADT(s)andData Structureyou use and how they are implemented.

Thank you very much.

 

 

Answers

(118)
Status NEW Posted 06 Nov 2017 06:11 PM My Price 15.00

JAV-----------A P-----------ROG-----------RAM----------- An-----------swe-----------r-----------

Attachments

1509991913-JAVA PROGRAM Answer.zip
Not Rated(0)