Форумы на Хакер

Добро пожаловать! Это — архивная версия форумов на «Хакер.Ru». Она работает в режиме read-only.

Программная реализация алгоритма сжатия текста методом LZP (c#)

Пользователи, просматривающие топик: none

Зашли как: Guest

Все форумы >> [Компилируемые языки] >> Программная реализация алгоритма сжатия текста методом LZP (c#)

Имя

Сообщение

<< Старые топики Новые топики >>

Программная реализация алгоритма сжатия текста методом LZP (c#) - 2010-05-28 21:41:39.946666

mr.hankey

Сообщений: 2
Оценки: 0
Присоединился: 2010-05-28 21:37:33.110000

Ребята, выручайте. В понедельник сдавать курсовую работу, а ПО не работает, как положено . Кто чем может, кто советом, кто консультацией.
Суть дела: Делал программу по алгоритму данного метода. В результате получил программу, которая не сжимает текстовые файлы, а наоборот в раза два, три увеличивает. Но всё-таки, у меня есть надежда, потому что программа писалась не отклонясь от алгоритма. Писать с нуля думаю нет смысла, хочу произвести оптимизацию, но пока не выходит. Почему именно присутвует надежда: потому что, есть тестовые примеры подобранные мной , которые показывают превосходные результаты, сжатие в 8 раз.Но на реальном и на маленьком тексте сжатие становится отрицательным.

FileFormatException.cs


 using System;
 
 namespace TermPaper
 {
 
     class FileFormatException : Exception
     {
         public FileFormatException(string message)
             : base(message)
         {
         }
     }
 }

С этим куском кода всё понятно и ясно.


 using System;
 using System.Collections.Generic;
 using System.Linq;
 using System.Text;
 using System.IO;
 
 namespace TermPaper
 {
     class Program
     {
 
         #region Constants 
 
         private const int HashTableSize = 65536;
         private const int LzpMatchFlag = 1;
         private const int LzpNoMatchFlag = 0;
         private const int ContextLength = 2;
         private const int BitsInByte = 8;
 
         #endregion
 
         /// &lt;summary&gt;
         /// Compresses file
         /// &lt;/summary&gt;
         /// &lt;param name="inputFileName"&gt;Input file name&lt;/param&gt;
         /// &lt;param name="outputFileName"&gt;Output file name&lt;/param&gt;
         private static void Pack(string inputFileName, string outputFileName)
         {
             Dictionary&lt;ushort, int&gt; hashTable = new Dictionary&lt;ushort, int&gt;(HashTableSize);
             // load entire file into the array of bytes
             byte[] input = File.ReadAllBytes(inputFileName);
 
             using (FileStream fs = new FileStream(outputFileName, FileMode.OpenOrCreate))
             {
                 int currentOffset = 0;
                 // output first 2 bytes - context
                 fs.Write(input, 0, currentOffset += ContextLength);
                 
                 while (currentOffset &lt; input.Length)
                 {
                     ushort hash = (ushort)((input[currentOffset - ContextLength] &lt;&lt; BitsInByte) + input[currentOffset - ContextLength + 1]);
                     if (hashTable.ContainsKey(hash))
                     {
                         int matchingLength = CompareSubArrays(input, currentOffset, hashTable[hash]);
                         if (matchingLength &gt; 0)
                         {
                             // write the matching byte and the count of matching bytes
                             fs.WriteByte(LzpMatchFlag);
                             byte[] bytesToWrite = BitConverter.GetBytes(matchingLength);
                             fs.Write(bytesToWrite, 0, bytesToWrite.Length);
                             hashTable[hash] = currentOffset;
                             currentOffset += matchingLength;
                             continue;
                         }
                     }
                     // no hashtable record - create a new one
                     hashTable[hash] = currentOffset;
                     fs.WriteByte(LzpNoMatchFlag);
                     fs.WriteByte(input[currentOffset]);
                     currentOffset++;
                 }
             }
         }
 
         /// &lt;summary&gt;
         /// Decompresses file
         /// &lt;/summary&gt;
         /// &lt;param name="inputFileName"&gt;Input file name&lt;/param&gt;
         /// &lt;param name="outputFileName"&gt;Output file name&lt;/param&gt;
         private static void Unpack(string inputFileName, string outputFileName)
         {
             Dictionary&lt;ushort, int&gt; hashTable = new Dictionary&lt;ushort, int&gt;(HashTableSize);
             // load entire file into the array of bytes
             byte[] input = File.ReadAllBytes(inputFileName);
 
             // this is ineffective, but who cares:)
             List&lt;byte&gt; outputBytes = new List&lt;byte&gt;();
 
             int currentInputOffset = 0;
             // output first 2 bytes - context
             AddBytes(outputBytes, input, 0, currentInputOffset += ContextLength);
             int outputOffset = ContextLength;
 
             while (currentInputOffset &lt; input.Length)
             {
                 ushort hash = BitConverter.ToUInt16(
                     GetBytes(outputBytes, outputOffset - ContextLength, sizeof(ushort)), 0);
 
                 if (input[currentInputOffset] == LzpMatchFlag)
                 {
                     byte[] intBytes = ExtractIntBytes(input, currentInputOffset);
                     int matchingLength = BitConverter.ToInt32(intBytes, 0);
                     AddBytes(outputBytes, 
                         GetBytes(outputBytes, hashTable[hash], matchingLength),
                         0,
                         matchingLength);
 
                     currentInputOffset += intBytes.Length;
                     hashTable[hash] = outputOffset;
                     outputOffset += matchingLength;
                 }
                 else if (input[currentInputOffset] == LzpNoMatchFlag)
                 {
                     currentInputOffset++;
                     outputBytes.Add(input[currentInputOffset]);
                     hashTable[hash] = outputOffset;
                     outputOffset++;
                 }
                 else
                 {
                     throw new FileFormatException(string.Format("Wrong file format: unexpected byte at the offset {0}", currentInputOffset));
                 }
 
                 currentInputOffset++;
             }
             File.WriteAllBytes(outputFileName, outputBytes.ToArray());
         }
 
         /// &lt;summary&gt;
         /// Adds bytes to the list
         /// &lt;/summary&gt;
         /// &lt;param name="bytesList"&gt;Output list&lt;/param&gt;
         /// &lt;param name="inputArray"&gt;Array to take bytes from&lt;/param&gt;
         /// &lt;param name="offset"&gt;Input array offset&lt;/param&gt;
         /// &lt;param name="count"&gt;Bytes count&lt;/param&gt;
         private static void AddBytes(List&lt;byte&gt; bytesList, byte[] inputArray, int offset, int count)
         {
             if (inputArray.Length &lt; offset + count)
             {
                 throw new ArgumentException("Offset and count mismatch");
             }
             for (int i = 0; i &lt; count; i++)
             {
                 bytesList.Add(inputArray[offset + i]);
             }
         }
 
         /// &lt;summary&gt;
         /// Gets bytes subarray from the given list
         /// &lt;/summary&gt;
         /// &lt;param name="bytesList"&gt;Given list&lt;/param&gt;
         /// &lt;param name="offset"&gt;List offset&lt;/param&gt;
         /// &lt;param name="count"&gt;Number of bytes&lt;/param&gt;
         /// &lt;returns&gt;&lt;/returns&gt;
         private static byte[] GetBytes(List&lt;byte&gt; bytesList, int offset, int count)
         {
             if (bytesList.Count &lt; offset + count)
             {
                 throw new ArgumentException("Offset and count mismatch");
             }
             byte[] result = new byte[count];
             for (int i = 0; i &lt; count; i++)
             {
                 result[i] = bytesList[offset + i];
             }
             return result;
         }
 
         /// &lt;summary&gt;
         /// Extracts int subarray in the correct order from the input array
         /// &lt;/summary&gt;
         /// &lt;param name="input"&gt;Input array&lt;/param&gt;
         /// &lt;param name="offset"&gt;Offset in the input array&lt;/param&gt;
         /// &lt;returns&gt;&lt;/returns&gt;
         private static byte[] ExtractIntBytes(byte[] input, int offset)
         {
             int sizeOfInt = sizeof(int);
             byte[] result = new byte[sizeOfInt];
             for (int i = sizeOfInt - 1; i &gt;= 0; i--)
             {
                 result[i] = input[offset + i + 1];
             }
             return result;
         }
 
         /// &lt;summary&gt;
         /// Compares 2 subarrays in the given array
         /// &lt;/summary&gt;
         /// &lt;param name="input"&gt;Input array&lt;/param&gt;
         /// &lt;param name="inputOffset"&gt;First offset&lt;/param&gt;
         /// &lt;param name="hashTableOffset"&gt;Second offset&lt;/param&gt;
         /// &lt;returns&gt;Match length&lt;/returns&gt;
         private static int CompareSubArrays(byte[] input, int inputOffset, int hashTableOffset)
         {
             int matchingLength = 0;
             while ((inputOffset + matchingLength &lt; input.Length) // bounds check
                 && (inputOffset &gt; hashTableOffset + matchingLength) // do not intersect subarrays!
                 && (input[inputOffset + matchingLength] == input[hashTableOffset + matchingLength]))
             {
                 matchingLength++;
             }
             return matchingLength;
         }
 
         static void Main(string[] args)
         {
             Pack("Base.txt", "Compressed.txt");
             Unpack("Compressed.txt", "Uncompressed.txt");
         }
     }
 }

Ну и собственно, полная реализация алгоритма LZP. Также выкладываю проект программы с тестовым примером, показывающим превосходное сжатие.

Post #: 1

RE - 2010-05-28 21:43:58.306666

mr.hankey

Сообщений: 2
Оценки: 0
Присоединился: 2010-05-28 21:37:33.110000

http://ifolder.ru/17918364 проект !

Post #: 2

Страниц: [1]

Все форумы >> [Компилируемые языки] >> Программная реализация алгоритма сжатия текста методом LZP (c#)

Связаться:
Вопросы по сайту / xakep@glc.ru

Предупреждение: использование полученных знаний в противозаконных целях преследуется по закону.