Welcome Guest ( Log In | Register )

Outline · [ Standard ] · Linear+

 Compress 192-byte string of "A"& "B" as 16 bytes, Or 128-bit..but don't know how to decode

views
     
TSMussel
post Dec 15 2019, 08:12 PM, updated 5y ago

Casual
***
Junior Member
433 posts

Joined: Jun 2016


96-bit + 96-bit = 128-bit decimal.

If I have 192-byte string of two distinct characters, it would be considered a great savings by converting it to two pairs of 96-bit number and add it as 128-bit number, instead of 192-bit number.

But how do I decode the 128-bit number (16 bytes data values) back to 192-bit number (without making the compressed size larger than 192-bit)?

This post has been edited by Mussel: Dec 15 2019, 08:12 PM
wKkaY
post Dec 16 2019, 03:38 PM

misutā supākoru
Group Icon
VIP
6,008 posts

Joined: Jan 2003
Take a simple 1-bit example:

x + y = z
0 + 1 = 1
1 + 0 = 1

If I tell you that z = 1, can you tell me what x and y are?

Generally you can't. Unless there's some additional special properties of x and y that you know about, e.g. no overlapping bits, x always larger than y, etc.
narf03
post Dec 16 2019, 06:47 PM

Look at all my stars!!
*******
Senior Member
4,545 posts

Joined: Dec 2004
From: Metro Prima, Kuala Lumpur, Malaysia, Earth, Sol


technically u can omit info that you do not need or figure a way to reduce the data, for example, instead of saving date (16/12/2019). u can use number of days since 1/1/1980 to reduce the number.

in your example, if the string that you want to encode is limited to "a" to "z", which is only 26 characters(no upper or lower case), then there are 26^192 = 4.74x10^271 possibilities, so you will need a variable that can store that huge number, each byte can store a possibility of 256 possibilities(0-255, 2 byte=256x256, 4 byte=256^4), so you will need at least 113 byte (256^113=1.35*10^272) to store that.

so technically your data need to be base 26(a-z) and compressed data will be using max of base 256(ascii 0 - 255). but if you say your data is already base 256, then there is nothing much u can do about it.

user posted image
user posted image

This post has been edited by narf03: Dec 16 2019, 06:52 PM
TSMussel
post Dec 16 2019, 09:59 PM

Casual
***
Junior Member
433 posts

Joined: Jun 2016


QUOTE(wKkaY @ Dec 16 2019, 03:38 PM)
Take a simple 1-bit example:

x + y = z
0 + 1 = 1
1 + 0 = 1

If I tell you that z = 1, can you tell me what x and y are?

Generally you can't. Unless there's some additional special properties of x and y that you know about, e.g. no overlapping bits, x always larger than y, etc.
*
Yes, it makes sense. Even though I might be able to code it as something like:

x1 + y1 = z1

x2 + y1 = z2

At least y1 is common summand, but still, I cannot figure out which is the first or which is the second: x1 --> y1 or y1 --> x1.

And I have got the answer that adding two 96-bit number together as decimal in C# will result in data loss (rounding error or overflow).

So thanks for your much needed clarification.
TSMussel
post Dec 16 2019, 10:04 PM

Casual
***
Junior Member
433 posts

Joined: Jun 2016


QUOTE(narf03 @ Dec 16 2019, 06:47 PM)
technically u can omit info that you do not need or figure a way to reduce the data, for example, instead of saving date (16/12/2019). u can use number of days since 1/1/1980 to reduce the number.
*
Clever boy!

QUOTE(narf03 @ Dec 16 2019, 06:47 PM)
in your example, if the string that you want to encode is limited to "a" to "z", which is only 26 characters(no upper or lower case), then there are 26^192 = 4.74x10^271 possibilities, so you will need a variable that can store that huge number, each byte can store a possibility of 256 possibilities(0-255, 2 byte=256x256, 4 byte=256^4), so you will need at least 113 byte (256^113=1.35*10^272) to store that.

so technically your data need to be base 26(a-z) and compressed data will be using max of base 256(ascii 0 - 255). but if you say your data is already base 256, then there is nothing much u can do about it.

user posted image
user posted image
*
I think you mean 2^5? 2^5 is already enough to store 32 different characters?

Practically, encoding "a"..."z" needs 5-bit only, the rest 3-bit can be joined with the next byte.....

(2^8 can store 256 different characters, if this is what you meant?)


narf03
post Dec 16 2019, 10:20 PM

Look at all my stars!!
*******
Senior Member
4,545 posts

Joined: Dec 2004
From: Metro Prima, Kuala Lumpur, Malaysia, Earth, Sol


QUOTE(Mussel @ Dec 16 2019, 10:04 PM)
Clever boy!
I think you mean 2^5? 2^5 is already enough to store 32 different characters?

Practically, encoding "a"..."z" needs 5-bit only, the rest 3-bit can be joined with the next byte.....

(2^8 can store 256 different characters, if this is what you meant?)
*
2^5 =32 possibilities, not 32 character.

ie you need to store 3 characters and each of them can be a, b, or c. then you will have 3^3= 27 possibilities
aaa
aab
aac
aba
abb
abc
aca
acb
acc
baa
bab
bac
bba
bbb
bbc
bca
bcb
bcc
caa
cab
cac
cba
cbb
cbc
cca
ccb
ccc

thats all 27 possibilities of the data you need to store, so u can use 2^5 = 5 bit of data(max 32 possibilities), asc(0)=aaa, asc(26)=ccc

so if each character can be a-z then it will be 26^3=26*26*26=17576 possibilities(im not going to list them out), then you will need 15bit(2^15=32768) as 14 bit isnt enough(2^14 = 16384)

This post has been edited by narf03: Dec 16 2019, 10:27 PM
dstl1128
post Dec 23 2019, 02:12 PM

Look at all my stars!!
*******
Senior Member
4,463 posts

Joined: Jan 2003
QUOTE(Mussel @ Dec 15 2019, 08:12 PM)
96-bit + 96-bit = 128-bit decimal.

If I have 192-byte string of two distinct characters, it would be considered a great savings by converting it to two pairs of 96-bit number and add it as 128-bit number, instead of 192-bit number.

But how do I decode the 128-bit number (16 bytes data values) back to 192-bit number (without making the compressed size larger than 192-bit)?
*
If it were just A & B (two symbols), it can be represented in a single bit instead of bytes. So raw saving would be one-eightth (1/8): 192 bytes to 24 bytes. (Excluding the need to store the info about mapping of, eg 0->"A" and 1->"B". )

Using RLE might be able to further reduce it.

This post has been edited by dstl1128: Dec 23 2019, 02:13 PM
TSMussel
post Dec 23 2019, 08:19 PM

Casual
***
Junior Member
433 posts

Joined: Jun 2016


QUOTE(dstl1128 @ Dec 23 2019, 02:12 PM)
If it were just A & B (two symbols), it can be represented in a single bit instead of bytes. So raw saving would be one-eightth  (1/8): 192 bytes to 24 bytes. (Excluding the need to store the info about mapping of, eg  0->"A" and 1->"B". )

Using RLE might be able to further reduce it.
*
Yeah, that's right! nod.gif

 

Change to:
| Lo-Fi Version
0.0137sec    0.44    5 queries    GZIP Disabled
Time is now: 29th March 2024 - 03:03 PM