Create binary vectors with specific hamming distance

15 Ansichten (letzte 30 Tage)
Balázs Szabó
Balázs Szabó am 16 Feb. 2019
Kommentiert: Balázs Szabó am 16 Feb. 2019
Hi,
Let's say I have a binary vecor [0 0 1 0 0 1 0 0] and I want to create all possible vector that are a specified hamming distance from the original vector.
How can I do that?
(I was trying to create a mask that has only the specified number of bits as 1 and the rest is 0, but I wasn't able to get all the different such vectors)

Akzeptierte Antwort

John D'Errico
John D'Errico am 16 Feb. 2019
Bearbeitet: John D'Errico am 16 Feb. 2019
Why does this seem pretty easy, if you just think about what Hamming distance means?
In a binary vector, all that matters is the number of positions that are different. And in a binary ector, if a position is different at all, that just means that bit is flipped.
Given some binary vector, I'll call it B.
B = [0 0 1 0 0 1 0 0];
How many other vectors exist that lie at a Hamming distance of say 3? B has 8 elements. So if a different vector lies at aHamming distance of 3, then EXACTLY 3 elements have their bits flipped. So how many such vectors are there? nchoosek(8,3), which if my back of the envelope arithmetic is correct, is 56.
Those differences lie at positions
NB = numel(B);
NHD = 3;
NC = nchoosek(NB,NHD);
changes = nchoosek(1:NB,NHD)
1 2 3
1 2 4
1 2 5
...
which is an array with 56 rows, and 3 columns. Therefore, all vectors with a Hamming distance of exactly 3 can be written as
B3 = repmat(B,NC,1);
ind = sub2ind([NC,NB],repmat((1:NC)',1,NHD),changes);
B3(ind) = ~B3(ind);
So each row of B3 lies at a Hamming distance of 3 from B, and no such other vectors exist.
Note for long vectors, this computation may get EXTREMELY memory intensive, if the Hamming distance is at all close to half the length of B. For example, how many vectors lie at a Hamming distance of 50 from any binary vector of length 100?
nchoosek(100,50)
Warning: Result may not be exact. Coefficient is greater than 9.007199e+15 and is only accurate to 15 digits
> In nchoosek (line 92)
ans =
1.0089e+29
Too many to generate, that is how many.

Weitere Antworten (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by