This weekend I’ve been working again on the iPod Shuffle randomness… I’ve been quite busy lately, that is why I haven’t been posting.
In this second post I will be finally using an iPod Shuffle and will show the first results on it’s degree of randomness, this results are probably
not statistically correct, as I don’t have enough samples yet…
The Analyzer /dev/dsp
As you might remember in last iPod post I gave a program that converted some WAV files to a number by analyzing it’s content, now I modified this program
by taking the data from /dev/dsp insted that from a WAV file.
Data is taken in small chunks, the FFT is done on this chunk, when the FFT is has been finished I search for the frequency peak of the sample. This peak
is used to determine which tone was in the original file.
The test set
I’ve filled the iPod with exactly 256 audio files, this number of files is exactly 1 byte, we will see why I’m using this number of files later. Each of
this files has an Id number which is a prime number so the id of the file ranges from 2 to 1619 ( the first 256 prime numbers ), this numeration is used as error
detection. Here is a sample of the number 1531 and 1123, if anyone needs the full
set just post a comment and I’ll be glad to send it.
The first results
After uploading the files to the iPod the test can begin, the output of the Shuffle is plugged into the Mic input of the pc. The iPod is put in shuffle position,
the analyzer program is started and then the play button on the iPod is pressed ( iPod volume has to be put to the maximum ). Some hours later, when the Shuffle’s
battery has been emptied ( 8 hours or so ), we can have a look at the output file. The file contains 9322 numbers ranging from 2 to 1619. This numbers have
not been checked against errors.
Text output to stream
It’s time to check the previous output for errors and to convert it to a stream for further processing, this binary stream will be passed to a program
that analyzes it’s randomness. The range 2-1919 is converted to a range of 0×00-0xFF ( 1byte ), we are using 1 byte values as the randomness analyzer
takes 1 byte values. The range conversion is done by assigning a number to each prime, for example 2 is 1, 3 is 2, 5 is 3, … , 71 is 20, … , 809 is 140, … and
1619 is 255. If any of the values of the input file can not be converted to the range is discarded and is considered as a detection error, I’ve not
implemented error correction, but it will be done later.
Randomness
At this point we have a binary file with the random data obtained from the iPod, we pass this output to ENT, this
program applies a series of tests to a stream of bytes, this tests help determine the randomness of this sequence.
A full example
Now I’ll show the process and results obtained from this first test. At the end of this post all source code for the programs I’ve used can be obtained, of
course this sources are GNU/GPL.
First of all we generate the files that will be uploaded to the iPod Shuffle, this script uses the min_gen_number.sh script to generate wav files,
and then convert them to MP3 files:
# ./whole_set.sh
Those 256 files are uploaded to the Shuffle using gtkpod or anything else… Now we need mini-jack to mini-jack cable to connect the iPod’s
headphone output to the Mic input of the computer. iPod’s volume has to be high when playing back files, also remember to put the iPod in Shuffle position.
# ./analyze_dsp "out$(date).txt"
( 8 hours later )
# cat "outSun Sep 4 21:39:14 CEST 2005.txt" | wc -l
9322
Now we have a file called outSun Sep 4 21:39:14 CEST 2005.txt, this file contains 9322 numbers
which can contain errors. Now we have to transform this output to binary, and with the range of 0-255.
# ./streamize "outSun Sep 4 21:39:14 CEST 2005.txt" Sunday.bin
ERRROR 255
ERRROR 525
ERRROR 489
ERRROR 485
ERRROR 235
ERRROR 803
# ls -l Sunday.bin
-rw-r--r-- 1 esteve users 9316 Sep 5 12:21 Sunday.bin
A binary file Sunday.bin has been created, as we can see it has detected some errors, this numbers have been discarded.
The size of the file is 9322 - 6, where 6 are the number of errors. Now this binary file is used as input to ENT:
# ./ent Sunday.bin
Entropy = 7.999828 bits per byte.
Optimum compression would reduce the size
of this 9316 byte file by 0 percent.
Chi square distribution for 9316 samples is 2.22, and randomly
would exceed this value 99.99 percent of the times.
Arithmetic mean value of data bytes is 127.4317 (127.5 = random).
Monte Carlo value for Pi is 3.110824742 (error 0.98 percent).
Serial correlation coefficient is 0.018126 (totally uncorrelated = 0.0).
We can compare this result with the result obtained by suppling ENT with a true random file obtained from timing radioactive decay events at
Hotbits:
# ./ent Hotbits
Entropy = 7.979351 bits per byte.
Optimum compression would reduce the size
of this 8192 byte file by 0 percent.
Chi square distribution for 8192 samples is 232.50, and randomly
would exceed this value 75.00 percent of the times.
Arithmetic mean value of data bytes is 126.8322 (127.5 = random).
Monte Carlo value for Pi is 3.129670330 (error 0.38 percent).
Serial correlation coefficient is -0.016542 (totally uncorrelated = 0.0).
Sources
Update: By the way this Blog was one year old on the 2nd of september!! I forgot to post it.