summaryrefslogtreecommitdiff
path: root/src/prng/rand_r.c
diff options
context:
space:
mode:
authorRich Felker <dalias@aerifal.cx>2013-06-12 18:20:48 -0400
committerRich Felker <dalias@aerifal.cx>2013-06-12 18:20:48 -0400
commit20d01d83b5a13c77805976e7c520f566244ba3ff (patch)
tree3cfb70bf958e5b621b70c7099be59de534068b9f /src/prng/rand_r.c
parent4191d2447673e6b03722198871af0daf9390cb51 (diff)
downloadmusl-20d01d83b5a13c77805976e7c520f566244ba3ff.tar.gz
musl-20d01d83b5a13c77805976e7c520f566244ba3ff.tar.bz2
musl-20d01d83b5a13c77805976e7c520f566244ba3ff.tar.xz
musl-20d01d83b5a13c77805976e7c520f566244ba3ff.zip
improve the quality of output from rand_r
due to the interface requirement of having the full state contained in a single object of type unsigned int, it is difficult to provide a reasonable-quality implementation; most good PRNGs are immediately ruled out because they need larger state. the old rand_r gave very poor output (very short period) in its lower bits; normally, it's desirable to throw away the low bits (as in rand()) when using a LCG, but this is not possible since the state is only 32 bits and we need 31 bits of output. glibc's rand_r uses the same LCG as musl's, but runs it for 3 iterations and only takes 10-11 bits from each iteration to construct the output value. this partially fixes the period issue, but introduces bias: not all outputs have the same frequency, and many do not appear at all. with such a low period, the bias is likely to be observable. I tried many approaches to "fix" rand_r, and the simplest I found which made it pass the "dieharder" tests was applying this transformation to the output. the "temper" function is taken from mersenne twister, where it seems to have been chosen for some rigorous properties; here, the only formal property I'm using is that it's one-to-one and thus avoids introducing bias. should further deficiencies in rand_r be reported, the obvious "best" solution is applying a 32-bit cryptographic block cipher in CTR mode. I identified several possible ciphers that could be used directly or adapted, but as they would be a lot slower and larger, I do not see a justification for using them unless the current rand_r proves deficient for some real-world use.
Diffstat (limited to 'src/prng/rand_r.c')
-rw-r--r--src/prng/rand_r.c11
1 files changed, 10 insertions, 1 deletions
diff --git a/src/prng/rand_r.c b/src/prng/rand_r.c
index ef26dbde..638614c8 100644
--- a/src/prng/rand_r.c
+++ b/src/prng/rand_r.c
@@ -1,6 +1,15 @@
#include <stdlib.h>
+static unsigned temper(unsigned x)
+{
+ x ^= x>>11;
+ x ^= x<<7 & 0x9D2C5680;
+ x ^= x<<15 & 0xEFC60000;
+ x ^= x>>18;
+ return x;
+}
+
int rand_r(unsigned *seed)
{
- return (*seed = *seed * 1103515245 + 12345)/2;
+ return temper(*seed = *seed * 1103515245 + 12345)/2;
}