A couple of bogus prime sieves have shown up in the blogorama lately. After reading a very interesting paper about the authentic prime sieve algorithm, I became kind-of hooked on implementing it. The paper is great but the examples are all in Haskell, and I can’t really understand that too well. However I did manage to figure out the basic point.

The “wrong” prime sieve implementations all operate by accumulating a list of known primes, and then testing candidates by seeing whether they’re divisible by any of them. If not, then the candidate must be prime. It’s only necessary to test from 2 through sqrt(candidate) of course. Well maybe not “of course” but that’s the way it is anyway. Thus the time bound on that approach is O(n sqrt(n)), sort-of, since prime numbers are pretty uniformly distributed along the natural number line. (**Note:** the preceding claim about prime distribution is incorrect, but as the algorithm doesn’t take density of primes into account the time bound (in terms of numbers tested, not number of primes generated) is about right I think. Thanks Cybil!)

The “right” prime sieve is different (duhh). What it does is keep track of the composite numbers implied by each known prime. In other words, once (in the degenerate initial case) you decide that 2 is prime, you know that all multiples of 2 (starting with 4) must not be prime. As you work upwards through candidates, you know a candidate is prime if it is smaller than the smallest composite number you know about so far. This approach is the “right” approach because, if you do the “what’s the smallest composite?” part properly, you end up with an O(n log n) algorithm and that’s a lot better than the wrong way.

The way to do the “what’s the smallest composite” test is to keep the composite lists (that is, one list per known prime) in a priority queue. In this case, it’s possible to cheat a little with adding primes to the queue because it’s going to be the case that the first composite implied by a new known prime is always a bigger number than any other known composite. That’s probably provable but I don’t feel like doing that now.

Implementing the priority queue in a language like Erlang or Clojure (well one could cheat in Clojure but I didn’t want to) requires that it be done with some sort of functional data structure. In Java (and I did do a Java version but it’s ugly and boring) you can use an array list to store a “heap” for the queue, and inserts are super easy because you just drop your new composite stream in the next available slot of an ArrayList. In the trendy functional languages, however, you can’t do that.

What I came up with (probably after several billion other people did) was to use a functional (immutable) tree structure that I could rebuild when doing an “add” operation. The trick is to consider what’s going on with the ArrayList cheater approach. If you’re dropping something into the next available slot at the end of the heap, well what does that look like if you visualize the heap as a tree? You can work your way up the tree by repeatedly dividing that slot index by 2 until you get to the root. Each time, you know that if the remainder is an odd number you’re moving up a right link, and if it’s an even number you’re moving up a lefft link. Thus by looking at the binary representation of the heap slot index (backwards), you have a little “path” descriptor that’ll guide you straight from the top of the heap down to the next empty slot.

So in Erlang I represent the priority queue as a tuple with the current queue size, and the heap. The heap is also a tuple, containing the top node and the left and right children – recursively, also tuples. Each node is a tuple containing the prime number, the current known composite, and a counter to generate the next composite.

```
-module(pq).
-export([add/2, primes/1]).
% Primes
np(Prime) -> {Prime, Prime * Prime}.
next({Prime, M}) -> {Prime, Prime + M}.
cur({_Prime, V}) -> V.
```

The queue utilities:

```
% The priority queue
newQ() -> empty.
leaf(P) -> {P, nil, nil}.
sz(empty) -> 0;
sz({Sz, _T}) -> Sz.
% Find where to put a new prime (end of the queue)
path(N) -> path(N, []).
path(1, L) -> L;
path(N, L) -> path(N bsr 1, [N band 1 | L]).
% Add a newly-discovered prime
add(empty, Prime) -> {1, leaf(np(Prime))};
add({Sz, T}, Prime) -> {Sz + 1, add(T, Prime, path(Sz + 1))}.
% ... follow the path to the nil!
add(nil, Prime, []) -> leaf(np(Prime));
add({P, L, R}, Prime, [0 | Path]) -> {P, add(L, Prime, Path), R};
add({P, L, R}, Prime, [1 | Path]) -> {P, L, add(R, Prime, Path)}.
val(empty) -> nothing;
val({P, _L, _R}) -> cur(P);
val({_Sz, {P, _L, _R}}) -> cur(P).
```

Now there are two things to do when cranking through candidate primes. If your candidate is in fact a prime, you need to add it to the queue (in the form of one of those tuples). If it’s not a prime, you’ve “used up” the minimum known composite number so you need to get a new one. Well it turns out that of course your primes will sometimes generate overlapping composite numbers – both 2 and 3 will give you 12, for example. Thus when you need to “bump” the prime queue up to the next useful value, you need to keep working until the new minimum composite number is bigger than the one you just threw away.

To “bump” the queue, then, we’ll roll to the next composite provided by the prime at the top of the queue. But now of course the heap may not be a heap, so we need to “fix” it. The “bump-fix” needs to repeat until we’ve got a good new minimum.

Thus, the “bump” code:

```
% Gen next composite number from topmost prime, adjust heap
bump({Sz, T}, N) -> {Sz, bumpt(T, N)}.
bumpt({P, L, R} = T, V) ->
case cur(P) of
Vp when Vp =< V -> bumpt(fix({next(P), L, R}), V);
_ -> T
end.
% adjust heap by swapping primes down until it's a heap again
fix({_P, nil, nil} = T) -> T;
fix({P, {Pl, nil, nil}, nil} = T) ->
case {cur(P), cur(Pl)} of
{V, Vl} when V < Vl -> T;
_ -> {Pl, {P, nil, nil}, nil}
end;
fix({P, {Pl, Ll, Rl} = L, {Pr, Lr, Rr} = R} = T) ->
case {cur(P), cur(Pl), cur(Pr)} of
{V, Vl, Vr} when V < Vl, V < Vr -> T;
{_V, Vl, Vr} when Vl < Vr ->
{Pl, fix({P, Ll, Rl}), R};
_ ->
{Pr, L, fix({P, Lr, Rr})}
end.
```

The main “primes” routine (which just counts primes up to some limit) is therefore:

```
primes(Max) -> primes(newQ(), 2, Max).
primes(Q, N, Max) when N >= Max -> sz(Q);
primes(Q, N, Max) ->
primes(case val(Q) of N -> bump(Q, N); _ -> add(Q, N) end, N + 1, Max).
```

I did this up in Scheme, and it was kind-of messy and gross. Of course I don’t really know much Scheme, but the hurdle was the representation of the prime tuples and stuff like that. It took a lot of ugly little “(caddr foo)” functions.

Then I moved on to Clojure. Clojure is a neat-o Lisp dialect that runs on the Java VM. Here I’m not really exploting any Java stuff. I did however exploit some of the useful built-in “lazy” features:

```
(def countprimes (fn [Max] (let
[
; The representation of "composites from prime P" is a
; Clojure lazy sequence, made by mapping a function to
; multiply the prime by each of a lazy sequence, counting
; up from the prime.
countFrom (fn [Start] (iterate #(+ 1 %) Start))
composites (fn [Prime] (iterate #(+ Prime %) (* Prime Prime)))
; Tools for manipulating the queue
newQ #^{:size 0} [nil nil nil]
szQ (fn [Q] (get ^Q :size))
curMin (fn [[Top Left Right]] (if (nil? Top) nil (first Top)))
; Add a newly-discovered prime to the queue. This entails making
; a new "composites" sequence and dropping into a new leaf node
; at the end of the queue.
addPrime (let
[
; Binary representation of N, backwards
path (fn [N]
(reduce #(list* %2 %1) ()
(for [n (iterate #(quot % 2) N) :while (> n 1)] (rem n 2))
))
; Follow the path to the leaf
addr (fn addr [Np [Top Left Right] [LR & Rest]]
(if (nil? LR)
[Np nil nil]
(if (== 0 LR)
[Top (addr Np Left Rest) Right]
[Top Left (addr Np Right Rest)]
))
)
]
(fn [Prime Q] (let [newSz (+ 1 (szQ Q))]
(with-meta (addr (composites Prime) Q (path newSz)) {:size newSz})
)
))
; Get a new minimum composite, after detecting a non-prime
bumpUp (let
[
; Make sure the heap really is a heap. Note that there'll
; never be a node with nil on the left and non-nil on the
; right.
fix (fn fix [[Top Left Right :as Qfixed]]
(let [[LTop LLeft LRight] Left [RTop RLeft RRight] Right]
(if (and (nil? LTop) (nil? RTop))
Qfixed
(if (nil? RTop)
(if (<= (first Top) (first LTop))
Qfixed
[LTop (fix [Top LLeft LRight]) Right]
)
(if (and (<= (first Top) (first LTop)) (<= (first Top) (first RTop)))
Qfixed
(if (<= (first LTop) (first RTop))
[LTop (fix [Top LLeft LRight]) Right]
[RTop Left (fix [Top RLeft RRight])]
))))
))
]
(fn [Comp Q] (let [sz (szQ Q)]
(with-meta (loop [[Top Left Right :as Qbumped] Q]
(if (< Comp (first Top))
Qbumped
(recur (fix [(rest Top) Left Right]))
)
) {:size sz})
)
))
; It's either a new prime, or it isn't.
tryNumber (fn [Q Candidate] (let [min (curMin Q)]
(if (nil? min)
(addPrime Candidate Q)
(if (< Candidate min)
(addPrime Candidate Q)
(bumpUp Candidate Q)
))
))
]
(get ^(reduce tryNumber newQ (range 2 Max)) :size)
)))
```

Because Clojure supports some "decomposition" forms that make it easy to chop arguments up (though not quite as easy as in Erlang), it's somewhat neater and less noisy than my amateur Scheme version. Also in Clojure I can carry the queue size around as a meta property of the queue heap, though there's some weirdness about that which I don't completely understand. (Note - I cleaned up the Clojure source a little bit just now, and added some comments.)

The Erlang version is faster than the Clojure version, if anybody cares. I strongly suspect that both are much faster than the "check for prime factors" wrong sieve implementations. Note that in these the operations are all adds, compares, and multiplies - there are no divisions or square roots to compute.

**Update:** Thanks, qebab, for the hint, and I've now gotten rid of the useless multiplications in the composite iterations.