\section{Discussion} \label{sec:discussion} \newcommand{\ec}[0]{\emph{ec}\xspace} \newcommand{\mg}[0]{\emph{mg}\xspace} In this section \ec and \mg will refer to the 2 different implementations that we compare. They stand for elliptic curve and multiplicative group. So lets first talk about what our expectations will be. The size of the 2 types of messages, for \mg and \ec, are slightly different. We use a 2048 bit multiplicative group so we have 256 byte messages. Ed25519 has a little below 255 bit group elements giving us less then 32 byte messages. For convenience and because we're using Koblitz's method we only use 31 byte for the message part. But to counter this we do 8 mixes at a time to go up to 248 byte messages. Still 8 bytes short, but this makes the timings more comparable. Note that these separate runs of the ed25519 protocol can trivially be parallelized, possibly making it even more interesting by comparison. But as we are interested in a straight up comparison this implementation does not parallelize multiple mixes. This implementation uses a prefix to the message data that is the destination id. This takes up $20$ bytes of each message, as it is the SHA1 hash of the public key of the receiver.So the payload would become $236$ and $228$ bytes for \mg and \ec respectively. \subsection{Precomputation precomputation} Right away we can see a large difference between \ec and \mg, with \ec being more than $2$ times slower. Most likely this can be explained by 2 factors. The first being that this step includes generating all the randomness needed for the \cmix run. It generates $R$ and $S$ and the permutation $\pi$. For Ed25519 it tries a $Y$-value and if it doesn't map onto the curve we try a different one. On average only half of the points would be valid. \begin{equation} \label{eq:prepre} \sum_{x=0}^{\infty} n \cdot \left( \frac{1}{2}\right)^x =\ 2n \end{equation} Another slowing factor might be the memory fragmentation. The way that libgcrypt operates is that each object is it's own opaque pointer to an internal data structure. So there is no way to consolidate memory into a single block. This wouldn't be a factor if both algorithms would use the same amount of allocations, but now that we do 8 simultaneous mixes for \ec we have 8 times the amount of the allocations for the \ec algorithm. This impacts the performance quite heavily \subsection{Precomputation mixing} \label{sec:premix} Another big time loss is attributed to the wire format chosen. because we only send over the Y coordinate, we need to convert the coordinate from projected to affine coordinate systems. This takes up extra time which can be eliminated if we choose. Unfortunately if we want to do this using libgcrypt we need to send over all 3 projected coordinates as there is no way to use only the $Y$ and $Z$ coordinates to reconstruct the curve point in libgcrypt. \subsection{Precomputation postprocessing} In this stage we only do a `simple' cryptographic operation on each of the points to determine the node's decryption share. There are no extra allocations and no unneeded inversions, and we can see \ec is $2.5$ times faster. \subsection{Realtime precomputation} Here is a large time saver, namely \ec is almost $13$ times faster. Additionally it is a time saver in the realtime portion of the protocol. This will massively impact perceived latency by the clients. \ec is much faster because the we only need to multiply the incoming values with 2 precomputed values in the precomputation phase, the inverse of the session key $K^{-1}$, and the value of $R$. These operations are faster in \ec than they are in \mg. \subsection{Realtime mix and realtime postcomputation} The problem with these 2 phases is that the duration of these phases are relatively short and therefore are more unreliable to derive any information from. For instance the Realtime mix operation seems to be slower in \ec. This could be the case and could be again explained by the serialization choices made for elliptic curve points. However the difference is so small with respect to the clock resolution that it is risky to claim this. It could be that these phases are calling more functions and we are just measuring the indirection overhead of the \cmix library implementation. It could also be due to memory fragmentation. The \ec implementation is uses smaller elements, allocating an element size buffer has a bigger chance to succeed within just cleared memory, but subsequent buffers might not find free bytes in the neighborhood. I try to minimize this by allocating all the elements I need at a time. However I don't have much control over for instance the protobuf library. I use the arena allocator from libprotobuf\cite{protobuf-arena} but have no idea how large these arenas are and how the cache locality is for the \cmix network messages. In short there is not many conclusions we can derive from these results. However these phases seem to be similar between the 2 implementations. One interesting thing here is the Realtime phase in node 3 of both \ec and \mg. It takes quite some more time to remove the R and S than I expected that being said it seems to take about 3 times as long for \mg than \ec and that gives us some insight how long it takes to invert an element in both implementations. \subsection{Precomputation phase} The complete precomputation phase on average takes $22.98$ seconds for \ec versus $175.49$ seconds for \mg. So for elliptic curve to be worth it we would need to make it a bit faster than it now is. But I fully believe it is possible. Again removing the unnecessary inversions would help. Using a different cryptographic library as backend for the ed25519 cryptography would have a much greater impact. How useful libgcrypt may have been for experimenting and debugging certain aspects of the application, it's cryptographic routines might not be as optimized as say an ed25519-donna. That being said ed25519-donna does not implement all the primitives I need to implement all the \cmix primitives. But creating a custom ed25519 library or using one that does support all the primitives \cmix needs would make a huge impact on the performance. This also holds for the multiplicative group implementation of libgcrypt but I expect that both algorithms would have the same level optimization within the library, because both implementation use the underlying big integer library and operations. \subsection{Realtime phase} The complete realtime phase on average takes $2.06$ seconds for \ec versus $75.81$ seconds for \mg. So in the Realtime phase using ed25519 is really worth it.