1 files changed, 8 insertions, 19 deletions
diff --git a/content/discussion.tex b/content/discussion.tex
index 397f43d..34bdc4e 100644
--- a/content/discussion.tex
+++ b/content/discussion.tex
@@ -1,24 +1,11 @@
 \section{Discussion}
 \label{sec:discussion}
-\newcommand{\ec}[0]{\emph{ec}\xspace}
 
-\newcommand{\mg}[0]{\emph{mg}\xspace}
-
-In this section \ec and \mg will refer to the 2 different implementations that we compare. They stand for elliptic curve and multiplicative group.
-
-So lets first talk about what our expectations will be.
-The size of the 2 types of messages, for \mg and \ec, are slightly different.
-We use a 2048 bit multiplicative group so we have 256 byte messages.
-Ed25519 has a little below 255 bit group elements giving us less then 32 byte messages.
-For convenience and because we're using Koblitz's method we only use 31 byte for the message part. But to counter this we do 8 mixes at a time to go up to 248 byte messages. Still 8 bytes short, but this makes the timings more comparable.
-
-Note that these separate runs of the ed25519 protocol can trivially be parallelized, possibly making it even more interesting by comparison. But as we are interested in a straight up comparison this implementation does not parallelize multiple mixes.
-
-This implementation uses a prefix to the message data that is the destination id. This takes up $20$ bytes of each message, as it is the SHA1 hash of the public key of the receiver.So the payload would become $236$ and $228$ bytes for \mg and \ec respectively.
+We will now discuss the results in more detail, zooming in on all the separate phases and see if we can either verify our expectations or explain any deviations.
 
 \subsection{Justification af claims}
 \label{sec:microbench}
-To justify the claims made in the rest of the discussion section I created a micro benchmark on both implementations, \ec and \mg. Please note that these are raw operation on their respective elements and thus does not take into account the difference in size. Therefore the \ec results should effectively be multiplied by $8$ to have a straight up comparison.
+To justify the claims made in the rest of the discussion section a micro benchmark was created for both implementations, \ec and \mg. Please note that these are raw operation on their respective elements and thus does not take into account the difference in size. Therefore the \ec results should effectively be multiplied by $8$ to have a straight up comparison.
  
 \begin{description}
 \item[Random] The time it took to generate 10000 random group elements / curve points.
@@ -43,14 +30,16 @@ Right away we can see a large difference between \ec and \mg, with \ec being mor
 \sum_{x=0}^{\infty} n \cdot \left( \frac{1}{2}\right)^x =\ 2n
 \end{equation}
 
-A quick benchmark done with ``valgrind --tool=callgrind'' shows us that for the entire run $70\%$ of time is spend in getting random bytes trough the $libgcrypt$ api for \ec. But ``only'' $46\%$ is spend gathering random bytes for the \mg
+A quick benchmark done with ``valgrind --tool=callgrind''\cite{ValgrindCallgrind} shows us that for the entire run $70\%$ of time is spend in getting random bytes trough the $libgcrypt$ api for \ec. But ``only'' $46\%$ is spend gathering random bytes for the \mg
 
 Another slowing factor might be the memory fragmentation. The way that libgcrypt operates is that each object is it's own opaque pointer to an internal data structure. So there is no way to consolidate memory into a single block. This wouldn't be a factor if both algorithms would use the same amount of allocations, but now that we do 8 simultaneous mixes for \ec we have 8 times the amount of the allocations for the \ec algorithm. This impacts the performance quite heavily 
 
 \subsection{Precomputation mixing}
 \label{sec:premix}
 
-Another big time loss is attributed to the wire format chosen. because we only send over the Y coordinate, we need to convert the coordinate from projected to affine coordinate systems. This takes up extra time which can be eliminated if we choose. Unfortunately if we want to do this using libgcrypt we need to send over all 3 projected coordinates as there is no way to use only the $Y$ and $Z$ coordinates to reconstruct the curve point in libgcrypt.
+Another big time loss is attributed to the wire format chosen. because we send over the affine $x$ and  $y$ coordinate for \ec, we need to convert the coordinate from projected to affine coordinate systems. This takes up extra time which can be eliminated if we choose. Unfortunately if we want to do this using libgcrypt we need to send over all 3 projected coordinates as there is no way to use only the $Y$ and $Z$ coordinates to reconstruct the curve point in libgcrypt.
+
+Another downside to fixing this by sending projected coordinates is that there is no way of knowing how many bits a projected coordinate is, so the allocation of buffer space in the wire format container will become dependent on the point input. Something which is not necessary for the \mg implementation because it's buffer size will always be the size of the group element. This has to be exposed via the generic API if you to be able to send projected coordinates, because the application binaries, the nodes and clients, are responsible for all the network transmissions, not the \cmix libraries.
 
 \subsection{Precomputation postprocessing}
 
@@ -58,11 +47,11 @@ In this stage we only do a `simple' cryptographic operation on each of the point
 	
 \subsection{Realtime precomputation}
 
-Here is a large time saver, namely \ec is almost $13$ times faster. Additionally it is a time saver in the realtime portion of the protocol. This will massively impact perceived latency by the clients. The cryptographic primitives in this step are $1$ inversion and $1$ multiplication or point add. All though inversion are still very slow compared to adding 2 points in Ed25519, it is still considerably faster than a inversion in 2048 bit multiplicative group as we can see in section \ref{sec:microbench}. 
+this phase is a large time saver, namely \ec is almost $13$ times faster. Additionally it is a time saver in the realtime portion of the protocol. This will massively impact perceived latency by the clients. The cryptographic primitives in this step are $1$ inversion and $1$ multiplication or point add. All though inversion are still very slow compared to adding 2 points in Ed25519, it is still considerably faster than a inversion in 2048 bit multiplicative group as we can see in section \ref{sec:microbench}. 
 
 \subsection{Realtime mix and realtime postcomputation}
 
-In the last 2 phases \ec is considerably slower than \mg. The unnecessary inversion in the wire protocol might be to blame for this. It is hard to say for certain but taking into account the operations done in those phases and section \ref{sec:microbench} it is the most likely cause.
+In the last 2 phases \ec is considerably slower than \mg. This seemed surprising. On closer inspection with ``valgrind --tool=callgrind''\cite{ValgrindCallgrind} we find that almost 94\% of the time was spent in the $gcry\_mpi\_ec\_get\_affine$ function call. Which does inversions which we already know are slow from our micro benchmark. To reiterate 1 of these inversions could be avoided if we would change the wire format to not send over the affine Y coordinate. 
 
 That being said, the phases take around $1$s per node. And the time saved during the realtime precomputation phase still makes this worth while.