Small revision of the thesis.

author: Dennis Brentjes <dennis@brentj.es> 2018-06-26 23:16:41 +0200
committer: Dennis Brentjes <dennis@brentj.es> 2018-08-18 14:12:53 +0200
commit: 23968a760efa6e03e8d47fbff108ec5aae010fe3 (patch)
tree: 814aa46a2a753bab7aee12ee99a95e3dd5672bf4 /content
parent: d0c805d47d54c20d34795cc8bcf5b80cfa03c31e (diff)
download: thesis-23968a760efa6e03e8d47fbff108ec5aae010fe3.tar.gz
thesis-23968a760efa6e03e8d47fbff108ec5aae010fe3.tar.bz2
thesis-23968a760efa6e03e8d47fbff108ec5aae010fe3.zip
6 files changed, 26 insertions, 33 deletions
diff --git a/content/cmix.tex b/content/cmix.tex
index 3af780d..eac6840 100644
--- a/content/cmix.tex
+++ b/content/cmix.tex
@@ -37,7 +37,7 @@ A major downside of these classic mix network is the amount of public key operat
 
 
 \section{\cmix}
-
+\label{sec:cmix}
 \cmix is a new anonymity mix network\cite{cMix}. Just like any other mix network it aims to provide anonymity by hiding timing information of messages. This means hiding the difference in time between a message leaving the client and arriving at its destination.
 
 A \cmix network is a fixed network consisting of $N$ nodes. This means there is a fixed network order and all clients know which computer represents each node in the network. It uses ElGamal encryption. And it relies heavily on the homomorphic properties of ElGamal.
@@ -94,7 +94,7 @@ where:
 \item $R_i$ is the R vector of node $i$
 \end{itemize}
 
-Whenever the result reaches the first node again it uses its permutation function $\pi$ on the incoming vector of values and multiplies that result with the encryption of $S$. It sends its result to the next node which does the same. The last node gets the following result.
+Whenever the result reaches the first node again phase 2 of the precomputation starts. this is a mix phase. It uses its permutation function $\pi$ on the incoming vector of values and multiplies that result with the encryption of $S$. It sends its result to the next node which does the same. The last node gets the following result.
 
 \begin{align}
 &\mathcal{E}_E( \nonumber \\
@@ -114,7 +114,7 @@ where:
 \item $\pi_i$ is the permutation function of node $i$
 \end{itemize}
 
-The third part of the precomputation is decrypting this final value. Each node can perform part of the decryption with his private key part of $E$. in combination with the encryption specific random value which is used in ElGamal it is called the decryption share. When it multiplies the above value with the decryption share you remove your part of the encryption. So when passing your result to the next node each node can multiply it's decryption share with their input. After the last node performs this action the last nodes has the decrypted value of \eqref{form:EPiRS}. It stores this for use in the realtime phase.
+The third part of the precomputation, is about decrypting this final value. Each node can perform part of the decryption with his private key part of $E$. in combination with the encryption specific random value which is used in ElGamal it is called the decryption share. When it multiplies the above value with the decryption share you remove your part of the encryption. So when passing your result to the next node each node can multiply it's decryption share with their input. After the last node performs this action the last nodes has the decrypted value of equation \eqref{form:EPiRS}. It stores this for use in the realtime phase.
 
 \subsection{Realtime phase}
 \label{sec:realpre}
@@ -129,7 +129,7 @@ M_{out} = M_{in} \cdot K_{ci}^{-1} \cdot R_j
 \vspace{-1em}
 where:
 \begin{itemize}[label=]
-\item $M_{in}$ is one of the input messages $E$
+\item $M_{in}$ is one of the input messages
 \item $R_j$ the corresponding r of $M_{in}s$ position
 \item $K_{ci}$ Is the shared key of $M_{in}s$ position
 \end{itemize}
diff --git a/content/cmix_additions.tex b/content/cmix_additions.tex
index b377c03..ee7edb6 100644
--- a/content/cmix_additions.tex
+++ b/content/cmix_additions.tex
@@ -1,6 +1,6 @@
 \section{\cmix additions}
 \label{sec:cmixaddtions}
-So the base protocol still has some issues, thankfully these issues can be addressed at the cost of some speed and clarity. Because it would not be safe to use \cmix in the wild without these attack mitigations. This implementation adds the extra messages needed as this results in a more realistic benchmark.
+So the base protocol still has some issues\cite{galteland2016attacks}, thankfully these issues can be addressed at the cost of some speed and clarity. Because it would not be safe to use \cmix in the wild without these attack mitigations. This implementation adds the extra messages needed as this results in a more realistic benchmark.
 
 \subsection{Tagging attack}
 \label{sec:tagging}
diff --git a/content/conclusion.tex b/content/conclusion.tex
index 5718746..befeb8d 100644
--- a/content/conclusion.tex
+++ b/content/conclusion.tex
@@ -4,7 +4,13 @@ The big picture shows using elliptic curve for \cmix can be very beneficial. Esp
 
 So in general using elliptic curve for \cmix shows promise. However there is more research to be done. Writing optimized backends for both algorithms and rerunning the tests to see if the differences still hold, is one of them. This is one of the fundamental things that need to be checked in further research. The goal of this research was feasibility, and this research shows just that. Now somebody with knowledge of writing fast and constant time cryptography implementations can pickup the topic of writing specialized backends and retest the algorithm.
 
-Another point to be taken into consideration is that this is the happy flow of the protocol. Checks like the tagging attack mitigation talked about in section \ref{sec:tagging} are implemented in a ``null'' fashion. Meaning the functions are in place and are being called but the implementation just returns a ``null'' byte as hash of the things it supposed to hash. Therefore the hash checking code is not in place. This was a deliberate choice, as these checks would not meaningfully affect the timings. The hashing algorithm scales linearly with the input size and would be the same over the 2 protocols. and checking it would also scale linearly with the input. Therefore it would be a constant time addition which would differ with a factor 8 between the 2 algorithms. In fact it would only pollute the timing results of the other cryptographic operations. However the protocol therefore needs some work to incorporate the hash checking where necessary. Therefore complying to the protocol standard.
+Another point to be taken into consideration is that this is the happy flow of the protocol. Checks like the tagging attack mitigation talked about in section \ref{sec:tagging} are implemented in a ``null'' fashion. Meaning the functions are in place and are being called but the implementation just returns a ``null'' byte as hash of the things it supposed to hash. Therefore the hash checking code is not in place. This was a deliberate choice, as these checks would not meaningfully affect the timings. The hashing algorithm scales linearly with the input size and would be the same over the 2 protocols. and checking it would also scale linearly with the input. Therefore it would be a constant time addition. In fact it would only pollute the timing results of the other cryptographic operations. However the protocol therefore needs some work to incorporate the hash checking where necessary. Therefore complying to the protocol standard.
+
+Another interesting avenue to take is to simulate real network traffic. There are frameworks\cite{zhang2015survey} to this end but none of the more popular and established ones work on the application layer. And to adapt the current framework to work on a network level or to convert route network traffic over the application layer is too much work to be in scope for this research.
+
+And finally, there is still room to run this benchmark on separate machines. Having a server dedicated for each of the nodes and have 500 separate clients run the protocol. The benchmark framework is capable of this type of setup. All the communication is done over $TCP + SSL$ for the node communication and $TCP$ for the communication to the statistics collection daemon. But again writing additional tooling to distribute the executables and automation scripts would just take too long for the current research.
+
+\newpage
 
 
 
diff --git a/content/discussion.tex b/content/discussion.tex
index e27fcd6..397f43d 100644
--- a/content/discussion.tex
+++ b/content/discussion.tex
@@ -16,7 +16,8 @@ Note that these separate runs of the ed25519 protocol can trivially be paralleli
 
 This implementation uses a prefix to the message data that is the destination id. This takes up $20$ bytes of each message, as it is the SHA1 hash of the public key of the receiver.So the payload would become $236$ and $228$ bytes for \mg and \ec respectively.
 
-\subsection{justification af claims}
+\subsection{Justification af claims}
+\label{sec:microbench}
 To justify the claims made in the rest of the discussion section I created a micro benchmark on both implementations, \ec and \mg. Please note that these are raw operation on their respective elements and thus does not take into account the difference in size. Therefore the \ec results should effectively be multiplied by $8$ to have a straight up comparison.
  
 \begin{description}
@@ -26,10 +27,10 @@ To justify the claims made in the rest of the discussion section I created a mic
 \end{description}
 
 \noindent The results of a micro benchmark on multiplicative group.
-\VerbatimInput{results/mb_mg.txt}
+\vspace{-\parsep}\vspace{-\parsep}\VerbatimInput{results/mb_mg.txt}
 
 \noindent The result of a micro benchmark on elliptic curve Ed25519
-\noin\VerbatimInput{results/mb_ec.txt}
+\vspace{-\parsep}\vspace{-\parsep}\VerbatimInput{results/mb_ec.txt}
 
 This clearly shows that adding curve points is much faster than multiplying 2048 bit numbers in a group, but it also clearly shows but it also shows how slow inverting is. This gives us insight into how the results we see in the above tables came to be.
 
@@ -40,7 +41,9 @@ Right away we can see a large difference between \ec and \mg, with \ec being mor
 \begin{equation}
 \label{eq:prepre}
 \sum_{x=0}^{\infty} n \cdot \left( \frac{1}{2}\right)^x =\ 2n
-\end{equation} 
+\end{equation}
+
+A quick benchmark done with ``valgrind --tool=callgrind'' shows us that for the entire run $70\%$ of time is spend in getting random bytes trough the $libgcrypt$ api for \ec. But ``only'' $46\%$ is spend gathering random bytes for the \mg
 
 Another slowing factor might be the memory fragmentation. The way that libgcrypt operates is that each object is it's own opaque pointer to an internal data structure. So there is no way to consolidate memory into a single block. This wouldn't be a factor if both algorithms would use the same amount of allocations, but now that we do 8 simultaneous mixes for \ec we have 8 times the amount of the allocations for the \ec algorithm. This impacts the performance quite heavily 
 
@@ -55,28 +58,12 @@ In this stage we only do a `simple' cryptographic operation on each of the point
 	
 \subsection{Realtime precomputation}
 
-Here is a large time saver, namely \ec is almost $13$ times faster. Additionally it is a time saver in the realtime portion of the protocol. This will massively impact perceived latency by the clients.
-
-\ec is much faster because the we only need to multiply the incoming values with 2 precomputed values in the precomputation phase, the inverse of the session key $K^{-1}$, and the value of $R$. These operations are faster in \ec than they are in \mg. 
+Here is a large time saver, namely \ec is almost $13$ times faster. Additionally it is a time saver in the realtime portion of the protocol. This will massively impact perceived latency by the clients. The cryptographic primitives in this step are $1$ inversion and $1$ multiplication or point add. All though inversion are still very slow compared to adding 2 points in Ed25519, it is still considerably faster than a inversion in 2048 bit multiplicative group as we can see in section \ref{sec:microbench}. 
 
 \subsection{Realtime mix and realtime postcomputation}
-The problem with these 2 phases is that the duration of these phases are relatively short and therefore are more unreliable to derive any information from. For instance the Realtime mix operation seems to be slower in \ec. This could be the case and could be again explained by the serialization choices made for elliptic curve points. However the difference is so small with respect to the clock resolution that it is risky to claim this. It could be that these phases are calling more functions and we are just measuring the indirection overhead of the \cmix library implementation.
-
-It could also be due to memory fragmentation. The \ec implementation is uses smaller elements, allocating an element size buffer has a bigger chance to succeed within just cleared memory, but subsequent buffers might not find free bytes in the neighborhood. I try to minimize this by allocating all the elements I need at a time. However I don't have much control over for instance the protobuf library. I use the arena allocator from libprotobuf\cite{protobuf-arena} but have no idea how large these arenas are and how the cache locality is for the \cmix network messages.
-
-In short there is not many conclusions we can derive from these results. However these phases seem to be similar between the 2 implementations. One interesting thing here is the Realtime phase in node 3 of both \ec and \mg. It takes quite some more time to remove the R and S than I expected that being said it seems to take about 3 times as long for \mg than \ec and that gives us some insight how long it takes to invert an element in both implementations.
-
-\subsection{Precomputation phase}
-
-The complete precomputation phase on average takes $22.98$ seconds for \ec versus $175.49$ seconds for \mg. So for elliptic curve to be  worth it we would need to make it a bit faster than it now is. But I fully believe it is possible. Again removing the unnecessary inversions would help. Using a different cryptographic library as backend for the ed25519 cryptography would have a much greater impact. How useful libgcrypt may have been for experimenting and debugging certain aspects of the application, it's cryptographic routines might not be as optimized as say an ed25519-donna. That being said ed25519-donna does not implement all the primitives I need to implement all the \cmix primitives. But creating a custom ed25519 library or using one that does support all the primitives \cmix needs would make a huge impact on the performance.
-
-This also holds for the multiplicative group implementation of libgcrypt but I expect that both algorithms would have the same level optimization within the library, because both implementation use the underlying big integer library and operations.
-
-\subsection{Realtime phase}
-
-The complete realtime phase on average takes $2.06$ seconds for \ec versus $75.81$ seconds for \mg. So in the Realtime phase using ed25519 is really worth it.
-
 
+In the last 2 phases \ec is considerably slower than \mg. The unnecessary inversion in the wire protocol might be to blame for this. It is hard to say for certain but taking into account the operations done in those phases and section \ref{sec:microbench} it is the most likely cause.
 
+That being said, the phases take around $1$s per node. And the time saved during the realtime precomputation phase still makes this worth while.
 
 
diff --git a/content/implementation.tex b/content/implementation.tex
index d6dbef4..d0ec21a 100644
--- a/content/implementation.tex
+++ b/content/implementation.tex
@@ -35,11 +35,11 @@ So to use the homomorphic property of ElGamal we need to directly encode user in
 
 \subsubsection{Elligator}
 
-An algorithm that might spring to mind first is the Elligator algorithm\cite{Bernstein:2013:EEP:2541806.2516734} to map curve points to strings indistinguishable from random strings. However this is the exact reverse of the operation I need in \cmix. Elligator works by converting curve points to a representative, a random string. However not every possible string of that size is an representative of a curve point. This makes it impossible to use the Elligator reverse mapping for this problem.
+An algorithm that might spring to mind first is the Elligator algorithm\cite{Bernstein:2013:EEP:2541806.2516734} to map curve points to strings indistinguishable from random strings. However this is the exact reverse of the operation I need in \cmix. Elligator works by converting curve points to a representative, a randomly seeming string of bytes. However not every possible string of that size is an representative of a curve point. This makes it impossible to use the Elligator for the reverse mapping in this problem.
 
 \subsubsection{Koblitz's method}
 
-There exists no deterministic and reversible mapping from integer values to elliptic curve points. However there is a probabilistic algorithm called the Koblitz's method\cite{Koblitz}. It starts by choosing a number that will be used as a  ``stride''. The string you want to encode as a point has to be interpreted as a group element and multiplied by the stride. The result of this multiplication must also be a group element. This means your message space is made smaller by a factor ``stride''. Now you can check if your result has a corresponding $x$ coordinate, in the case of Ed25519. If it has one you are done, if it doesn't you add $1$ to your current trial $y$ coordinate and check again. This continues up until your trial $y$ coordinate reaches the value of  $first\_trial\_y\_coordinate + stride$. If you haven't found a suitable $y$ coordinate by now this mapping algorithm fails.
+There does exist an algorithm to map arbitrary strings to curve points. It is a probabilistic algorithm called the Koblitz's method\cite{Koblitz}. It starts by choosing a number that will be used as a  ``stride''. The string you want to encode as a point has to be interpreted as a group element and multiplied by the stride. The result of this multiplication must also be a group element. This means your message space is made smaller by a factor ``stride''. Now you can check if your result has a corresponding $x$ coordinate, in the case of Ed25519. If it has one you are done, if it doesn't you add $1$ to your current trial $y$ coordinate and check again. This continues up until your trial $y$ coordinate reaches the value of  $first\_trial\_y\_coordinate + stride$. If you haven't found a suitable $y$ coordinate by now this mapping algorithm fails.
 
 You can map a point back to the original string by dividing the y coordinate with the ``stride''. This works because of integer division and because we only add a value less than the stride to the original number.
 
diff --git a/content/introduction.tex b/content/introduction.tex
index abe04b8..252a831 100644
--- a/content/introduction.tex
+++ b/content/introduction.tex
@@ -2,10 +2,10 @@
 
 Showing that one piece of software is faster than another is somewhat of an art. Trying to keep as many of the variables the same while varying the one you are interested in is not easy. Especially when the implementations are not of the same algorithm. This is the case for the \cmix mix network, where you can choose between doing ElGamal in either multiplicative groups or in elliptic curves. This makes benchmarking the fundamental performance differences between these two difficult and interesting. This paper and companion framework implementation focuses on providing a fair comparison by between the two, by providing a common interface that can implement the \cmix primitives and can be implemented in different cryptographic back-ends.
 
-To start things of, and to see if there is an benefit in using ElGamal over elliptic curves we implemented a elliptic curve and multiplicative group back-end with the same underlying cryptographic primitive library that supported both.
+To start things of, and to see if there is an benefit in using ElGamal over elliptic curves we implemented an elliptic curve and multiplicative group back-end with the same underlying cryptographic primitive library that supported both.
 
 We will try to show how \cmix works and how it needs to be implemented to be safe against some of the known attacks. Touching on some of the implementation details of problems that needed to be solved and what kind of impact that has on the run time of the algorithms. We will also discuss how we timed the applications and gathered data. Keeping in mind the limitations of the platform used.
 
 Finally we hope to show that both these implementations of \cmix scale linearly in the amount of clients that participate in a run, and that elliptic curve implementations for \cmix could be an interesting alternative to multiplicative group with respect run time.
 
-The paper is structured as follows. First we will talk about other anonymity networks in section \ref{sec:anon}. Section \ref{sec:cmix} will talk about the \cmix network and how it works and why it works with ElGamal. Followed by \ref{sec:implementation}; which talks about some implementation specific things. Section \ref{sec:cmixaddtions} will talk about some flaws in the original \cmix protocol and discusses how to fix them. Then we talk about the results in section \ref{sec:results}, followed by the discussion of the results in section \ref{sec:discussion}. Final thought and further research ideas are in the conclusion section \ref{sec:conclusion}.
-\ No newline at end of file
+The paper is structured as follows. First we will talk about other anonymity networks in section \ref{sec:anon}. Section \ref{sec:cmix} will talk about the \cmix network and how it works and why it works with ElGamal. Followed by section \ref{sec:implementation}; which talks about some implementation specific things. Section \ref{sec:cmixaddtions} will talk about some flaws in the original \cmix protocol and discusses how to fix them. Then we talk about the results in section \ref{sec:results}, followed by the discussion of the results in section \ref{sec:discussion}. Final thought and further research ideas are in the conclusion section \ref{sec:conclusion}.
+\ No newline at end of file
author	Dennis Brentjes <dennis@brentj.es>	2018-06-26 23:16:41 +0200
committer	Dennis Brentjes <dennis@brentj.es>	2018-08-18 14:12:53 +0200
commit	23968a760efa6e03e8d47fbff108ec5aae010fe3 (patch)
tree	814aa46a2a753bab7aee12ee99a95e3dd5672bf4 /content
parent	d0c805d47d54c20d34795cc8bcf5b80cfa03c31e (diff)
download	thesis-23968a760efa6e03e8d47fbff108ec5aae010fe3.tar.gz thesis-23968a760efa6e03e8d47fbff108ec5aae010fe3.tar.bz2 thesis-23968a760efa6e03e8d47fbff108ec5aae010fe3.zip