Sunday, October 29, 2023

Theory that really DOES apply: Security.

I recently read and wrote a review of

                               Math for Security by Daniel Reilly.

(For the review see here. It will appear in SIGACT News at some later point. For the amazon link see here. Disclosure: Lance and I are Amazon Affiliates.)

The book had great example of using THEORY for PRACTICAL problems of security (NOT what you think as you will see later). Since I am always surprised (possibly because of my ignorance) when theory REALLY applied to PRACTICE I asked the author some questions which he answered. Our conversation is below.

BILL: I was surprised that a book called Math for Security didn't have crypto in it. Why was that?

DANIEL:
There's a plethora of excellent material that already covers the topic very well. Cracking codes and analyzing ciphers is the first thing most people think of when they think of the relationship between math and security. There are so many other topics where a little creativity and math can open up new avenues and approaches to unique problems we face in security. As an analyst and consultant, a very small percentage of my time goes into cryptographic systems, but I'm regularly asked other questions.  I'm hoping to show that anyone can apply a little math to start making better (or at least more informed) decisions in many of these areas, not just cryptography.

BILL: Many theorists dream of the day when someone comes along to REALLY use their stuff. For example---

DANIEL: (Cuts Bill off) That's funny. As an analyst I'm always hoping to define a theory which explains some system I've been analyzing. Something like Kim Rossmo's formula in Forensics
(see here). Understanding what your data is telling you is good, but discovering why, that's the pinnacle of accomplishment! I guess the grass is always greener on the other side of the fence!

BILL: The first time I saw the problem of determining the Betweeness Centrality of a node is, it was in a paper showing that it was APSP-hard (so likely not in subcubic time).  Hence I was AMAZED that you use it FOR REAL. How hard was it to take the THEORY that you found in the literature and APPLY it.

DANIEL: For the graph theory in the book, and in general, I think it comes down to some basic statistics and understanding the underlying system being modeled. Betweenness centrality is a great example. It's really easy to calculate, but what it means in practice is wholly dependent on what generated the data. A high Betweenness score in a social network is the result of a different underlying structure than in a computer network. This is one reason graph theory made sense as the place to start. There are so many problems that can be represented as a graph with very little change to the methodology needed, but it does show the need to be flexible in your interpretation of theory.
For the programming side, NetworkX is really well designed so I didn't have to do much to implement the algorithms. That made it easy to focus on showing off how the library could be used when analyzing a real data set. Of course, whenever you're taking a general theory and applying it to a specific problem, you'll always have some pieces you need to build. The pieces that glue the theory
and the data together in a useful way. For me, that is the fun part. The art behind the science. You can give 5 analysts the same problem and get back 10 possible approaches!

BILL: I recently looked at the Complexity of the Art Gallery Problem- it is ER-Complete (see Wikipedia entry on Existential Theory of the Reals) which is between NP and PSPACE. Hence I was AMAZED that you use it for REAL. How hard was it to take the THEORY that you found in the literature and APPLY it. For example, the Guards, unlike the Who (see here) cannot see for miles and miles and miles and miles and miles.  (NOTE: Spell check thinks that NP is a word, but PSPACE is not a word. Odd!)

DANIEL: You sound like a broken record. But still a good question. The Art Gallery Project was the one that kicked off the idea to write the book in the first place. I was looking for a method to analyze physical security layouts and I came across a paper describing the problem as it related to security camera placement. That led me to the original paper and then Fisk's paper using the greedy coloring algorithm. I think I read one or two more pieces about how greedy coloring is implemented in NetworkX, but that was it. Once you understand the basic problem formulation, and Fisk's method of solving for n-vertex polygons, you quickly move out of what the theory was designed for. The Art Gallery project therefore represents what I think is a more typical scenario. I'll often start with a
theoretical solution that makes a lot of simplifying assumptions and then remove the simplifications a little at a time by adding in other theoretical bits (such as adding a function to compute a more realistic guess at what area a guard can protect). Sometimes I will find complimentary research on modeling something like walking speed. Other times I rely on my experiences and best guess (like assigning the space for people at an event). At it's core though, the algorithm solving the layout is the same one suggested by Fisk and implemented in NetworkX.

BILL: Anything else you want to add?

DANIEL: There is a quote I love that gets thrown around in System Dynamics a lot It's better for a model to be useful, than accurate I think that idea applies really well when you're building proof-of-concept systems. For example, a model that is very accurate at predicting an adversary's behavior may not be very useful if it is too complex and expensive to use. Ultimately, the best model is the one that is most useful for solving the problem at hand. This means that it is important to consider other factors than accuracy, such as interpretability, cost, and ease of use, when developing your proofs. It's important to remember the goal is not to perfectly model the system your studying, but to model enough of the key components to get a useful response.

4 comments:

  1. Just ordered a copy. I wonder if you can speak to whether the book describes actual incidents or products in which these math tools have been used? Or is the extent of the applicability python code in the book? I don't doubt the author, I would just like a means to verify that it is actually used, and not just in the book or academic papers.

    ReplyDelete
    Replies
    1. (Bill here) The book discusses actual applications in passing as motivation. Not sure if that counts. Here is a quote that I think addresses this- in the Summary part of the Art Gallery Chapter:

      You now have all the code necessary to solve the core of the art gallery problem, including methods for handling real-world constraints such as limited budgets and effective range for sensors. We've also discussed the power of the Triangle library and combined it with Shapely and Network X to model the problem with both a geometric and graph representation. Perhaps most important of all, you should now be comforable explaining the Art Gallery Problem theory and discussing the practical constraings applied within the applications.

      Delete
    2. Thanks for the comment. It will still be interesting to read the book, but I suppose I can contact the author to ask more details on that. This is background research for my own book, “Practical Math for Programmers”, in which I require sufficient proof that the math is actually used in production to include the topic in the book.

      Delete
  2. "Ultimately, the best model is the one that is most useful for solving the problem at hand." Indeed. https://blogdredd.blogspot.com/2023/10/on-origin-of-genetic-constant-5.html

    ReplyDelete