Entropy has a formal definition, the minimum expected number of bits to represent the output of a distribution. But I view information as a more abstract concept of which entropy is just one substantiation. When you think of concepts like conditional information, mutual information, symmetry of information, the idea of an underlying distribution tends to fade away and you begin to think of information itself as an entity worth mentioning. And when you look at Kolmogorov Complexity, often called algorithmic information theory, the measure is over strings, not distributions, yet has many of the same concepts and relationships in the entropy setting.
Computational Complexity owes much to Shannon's information. We can use information theory to get lower bounds on communication protocols, circuits, even upper bounds on algorithms. Last spring the Simons Institute for the Theory of Computing had a semester program on Information Theory including including a workshop on Information Theory in Complexity Theory and Combinatorics. Beyond theory, relative entropy, or Kullback–Leibler divergence, plays an important role in measuring the effectiveness of machine learning algorithms.
We live in an age of information, growing dramatically every year. How do we store information, how do we transmit, how do we learn from it, how do we keep it secure and private? Let's celebrate the centenary of the man who gave us the framework to study these questions and so much more.