The information theory comment is correct but I’m going to elaborate a little bit with some potentially useful info.
For a binary number with N bits, you can represent 2^N values. Easy example: There are 2^8 = 256 possible values that can be represented by an 8-bit value. You can go the other way, too, and ask “how many bits do I need to represent a given value?” by using some math.
2^N = 256
Log2(2^N) = Log2(256)
N = Log2(256) = Log(256)/Log(2) = 8
And you can use this to figure out how many bits you need to represent an arbitrary number: N = Log2(1000) = Log(1000)/Log(2) = 9.966. That makes intuitive sense because a 10-bit number has 1024 values.
To get to the theoretical limit you’re asking about we do the same thing. For an arbitrary number x, how many trits do we need to represent it in trinary and how many bits N do we need to represent it in binary? What is the ratio of trits to bits?
x = 3^M and x = 2^N
Log3(x) = M
Log(x) / Log(3) = M
Log2(x) = N
Log(x) / Log(2) = N
And then we can just take that ratio N/M to figure out how many bits N we need to represent M trits:
R = N/M = Log(x)/Log(2) x Log(3)/Log(x) = Log(3)/Log(2)
To be able to reversibly encode an N-trit ternary number into an M-bit binary number, the number of possible N-trit numbers must be less or equal to the number of possible M-bit binary numbers. Otherwise there would be two input ternary numbers that would map to the the same binary number (pigeonhole principle). Which means that 3^N <= 2^M.
you should read it as log₂(3) because logₓ(y)=logₙ(y)/logₙ(x) and it represents the minimal number of bits (yes it's not an integer) required to represents 3 states
In general, in order to a store a thing with N values you need log_2(N) bits of information. This is actually really simple to see if N is a power of two. If you have four things, how many yes/no questions do you need to exactly determine one of the four things? The answer is two (divide the four into two groups of two, ask which group the object is in. Then, when you have that group, split into two groups of one, and ask the next question)[1].
So, for powers of two, it's obviously log_2(N).
Now we just extrapolate. For three things, on average, you need to ask log_2(3) questions. Sometimes you ask just one, sometimes you must ask two (depending on where you split the group). Either way, the formula is the same.
log(3)/log(2) is just a way of writing log_2(3) (since the value of log3/log2 is the same regardless of which base the log is in).
Note that two isn't special. It's only because we've limited ourselves to 'how many yes/no/questions'. If you wanted to know how many multiple choice questions of 3 responses would you need to exactly determine which object of a set of 4, the answer is log_3(4).
[1] You can think of any byte, word, double word, quad word as simply a path in a binary tree of all numbers in the range 0..2^N-1 (where N is bit width). The path uniquely identifies the number.
This is the equation for converting radix. You can lookup optimal radix choice for more, I think?
For fun, consider how much space a base 10 number takes in binary. It takes about 4 bits with slack. Specifically, about 3.32 bits. Which is log(10)/log(2).
Another way to write that number is as the log base 2 of 3. The log base 2 of the number of possible states gives you the number of bits of information in a value chosen from that many possible states; here, a trit has 3 possible states.