Two common features in programming languages are subtyping and type operators. Subtyping defines a preorder (often a partial order) on types that allows substituting values of a subtype anywhere a supertype is expected. The standard notation for the subtyping relation is A <: B, meaning "A is a subtype of B" (also "B is a supertype of A"). Type operators are types that are parameterized by one or more types. They are often called 'generics' and the notation for these is usually something like X<A>, where X is the type operator and A is the instantiating type.
In type systems with both subtyping and type operators, the question arises as to how the two interact: for a type operator X parameterized by a single type and two types A and B, when is X<A> <: X<B>? Since the only difference between X<A> and X<B> is the instantiating type, the answer depends on the subtyping relationship between A and B. There are four possibilities:
For the invariant case, A <: B and B <: A implies A = B by antisymmetry if the subtyping relation is a partial order. When the type operator is parameterized by more than one type, the variance is given for each parameter separately e.g. the type operator might be covariant with its first type parameter and invariant with its second.
So we know the four possible ways that our question can be answered, but how do we figure out which case applies? One type being a subtype of another is essentially saying that the subtype can be converted into the supertype; that there is a function taking the subtype as a parameter and returning the supertype. So to figure out when one instantiation of a type operator is a subtype of another instantiation we can try writing such a function. Let's use the following interface:
interface X<T1,T2> { T2 f(T1 t1); }
And try to figure out when X<A,B> <: X<C,D> for four types A, B, C, and D. To do this we will assume that we have functions to convert between A/C and B/D and write a conversion from X<A,B> to X<C,D>:
X<C,D> XABtoXCD(X<A,B> x) { return (C c) -> BtoD(x.f(CtoA(c))); }
So we need the conversions from B to D and C to A. Converting these back into subtyping constraints, we get that X<A,B> <: X<C,D> when C <: A and B <: D. In other words, X is contravariant with its first parameter and covariant with its second.
X has a natural variance for both of its parameters, making it nice to use with a programming language that has declaration-site variance annotations where the variance of each type parameter is given with the definition of the type operator. Here is how that looks in Scala:
trait X[-T1,+T2] { def f(t1: T1): T2 }
Annotating variance this way allows using subtyping with the type operator without having to add variance annotations everywhere that the type operator is used. Usually, though, type operators have more complicated interfaces that are naturally invariant with their parameters. In these cases, use-site variance annotations provide more flexibility by allowing the variance of type parameters to be specified based on how they are used in a given context. One way of thinking about this is that a subset of the type operator's interface is carved out to contain only the parts that are compatible with the variance annotations given, and the type checker will enforce that only those parts of the interface are used in that context. Here is what use-site variance annotations look like in Java:
void g(X<? super A, ? extends B> x) { ... }
Use-site variance annotations also make bivariance more practical, whereas it has little use with declaration-site variance.