Why do we need Java generics?
Generics introduce a new level of compile-time type safety. This in itself makes it one of the most impactful updates to Java’s core, and arguable, brings a sane amount of type safety to the Collections framework. The most common bug this is meant to address is a nasty ClassCastException when we believe an object is of one type when really it’s not. With Generics, this is caught at compile-time. Additionally, it improves code readability by absolving the programmer from using explicit casts.
Quick overview of terminology
1 2 3 4 5 6 |
|
Here, MimicList
1
|
|
A parameterized type is dependant on an existing generic type definition to exist. If I tried to initialize MimicList
The code below demonstrates why we can’t use primitives as type parameters. After the compiler strips the type information(Type Erasure, discussed later), it has to use casts to ensure that we are working with the parameterized type we initialized the object with, or throw an exception that we can’t, say, put the result of list.get(0) into a Integer.
1 2 3 |
|
becomes..
1 2 3 |
|
The compiler would erase all type information from parameterized types and instead add explicit casts to raw types. At the bytecode level, both a generic class a raw class would look exactly the same. One consequence of this is that objects at runtime do not contain information about their generic arguments (although the information is still present on fields, method, constructors and extended class and interfaces). A benefit to this is since all type information is erased, there’s only need for one version of the generic class to be stored in the bytecode for all variations of possible types(in comparison to C++ templating where each type had it’s own version).
The type information would be completely erased, making generics non-reifiable. Bridge methods were also added on a case by case basis.
Extending generics
I briefly alluded that we can easily extend an existing generic class with a new generic class without much complication. In the example above, type T is an unbounded type parameter that is a type placeholder for MimicList. If we initialize MimicList as a raw type, then the underlying superclass will also be initialized as a raw class.
Raw Classes
A raw class is basically a generic class declared without any type parameters. Any inner classes of a raw class will also be a raw class. The only exception is a static inner class. It would be considered raw because it’s technically not a parameterized type. It’s not even part of that instance, since it’s just a static.
In this example, we have no need for a placeholder because we are not referencing it anywhere. In such cases we can just replace it with ? to mean the same thing. It will have the same effect of creating a generic class definition with an unbounded type - any reference type. We still have the same type safety guarantees as if we used a java identifier instead. A wildcard without bounds is called an unbounded wildcard.
Variance
Covariant, Invariant, and Contravariant. These concepts are the building blocks of subtyping in modern languages. In Java, generics are invariant by default. Just because class Y is a subclass of class X does not mean that SomeGeneric
Bounded Type Parameters
Type bounds can be restricted with the super or extends keyword. If you want to restrict initialization to instances of itself or it’s subtype, you use the extends keyword(covariance). If you need to limit the initiziation to itself or all supertypes, you use the super keyword(contravariance). You are not limited in how many bounds you can specify. You can only have one class bound(since multiple inheritance is not allowed in Java), but you can have an unlimited number of interface bounds! Later we’ll discuss why this makes our code much more flexible without decreasing type-safety.
History of Type erasure
Type Erasure exists because Sun wanted to keep binary compatibility with older versions of Java(versions 4 and below) when Java 5 with Generics was introduced. It’s also basically the reason Raw classes are still allowed. There is no excuse to use a raw class when you have wildcard bounds(The only exception being class literals and the instanceof operator). At worst case, the unbounded wildcard type should fit any scenario. Bridge methods are quite useful since they let us use generic types as raw types, and more importantly, allow us to use parameterized types in function calls after their type parameters are erased by .. type erasure. Is that redundant enough for you?? Unfortunately, because Java generics are non-reified, there are two exceptions where raw types must be used in new code:
- Class literals, e.g. List.class, not List
.class - instanceof operand, e.g. o instanceof Set, not o instanceof Set
Sometimes you need to use a raw type or use an explicit unchecked typecast. Whether it’s for immovable things like legacy code or practical purposes like unit/mock testing, there are acceptable scenarios where we might want to forego strong compile-time safetey. To do so, we have to annoate the piece of code with @SuppressWarnings(“unchecked”).
1 2 3 4 5 |
|
It’s recommended you place the annotation as close to the offending line as possible. We could have placed it before the function defintion, but then all unchecked warnings in that function would be ignored, not just the initial conversion of List
Wildcards
Wildcard types can be pretty confusing. I’ll just have a simple overview. They are very useful when we want to introduce some type flexibility into our functions and collections, but at the same time be able to keep all the compile-time safety that generics provides us.
1 2 3 4 5 |
|
One practice that takes advantage of flexible wildcards is the PECS principle.
Producer Extends, Consumer Super principle
The idea behind PECS is super simple, but it’s not intuitive just from reading that title. In fact if you dig deep enough it’s a very complicated topic dealing with variance(covariance and contravariance). But it’s actually very simple if you just think of it in terms of type safety.
Let’s start with an abstract type Soldier and some concrete classes
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
|
The “Producer” collection
A collection whose type must be a class that extends the specified type(or itself), meaning that when we have a wildcard type of <? extends Soldier>, it means:
- You are guaranteed when you read from this collection, the object will be at least a type of Soldier
- You can not put anything inside of this collection, except for null. That’s because null can technically be of any type.
- This is called a Producer collection because it produces data. This is point of view from the collection
The reason that you can’t put anything inside of a Producer collection is because it will break the type safety guarantees. Let’s give an example:
1 2 3 4 5 |
|
Why are none of these legal? Because the allSoldiers reference can point to either a collection of Soldier, Rita, Cage, or Kimmel. But at compile time we don’t know which one it’s going to be. For type safety, the compiler cannot allow us to add class which might or might not cause a cast exception. All the compiler knows is that whichever class we initialize allSoldiers with, it must at least be a Soldier.
1 2 3 |
|
This argument is similar. Can you spot the pattern here? Yes, it’s all about ensuring type safety. We can only be sure that allSoldiers will be at worst case a type of Soldier, at compile time. At run-time, it could point to a reference of Kimmels for all we know.
The “Consumer” collection
A collection whose type is a supertype of the provided class, meaning that when we have a wildcard type of <? super Soldier>. This means:
- You are guaranteed when you write to this collection, the object must be a Soldier or it’s supertype
- As a consequence, we can only initialize the generic with a reference who’s type must be a supertyper
- This is called a Consumer collection because it consumes data. This is point of view from the collection
- User defined destructor
- The only type you are guaranteed to get back when you read from this collection is Object
1 2 3 4 |
|
Why does ? super Rita give us flexible type safety? In the example code above, it’s clear what we can only initialize maybeRitas with a supertype of class Rita. So it can only be of type Rita, Soldier, or Object. Let’s say we then want to add something to this collection. We are confident that if we an instance of Soldier into maybeRitas, it is guaranteed to be a subtype of whatever type we initialized the list with(Rita, Soldier, Object). But you can’t, for example, add a String, Int, or Cage to maybe Ritas no matter which of those three initializations was chosen.
This discussion can delve further into the differences between extends/super between not only initialization, but also it’s implications on the methods of the initialized objects. I might delve further into this but this seems like a nice overview of Java’s attempt to bring more compile-time type safety to the language.