Why project Jigsaw / JPMS?

Java's package management system always seemed simple and effective to me. It is heavily used by the JDK itself. We have been using it to mimic the concept of namespaces and modules.

What is Project Jigsaw (aka Java Platform Module System) trying to fill in?

From the official site:

The goal of this Project is to design and implement a standard module system for the Java SE Platform, and to apply that system to the Platform itself and to the JDK.

21276 次浏览

Jigsaw and OSGi are trying to solve the same problem: how to allow coarser-grained modules to interact while shielding their internals.

In Jigsaw's case, the coarser-grained modules include Java classes, packages, and their dependencies.

Here's an example: Spring and Hibernate. Both have a dependency on a 3rd party JAR CGLIB, but they use different, incompatible versions of that JAR. What can you do if you rely on the standard JDK? Including the version that Spring wants breaks Hibernate and visa versa.

But, if you have a higher-level model like Jigsaw you can easily manage different versions of a JAR in different modules. Think of them as higher-level packages.

If you build Spring from the GitHub source you'll see it, too. They've redone the framework so it consists of several modules: core, persistence, etc. You can pick and choose the minimal set of module dependencies that your application needs and ignore the rest. It used to be a single Spring JAR, with all the .class files in it.

Update: Five years later - Jigsaw might still have some issues to resolve.

AFAIK The plan is to make the JRE more modular. I.e. have smaller jars which are optional and/or you can download/upgrade only the functionality you need.

Its to make it less bloated and give you the option of dropping legacy modules which perhaps most people don't use.

Based on Mark Reinhold's keynote speech at Devoxx Belgium, Project Jigsaw is going to address two main pain points:

  1. Classpath
  2. Massive Monolithic JDK

What's wrong with Classpath?

We all know about the JAR Hell. This term describes all the various ways in which the classloading process can end up not working. The most known limitations of classpath are:

  • It's hard to tell if there are conflicts. build tools like maven can do a pretty good job based on artifact names but if the artifacts themselves have the different names but same contents, there could be a conflict.
  • The fundamental problem with jar files is that they are not components. They're just bunch of file containers that will be searched linearly. Classpath is a way to lookup classes regardless of what components they're in, what packages they're in or their intended use.

Massive Monolithic JDK

The big monolithic nature of JDK causes several problems:

  • It doesn't fit on small devices. Even though small IoT type devices have processors capable of running an SE class VM but they do not have necessarily the memory to hold all of the JDK, especially, when the application only uses small part of it.
  • It's even a problem in the Cloud. Cloud is all about optimizing the use of hardware, if you got thousands of images containing the whole JDK but applications only use small part of it, it would be a waste.

Modules: The Common Solution

To address the above problems, we treat modules as a fundamental new kind of Java program component. A module is a named, self-describing collection of code and data. Its code is organized as a set of packages containing types, i.e., Java classes and interfaces; its data includes resources and other kinds of static information.

To control how its code refers to types in other modules, a module declares which other modules it requires in order to be compiled and run. To control how code in other modules refers to types in its packages, a module declares which of those packages it exports.

The module system locates required modules and, unlike the class-path mechanism, ensures that code in a module can only refer to types in the modules upon which it depends. The access-control mechanisms of the Java language and the Java virtual machine prevent code from accessing types in packages that are not exported by their defining modules.

Apart from being more reliable, modularity could improve performance. When code in a module refers to a type in a package then that package is guaranteed to be defined either in that module or in precisely one of the modules read by that module. When looking for the definition of a specific type there is, therefore, no need to search for it in multiple modules or, worse, along the entire class path.

JEPs to Follow

Jigsaw is an enormous project that is ongoing for a quite a few years. It's got an impressive amount of JEPs which are great places to gain more information about the project. Some of these JEPs are as the following:

  • JEP 200: The Modular JDK: Use the Java Platform Module System (JPMS) to modularize the JDK
  • JEP 201: Modular Source Code: Reorganize the JDK source code into modules, enhance the build system to compile modules, and enforce module boundaries at build time
  • JEP 261: Module System: Implement the Java Platform Module System, as specified by JSR 376, together with related JDK-specific changes and enhancements
  • JEP 220: Modular Run-Time Images: Restructure the JDK and JRE run-time images to accommodate modules and to improve performance, security, and maintainability
  • JEP 260: Encapsulate Most Internal APIs: Make most of the JDK's internal APIs inaccessible by default but leave a few critical, widely-used internal APIs accessible, until supported replacements exist for all or most of their functionality
  • JEP 282: jlink: The Java Linker: Create a tool that can assemble and optimize a set of modules and their dependencies into a custom run-time image as defined in JEP 220

Closing Remarks

In the initial edition of The State of the Module System report, Mark Reinhold describes the specific goals of the module system as following:

  • Reliable configuration, to replace the brittle, error-prone class-path mechanism with a means for program components to declare explicit dependences upon one another, along with
  • Strong encapsulation, to allow a component to declare which of its public types are accessible to other components, and which are not.

These features will benefit application developers, library developers, and implementors of the Java SE Platform itself directly and, also, indirectly, since they will enable a scalable platform, greater platform integrity, and improved performance.

This article explains in detail the problems which both OSGi and JPMS/Jigsaw try to solve:

"Java 9, OSGi and the Future of Modularity" [22 SEP 2016]

It also goes thoroughly into the approaches of both OSGi and JPMS/Jigsaw. As of now, it appears authors listed almost no practical Pros for JPMS/Jigsaw compared with matured (16 years old) OSGi.

For the sake of argument, let's assert that Java 8 (and earlier) already has a "form" of modules (jars) and module system (the classpath). But there are well-known problems with these.

By examining the problems, we can illustrate the motivation for Jigsaw. (The following assumes we are not using OSGi, JBoss Modules, etc, which certainly offer solutions.)

Problem 1: public is too public

Consider the following classes (assume both are public):

com.acme.foo.db.api.UserDao
com.acme.foo.db.impl.UserDaoImpl

At Foo.com, we might decide that our team should use UserDao and not use UserDaoImpl directly. However, there is no way to enforce that on the classpath.

In Jigsaw, a module contains a module-info.java file which allows us to explicitly state what is public to other modules. That is, public has nuance. For example:

// com.acme.foo.db.api.UserDao is accessible, but
// com.acme.foo.db.impl.UserDaoImpl is not
module com.acme.foo.db {
exports com.acme.foo.db.api;
}

Problem 2: reflection is unbridled

Given the classes in #1, someone could still do this in Java 8:

Class c = Class.forName("com.acme.foo.db.impl.UserDaoImpl");
Object obj = c.getConstructor().newInstance();

That is to say: reflection is powerful and essential, but if unchecked, it can be used to reach into the internals of a module in undesirable ways. Mark Reinhold has a rather alarming example. (The SO post is here.)

In Jigsaw, strong encapsulation offers the ability to deny access to a class, including reflection. (This may depend on command-line settings, pending the revised tech spec for JDK 9.) Note that because Jigsaw is used for the JDK itself, Oracle claims that this will allow the Java team to innovate the platform internals more quickly.

Problem 3: the classpath erases architectural relationships

A team typically has a mental model about the relationships between jars. For example, foo-app.jar may use foo-services.jar which uses foo-db.jar. We might assert that classes in foo-app.jar should not bypass "the service layer" and use foo-db.jar directly. However, there is no way to enforce that via the classpath. Mark Reinhold mentions this here.

By comparison, Jigsaw offers an explicit, reliable accessibility model for modules.

Problem 4: monolithic run-time

The Java runtime is in the monolithic rt.jar. On my machine, it is 60+ MB with 20k classes! In an age of micro-services, IoT devices, etc, it is undesirable to have Corba, Swing, XML, and other libraries on disk if they aren't being used.

Jigsaw breaks up the JDK itself into many modules; e.g. java.sql contains the familiar SQL classes. There are several benefits to this, but a new one is the jlink tool. Assuming an app is completely modularized, jlink generates a distributable run-time image that is trimmed to contain only the modules specified (and their dependencies). Looking ahead, Oracle envisions a future where the JDK modules are compiled ahead-of-time into native code. Though jlink is optional, and AOT compilation is experimental, they are major indications of where Oracle is headed.

Problem 5: versioning

It is well-known that the classpath does not allow us to use multiple versions of the same jar: e.g. bar-lib-1.1.jar and bar-lib-2.2.jar.

Jigsaw does not address this problem; Mark Reinhold states the rationale here. The gist is that Maven, Gradle, and other tools represent a large ecosystem for dependency management, and another solution will be more harmful than beneficial.

It should be noted that other solutions (e.g. OSGi) do indeed address this problem (and others, aside from #4).

Bottom Line

That's some key points for Jigsaw, motivated by specific problems.

Note that explaining the controversy between Jigsaw, OSGi, JBoss Modules, etc is a separate discussion that belongs on another Stack Exchange site. There are many more differences between the solutions than described here. What's more, there was sufficient consensus to approve the Public Review Reconsideration Ballot for JSR 376.