Java EE 8 High Performance
上QQ阅读APP看书,第一时间看更新

From JPA to the database

JPA let's you represent your database in plain Java. Said otherwise it let's you represent the relational model as an object model. It is very common for the development and maintenance but at some point, and in particular when you will validate your performances, you will need to check what the mapper (JPA implementation) is doing and how it does translate your object code/model to the relational one (SQL).

When you check the JPA caller code, you often have something like the following:

final Quote quote = entityManager.find(Quote.class, id);
....
entityManager.persist(quote);

For more complex queries, it is like the following:

final Number count = entityManager.createQuery("select count(q) from Quote q", Number.class);

I will not deal with named queries versus this kind of query in this part, but what is important here is that the model is object/Java-based. Even the JPQL query is related to an object and not plain SQL.

This leads to the main role of the JPA provider: translating all the code from the object/Java model to the relational/SQL model.

To understand this, we will configure the JPA provider of our server to log what it does. Since we are using GlassFish, we need to configure EclipseLink, which is the JPA provider. To do so, we just add the following properties in the persistence unit:

<property name="eclipselink.logging.level" value="FINEST"/>
<property name="eclipselink.logging.logger" value="JavaLogger"/>

This configuration will activate Eclipselink to log at FINEST level of the logger a lot of information. To see these information, we need to ensure the FINEST log level is written somewhere and not skipped as it is done by default. To do that, you need to configure the EclipseLink logger level to FINEST as well. This way Eclipselink would log with a level the logger would output. You can do it in GlassFish add this line to your logging.properties:

org.eclipse.persistence.level = FINEST

Note that if we use the maven plugin that we set up in Chapter 1, Money – The Quote Manager Application to run GlassFish, it will fallback on JVM logging.properties and you will need to either modify it from $JAVA_HOME/jre/lib/logging.properties or set another one when launching the server. Here is the potential content to activate logging in the console:

# output configuration - console here
java.util.logging.ConsoleHandler.level = FINEST
java.util.logging.ConsoleHandler.formatter = java.util.logging.SimpleFormatter

# global configuration (default)
.level = INFO
.handlers = java.util.logging.ConsoleHandler

# eclipselink specific logging level
org.eclipse.persistence.level = FINEST

Finally, to use this file when launching the server, simply set the system property, java.util.logging.config.file (assuming you put the file in src/main/glassfish/conf/logging.properties), as follows:

MAVEN_OPTS="-Djava.util.logging.config.file=src/main/glassfish/conf/logging.properties" mvn package embedded-glassfish:run

The logger name uses this pattern:

org.eclipse.persistence.session./file:<path to the webapp>/WEB-INF/classes/_<entity simple name in lowercase>.[sql|query]

Now, if you start the server, you have a few more lines:

...
Sep 09, 2017 5:21:51 PM org.eclipse.persistence.session./file:/home/rmannibucau/dev/quote-manager/target/quote-manager-1.0-SNAPSHOT/WEB-INF/classes/_quote.sql
FINE: SELECT ID, NAME, VALUE FROM QUOTE WHERE (NAME = ?)
bind => [1 parameter bound]
...
Sep 09, 2017 5:41:53 PM org.eclipse.persistence.session./file:/home/rmannibucau/dev/quote-manager/target/quote-manager-1.0-SNAPSHOT/WEB-INF/classes/_quote.sql
FINE: INSERT INTO QUOTE (ID, NAME, VALUE) VALUES (?, ?, ?)
bind => [3 parameters bound]
....
Sep 09, 2017 5:44:26 PM org.eclipse.persistence.session./file:/home/rmannibucau/dev/quote-manager/target/quote-manager-1.0-SNAPSHOT/WEB-INF/classes/_quote.sql
FINE: SELECT t1.ID, t1.NAME FROM QUOTE_CUSTOMER t0, CUSTOMER t1 WHERE ((t0.quotes_ID = ?) AND (t1.ID = t0.customers_ID))
bind => [1 parameter bound]

These lines are generated by our JPA provider (EclipseLink here) every time a query is issued to the database. The queries use bound parameters. This is interesting at two levels. The first one is about the security and intends to prevent SQL injections - note that for security reasons as well, the values are not logged by default eclipselink.logging.parameters can be set to true in your persistence unit properties if you want to see them instead of the number of bound parameters only. The second interesting consequence is directly linked to the performance and the fact that the provider can use prepared statements instead of creating a statement every time it creates a query. Combined with a datasource pool which can most of the time cache these prepared statements, it makes pretty cheap to execute statement compared to an implementation which would create them each time it is needed.

Depending on your JPA provider, you need to change the properties to activate the query logging. Hibernate and OpenJPA use other properties and logger names, for instance. Alternatively, some containers or JDBC drivers will let you configure it at another level. For instance, in Apache TomEE you can set LogSQL=true in your DataSource resource directly.

What is interesting to see is the effect of what we write in Java on the SQL side.

The INSERT case is straightforward and directly converts the JPA model to the corresponding SQL statement to insert all the values into the corresponding database:

INSERT INTO QUOTE (ID, NAME, VALUE) VALUES (?, ?, ?)

SELECT is a direct binding too, which selects all the columns with a clause on the idenfitier of the entity:

SELECT ID, NAME, VALUE FROM QUOTE WHERE (ID = ?)

Here, the role of the JPA provider is quite obvious; it makes the link to SQL, which means the following:

  • Convert the JPA API and JPQL to the current SQL. Note that in all the JPA providers, there is a notion of database SQL language so that they can handle the database specifics (such as the column types or the pagination). EclipseLink calls it platform, Hibernate, dialect and OpenJPA, dictionary.
  • Handle Java to database mapping: database column names are converted to field names, table names to class names, and so on.

However, if you look closer to the logs when you query a quote through the JAX-RS endpoint, you may be surprised:

SELECT t1.ID, t1.NAME FROM QUOTE_CUSTOMER t0, CUSTOMER t1 WHERE ((t0.quotes_ID = ?) AND (t1.ID = t0.customers_ID))

Where does it come from? If you investigate a bit, you will quickly identify this line in the JAX-RS layer:

json.setCustomerCount(ofNullable(quote.getCustomers()).map(Collection::size).orElse(0));

What does it do? It just sets the number of customers linked to Quote. Which part triggers this additional query? A simple call on the relationship collection triggers it. In our case, it is size():

quote.getCustomers().size();

Since the relationship between Quote and Customer is lazy, this simple line will trigger an additional query with EclipseLink. What is interesting is that if you check the JAX-RS resource, it is not @Transactional and this query may fail depending on the JPA provider, as lazy handling must be done in a transaction.

The provider is clever enough to not trigger any query and just call  getCustomers() . But it will do when calling any method of the returned collection such as  size() here. Depending on the provider, null may or may not be possible, which is why the original code assumes it can be null.

We will discuss about modelling in another chapter but the obvious solution to make the relationship eager is not a real solution, since you will slowly load all your object graphs everywhere, doing which can lead to performance issues and even memory issues. So try to resist this temptation.

While you are playing with the JPA and SQL, I recommend that you disable EclipseLink's default shared cache, which easily hides queries (later on, we will discuss why to disable it even in production). This can be done with the following property added to your persistence unit:

<property name="eclipselink.cache.shared.default" value="false"/>