Java EE

Variants of SEDA

17th April 2016 Architecture, EDA, JMS, Uncategorised No comments

SEDA (Staged Event Driven Architecture) was described in Ph.D. thesis of Matt Welsh. I recently read his blog post that describes various ways SEDA was perceived and implemented . One thing that comes to my mind after reading this is that SEDA as a tool has been used in more or less correct ways, as many other patterns and architecture models.

Because SEDA is a design model and architecture pattern, one that can be used with other models like SOA. In fact some ESB’a use SEDA as a processing model, like Mule ESB or ServiceMix. SEDA is one of a few processing models in ServiceMix. But SEDA can be used elsewhere, like with Oracle Service Bus. And in more that one way:

  1. SEDA using messaging – here we would add a queue before service that consumes event messages possibly using SOAP/JMS binding. We can that add additional stages for events processed by this first service.
  2. Throttling capabilities of OSB – here we do not have a real thread pool but using WebLogic’s Work Managers we achieve similar goal, threads will be taken from WebLogic thread pool. This processing model will work with every routing action

I also used a variant of the first model, where services were exposed as SOAP/HTTP services and they published a messages on a queue and then send reply that message has been queued. This way we can still used SOAP/HTTP messages, not forcing other systems to use messaging – either JMS, AMQP, STOMP or some other.

When done right SEDA will improve system’s scalability an extensibility. Scalability is higher because if a peak of messages happens than it will be queued up to the limit of storage quota. Messages that can’t be processed when arriving at a queue will wait for their turn. With addition of reliable messaging like JMS persistent delivery with exactly-once guarantee we can have scalable and reliable message processing mechanism. Adding additional queues depends on requirements of specif domain or application. It may be viable to add queues not only because of scalability but also from extensibility point of view. We would use topics here of publish event to multiple queues but the main point is that it would me possible and quite easy to add additional event consumer in this model.

SEDA may be also good for asynchronous request processing with long processing times with addition of NIO2 and Servlet 3.x asynchronous processing model. In this case we would accept request at some endpoint like Spring controller method and then invoke asynchronous method to do backend processing. HTTP could process another request for another potentially synchronous endpoint. Backend processing service would process incoming requests and invoke completion handlers as it finishes (using Spring infrastructure).  Here we would have at least two queues – HTTP thread queue and backend processing service queue. Both could be managed and adjusted to add higher scalability.

When considering SEDA it is important co remember that request processing will be lower that with processing with one thread and no queues on the way. This is quite obvious, as dispatching, consuming, confirming event message has it’s performance impact. SEDA can also impact system’s maintainability as event messages may get stuck in some queues or get redirected to dead letter queues and this must be monitored. Fortunately it is not a big problems, especially on OSB (we can use alerts) .

SEDA can be used incorrectly, just like EDA, CEP or SOA.  Once again architecture design is probably the most important thing in system’s development.

Have a nice day !

EDA in Java EE 7

3rd April 2016 EDA, Java EE, JMS, Software development 1 comment

Building EDA in Java EE 7 or Spring is easy. Let’s create a simple EDA routing example using JMS 2.0, EJB 3.2 and CDI 2.0. We will focus on messaging part, so we will not use anything fancy to route messages. We will publish messages to a single queue – the event channel. Messages will be received by EDA component which will route them according to header. Message will be published to appropriate event queue based on header value and will be received by appropriate component that will monitor those queues. Example will use queues but it can be easily modified to use topics.

A few important things:

  • What is important is to keep the architecture clean an event generator must not know anything about event consumers. Event instance should be created using an EventBuilder or some other helper class so that event source passes required information and does not know anything about details of EDA event construction.
  • Event consumer that does business processing must get business information from event and trigger processing. Do not implement business code in event consumer nor propagate EDA related details down to domain services layer.
  • It is important to do routing based on header if possible, as it is more lightweight.
  • Use appropriate session mode (transacted or no, with required AcknowledgementMode)
  • Remember that messages can and will arrive out of order
  • Use one or a few Dead Letter Queues, sent messages that failed to be processed a few time to related DLQ. Example code does not do this

EDA - simple routing

 

Publishing message using EJB is super easy:

Important parts are discussed below.

Creating JMSContext that manages JMS Session and ConnectionFactory for us. This must NOT be created in session transacted mode in order for it to join JTA transaction. There are cases where we want to manage transactions our selves or send message whether or not other parts of request processing failed. In those cases we could use separate transaction (e.g. by setting TransactioAttribute to REQUIRES_NEW  in CMT model or JTA 1.2 Transactional annotation) or we could create a transacted session and commit local JMS transaction in publisher code. In most cases we want to publish events as part of some other processing in transaction, so we must join existing JTA transaction.

Thanks to AUTO_ACKNOWLEDGE mode session will not be transacted and will join existing JTA transaction. We can use this to span XA transactions if it would be required.

Creating message and setting routing header:

and finally sending message:

Event can be send fully asynchronously in Java EE. In code example above client code will block until JMS provider acknowledges message. Starting from Java EE 7 we can send message and continue processing.

Publishing a message from a CDI Bean is similar, notable detail is the use of @Transactional, a JTA 1.2 annotation that allows CDI beans to work in transaction context (but not in case of life cycle methods – here EJBs still have to be used).

Messages will be received by Message Driven Bean. One of the strengths of using MDB component is that messages are delivered asynchronously with possibility to scale MDB resource share (thread pool, component instance pool). EJB container take care of other thing that we must think of when using CDI bean like transaction management and related message acknowledgement. A lot of configuration can be done declaratively using @AnnotationConfigProperties. But I don’t intend to compare CDI with EJB here, so let’s see the code:

MDB implements MessageListener to clearly state it is a JMS MDB, as MDBs can also connect to other resource manager types. onMessage method will be invoked to process each message and each message will be processed separately, even though messages will be fetched in batches from JMS Provider. Default acknowledgment mode for MDB component is AUTO_ACKNOWLEDGE with DUPS_OK_ACKNOWLEDGE allowed to be set.

SimpleRoutingEventMessageBroker can be source of other events. How you process events is system specific. In some cases simple XML or JSON routing specifications will be sufficient, in other cases Business Rules engine will help go trigger events. CEP (Complex Event Processing) is different league.

As comment states in code above MDB component that gets messages from event channel should not take care of message processing or routing. Message processing should be executed in dedicated component so that we can use different routing strategies and component will be easier to maintain. Testing routing code separately of JMS can also be a plus – although we can test it with JMS this test will be slower than regular JUnit tests.

Event consumer implementation is similar to SimpleRoutingEventMessageBroker MDB:

Message consumer CDI component is another story – see code below:

We must receive messages manually and we can and should give a timeout for receive operation. This component is transactional also, but it must be called by some other component in order to receive messages.

We can test code above using Arquillian. In case of both publishers we will test if message is not send if an exception is thrown.

Second test is similar, so if you’re interested to check github.

In come cases it may make more sense to use lightweight integration framework like Spring Integration or Apache Camel to create internal EDA component. Most efficient tool should be used to do the job.

Have a nice day !

Batch processing for Java Platform – partitions and error handling

13th March 2016 Batch, Software development No comments

In previous post I wrote about basic Batch processing for Java Platform capabilities. I tried out partitions and error handling now. Let’s look at error handling first. With chunk style steps have following options:

  • Skipping chunk step when a skippable exception is thrown.
  • Retry a batch step when a retryable exception is thrown. Transaction for current step will be rolled back unless exception is also configured as no-rollback type exception
  • No rollback for given exception class

Processing applies for exceptions thrown from all phases (read, process, write) and checkpoint commit. We can specify what exception classes are to be included in each option – that is, if given exception class is skippable exception. We can also exclude some exception classes and batch processing engine will use nearest class algorithm to choose appropriate strategy to use. If we include exception class A as skippable and this class has two subclasses B and C, and we configure exclude rule for class C then exceptions A and B will cause engine to skip step while exception C will not cause the engine to skip step excecution. If no other strategy is configured then job execution will be marked as FAILED for C.

Let’s see an example

Here we skip step for pl.spiralarchitect.kplan.batch.InvalidResourceFormatException exception. We add code that throws this exception to factory method for KnowledgeResource:

In test file we make one line invalid:

Step for this line will be skipped.

For Batchlet style steps we need to handle exceptions our selves. If exception gets out of step job will be marked as failed.

Another more advanced feature is partitioning of steps. Partitioning is available for both chunk as well as batchlet style steps. Consider example xml below:

In this configuration we specify that there are to be two partitions and two threads are to be used to process them, so one thread for a partition. This configuration can be also specified using a partition mapper, as the comment in xml configuration snippet describes.

Partition collector’s role is to gather data from each partition to analyzer. There is a separate collector for each thread.

Partition analyzer is to collect data from all partitions and it runs on main thread.  It can also decide on batch status value to be returned.

In order to understand how this works it may be helpful to look at algorithm descriptions in JSR Spec, chapter 11. Important detail that is described here is that for each partition step intermediary data is stored in StepContext. With this knowledge we can create a super simple collector – keeping in mind that we could process the intermediary result here, we just don’ t need to do this:

This collector will be executed for each step with result returned by writer step – you can  find modified KnowledgeResourceWriter below

Then we can write code for super simple analyzer:

KnowledgeResourcePartitionAnalyzer adds all resources to same list that writer step did. It also sets parameters used by next step, that were previously set in writer.

Now when we execute modified example we will see that we have some strange output:

Yikes !

1. We did not tell each partition what data are to be analyzed by it nor we did contain this logic (to decide what data should be processed) in step definition itself.

2. storing resource: KnowledgeResource [title=Hadoop 2 and Hive, published= 2015] is repeated four times ! – this is because:

“The collector is invoked at the conclusion of each checkpoint for chunking type steps and again at the end of partition; it is invoked once at the end of partition for batchlet type steps.”

Back to the lab then

To fix first problem we simply split the file in first step into two files – one for each partition. We will create two files statically (artciles1.txt and articles2.txt). Each partition will read one file – file name will be configured (simplification for demo – in real life application this would be probably a bit more complicated). So we need to change implementation for reading files and configuration.

Important lines in configuration above are:

* <jsl:property name=”fileName” value=”articles1.txt”/> and <jsl:property name=”fileName” value=”articles2.txt”/> – here we configure file name for each partition

* <jsl:property name=”fileName” value=”#{partitionPlan[‘fileName’]}”/> – here we configure file name for reader component, this needs to be done to transfer configured file name from partition plan to step properties (inconvenience ; ) – JSR-352 is young).

Reader component will get fileName injected (cool feature of young JSR)

And it will use this file name to construct resource path:

This fixes first problem.

To fix second problem we need to know if this is end of partition. We can check it in a few ways :

* check if we processed given number of elements, so there would be a need for some service that would monitor all partition processing

* check thread name – partitions are executed on separate thread, poor solution as threads may be pooled so this won’t work

* check if we already processed given element (using some hash)

I haven’t found any API that would tell if partition execution is done and we this is the last call to collector. Maybe this last call is an error in JBeret (JBoss’ implementation of Batch processing for Java platform).

We will try last solution- we could check if we processed chunk in analyzer, but since this is more technical detail (the important thing was to find out why we got duplicated element) we will just check this in data structure for KnowledgeResources – we will simply replace List with Set:

After this changes we get correct number of processed entries 🙂

Take a look at this post on Roberto Cortez blog regarding Java EE 7 batch processing.

Have a nice day ! Code is on github.

 

Trying out Java EE Batch

7th March 2016 Batch, Java EE, Software development No comments

I decided to try Java EE Batch, then try Spring Batch and finally Spring For Hadoop with Pig. Java EE Batch is relatively new specification but for a number of use cases it will be sufficient. This sentence says it all – I thing that Java EE Batch is ok. Nothing more yet. As you will see you have to do a lot things yourself and doing other things is just cumbersome.

So what is is Java EE Batch, a.k.a. JSR 352 (Batch Processing for Java Platform) – it is heavily inspired by Spring Batch, Java EE family specification describing framework for batch job processing. It allows to describe jobs in Job Specification Language (JSL) – only XML format is supported. I guess that for batch apps this may be ok. I can live with it 😉

Java EE Batch specified two types of processing:

  • Chunked – here we have 3 phases – reading, processing and writing. Chunked steps can be restarted from some Checkpoint.
  • Batchlet – this is a free form step – do whatever is required.

Java EE Batch has many features like:

  • Routing processing using decision steps
  • Grouping groups of steps into flows
  • Partitioned steps – step instances to be executed in parallel. Partitioned step allows to split job into multiple threads in more technical way. We can specify partition mapper, partition reducer and other partition specific elements. I won’t go into detail on this now.
  • Splits – splits specify flows to be executed in parallel – there can be a few flow definitions. so this is more process or business oriented parallelization of work.
  • Exception handling:
    • Skipping steps
    • Retying steps
  • JobContext to pass batch job data to steps

CDI is used and supported. But with all of these specifying job context parameters is inconvenient (in 2005 I would say that it’s ok ;)), we must do a lot our selves – like moving files, reading them and passing intermediate results. It would be nice to have some utilities to help with standard batch related chores.

Ok let’s see some example – below you can fine a JSL definition for a demo batch job. You can build this XML using Eclipse xml editor with ease (or use some plugins for NetBeans) – just choose XML from schema option and then select XML Catalog and jobXML_1_0.xsd entry:

xml_schema_jpl

First we move file from import location to some temporary location so it can be processed without any problem:

next=”resource-router” specified next step to be invoked – there is a possibility for a typo here. So inconvenience.

<jsl:batchlet ref=”initializeResourceImport” /> – specifies a batchlet (free form job) step that will move file. Instance of class for this step will be instantiated as a CDI bean named “initializeResourceImport”:

Here we move file to a workDir location.

Next we start a flow and execute another step type – chunked step. Next step id is given in next attribute.

Each of phases is implemented by a CDI Bean. So reading looks like this:

Open and close stream in this step, w provide checkpoint definition in order to restart from some point. Single invocation of this reader is only reading single line that is passed down to process part of step. It would be easier if Batch framework provided some reading utilities and let us worry about doing important stuff with data than to worry about opening and proper closing of files. But this foundation functionality that is provided now is also required and features that make using Batch Processing for Java Platform will be added in next versions of spec.

Next phase is processing line of data read from file – only thing that is done here is conversion from String to KnowledgeResource instance

And finally we can write the data

This is also simple processing in order to have some fun with Java EE Batch framwork without worrying about more job related details. Important parts here is that this phases receives a list of objects that are results of processing phase.

Use of application scoped CDI bean of class KnowledgeResources – this is done this way in order to ease testing. Of course in real life batch job I would not think about using application scoped bean without any special handling – it’s like having state field in a Servlet where requests store data 😉 Very bad idea. This job passes a result of processing that will be used by Decider instance to route processing. This simplish demo uses a hard coded value.

After processing data in resource-processor step we proceed to decision step – path-decision. This step is also implemented by CDI Bean named pathDecider.

Depending on what value is returned from Decider instance some route will be chosen. Steps that are to be invoked next are designated by id in to attribute. Java Based config would allow here to get rid of some typos.

Again implementation is simplish:

Decider chooses on of two steps (ok, it does choose STORE step every time…):

CDI beans for both steps are similar:

Important note about running test – we have to use Statless EJB to start batch because of class loader ordering – Arquillian does not see Java EE Batch resources:

So we use BatchImportService to start batch. In real app we would probably use a Claim check like processing in order to pass data to an SFTP server and trigger processing using JMS message or Web Service call. We could also monitor SFTP using some service (Oracle Service Bus for example can monitor SFTP and start processing).

And this is it – a simple evaluation of Java EE Batch (or Batch Processing for Java Platform) basic features. Next thing would be to add partitioned or split features. In the mean time – code is on github.

Apart from batch applications we could use stream or event processing. Chosen architecture depends on requirements for system under construction. If order is important but processing delay is not so important than batch processing is worth to check. If we need to react more quickly, or we want to continue processing data despite processing of some part of this data failed and we want to have high reliability than using messaging may be way to go. But all this is a subject for another article 🙂

Have a nice day and a lot of fun with Java EE !

JPA 2.1 – Converters

20th February 2016 Java EE, JPA, Software development No comments

Converting values between object data model and relation data model in JPA 2 is super easy. Here are the steps:

1. Declare converter for value pair thet you want to be converted. Let’s say that we want to convert boolean values to integer, then we declare converter like this:

Converter has to implement javax.persistence.AttributeConverter<ObjectModelType, RelationalModelType>. This interface declared to methods with very clear names – at first sight one can tell what they are responsible for.

Instead of using annotation you can use XML descriptor of course.

2.a Declare Converter class to be applied automatically – just add autoApply attribute set to true to Converter annotation:

2. b If Converter is not applied automatically then annotate entity, mapped superclass or embeddable class attribute with javax.persistence. Convert annotation (or add appropriate XML element to XML descriptor).

converter attribute specifies Converter class to be used for conversion.

If you annotate attribute that is not a basic type or element collection of basic types than you must also add attributeName attribute to annotation to tell JPA to which attribute you want the converter to be applied to.

Convert annotation can also exclude or or overwrite default conversion, e.g. if we have defined some other converter like:

than this converter would not be applied to aforementioned locked field because other converter is specified directly on field.

In order to exclude this attribute from conversion Converter annotation must have disableConversion attribute specified with value true.

Some hints:

According to JPA 2.1 specification point 11.1.10 Convert Annotation some attributes should not be applied to some fields like id, Enumerated and Temporal (annotated) types, relationship attributes or version attributes.

Convert annotation can be applied to entity class that extends mapped superclass to overwrite conversion mapping, see point 11.1.10 Example 11: Override conversion mappings for attributes inherited from a mapped superclass.

When using converter in Java SE environment one needs to specify converter class in persistence.xml, same as with entity classes.

That’s it ! Have a nice day 🙂

Setting up Arquillian for Java EE testing

30th January 2016 Java EE, Testing No comments

This post is about aliens. There will be new X-Files series. I watched some episodes, I’m not a big fan, but this show had its good sides. In Java aliens are present from some time, I don’t even remember when they landed. My first 3-rd degree encounter with them was around 2012. The alien that I saw looked a little bit like this

arquillian_logo_450px

Arquillian (difficult name – but hey ! his an alien) is a platform for running Java EE components tests inside Java EE, CDI and Servlet containers. There are modules available for functional testing (Graphene / Drone), persistance layer testing and other. It can manage embedded containers, manage containers by itself, it can attach to external container started in some other way. Arquillian creates deployment archives for this containers and provides us with test runners. Available containers are listed in the modules section of Arquillian home page and in the reference guide. Modules section is more up to date, but it does not provide complete guide on how to setup test environment.

Arquillian lets you test different component types depending on container used, but most of the time you will probably test with full Java EE Container or Servlet Container. Arquillian can even download server by maven resources plugin, it will be useful for the demo now. I think I may reuse this setup for next posts about Java EE, but then I will use external Wildfly or some other Java EE server. I would like to build a knowledge management application in Java EE using Domain Driven Design.

Why would I need module testing ? I prefer to test integrated modules as a lot of bugs is related to errors in cooperation between modules. With rigorous unit testing (that is test that test single unit of code) and design it may be possible to provide comparable quality, after all every component has provides API that can be tested. But then again if we plan and provide test in unit test that covers case component will throw InvalidArgumentException will not prepare us for actually handling this exception in client components.. Yes, it can be handled in client component (or some higher level component up the call stack), but you also must detect cases where you forgot to take care of it – and this is where rigorous unit testing comes in to consideration. In real life projects it’s sometimes not so easy. After all software development is about providing software that provides business value and required quality and architecture metrics. Having this in mind I prefer to create automated functional (e2e) test (always for not trivial projects) and then test that are more module or unit tests depending on component. Testing integrated code often requires container – like Spring or Java EE container. This is where Arquillian comes in.

Getting back to our alien – setting up Arquillian is not easy. My setup is not the most up to date one. I tried to update it, spend some time fighting with Wildfly / Aruqillian configuration and decided that it is not the most important thing to use Arquillian Universe instead of older setup. After all we will be using Wildfly 8.1 and Java EE 7 – good enough 😉

First thing is to set up maven configuration:

1. Set up integration test plugin

Important detail here is to setup system properties variables as these will be used by Arquillian and passed to Wildfly. We will us standalone-full.xml so that we can test API’s like JMS that aren’t available in standalone config.

2. Download and configure Wildfly

3. Dependencies

Here we have setup in parent pom – notice

  • jboss-javaee-all-7.0 dependency – we do not use the javaee-api dendency as it contains only API withou implementation (look at warning here)
  • arquillian-bom – this pom contains all dependencies required for Arquillian, we import is so that we do not have to specify full dependency configuration ourselves

and core module:

Now we can setup Wildfly in standalone-full.xml. Important part is overwriting port values here – originally http management port conflicts with NVidia application on Windows. We may also overwrite http port (but we are not really required to do so)

Finally setup in  arquillian.xml, a configuration file used by Arquillian. We just need to tell Arquillian what is the number of management port (setup in standalone-full.xml)

In order to create test we need to specify deployment that Arquillian will work with. We do it using ShrinkWrap module:

Now we are ready to do Java EE module testing:

Example code at github.

Have a nice day 😉

JPA 2.1 – generating database for unit and integration tests

24th January 2016 JPA No comments

Generating db for unit tests is pretty easy in JPA 2.1 or rather as easy as it was but a lot more portable an with more management options. We have a few new entries available in persistence.xml (8.2.1.9 properties of JPA 2.1 spec), relevant to db creation are:

  • javax.persistence.schema-generation.create-script-source – this entry points jpa provider to script used when creating db
  • javax.persistence.schema-generation.drop-script-source – this entry points jpa provider to script used when tearing down db
  • javax.persistence.sql-load-script-source – this entry points jpa provider to script that is to load the data
  • javax.persistence.schema-generation.database.action – what action should be done for database – is the JPA provider to create database, drop it or both (drop-and-create) or maybe none?
  • javax.persistence.schema-generation.scripts.action – as above but used for generation of script files
  • javax.persistence.schema-generation.create-source – what source of information should JPA provider use when creating database – scripts, metadata or maybe scripts-then-metadata or metadata-then-script
  • javax.persistence.schema-generation.drop-source – as above but for dropping database
  • javax.persistence.schema-generation.scripts.create-target – target location for create instructions script
  • javax.persistence.schema-generation.scripts.drop-target- target location for drop instructions script

Example configuration:

In the example above H2 database is created in Oracle compatibility mode. Scripts are generated in root folder of project but not used to create or drop database. I used them to make creation of load script easier. Load script is read from src/main/resources 🙂 Hibernate will also read it from root folder, but EclipseLink refuses to do so.

Interesting option is using metadata to create DB and then ddl script to add some specific details or add stored procedures (yes, you could do it e.g. in hibernate using hibernate.cfg.xml but current solution is more portable).

Important detail – notice that all classes are listed in Persistence Unit. Some providers will support class detection when PU is bootstrapped in Java SE environment, some do not. It is safer to list classes. Code that uses unit tests and bootstraps database can be downloaded from GitHub.

If portability is not connected to losing some functionality than why not to use this option ? On the other hand if the price would be to sacrifice some required and well designed, written and maintained functionality that better solution would be to encapsulate the provided dependent part than to use the portable mechanism.

Have en dejlig dag!

JPA 2.1 named entity graphs in Hibernate and EclipseLink

21st January 2016 Java EE, JPA No comments

JPA 2.1 enables us to use named entity graphs that describe what part of entity graph should be loaded. Instead of using one default graph described by FetchType attributes on relation fields and basic fields (in this case code weaving is required) we can now define a few named graphs. Should I mention Fetch Joins used in JPQL and Criteria API ? Well, I don’t find use cases for these equal:

  • With Fetch Joins you have to duplicate query or use tricks like storing core part of query in one field and then concatenate case specific part. On the other hand you can do advanced filtering, grouping and ordering.
  • With Criteria API you can do the same but you can extract common query part to some method
  • Both with Criteria API and JPQL in order to cache results in L2 Cache you must enable query cache

The use cases for named entity graphs are the ones where we do not need advanced filtering or grouping and we do not want to use query cache. Entities fetched with entity graph are cached in L1 cache if EntityManager cache is used. So how can we use named entity graphs?

Let’s say we are given following tables mapped to entities:

entities

On Customer entity we can specify named entity graphs:

In code snippet above there are two named entity graphs:

  • CRT.CUSTOMERS_WITH_PHONES – this one fetched customer data and phones for the customer
  • CRT.CUSTOMER_DATA – this one fetches only basic customer data

Those graphs are simple, this makes it even easier to show some properties. Every entity graphs is a set of:

  • attributeNodes – defined fields/properties that are to be fetched, they can reference subgraphs. Here we use NamedAttributeNode annotation which  value is field/property name that is to be fetched
  • subgraphs – define graphs for related entities, element collections and embedded classes
  • subclassGraphs – defined subgraph for subclasses

@NamedEntityGraphs defines also the  includeAllAttributes attribute that tells jpa provider do load all entity fields/properties, subgraphs still have to be specified.

Now that we have graphs defined We can use this graphs in two ways:

  • as fetch graphs – in this case if an attribute is not specified in named entity graph (including subgraphs) it is treated like it was lazy loaded
  • as load graphs – in this case default fetch type is used to attributes that are not specified in named entity graph (or one of subgraphs)

Let’s test it !

Hibernate JPA provider test:

We specify fetch graph to be used by setting a hint in query instance. As a hint value we specify named entity graph retrieved from EntityManager –  newEntityManager.getEntityGraph(Customer.CUSTOMER_DATA). Orders and phones are not fetched. Note that we use “javax.persistence.fetchgraph” hint – so the fields/properties not specified in graph will be treated as lazy loaded.

Let’s test graph with phones:

Here we using fetch graph but this time named entity graph included phones and they are loaded.

How about load graph ?

In test above named entity graph does not specify phones or orders, but load graph hint is used so default fetch type applies. And both order and phones are not fetched.

How about finder method? Let’s see:

Named entity graphs work in same way. Additionally with query logging enabled we can se that query for customer will be done once.

If we would like to use named entity graphs with O2O relations or basic fields/properties that we will be disappointed:

Both Hibernate and Eclipselink will fetch additional fields in case of One2One mapping and basic properties, and to this according to JPA Spec.

I think I’d rather stick to Fetch Joins and criteria API.

Code with examples will be  are available on GitHub

Cheers !

JPA 2.1 – constructor expressions

21st December 2015 Java EE, JPA No comments

In cases where we fetch lists of data we should fetch as little data as it is reasonable. What is reasonable amount of data ? If we fetch a list with 25 records and display 3 out of 4 entity properties then we fetch whole entity. But if amount of data is controlled by the user (say up to about 100 records per page) or better something like value list holder with in memory cach is implemented, and if there can be several users then we should really design our domain services to return data needed for displaying the view and performing further operations and not much more. An example of bad design would be a list that displays 5 properties from 2 entities that have together over 15 properties. In this case we fetch up to ten properties without a reason. Some of them may be required to e.g. display details view but not all of them.

In JPA we can solve this issue by querying for individual fields and then working with Object arrays or Tuple objects. We can also use Value Objects constructed using Constructor Expressions. This gives us easier refactoring and maintenance, it is much easier to work with Value Object then array. But prior to JPA 2.1 we were limited to JPQL Constructor expressions. Let’s say we have a Customer entity that references Order entity and a few others and we want to display list of customer (represented by customer name) orders with order date and id. In JPA 2.1 we could do:

This does not solve cases where we must use Criteria API or Native Queries. Starting from JPA 2.1 we can use Constructor Expression !

Same query as above written in Criteria API (JPA 2.1):

Here we use CritieriaBuilder.construct method that takes Class to be constructed and constructor parameters.

If we would need to make a Native Query then we must use the new @ConstructorResult annotation:

@ConstructorResult also requires target class and an array of @ColumnResult annotations that point to column names from the query that will be constructor parameter values. Order of @ColumnResult annotations is important and must be the same in Value Object constructor.

Have a nice day 😉

How to avoid repeating cachable DB queries in JPA

30th August 2015 JPA No comments

JPA has two levels of cache:

  • Persistence Context level cache (L1 Cache) – according to JPA specification an instance of Persistence Context can have at most one references to the instance of given entity representing a row in the database.
  • Second level cache (L2 Cache) –  this cache is shared and entries in it live longer. Internal construction of this cache can vary between JPA providers of course – there can be subtypes of this cache like entity cache, relations cache or query cache.

You car read about those in many articles, including this great article by Caren McDonald.

There are two less obvious thing related to use of L1 Cache. One thing is given straight in Caren McDonald’s article:

L1 Cache is related to Persistence Context not the transaction.

A second not so obvious thing is that when you do the following criteria query:

or do a JPA QL Query then the query will be executed against database each time it gets called. This is because you would need to use query cache and L2 cache to avoid those queries.

In order to avoid querying the database when looking for entity with given identifier more than one time in the scope of one Persistence Context you must use EntityManager find:

Shorter, more readable and more effective. Of course we could make the criteria query version shorter, use some utility methods – the point it would still execute the query. Of course in many cases it won’t matter as much, as you would have to query for given id a few times. But when designing a generic utility or some shared repository be aware of this mechanism.