Introduction
A powerful data access layer requires a
lot of knowledge about the internal database, JDBC, JPA, Hibernate, and this
article summarizes some of the key techniques you can use to optimize your
business application.
1. SQL statement logging
When you use a structure that generates statements for you,
you must always check the effectiveness and efficiency of each statement. A
Test Time Assertion mechanism is even better because you can capture N + 1
query issues before even confirming your code.
2. Connection management
Because database
connections are expensive, you must always use a connection pooling mechanism.Because the number of
connections is provided by the underlying database cluster resources, you must
release the connections as soon as possible.
In the performance
parameter, you must always measure and set the correct size of the grouping,
which is not different. A tool like FlexyPool can help you find the right size,
even after the implementation of the application in production.
3. JDBC batching
The JDBC package allows
us to send multiple SQL statements in a single round of the database. The
performance gain is important for both the controller and the database.
PreparedStatements are very good candidates for many, and some Oracle database
systems (for example) only support batch-ready statements.
Because the JDBC API
sets a different one for stacks (for example, Preparado Statement.addBatchand
PreparedStatement.executeBatch), when you generate statements, you must
manually know from the start whether you want to use stacks or not. With
Hibernate, you can switch to a stack with a configuration.
4. Statement caching
Instruction cache is
one of the least known performance optimizations that you can easily use.
Depending on the underlying JDBC driver, you can store in the
PreparedStatements cache on the client side (the controller) or on the side of
the database (syntax tree or even the execution plan)
5. Hibernate identifiers
If you use Hibernate,
the Identity Generator is not a good choice because it mutes the JDBC stack.
TABLE Generator is even
worse because it uses a separate search for a new ID operation that can push
the underlying transaction log and the connection pool as a separate connection
whenever you need a new identifier
SEQUENCE is the right
choice and even SQL Server supports version 2012. For Sequence Identifier,
Hibernate has offered Optimizer as pooled pools that can reduce the number of database moves
required to find a new value. entity identifier
6. Choosing the right column types
You should
always use the right column types on the database side. The more compact the
column type is, the more entries can be accommodated in the database working
set, and indexes will better fit into memory. For this purpose, you should take
advantage of database-specific types (e.g.
inet
for
IPv4 addresses in PostgreSQL), especially since Hibernate is very flexible when
it comes to implementing a new custom Type.
7. Relationships
Hibernate
comes with many relationship mapping types, but not all of them are equal in
terms of efficiency. However,
unlike queries, collections are less flexible since they cannot be easily
paginated, meaning that we cannot use them when the number of child
associations is rather high. For this reason, you should always question if a
collection is really necessary. An entity query might be a better alternative
in many situations.
8. Inheritance
When it
comes to inheritance, the impedance mismatch between object-oriented languages
and relational databases becomes even more apparent. JPA offers
SINGLE_TABLE
, JOINED
,
and TABLE_PER_CLASS
to deal with inheritance
mapping, and each of these strategies has pluses and minuses.SINGLE_TABLE
performs the best in
terms of SQL statements, but we lose on the data integrity side since we cannot
use NOT NULL
constraints.JOINED
addresses the data integrity
limitation while offering more complex statements. As long as you don’t use
polymorphic queries or @OneToMany
associations
against base types, this strategy is fine. Its true power comes from
polymorphic @ManyToOne
associations backed by a
Strategy pattern on the data access layer side.TABLE_PER_CLASS
should be
avoided since it does not render efficient SQL statements.
9. Persistence Context size
When using
JPA and Hibernate, you should always mind the Persistence Context size. For
this reason, you should never bloat it with tons of managed entities. By
restricting the number of managed entities, we gain better memory management,
and the default dirty checking mechanism is going
to be more efficient as well.
10. Fetching only what’s necessary
Fetching
too much data is probably the number one cause for data access layer
performance issues. One issue is that entity queries are used exclusively, even
for read-only projections.
11. Caching
Relational
database systems use many in-memory buffer structures to
avoid disk access. Database caching is very often
overlooked. We can lower response time significantly by properly tuning
the database engine so that the working set resides in memory and is not
fetched from disk all the time.
Application-level
caching is not optional for many enterprise application. Application-level
caching can reduce response time while offering a read-only secondary store for
when the database is down for maintenance or because of some serious system
failure.
12. Concurrency control
The choice
of transaction isolation level is of paramount importance when it comes to
performance and data integrity. For multi-request web flows, to avoid lost
updates, you should use optimistic locking with detached entities or an
EXTENDEDPersistence Context.
To avoid optimistic locking false positives, you can use versionless
optimistic concurrency control or split entities based write-based property
sets.
13. Unleash database query capabilities
Just
because you use JPA or Hibernate, it does not mean that you should not use
native queries. You should take advantage of Window Functions, CTE
(Common Table Expressions),
CONNECT BY
, PIVOT
.
These
constructs allow you to avoid fetching too much data just to transform it later
in the application layer. If you can let the database do the processing, you
can fetch just the end result, therefore, saving lots of disk I/O and
networking overhead. To avoid overloading the Master node, you can use database
replication and have multiple Slave nodes available so that data-intensive
tasks are executed on a Slave rather than on the Master.
14. Scale up and scale out
Relational
databases do scale very well.
If Facebook, Twitter, Pinterest or StackOverflow can
scale their database system, there is good chance you can scale an enterprise
application to its particular business requirements.
Database
replication and sharding are very good ways to increase throughput, and you
should totally take advantage of these battle-tested architectural patterns to
scale your enterprise application.
Conclusion
A
high-performance data access layer must resonate with the underlying database
system. Knowing the inner workings of a relational database and the data access
frameworks in use can make the difference between a high-performance enterprise
application and one that barely crawls.
There are
many things you can do to improve the performance of your data access layer, and
I’m only scratching the surface here. If you want to read more on this
particular topic, you should check my High-Performance Java
Persistence book as well. With over 450 pages, this book explains
all these concepts in great detail.
To getting expert-level training for Java Training in your location
–
java training in chennai | java training in bangalore|java training in pune | java training in chennai | java training in bangalore | java training in tambaram | java training in omr | java training in velachery | java training in annanagar | java training in chennai | java training in marathahalli | java training in btm layout | java training in jayanagar | java training in rajaji nagar | For getting java online
training | java online training
No comments:
Post a Comment