Configured
in the ColdFusion administrator under the "Caching" section, server
caching controls how much and when data is stored in memory. An effective server
caching strategy can relieve stress on resources such as databases, CPUs, and
file systems while dramatically improving application performance. This article
explores server caching in ColdFusion, introducing its different pieces and
providing examples of how server caching works and its benefits.
Template Cache
All ColdFusion templates are compiled into PCode before
execution. This compilation process can be resource intensive and slow down an
application. To avoid compiling a template on every request, ColdFusion caches
its PCode into memory the first time that it is called. If the cache becomes
full, the cache is forced to purge templates on a first in first out basis to
accommodate new requests. As a result, the next time the purged template is
called, it must be recompiled. This purging is referred to as a cache pop and
can be seen when monitoring CFSTAT.
If
ever CP/Sec is greater than 0, Allaire recommends that the "Template Cache
Size" setting be increased. By default "Template Cache Size" is
set to 1024 kilobytes, but a good rule of thumb is to set the template cache
size two to five times the total template size. Note that this setting is a
maximum limit and is not allocated until necessary. Each template is cached only
once, even if it is included in several other templates.
Trusted Cache
Though a template's PCode may be stored in the template
cache, ColdFusion checks the actual file to see if it has been modified after it
was cached. This check may increase I/O wait and can be avoided by turning on
trusted cache, also in the ColdFusion administrator.
With
trusted cache enabled, ColdFusion will only access the template cache—even if
the template itself is modified. This can be problematic if developers expect to
see changes when files are modified. To introduce modified templates into the
cache with out restarting the ColdFusion server, disable trusted cache and make
a request to each modified template. Trusted cache can then be turned back on.
Database Connection Caching
To avoid the highly expensive task of opening and closing a
connection to the database for every request, ColdFusion caches database
connections by default. This means that the connection to the database is only
opened once for many requests, thereby dramatically improving performance.
If
you are connecting to a clustered database configuration, it may be necessary to
disable connection caching to allow failover to function properly. This can be
accomplished by unchecking "Maintain Database Connections" in the
attributes of the data source but will strongly degrade performance.
To
avoid unused connections to the database remaining open for long periods of
time, the "Limit cached database connection inactive time" setting can
be adjusted. It is also possible to manually release all data source connections
from the "Verify Data Source" section of the ColdFusion administrator.
Query Caching
Query caching greatly increases performance as result sets
are retrieved from memory rather than from the database. Developers should
consider caching queries whenever possible.
For
example, the following query will be cached for two hours:
<CFQUERY Name="MyQuery" DataSource="dsn" CachedWithin="#CreateTimeSpan(0,2,0,0)#">
Select * from Inventory where InventoryId =2
</CFQUERY>
While caching queries is controlled by code, the limit of
allowable cached queries is set in the ColdFusion administrator. With the
introduction of CF 4.5x, it became possible to cache more that 100 queries at a
time. The amount of queries that can be cached is now limited only by the amount
of memory available on the server. As the size of result sets, amount of
available memory, and the use of cached queries in applications vary, this
setting should be tested under expected load for optimal performance.
Improving
scalability is about finding and removing bottlenecks that restrict the growth
of a system. The most common bottlenecks for Web systems include:
1.
Insufficient network bandwidth.
2.
Insufficient CPU resources.
3.
Inability to get data to/from the database.
Solving
each problem seems simple:
1.
Call the ISP.
2.
Add servers.
3.
Add more database server(s).
Unfortunately,
the cost of adding database server(s), both in terms of money and administrative
overhead, is very high. So, it seems that maximizing your existing investment in
database hardware and software is warranted. Enter query caching.
Query
caching is designed to accomplish two goals:
1.
Decrease the time between a page request and the page view.
2.
Reduce the amount of work generated for the database server
for each page view.
Implementing
query caching is very simple. For example, examine the query below, which might
be used to retrieve a list of states for a <SELECT> list.
Before:
<CFQUERY Name="qStates" DataSource="#Request.DSN#">
SELECT StateCode
FROM States
ORDER BY StateCode
</CFQUERY>
After:
<CFQUERY
Name="qStates" DataSource="#Request.DSN#"
CachedWithin="#CreateTimeSpan(0,1,0,0)#">
SELECT StateCode
FROM States
ORDER BY StateCode
</CFQUERY>
You just empowered ColdFusion to hold on to the results of
that query for up to an hour. ColdFusion will now stop repeatedly asking the
database for the results of this query. In fact, ColdFusion will reuse the
results of that query for up to an hour before asking the database for that
result set again. The database is now relieved of the duty of fetching these
rows and sorting them (which usually involves creating and dropping a temporary
table) for each request to that page.
Before
you get too excited, there are a few details to consider:
- There is a limit to the number
of queries you can cache. This number is configured in the ColdFusion
Administrator under "Caching." In ColdFusion 4.01, you cannot set
this number higher than 100. This limitation was removed in ColdFusion 4.5,
but that is not an invitation to set the value to 30,000. Caching too many
result sets will cause memory starvation and heavy virtual memory paging,
negating the benefits.
- If you have a dynamic query,
such as "SELECT * FROM Catalog WHERE CatalogNumber = #val(FORM.CatalogNumber)#",
each permutation of that query counts as one cached query. Therefore, query
caching should only be used for commonly accessed result sets. The CFML
Language Reference defines a distinct result set by stating "…the
current query must use the same SQL statement, data source, query name, user
name, password, and DBTYPE. Additionally, for native drivers it must have
the same DBSERVER and DBNAME (Sybase only)."
- There is no easy way to
invalidate a result set, if you detect that a result set should be
invalidated. Therefore, the time span used for the result set cache should
be chosen carefully. However, if you have a result set that is accessed four
times per second, setting a timeout as low as a minute reduces the load on
the database (for that query) by a factor of 240.
- You cannot use query caching
for parameterized queries (queries using <CFQUERYPARAM>.)
Parameterized queries should be used for common non-cached queries, since
they allow the query plan to be reused on some database systems (such as
Oracle), and they are virtually immune to malicious query editing as
documented in Allaire Security Bulletin ASB99-04 (http://www.allaire.com/handlers/index.cfm?ID=8728&Method=Full)
Keep in mind that queries are also global to the server, so
if you use the same query with the same name in multiple pages, that cached
result set is shared between the pages.
The
repetitive nature of Web pages often causes database servers to work very hard
at producing the same results sets over and over again. However, by using the
Query Caching capability included with ColdFusion, a significant amount of work
can easily be moved from the database server to the ColdFusion servers. This
allows for a much higher ratio of ColdFusion servers to database servers,
enhances the performance and scalability of your Web system, and maximizes your
investment in your database servers.
When designing a server caching strategy, it is
important to take into account available sever memory and the need for fully
dynamic information. If poor application performance is an issue, these settings
may make a world of difference. During implementation, testing should take place
to ensure that the application and server reacts as expected. Be sure to monitor
server memory and verify that application data is correct. |