When a Web application runs too slow, it is important to identify which specific
actions are taking the most time. In other words, is the entire Web application
slow? Or, do certain actions take longer than others.
To discover exactly how long HTTP request actions such (as POST and GET) are
taking by URI, use the W3C Extended Log File Format property, time-taken. When
you select this property using the extended properties dialog box in IIS,
time-taken appears as a column in the W3C log for the Web application,
indicating how many milliseconds an ASP took to complete processing.
While this information may seem redundant — you probably already know what the
slow pages are — it provides a reliable metric. This data precisely quantifies
the speed at which pages are processed and it confirms suspected bottlenecks.
It also illuminates inconsistencies, such as a "slow" page that runs fast in
certain instances depending on request parameters or the query string. In cases
where the performance problems are indeed isolated to a particular page or set
of pages, you can proceed with analyzing the ASP code for optimization
opportunities. The Code Matters section below enumerates some steps that can be
taken in that space.
Requests/sec: The number of responses completed per second
Requests Executing: The number of requests currently executing Request Wait
Time: The number of milliseconds the most recent request waited in the queue
Request Execution Time: The number of milliseconds the most recent request took
to execute Requests Queued: The number of requests queued up waiting to be
processed
If you see Requests Queued greater than 0 for extended periods, a low number of
Requests Executing, high Request Wait Time, and low Requests per Second, you
should consider making worker process configuration changes to produce a higher
level of concurrent or asynchronous request processing.
Configuring IIS 6 Worker Processes
IIS 6 offers vast improvements over IIS 5 in terms of Web application process
configuration options. While IIS 5 offered the low (in process), medium, and
high isolation modes, IIS 6 lets you define the number of worker processes,
performance options (such as bandwidth throttling, worker process automatic
recycling based on volume, resource consumption, or inactivity), and a dynamic
process spawning feature named Web Gardening, which causes each request to be
handled by a new process until the number of processes reaches the max set for
a predefined threshold. Once the defined max number is reached, requests are
distributed round-robin across the instances.
It is important to note that Web Gardening does not work on applications with
stateful sessions. The reason is that each process has its own copy of the
application state, so values cached in session would not match across instances
in the Garden — causing bad things to happen.
I'll focus on an example with which I have personal experience: using IIS 6 to
migrate an Active Server Pages Web application which was originally implemented
in IIS 5. This involves creating a set of application pools. I'll use 4 for the
example, each with a corresponding web application, in effect a 1:1 ratio of
dedicated app pool to web app. The configuration requires that you have a load
balancer in front of the Web to round-robin across the URLs of the application,
which will be unique by TCP port number within IP/URL. The steps are:
Create four applications pools, leaving the max number of worker processes set
to 1 (Gardening off). Create four Web applications, one for each application
pool. Use the same virtual directory for all four web applications. Assign each
Web application a unique TCP port.
There are other settings involved, but I'm intentionally leaving them out to
illustrate the approach from a high level, and to keep you awake — unless
you're reading this to put you to sleep (in which case, you're probably reading
this paragraph from a lucid twilight dream state).
In short, this type of configuration can offer a higher level of concurrent
request processing because you now have four processes concurrently handling
requests for the same Web site. It's a scaling up within the same server based
approach that can be scaled out by repeating the configuration on multiple
servers.
Also, unlike Web Gardening, requests do maintain affinity with the Web
application first accessed, so, a stateful session will work. In fact, the
application I converted from IIS 5 uses the ASP session to store an identity
object that is incrementally loaded by a series of page requests.
The Code Matters
A myriad of factors can be considered targets for improvement when you need to
optimize a web application's code. For now, I'll focus on a common area that
often leaves great opportunity for improvement: Data Access, specifically from
a relational DBMS.
At a high level, design patterns — such as DAO (Data Access Object) and DVO
(Data Value Object), with counter parts in J2EE such as Transfer Object — are
all pretty much designed to make most efficient use of available memory,
network bandwidth, and data source abstraction. To be more specific, let's use
a common scenario: a page that uses an object in script to request data that,
internally, uses a data access object to query a database.
As I mentioned earlier, using the W3C extended log parameter time-taken helps
identify slow running actions. If these actions are indeed associated with a
Data Access Object behind a Business Object Model component, the request
parameters used to set SQL query parameters back in the DAO should be analyzed
and compared with primary and secondary index configurations of the database
object. If the parameters are not indexed, and the database uses table scans
instead of index scans, performance can suddenly be impacted by increases in
volume to the table.
A technology-agnostic approach is to use prepared statements versus dynamic SQL.
This greatly speeds up queries by allowing data access layer components (ADO /
java.sql) to cache SQL access plans. Prepared statements, when used with
connection pooling, are considerably faster than dynamic SQL, because access
plans are reused for the life span of the cached prepared statement.
This means the SQL optimizer does not need to go back and figure out what index
it should use for optimal access every time the SQL executes. The access path
is calculated once, then re-used every time the same statement is executed;
statements are cached in connections, and connections are pooled. The
efficiency of the SQL is still important, however; poorly written SQL
statements, even if prepared, can result in table scans if close attention is
not placed on indexing. Also, be sure to use the connection pooling features
available in the platform and at all costs, avoid application centric "home
grown" pooling mechanisms.
In this article, I've discussed only a few methods for assessing and addressing
performance issues. There are plenty more, but this is the way I start when I
look for an application problem.
Copyright © 2005 Ziff Davis Media Inc. All Rights Reserved. Originally
appearing in Dev Source.