Directing Web Traffic

Does your site handle traffic effectively? Will it scale up? Site-analysis and load-balancing tools can help you anticipate and fix problems..

by Rohit Bafna

As the commercial Internet matures, two key issues are quickly converging. First, user traffic is increasing exponentially, and second, Web sites are becoming fundamental components of many businesses' market presence. To companies serious about how customers experience their Web sites, the interdependence of these issues should not be ignored. Set aside content and aesthetics for one moment and ask yourself two simple questions:

-- What is the experience I deliver to visitors via my Web site?

-- How will that experience be affected if my Web-site traffic doubles?

Many people spotcheck the performance of their Web site locally. They fire up Netscape, enter their home-page URL, and time how long the page takes to appear. The results of simple experiments like this might be encouraging, but can be deceiving for a number of reasons.

First, many people forget to flush their browser's cache. Caching is a mechanism used by a browser to speed up your Web surfing. The browser simply stores frequently requested HTML pages and images--such as your company's home page and logo--locally on your computer. In fact, if you frequent a site on the Web, what you see in your browser may well have been pulled completely from data cached locally.

Also, most spot checks don't dig deeply enough. It's worth your while to test more than how long it takes to access your home page. Think about how a customer might typically travel around your Web site, then follow the same path yourself. Depending on how your site is set up, the time it takes to view a series of pages will probably be very different from the total time it takes to access the same pages individually. Many Web servers also cache pages and images, and the response time for some pages may be significantly different than for others.

Local spot checks also can be misleading because the path you take to a resource on your own Web site from within your company's intranet is probably a lot shorter, with fewer hops, than the path your customers take, most likely through a busy Internet service provider (ISP). If your visitors reside within a corporate intranet, their requests are probably coming through a proxy server as well, resulting in even more hops.

Proxy servers sit between a corporate intranet and the public Internet. When a user wants to visit your Web site, the request first goes to the proxy server, which makes the request to your Web site. The proxy server collects the data from your Web server and forwards it to the user who requested it. Some ISPs use proxy servers to route Internet requests.

A third factor that can distort spot checks is the speed of your own network connection. If your corporate intranet is connected to a T1 line with a capacity of 1.55 Mbps, you can retrieve data from your Web server much faster than an end user using a 28.8-Kbps modem over a noisy phone line. Multiplying the time it takes you to retrieve your home page on a T1 line by the difference in network-connection speeds gives you only a rough estimate of the difference in performance. And if your Web site attracts a diverse group of users, you'll have to provide acceptable performance for users with 28.8-Kbps, 14.4-Kbps, and 9.6-Kbps modems.

The good news is that some Web-server vendors have the foresight to record your visitors' timing information for you. The bad news is that it's usually hard to get at the information. If your Web server records the information in its logs, you will see for each entry in the log the time it took to transfer the data for that request. Keep in mind that timing data collected by your Web server is raw information. To make sense of the data--especially on a busy site--you'll probably want to use a log-analysis tool that breaks apart the information in a more meaningful way (see "Stalking the Elusive Usage Data," Web Watch, April 1996 IW).

If you really want to find out how your end user experiences a visit to your site, call a friend on the other side of the country and have him or her call up your site at different times of the day and time how long it takes to retrieve your home page and a couple of pages after that. You may be surprised at the numbers.

Gauging Changes
 
Hopefully, your friend confirms that your Web site is performing at acceptable speeds. But things on the Net have an odd way of not scaling linearly. Will performance continue to be acceptable when twice as many people visit your site? Fortunately, there are tools on the market to help you analyze your setup, simulate possible increases in usage, pinpoint scaling bottlenecks in your system, and even help you build a site with the ability to scale up to meet ever-increasing traffic.

The first thing you should study is how your site currently is being used. The more you know about your site's usage, the better equipped you will be to fine-tune it and prepare for upgrades. The log-analysis tools on the market today, such as E.G. Software's AuditTrack, Inters's Market Focus 2, I/Pro's NetLine, and net.Genesis's net.Analysis Pro 2.0, can help you understand usage patterns and tell you what areas of your Web site attract the most traffic. Moreover, these tools can tell you how different parts of your site are being used proportionately. This is important because the way your Web server reacts to twice as many people searching your product database may be quite different from the way it reacts to twice as many people requesting the home page.

These tools also give you the ability to look at trends in usage and performance at a site. Analysis of usage trends for the past three or six months can tell you if your traffic is doubling every quarter, month, or week, and how usage patterns are changing over time.

Simulated Action
 
Once armed with a thorough understanding of the current picture, you can use one of a few load-testing tools to simulate different scenarios of change and growth. Most of these tools provide a relatively good degree of flexibility in creating realistic scripts of how users move around your site. Once you've recorded a few common scenarios, the tool can simulate any number of simultaneous users performing the same actions.

The more sophisticated simulation tools let you watch in real time how your server's performance degrades as more simulated users visit your site. By intentionally overloading your Web site and carefully watching the real-time meters, you can identify the number of users at which your site's performance falls below reasonable levels. Use this information, coupled with trend information from an analysis tool, to estimate when it will be time to upgrade.

Load-testing software is an invaluable tool when trying to determine exactly where your Web server has bottlenecks. Server bottlenecks can occur in different areas: server software, memory, CPU, disk speed, and network bandwidth. If memory appears to be the problem, your performance problems could be averted by installing more of it. If your network bandwidth is the limiting factor, you may have to upgrade your network connection.

One load-testing simulation tool is SQA LoadTest, which checks all components of a browser-based application, including Java applets, JavaScript, and plug-ins that run on the client and server. Another tool, WebLoad from RadView Software Ltd., assesses Web-application performance under user-defined variable system loads by running tests that can be generated remotely from multiple client workstations.

Load-Balancing Tools
 
At some point, traffic to a Web server may get too high for one computer to handle effectively. The obvious solution to handling a heavy traffic load is to put another computer to work. A simple solution, called a mirrored site, takes all the Web site resources on the existing server, mirrors them onto another computer, then splits the traffic. This is, in fact, how a lot of the larger sites currently handle millions of hits per day. There are a number of ways to split the traffic, but two common methods are round-robin DNS (Domain Name Server) and dynamic IP (Internet Protocol) redirection.

IP addresses (for example, 206.45.67.100) are based on a hierarchy that helps one computer find another computer on the network quickly. In round-robin DNS systems, the Web-server name is associated with a list of IP addresses. Each IP address on the list maps to a different computer, and each computer contains a mirrored version of the Web site. Whenever a request is received, the Web-server name is translated to the next IP address on the list. By translating Web-server names to IP addresses in this round-robin fashion, requests can be load-balanced to multiple computers.

In the second common method, dynamic IP redirection, the main Web server takes all requests for the Web site's home page but the visitor's browser is redirected to another URL to satisfy the request. The magic here is that the redirected URL could be on the same computer as the main server or any one of several back-end mirror computers. The main server redirects the traffic to back-end Web servers based on their current loads.

Many original megasites had to create their own load-balancing software. But with load balancing becoming an issue for more Web sites, out-of-the-box solutions have come to market. A tool from HydraWEB Technologies provides fault-tolerant load-balancing across multiple servers, intelligently routing large volumes of HTTP requests to the most-available server to optimize performance. Cisco Systems also offers Internet scaling solutions.

The mind-boggling increase in Web traffic bodes well for Internet commerce. To keep pace with such explosive growth, serious Web site owners must understand current usage, traffic, and performance trends to ensure that their customers' Web experience remains as satisfying as their experience of any other facet of their business.

 
 
Sponsors

CyberAds Studio

The Difference is People
Experienced consulting team assists to oursource your IT projects and Technical Support Help Desk. CyberAds Studio runs an Offshore Software Development Center (ODC) in India and China.

Our strengths are in Content Management, Portal Development, Custom Software and application development, Wireless Application Development, Smart Card, Embedded System development, System Integration, Global Project Management and Offshore Software Development Center.

Innovative and state-of-the-art website design offering technically-savvy perspectives on corporate communications and web marketing. See examples of our website design.

Submit your Resume
Apply online for an exciting career with CyberAds Studio in the US, Europe and Brazil and at our offshore software development centers in India and China. View Jobs and Submit Resume

Spear Art Museum

Exclusive showcase of contemporary Indian Fine Art with famous paintings of M.F. Husain, Satish Gujral, Laxman Shreshtha, Deepak Shinde, S.H. Raza, Sanjay Bhattacharyya, Prabhakar Barwe, N.S. Bendre, and Anjolie Ela Menon.

Visit the SPEAR Art Museum and Gallery.