AN ASSESSMENT USING WWW SERVER SOFTWARE
SUKANYA SURANAUWARAT
HIDEO TANIGUCHI
Graduate School of Information Science and Electrical Engineering,
Kyushu University, Fukuoka 812-8581, Japan
ABSTRACT
Improving an operating system’s support for softwaremaintenance is, we believe, vital to our goal of reducingthe significant sum spent on adapting existing software tochanging user requirements, especially improving theperformance of software. Therefore, we proposed theidea that by increasing an operating system’s abilities toobserve a software’s execution behavior and evolve itsexecution behavior using observed results, an operatingsystem could adapt existing software to changing userrequirements without making any modifications to thesoftware. We have already integrated the above abilitiesinto CPU and disk scheduling mechanisms in an operatingsystem. In this paper, we verify the usefulness of our ideausing existing software like a WWW (World Wide Web)server, by examining its performance when using both ourCPU and disk scheduling mechanisms.
KEY WORDS: Software Maintenance, WWW Server,
Operating System, Behavior, Predict
1. INTRODUCTION
The magnitude of software maintenance cost is estimatedto comprise at least 50% of overall software life-cyclecost [1][2][3]. And a large portion of this cost, over 50%,is spent on changes to accommodate changing userrequirements. Changes in user requirements areinevitable. Software models part of reality, and realitychanges. So the software has to change too. Keep this inmind, our goal is to reduce the significant sum spent onchanging user requirements, especially performanceenhancement of software. This goal could be achieved byusing the following idea.
Our idea is that by increasing an operating system’sabilities to observe a software’s execution behavior andevolve its execution behavior using observed results, anoperating system could adapt existing software tochanging user requirements without making anymodifications to the software. In other words, by usingthese abilities, an operating system could optimize asoftware’s execution behavior allowing user requirements
to be satisfied without any modifications to the existingsoftware.
We have already integrated the above abilities intoresource scheduling mechanisms in an operating systemsuch as CPU and disk scheduling mechanisms. We havealso verified the usefulness of our idea in some casesusing existing software, i.e., a WWW (World Wide Web)server. That is, we evaluated the performance of a WWWserver when using either scheduling mechanism by itself.In either case, our mechanism alters the executionbehavior of a WWW server by giving preferential use ofthe resource (i.e., the CPU resource or the disk drive) toany process that is predicted to be a server processhandling an HTML (HyperText Markup Language) filerequest. This allows users to view the first data to displayon browsers, the text data stored in an HTML file, and thegeneral layout of a WWW page in a timely manner duringperiods of high demand. As a result, the user requirement,the enhancement of response time when there is a heavydemand on the server, can be satisfied without makingany changes to the existing server software.
In this paper, we present experimental evaluation of ouridea by examining the performance of a WWW serverwhen using both our CPU and disk schedulingmechanisms. And our result shows that the meanresponse time is improved as much as 30%; thispercentage of improvement is better than when usingeither mechanism by itself (both of which are about 20%).The remainder of this paper is organized as follows.Section 2 provides background information on the WWW.Section 3 is a brief overview of our CPU and diskscheduling mechanisms. Section 4 describes ourexperiment and explains the results we obtained. Section6 offers our conclusion and future work.
2. THE WWW
The WWW is based on the client-server model. That is,users access WWW pages provided by WWW servers viaWWW clients, mainly browsers. WWW pages arewritten in HTML and stored on a disk as text files calledHTML files. An HTML file contains text data that users
will view and HTML tags that specify structure for thetext data as well as formatting hints. Because an HTMLfile uses a text representation, non-text data such asimages are not included directly in the HTML file.Instead, a tag is placed in the HTML file to specify theplace at which an image should be inserted and the sourceof the file that stores it (Image file). HTML files andImage files account for more than 90% of the totalrequests to a server [4][5]. Therefore, the WWW page inthis paper consists of an HTML file and an Image file.WWW clients. A user can access the information on theWWW by using a browser, such as Netscape Navigator,Internet Explorer, or Mosaic. When the user selects aWWW page to retrieve (usually, by clicking a mouse on ahyperlink), the browser creates a request to be sent to thecorresponding WWW server and then waits for a response.When the response arrives, the browser interprets andprocesses the HTML file sent back by the server, and thendisplays the text data for the user to view. During theinterpretation, if the WWW page also contains other typesof data such as an image, then the browser will request theImage file from the WWW server.
WWW servers. The purpose of a WWW server is toprovide WWW pages to WWW clients that request them.The server software we used is Apache version 1.2.5 [6],the pre-forking model server. In this model, a masterprocess pre-forks a pool of child server processes tohandle requests. However, the master process does nothandle any part of the request. In this paper, we refer toeach child server process as a server process.
3. OVERVIEW
In this section, we will briefly describe our CPU and diskscheduling policies and their aspects of implementation[7][8] in order to provide sufficient understanding to therest of the paper.
3.1 SCHEDULING POLICIES
As the demand placed on a WWW server grows, thenumber of simultaneous requests it must handle increases.As a result, users see slower response times duringperiods of high demand. In other words, it takes a longertime for the first data to display on browsers, the text datastored in an HTML file, to start displaying when a WWWserver is accessed by many browsers simultaneously.This situation could be one in which it is most desirablefor users to improve the response time of a WWW server,since they tend to get frustrated if it takes a long time tostart viewing a WWW page. Hence, in such a situation,our scheduling goal is to display the text data for the userto view as soon as possible while other types of data suchas an image are coming in, and also to allow the user tostop loading if the page is not sufficiently interesting to
warrant waiting. A method to achieve this goal isdescribed below.
Most processes, except the currently executing process(i.e., process in the run state), are in one of two queues: aready queue or a sleep queue. Processes that are waitingfor the CPU to become available (i.e., processes in theready state) are placed on a ready queue, whereasprocesses that are blocked awaiting an event (i.e.,processes in the wait state) are located on a sleep queueassociated with the event. When a process is blockedawaiting an event to happen such as the completion of itsI/O request, if the desired I/O device (e.g., a disk drive) isavailable, the request can be serviced immediately. If thatdevice is being used by any other process, then therequest will be put into the I/O queue for that device. Byreducing the time a process waits for the CPU to becomeavailable in the ready queue or the time its I/O requestswait to be serviced in the I/O queue, we can reduce theprocessing time of a process, which results in an enhancedresponse time if that process is a server process handlingan HTML file request. For this reason, we proposed thefollowing scheduling policies.
(1) When a CPU becomes bottlenecked, any server
process handling an HTML file request will bemoved to the head of the ready queue.(2) When an I/O device becomes bottlenecked, any
I/O request generated by any server processhandling an HTML file request will be moved tothe head of the I/O queue for that device.Note that the bottleneck mentioned in this paper is thesituation in which the resources are being used and thereis more than one process waiting to use the resources.
3.2 ASPECTS OF IMPLEMENTATION
When implementing the proposed scheduling policiesfocused on the bottleneck of a CPU and a disk drive, wehad two problems: how to detect which processes areserver processes handling HTML file requests, and how tooperate the ready queue and the I/O queue.
To deal with the first problem, we first need to knowwhat a server process handling an HTML file request islike. So, we logged the execution behavior of a WWWserver in terms of process identifier, process state andtime. And based on this log, we created the predictedexecution behavior called PFS (Program Flow Sequence)for each server process. PFS is a sequence of entriesdescribing process state and time spent. We analyzed thebehavior of server processes based on PFS and found thata server process handling an HTML file request is aprocess that waits for a request from a browser, and thetime it waits for a request is relatively long compared with
the time waiting for other kinds of events to happen (e.g.,the completion of an I/O request) in the wait state. Also,after accepting a request, it tends to change between runstate and wait state a number of times. The number ofchanges is proportional to the size of the file it handles,i.e., an HTML file (which is usually smaller than anImage file). In other words, any server process handlingan HTML file request has two characteristics: afterwaiting for a long time in the wait state (characteristic 1),it tends to cycle between run state and wait state a numberof times but fewer times than that of a server processhandling an Image file request (characteristic 2).
Next, we introduced two parameters into our schedulingmechanisms in order to determine which processes havethe above two characteristics (i.e., to determine whichprocesses are server processes handling HTML filerequests): long wait threshold (its value is denoted bySLP) and run state/wait state threshold (its value isdenoted by RW). If the time spent by a process in thewait state before moving to the run state is more than SLP,and the number of times the process changes between runstate and wait state is less than RW, then we determinethat it is a server process handling an HTML file request.By these two parameters, we can detect which processappears to be a server process handling an HTML filerequest. SLP and RW are automatically predicted andupdated every time period based on the predictedexecution behavior of each server process, i.e., PFS ofeach server process. We note that we cannot fix SLP andRW at some values due to random accesses from users (inthe case of SLP), and the changing execution behavior ofthe server and the size of the HTML files it handles (inthe case of RW). Also, PFS is created and updated everytime period based on the log we collected when theWWW server is running. As mentioned, a log is asequence of entries describing process identifier, processstate and time.
To deal with the second problem, our CPU and diskscheduling mechanisms give preferential use of the CPUresource and the disk drive to any process that is predictedbased on its behavior to be a server process handling anHTML file request, by moving it and its I/O requests tothe head of the waiting queue (i.e., the ready queue orthe I/O queue). In other words, our schedulingmechanisms put any process that has characteristic 1 andalso its I/O requests at the head of the waiting queue; andwhen that process loses characteristic 2, it and its I/Orequests will be scheduled normally, i.e., that process andits I/O requests will be respectively put into the readyqueue and the I/O queue using the routines provided bythe operating system, which in our case are theroundrobin and the disksort routines. Roundrobin entersprocesses into the ready queue on a round-robin basiswhile disksort enters I/O requests into the I/O queue in acyclic, ascending, cylinder order as described below.
An I/O queue is made up of one or two lists of requestsordered by cylinder number. The request at the front ofthe first list indicates the current position of the drive. If asecond list is present, it is made up of requests that liebefore the current position. Each new request is sortedinto either the first or the second list, according to therequest’s location. When the heads reach the end of thefirst list, the drive begins servicing the other list. Figure1(a) shows an example of an I/O queue with requests forI/O to blocks on cylinders 75, 30 and 120 when therequest at cylinder 100 is being serviced. For comparison,Figures 1(b) and (c) show respectively how disksort andour disk scheduling mechanism enter the requests from aserver process handling an HTML file request oncylinders 140 and 80, in addition to the requestsmentioned in Figure 1(a). Note that the request at thehead of the queue is the one that is being serviced. So, wedecided to enter the request of any server processhandling an HTML file request right after the one at thehead. If there is more than one request generated byserver processes handling HTML file requests, then theywill be put into the queue on a first-come-first-servedbasis. In the same way, if there is more than one serverprocess handling an HTML file request that is ready torun, then they will be put at the head of the ready queueon a first-come-first-served basis.
disksort(a)I/O queue1001203075disksort(b)I/O queue100120140307580our mechanism(c)I/O queue100140801203075Figure 1. Examples of how disksort and our diskscheduling mechanism enter I/O requests into thequeue.
4. EXPERIMENTAL EVALUATION
In this section, we present an experiment designed toexamine the performance of a WWW server when usingboth our CPU and disk scheduling mechanisms. We startwith a description of the experimental setup, and proceedto present the results of the experiment.
4.1 EXPERIMENTAL SETUP
Our CPU and disk scheduling mechanisms areimplemented in BSD/OS version 2.1. The software usedfor the WWW server and the browser in our experimentwas Apache version 1.2.5 and Netscape Navigator version
3.04 respectively. The WWW server ran on a personalcomputer with a 233 MHz AMD-K6 processor and MB of memory, while browsers ran on three personalcomputers, each with a 200MHz Intel Pentium Proprocessor and MB of memory. All machines wererunning on BSD/OS version 2.1 and were connected by aprivate 10 Mb/s Ethernet. Also, our experiment wasconducted in single user mode, and the operating system’sI/O buffer cache in the server machine and each browser’scache were disabled during the experiment in order toclearly see the effect of our scheduling mechanisms.In our experiment, the WWW server was accessed bythree browsers from each of the three machines randomlyat the same time; this number can cause bottleneck of theCPU and the disk drive, which is indicated by the non-zero length of the ready queue and the I/O queue, duringthe short period of simultaneous accesses [7][8]. Also, allbrowsers accessed 18 unique URLs (URL — UniformResource Locator) all of which have the same content, anHTML file (1,772 bytes) and an Image file (43,770 bytes).For each URL, we measured the 5 trial times of time1 andtime2. Time1 is the time from requesting a WWW pageuntil text data starts displaying. Time2 is the time fromrequesting a WWW page until image data displayscompletely. We will refer to the averages of 5 trial timesof time1 and time2 as response time of text data andresponse time of image data respectively. During theexperiment, SLP and RW were automatically predictedand updated based on PFS every 500 milliseconds. Andour previous works [7][9] have already showed that SLPand RW are effectively predicted and updated by ourscheduling mechanisms.
4.2 EXPERIMENTAL RESULTS
Figure 2(a) illustrates the maximum, the minimum, themean, and the median response times of text data from 18URLs. Note that the median values are calculated directlyfrom time1, since it can be skewed if we calculate it fromthe response times of text data, each of which is theaverage of the 5 trial times of time1. For comparison, wealso show the results when using the original schedulingmechanisms (i.e., roundrobin and disksort). Also, theresults for image data shown in Figure 2(b) are plotted inthe same way.
Figure 2(a) shows that the maximum, the minimum, themean, and the median response times of text data areimproved 20%, 92%, 30% and 26% respectively. These
figures are calculated by a−b
a×100% where a and b arethe maximum (or the minimum or the mean or themedian) response times of text data when using theoriginal scheduling mechanisms and when using oursrespectively. According to the result, our schedulingmechanisms produce a good improvement for responsetime of text data. In other words, by using our scheduling
mechanisms the time from requesting a WWW page untiltext data starts displaying is reduced. Besides, the meanresponse time when using both of our schedulingmechanisms simultaneously is improved (30%) more thanwhen using either mechanism by itself (both of which areabout 20%). However, the percentage of improvementwhen using both our CPU and disk schedulingmechanisms at the same time is not the sum of thepercentage when using either mechanism by itself. Thiscould be because of the correlation between our twoscheduling mechanisms. That is, the operation of movinga process or an I/O request on one queue can affect theoperation on the other queue. For example, if a processwhich is moved to the head of the ready queue by ourCPU scheduling mechanism generates an I/O request,then the likelihood that its I/O request will beconsequently put into the front of the I/O queue is high.As a result, the effect of moving I/O requests to the headof the I/O queue by our disk scheduling mechanism willbe diminished.
6original scheduling mechanisms)ce54.78our scheduling mechanismss( em43.85it es3n2.54ops21.79eR1.4111.040.5200.04maxminmeanmedian(a) Response time of text data12original scheduling mechanisms)ce10our scheduling mechanismss( e88.128.487.887.84mit6.447.03 es6nops4eR2.733.3220maxminmeanmedian(b) Response time of image dataFigure 2. The experimental results when theWWW server is accessed by multiple browsersrandomly at the same time while SLP and RW arepredicted/updated automatically.
However, the price to be paid for reducing the time fromrequesting a WWW page until text data starts displayingis that we reduce the fairness of the system, whichconsequently affects the amount of time it takes fromrequesting a WWW page until image data displays
completely. If this effect is small, it is acceptable. Forexample, it is acceptable if the image data displayscompletely while the users are reading the text data thatdisplays faster. On the other hand, if the effect on theresponse time of image data is big, then users might getfrustrated and not wait for the whole page to displaycompletely. Therefore, care must be taken to ensure thatthe resulting unfairness does not outweigh theperformance gains obtained. And our result in Figure2(b) shows that the worst or the maximum response timeof image data when using our mechanisms is only 4%slower than when using the original schedulingmechanisms. In addition, the minimum, the mean and themedian response times of image data when using ourmechanisms are not so different from when not using ours.According to the result, response time of image data paysa small penalty under our scheduling mechanisms.Therefore, any WWW server that experiences a lot ofsimultaneous accesses from users would benefit from ourresource scheduling mechanisms.
5. RELATED WORK
The work described in this paper relates mainly to thearea of CPU and disk scheduling in operating systems.
5.1 CPU SCHEDULING
Traditional operating systems control the sharing of theCPU resources among processes using a fixed schedulingpolicy based on the utilization of a computer system suchas a real-time or a time-sharing system. Real-timesystems’ scheduling policies are usually only available inreal-time operating systems, and not in general purposeoperating systems in which time-sharing systems’scheduling policies are used. However, the advent ofmultimedia applications on PCs and workstations hascalled for new scheduling paradigms to support real-timein systems with conventional time-sharing schedulers.One simple approach to do this, which has been adoptedby many general-purpose operating systems such asSolaris, Linux and Windows NT, is to provide fixedpriorities that are higher in priority than regular prioritiesto real-time applications. Another approach is to schedulebased on proportion and/or period [10][11][12]. Anotherapproach is based on hierarchical scheduling with severalscheduling classes and with each application beingassigned to one of these classes for the entire duration ofits execution [13][14][15].
However, none of the above approaches is trying toschedule based on behavior of a process. As aconsequence, in some cases, this can hinder an effectiveuse of the CPU resource or can extend the processing timeof a process unnecessarily. For example, in UNIX basedoperating systems, several processes of the same priority
may be ready to run if they could use the CPU if it wereavailable. Since only one process can be running at atime, the rest will have to wait in the ready queue until theCPU is free and rescheduled on a round-robin basis. Inthe case of WWW servers, when the number of serverprocesses needed to handle the simultaneous requestsincreases, if a server process handling an HTML filerequest is put at the end of the queue, then it takes alonger time for the text data to show up on browsers. Asa result, users experience slower response times.
5.2 DISK SCHEDULING
The simplest form of disk scheduling is First Come FirstServed (FCFS), that schedules requests in the order oftheir arrival. Since the access schedule thus derived isindependent of the relative positions of the requested dataon disk, FCFS scheduling can incur significant seek timeand rotational latency. Therefore, many schedulingpolicies concentrated on minimizing seek time such asShortest Seek Time first (SSTF), SCAN, LOOK, andV(R), and those concentrated on minimizing rotationallatency such as Shortest Latency Time First (SLTF) havebeen proposed in order to achieve higher performance[16][17][18][19]. In other words, these policies attemptto service I/O requests with the minimum mechanicalmotion. However, they are less concerned about eachrequest individually, which is what our policy does. As aconsequence, the problem similar to that when usingtraditional CPU scheduling policies occurs when usingthese disk scheduling policies. That is, when a WWWserver is accessed by a lot of users simultaneously, thelikelihood that an I/O request generated by any serverprocess handling an HTML file request will be put at theend of the queue is high, which results in userexperiencing a slower response.
In addition to CPU and disk scheduling, one function inWindows 98, which users can get “faster program startup” as performance enhancement [20], uses an ideasimilar to ours. That is the function improves theperformance of a user’s programs based on the previoususage without making any changes to the programs. Inother words, the function creates a log file to determinewhich programs a user runs most frequently. All suchfrequently used files are then placed in a single locationon the user’s hard disk, which further reduces the timeneeded to start those programs [21]. However, thefunction does not alter the execution behavior ofprograms based on the previous usage which is what ouridea does. So, by using the function in Windows 98, theoperating system can control a user’s programs moreefficiently until a user’s programs start up (i.e., theoperating system can locate and load a user’s programsfaster), but it cannot execute or run a user’s programsmore efficiently.
6. CONCLUSION
This paper examined the performance of a WWW serverwhen using the proposed CPU and disk schedulingmechanisms. Our scheduling mechanisms controls theallocation of a CPU and a disk drive based on thebehavior of WWW server processes rather than based ona fixed policy used in traditional operating systems, inwhich the utilization of a computer system such as a real-time or a time-sharing system is a major concern. Andour experimental result when the WWW server isaccessed randomly by multiple requests at the same timeshows that by using our disk scheduling mechanism theresponse time, the time from requesting a WWW pageuntil text data starts displaying, can be reduced. To bemore specific, the mean and the median response time areimproved 19% and 42% respectively. Moreover, theeffect of unfairness due to our policy of giving favorabletreatment to server processes handling HTML filerequests on the response times of other types of data,which in our case is image data, are relatively small.Therefore, any WWW server that experiences a lot ofsimultaneous requests from users would benefit from ourscheduling mechanisms.
Future work will measure the performance of a WWWserver when the operating system’s I/O buffer cache inthe server machine and each browser’s cache are enabled.Also, we will evaluate the usefulness of our idea usingother existing software applications.
ACKNOWLEDGEMENT
We would like to thank James Michael Perry for hisassistance in proofreading this paper.
REFERENCES
[1] N. Ford and M. Woodroffe, Introducing softwareengineering (Prentice-Hall, 1994).
[2] R. Pressman, Software engineering: a practitioner’sapproach (McGraw-Hill, 1992).
[3] I. Sommerville, Software engineering (Addison-Wesley, 1992).
[4] M. Arlitt and C. Williamson, Internet Web servers:workload characterization and performance implications,IEEE/ACM Trans. Networking, 5(5), 1997, 631-5.[5] C. Cunha, A. Bestavros, and M. Crovella,Characteristics of WWW client-based traces, TechnicalReport BU-CS-95-010, Computer Science Department,Boston University, 1995.[6] http://www.apache.org/
[7] S. Suranauwarat and H. Taniguchi, Evaluation of aprocess scheduling policy for a WWW server based onits contents, IEICE Trans. Inf. & Syst., E83-D(9), 2000,1752-1761.
[8] S. Suranauwarat and H. Taniguchi, A diskscheduling mechanism for a WWW server, Proc. 2001International Conference on Advances in Infrastructurefor Electronic Business, Science, and Education on theInternet, 2001. (to appear)
[9] S. Suranauwarat and H. Taniguchi, Operatingsystems support for the evolution of software: anevaluation using WWW server software, Proc. 2000International Symposium on Principles of SoftwareEvolution, 2000, 292-301.
[10] C. Waldspurger and W. Weihl, Lottery scheduling:flexible scheduling proportional-share resourcemanagement, Proc. 1st USENIX Symposium onOperating Systems Design and Implementation, 1994, 1-11.
[11] C. Waldspurger and W. Weihl, Stride scheduling:deterministic proportional-share resource management,Technical Report MIT/LCS/TM-528, MIT laboratory forcomputers science, 1995.
[12] M. Jones, D. Rosu, and M. Rosu, CPU reservationsand time constraints: efficient, predictable scheduling ofindependent activities, Proc. 16th ACM Symposium onOperating Systems Principles, 1997, 198-211.
[13] D. Golub, Operating system support for coexistenceof real-time and conventional scheduling, TechnicalReport CMU-CS-94-212, School of Computer Science,Carnegie Mellon University, 1994.
[14] B. Ford and S. Susarla, CPU inheritance scheduling,Proc. 2nd USENIX Symposium on Operating SystemsDesign and Implementation, 1996, 91-106.
[15] P. Goyal, X. Guo, and H. Vin, A hierarchical CPUscheduler for multimedia operating systems, Proc. 2ndUSENIX Symposium on Operating Systems Design andImplementation, 1996, 107-121.
[16] H. Deitel, An introduction to operating systems(Addison-Wesley, 1990).
[17] A. Silberschatz and P. Galvin, Operating systemconcepts (John Wiley & Sons, 1997).
[18] A. Worthington, Scheduling algorithms for moderndisk drives, Proc. 1994 Conference. on Measurement andModeling of Computer Systems, 1994, 241-251.
[19] R. Geist, A continuum of disk scheduling algorithms,ACM Trans. Comput. Syst., 5(1), 1987, 77-92.
[20] http://www.microsoft.com/Windows98/guide/Win98/Features/Faster.asp
[21] http://www.microsoft.com/Windows98/usingwindows/maintaining/articles/811Nov/MNTfoundation2a.asp
因篇幅问题不能全部显示,请点此查看更多更全内容
Copyright © 2019- 517ttc.cn 版权所有 赣ICP备2024042791号-8
违法及侵权请联系:TEL:199 18 7713 E-MAIL:2724546146@qq.com
本站由北京市万商天勤律师事务所王兴未律师提供法律服务