|
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
A Tomcat worker is a Tomcat instance that is waiting to execute servlets on behalf of some web server. For example, we can have a web server such as Apache forwarding servlet requests to a Tomcat process (the worker) running behind it. The scenario described above is a very simple one; in fact one can configure multiple Tomcat workers to serve servlets on behalf of a certain web server. The reasons for such configuration can be:
There are probably more reasons for having multiple workers but I guess that this list is enough... Tomcat workers are defined in a properties file dubbed workers.properties and this tutorial explains how to work with it. This document was originally part of Tomcat: A Minimalistic User's Guide written by Gal Shachor, but has been split off for organizational reasons.
Defining workers to the Tomcat web server plugin can be done using a properties file (a sample file named workers.properties is available in the conf/ directory). the file contains entries of the following form: worker.list =<a comma separated list of worker names>
When starting up, the web server plugin will instantiate the workers whose name appears in the worker.list property, these are also the workers to whom you can map requests.
Each named worker should also have a few entries to provide additional information on his behalf. This information includes the worker's type and other related worker information. Currently the following worker types that exists are (JK 1.2.5): Defining workers of a certain type should be done with the following property format: worker . worker name . type =<worker type> Where worker name is the name assigned to the worker and the worker type is one of the four types defined in the table (a worker name may not contain any space (a good naming convention for queue named should follow the Java variable naming rules).
After defining the workers you can also specify properties for them. Properties can be specified in the following manner: worker.<worker name>.<property>=<property value>
The ajp12 typed workers forward requests to out-of-process Tomcat workers using the ajpv12 protocol over TCP/IP sockets. the ajp12 worker properties are : host property set the host where the Tomcat worker is listening for ajp12 requests. port property set The port where the Tomcat worker is listening for ajp12 requests lbfactor property is used when working with a load balancer worker, this is the load-balancing factor for the worker. We'll see more on this in the lb worker section.
Notes: In the ajpv12 protocol, connections are created, used and then closed at each request. The default port for ajp12 is 8007
The ajp13 typed workers forward requests to out-of-process Tomcat workers using the ajpv13 protocol over TCP/IP sockets. The main difference between ajpv12 and ajpv13 are that:
You should note that Ajp13 is now the only out-process protocol supported by Tomcat 4.0.x, 4.1.x and 5. The following table specifies properties that the ajp13 worker can accept: host property set the host where the Tomcat worker is listening for ajp13 requests. port property set The port where the Tomcat worker is listening for ajp13 requests lbfactor property is used when working with a load balancer worker, this is the load-balancing factor for the worker. We'll see more on this in the lb worker section. cachesize property is usefull when you're using JK in multithreaded web servers such as Apache 2.0, IIS and Netscape. They will benefit the most by setting this value to a higher level (such as the estimated average concurrent users for Tomcat). If cachesize is not set, the connection cache support is disabled. cache_timeout property should be used with cachesize to specify how to time JK should keep an open socket in cache before closing it. This property should be used to reduce the number of threads on the Tomcat WebServer. You should know that under heavy load some WebServers, for example Apache's create many childs/threads to handle the load and they destroy the childs/threads only when the load decrease. Each child could open an ajp13 connection if it have to forward a request to Tomcat, creating a new ajp13 thread on Tomcat side. The problem is that after an ajp13 connection is created, the child won't drop it until killed. And since the webserver will keep its childs/threads running to handle high-load, even it the child/thread handle only static contents, you could finish having many unused ajp13 threads on the Tomcat side. socket_keepalive property should be used when you have a firewall between your webserver and the Tomcat engine, who tend to drop inactive connections. This flag will told Operating System to send KEEP_ALIVE message on inactive connections (interval depend on global OS settings, generally 120mn), and sus prevent the firewall to cut the connection. The problem with Firewall cutting inactive connections is that sometimes, neither webserver or tomcat have informations about the cut and couldn't handle it. socket_timeout property told webserver to cut an ajp13 connection after some time of inactivity. When choosing an endpoint for a request and the assigned socket is open, it will be closed if it was not used for the configured time. It's a good way to ensure that there won't too old threads living on Tomcat side, with the extra cost you need to reopen the socket next time a request be forwarded. This property is very similar to cache_timeout but works also in non-cache mode.
Notes: In the ajpv13 protocol, the default port is 8009
The load-balancing worker does not really communicate with Tomcat workers. Instead it is responsible for the management of several "real" workers. This management includes:
The overall result is that workers managed by the same lb worker are load-balanced (based on their lbfactor and current user session) and also fall-backed so a single Tomcat process death will not "kill" the entire site. The following table specifies properties that the lb worker can accept:
With JK 1.2.x, new load-balancing and fault-tolerant support has been added via 2 new properties, local_worker_only and local_worker . Let's take an example environment: A cluster with two nodes (worker1+worker2), running a webserver + tomcat tandem on each node and a loadbalancer in front of the nodes.
The local_worker flag on worker1 and worker2 tells the lb_worker which connections are going to the local worker. If local_worker is an int and is not 0 it is set to JK_TRUE and marked as local worker, JK_FALSE otherwise. If in minimum one worker is marked as local worker, lb_worker is in local worker mode. All local workers are moved to the beginning of the internal worker list in lb_worker during validation. This means that if a request with a session id comes in it would be routed to the appropriate worker. If this worker is down it will be send to the first local worker which is not in error state. If a request without a session comes in, it would be routed to the first local worker. If all local worker are in error state, then the 'local_worker_only' flag is important. If it was set to an int and this wasn't 0 it is set to JK_TRUE, JK_FALSE otherwise. With set to JK_TRUE, this request gets an error response. If set to JK_FALSE lb_worker tries to route the request to another balanced worker. If one of the worker was in error state and has recovered nothing changes. The local worker will be check for requests without a session id (and with a session on himself) and the other worker will only be checked if a request with a session id of this worker comes in. Why do we need souch a complex behavior ? We need a graceful shut down of a node for maintenance. The balancer in front asks a special port on each node periodically. If we want to remove a node from the cluster, we switch off this port. The loadbalancer can't connect to it and marks the node as down. But we don't move the sessions to another node. In this environment it is an error if the balancer sends a request without a session to an apache+mod_jk+tomcat which port is switched off. And if the load balancer determines that a node is down no other node is allowed to send a request without a session to it. Only requests with old sessions on the switched off node would be routed to this node. After some time nobody uses the old sessions and the sessions will time out. Then nobody uses this node, because all session are gone and the node is unreachable without a session-id in the request. If someone uses a session which timed out, our servlet system sends a redirect response without a session id to the browser. This is necessary for me, because on a switched off node apache and tomcat can still be up and running, but they are in an old state and should only be asked for valid old sessions. After the last session timed out, I could update the node etc. without killing sessions or moving them to another node. Sometimes we have a lot of big objects in our sessions, so it would be really time consuming to move them. The defaults are still local_worker: 0 and local_worker_only:0
The jni worker opens a JVM inside the web server process and executes Tomcat within it (that is in-process). Following that, messages to and from the JVM are passed using JNI method calls, this makes the jni worker faster then the out-of-process workers that need to communicate to the Tomcat workers by writing AJP messages over TCP/IP sockets.
Note: Since the JVM is multithreaded; the jni worker should be used only within multithreaded servers
such as AOLServer, IIS, Netscape and Apache 2.0. Since the jni worker opens a JVM it can accept many properties that it forward to the JVM such as the classpath etc. as we can see in the following table. class_path is the classpath as used by the in-process JVM. This should point to all Tomcats' jar/file files as well as any class or other jar file that you want to add to the JVM. To have JSP compile support, you should remember to also add Javac to the classpath. This can be done in Java2 by adding tools.jar to the classpath. In JDK1.xx you should just add classes.zip. The class_path property can be given in multiple lines. In this case the JK environment will concatenate all the classpath entries together by putting path delimiter (":"/";") between the entries.
bridge indicate the kind of Tomcat you'll use via JNI. The bridge property could be for now tomcat32 or tomcat33 . Tomcat 3.2.x is deprecated but still present on some distribution like iSeries. By default the bridge type is set tomcat33.
cmd_line is the command line that is handed over to Tomcats' startup code. The cmd_line property can be given in multiple lines. In this case the JK environment will concatenate all the cmd_line entries together by putting spaces between the entries.
jvm_lib is the full path to the JVM implementation library. The jni worker will use this path to load the JVM dynamically.
stdout is full path to where the JVM write its System.out
stderr is full path to where the JVM write its System.err
ms set initial HEAP size for the JVM
mx set maximal HEAP size for the JVM
sysprops set the system properties for the JVM
ld_path set the additional dynamic libraries path (similar in nature to LD_LIBRARY_PATH)
Notes: Under Linux it seems that processes can't update their own LD_LIBRARY_PATH, so you'll have to update it BEFORE launching the webserver...
You can define "macros" in the property files. These macros let you define properties and later on use them while constructing other properties and it's very usefull when you want to change your Java Home, Tomcat Home or OS path separator
Since coping with worker.properties on your own is not an easy thing to do, a sample worker.properties file is bundled along JK. You could also find here a sample workers.properties defining :
|