Wednesday, July 29, 2015

CPU ---MONITORING & PERFORMANCE & TUNING


central processing unit (CPU) of a computer is a piece of hardware that carries out the instructions of a computer program. It performs the basic arithmetical, logical, and input/output operations of a computer system. The CPU is like the brains of the computer - every instruction, no matter how simple, has to go through the CPU.

A typical CPU has a number of components.
1. ALU -which performs simple arithmetic and logical operations

    1. CU - Second is the control unit (CU), which manages the various components of the computer. It reads and interprets instructions from memory and transforms them into a series of signals to activate other parts of the computer. The control unit calls upon the arithmetic logic unit to perform the necessary calculations.

    1. Cache - CPU caching keeps recently (or frequently) requested data in a place where it is easily accessible. This avoids the delay associated with reading data from RAM.



What is CPU Processor Clock Speed ?

A processor's clock speed measures one thing -- how many times per second the processor has the opportunity to do something.

Ex. A 2.3 GHz processor's clock ticks 2.3 billion times per second, while a 2.6 GHz processor's clock ticks 2.6 billion times per second. All things being equal, the 2.6 GHz chip should be approximately 13 percent faster.



                    What is CPU Caching ?
CPU caching keeps recently (or frequently) requested data in a place where it is easily accessible. This avoids the delay associated with reading data from RAM.
                 
  • A CPU cache places a small amount of memory directly on the CPU. This memory is much faster than the system RAM because it operates at the CPU's speed rather than the system bus speed. The idea behind the cache is that chip makers assume that if data has been requested once, there's a good chance it will be requested again. Placing the data on the cache makes it accessible faster.


 WHY IS CACHE REQUIRED  FOR BETTER PERFORMANCE ?
 CPU  will be accessing the data from memory . CPU is  connected  to memory through system bus.  The clock speed of the CPU is much higher than the speed of the system Bus .  For completion of any request ,CPU need to fetch  the data from the memory which can be accessed after going through the system bus . here is speed of the system bus comes into picture .   As a result,request processing power of the CPU was impacted .
                                                    So for  overcoming this latency the concept of CPU Caching was introduced.   The Cache will be on the processor chip  and will store the recently or frequently requested data  and is lot many times faster than accessing data from memory . Now since all  the required data is already available in cache the CPU will not have to wait for getting the data from memory and in terms  request processing speed will be increased.

Typically there are now 3 layers of cache on modern CPU cores:

    L1 cache is very small and very tightly bound to the actual processing units of the CPU, it can typically fulfil data requests within 3 CPU clock ticks. L1 cache tends to be around 4-32KB depending on CPU architecture and is split between instruction and data caches.

    L2 cache is generally larger but a bit slower and is generally tied to a CPU core. Recent processors tend to have 512KB of cache per core and this cache has no distinction between instruction and data caches, it is a unified cache. 

    L3 cache tends to be shared by all the cores present on the CPU and is much larger and slower again, but it is still a lot faster than going to main memory.


Note : CPU Performance also largely depend on the size of the L1 ,L2 & L3 Cache



 performance metrics in terms of CPU Performance

latency
   

The time that one system component spends waiting for another component in order to complete the entire task. Latency can be defined as wasted time. In networking discussions, latency is defined as the travel time of a packet from source to destination. 

response time
   

The time between the submission of a request and the completion of the response.

response-time = service time+wait time


service time
   

The time between the initiation and completion of the response to a request. 

throughput
   
The number of requests processed per unit of time. 

wait time
   
The time between the submission of the request and initiation of the response.



Response Time

Because response time equals service time plus wait time, you can increase performance in this area by:

    Reducing wait time

    Reducing service time

Understanding different aspects of   CPU Service time  ? 


  Suppose ,a LPAR is having 2 physical CPU allocated to it (no SMT Enabled/single threaded Mode) then what will happen, each CPU will be processing 1 request at a time  .  hence there will be no wait time & also least service time. this in term will increase the application response time .

                                                                   In other case suppose LPAR is assigned .4 CPU and 2 virtual CPU . Also SMT-2 is enabled . that means you will be having  2 threads  per Virtual CPU.  Each Virtual CPU is entitled 20 ms per timecycle/core  . If simultaneously requests from  both the threads of  Virtual CPU1 is queued up in the run-queue.The thread which is having the  high priority will be dispatched for execution  first. CPU dispatcher & Scheduler  will decide when to provide the timeslice to other  thread as per the scheduling algorithms.IF the primary physical CPU is not able to provide the timeslice to the thread ,context switching will happen and the request will be executed by other physical CPU of the same pool . That means anyhow your's service-time will be increased . This in terms will increase the application  response time  .