Monday, December 2, 2019

Understanding vm.nr_hugepages, vm.hugetlb_shm_group, kernel.shmall, kernel.shmmax, kernel.shmmni

Understanding vm.nr_hugepages, vm.hugetlb_shm_group, kernel.shmall, kernel.shmmax, kernel.shmmni

Setting up new database server and want to make sure that all the OS limits are set properly. It require little bit of calculation to be done. There are confusing bits what to divide by page size and what to divide by huge page size.

Hoping that below explanation should help in the understanding and setting them correctly. 



Getting Current values for all the variables
cat /etc/sysctl.conf | egrep '(shm|huge)' | sort


Hugepage size used on the machine
grep -i "Hugepagesize" /proc/meminfo



vm.nr_hugepages       = Number of hugepages available for a Unix Group. Make sure they are more than kernel.shmall.
Note: Make sure that the HugePages are many enough to cover all your database SGAs


vm.hugetlb_shm_group  = Group ID of the Unix group which can use the hugepages defined under vm.nr_hugepages

kernel.shmall         = Total number of pages available for all databases to place their SGAs.
Note: it is number of pages and not number of hugepages  (Command is $getconf PAGE_SIZE )
Note: it is number of pages. It is not the size.
Note: pga_aggregate_target is not part of this. pga_aggregate_target is taken from OS RAM.

kernel.shmmax         = SHMMAX is the maximum size of a single shared memory segment set in bytes.
Ideally, we would like SGA_TARGET to fit in one shared memory segment at startup by having
                        SGA_TARGET < SHMMAX
If SGA_TARGET > SHMMAX , then Oracle will try to use contiguous multi-segment to use to fit the SGA_TARGET.
If it is not able to do so, then it will use non-contiguous multi-segment allocation and in this
                        Oracle has to grab the free memory segments fragmented between used spaces.
Note: It has nothing to do with the total SGA for all databases. The database will start even if this value 
                        is smaller than SGA

kernel.shmmni         = It can vary for different OS.
Default use 4096



General Notes:
Make sure that the RAM and HugePages have enough space to cover all your database SGAs
Make sure the total SGA is less than the installed RAM and re-calculate HugePages.
Make sure that the HugePages are enough to cover all your database SGAs



Real world problem:
My server has 90G RAM. I will run 2 DBs on this server.

DB01 Requirements
-----------------
SGA = 12G
PGA = 4G


DB02 Requirements
-----------------
SGA = 7G
PGA = 10G


How can I setup the follwoing parameters?
vm.nr_hugepages, vm.hugetlb_shm_group, kernel.shmall, kernel.shmmax


Solution:
Value for kernel.shmmni

Setting the value to 4096 as that is recommended for redhat version I am using. 
kernel.shmmni = 4096



Calcualte kernel.shmall
Find Total SGA required which is 12 + 7 = 19G. It will be used to calculate kernel.shmall (Remember this value is numer of pages).

Page side we are using is 4k. So we will need 19G / 4k = Total number of pages required to hold SGA of both databases.

In this case it will be (19 * 1024 * 1024) / 4 = 4980736 pages

We can set kernel.shmall = 4980736    (Note: this is minimum value, you should add little more if you want to add more database later on. 
If you want to create another DB then there will be a need to change this parameter and restart of the server will be required. ) 



Calcualte kernel.shmmax
SHMMAX is the maximum size of a single shared memory segment set in bytes.
The biggest SGA out of two databases is 12G.
In this case it will be 12 * 1024 * 1024 = 12582912

kernel.shmmax = 12582912



Calcualte vm.nr_hugepages
Looking at shmall in our case we have allocated 19G so we have to calculate number of pages required to fit in for 19G. 
In our gase the huge page size is 2048k
In this case it will be (19 * 1024 * 1024) / 2048 = 9728

But 9728 may not fit evenly on 19G so I will allocate 512 more hugepages for this to happen smoohtly. Hence end result for the parameter will be 9728 + 512 = 10240

vm.nr_hugepages = 10240



Calcualte vm.hugetlb_shm_group
This is the group ID of the oracle user. Just issue the id oracle on shell prompt and you will get the GID. 

$ id oracle





Monday, May 20, 2019

Plant watering system using Raspberry PI

Tried using Raspberry PI to water plants while we are on holidays.

Raspberry PI setup with temperature sensor to water once a day when temp is over certain threshold.

Plants with a source of water
Raspberry PI setup for watering
Whole Setup



Thursday, May 9, 2019

Increase the size of /boot on centos

I have noticed that my /boot on one of my test machine is nearly full and I can not patch the machine any more and getting 

Total                                                                                                                                                                                                                           9.0 MB/s |  95 MB  00:00:10   
Running transaction check
Running transaction test


Transaction check error:
  installing package kernel-3.10.0-957.12.1.el7.x86_64 needs 23MB on the /boot filesystem

Error Summary
-------------
Disk Requirements:
  At least 23MB more space needed on the /boot filesystem.

#

This is a virtual box machine and I did the below to fix it.

Shut the machine down and added a new 4G disk to it.
startup
create new partition using

# fdisk /dev/sdb
n
p
rest as default

Select (default p): p
Partition number (1-4, default 1):
First sector (2048-8388607, default 2048):
Using default value 2048
Last sector, +sectors or +size{K,M,G} (2048-8388607, default 8388607):
Using default value 8388607
Partition 1 of type Linux and of size 4 GiB is set

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.
#


Format the drive to ext4 file system

mkfs.ext4 /dev/sdb


Copy existing contents from /boot to this new drive

mkdir -p /tmp/sdb1
mount /dev/sdb1 /tmp/sdb1
cp -a /boot/* /tmp/sdb1/


Get the UUID of the drive

# blkid /dev/sdb1
/dev/sdb1: UUID="6ae400b9-ff66-4ed0-a550-77eaf29e8d7f" TYPE="ext4"
#


un mount the drive

# umount /tmp/sdb1/


Modify /etc/fstab and add UUID of new drive.



un mount /boot and mount is back so that new drive is mounted as /boot

# umount /boot
# mount /boot


# df -h
/dev/sdb1                  3.9G  194M  3.5G   6% /boot


Re- install GRUB and update configurations


[root@localhost ~]# grub2-install /dev/sda
Installing for i386-pc platform.
Installation finished. No error reported.
[root@localhost ~]#
[root@localhost ~]# grub2-mkconfig -o /boot/grub2/grub.cfg
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-3.10.0-957.10.1.el7.x86_64
Found initrd image: /boot/initramfs-3.10.0-957.10.1.el7.x86_64.img
Found linux image: /boot/vmlinuz-3.10.0-957.5.1.el7.x86_64
Found initrd image: /boot/initramfs-3.10.0-957.5.1.el7.x86_64.img
Found linux image: /boot/vmlinuz-3.10.0-693.el7.x86_64
Found initrd image: /boot/initramfs-3.10.0-693.el7.x86_64.img
Found linux image: /boot/vmlinuz-0-rescue-afcc54177a674be69136d59c73eb35bf
Found initrd image: /boot/initramfs-0-rescue-afcc54177a674be69136d59c73eb35bf.img
done
[root@localhost ~]#




























Saturday, April 27, 2019

Recover database after loss of control file

I lost the control file and when trying to start the database I am getting the error below:

SQL> startup
ORACLE instance started.

Total System Global Area  838860800 bytes
Fixed Size                  8798312 bytes
Variable Size             339742616 bytes
Database Buffers          486539264 bytes
Redo Buffers                3780608 bytes
ORA-00205: error in identifying control file, check alert log for more info


SQL> 



You can see that the error is there in alert log as well.

2019-04-27T07:29:30.006425-04:00
ALTER DATABASE   MOUNT
2019-04-27T07:29:30.781317-04:00
ORA-00210: cannot open the specified control file
ORA-00202: control file: '/u01/app/oracle/oradata/orcl12c/control01.ctl'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 7
ORA-205 signalled during: ALTER DATABASE   MOUNT...
2019-04-27T07:29:30.932158-04:00
Errors in file /u01/app/oracle/diag/rdbms/orcl12c/orcl12c/trace/orcl12c_m000_12917.trc:
ORA-00202: control file: '/u01/app/oracle/oradata/orcl12c/control01.ctl'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 7
2019-04-27T07:29:32.069723-04:00
Checker run found 1 new persistent data failures


Solution: copying the second controlfile over original location and calling it control01.ctl and starting database fixed the issue.

location of the second control file is in the alert log.

  control_files            = "/u01/app/oracle/oradata/orcl12c/control01.ctl"
  control_files            = "/u01/app/oracle/fast_recovery_area/orcl12c/control02.ctl"



$ cp /u01/app/oracle/fast_recovery_area/orcl12c/control02.ctl /u01/app/oracle/oradata/orcl12c/control01.ctl

SQL> startup
ORACLE instance started.

Total System Global Area  838860800 bytes
Fixed Size                  8798312 bytes
Variable Size             339742616 bytes
Database Buffers          486539264 bytes
Redo Buffers                3780608 bytes
Database mounted.
Database opened.
SQL> 


Tuesday, March 12, 2019

The program cannot open the required dialog box because it cannot determine whether the computer named "BLABLA" is joined to a domain. Close this message, and try again.

I am connected to the sql server database from my machine using SQL Management Studio and wanted to add a user with Windows Authentication, but as soon as I click on Search button the session is taking some time and finally reporting back with an error as below:

The error message is "The program cannot open the required dialog box because it cannot determine whether the computer "BLABLA" named is joined to a domain. Close this message, and try again."

There is a workaround for this issue and a fix.

Workaround can be done from your client SQL Server management studio, but for fix you need to access sql server and make change in firewall.

Workaround:
Use the command like to create the user from SQL Server Management Studio and the example is as below:

create login "MYDOMAIN\test_user" from windows;
go

After this you should be able to refresh the Login view and grant required permissions to user.


Fix:
SQL Server Management Studio requires port 445 inbound to perform this search. So add that port to inbound rules and enable the rule to get it fixed.


Other Checks Done:
I have checked that the server is joined to domain and that can be done from cmd or from server properties

echo %userdomain%



Tuesday, February 26, 2019

Remove all old xauth entries

Simply log on to server and issue command below:

xauth list | cut -f1 -d\  | xargs -i xauth remove {}

Wednesday, January 23, 2019

Medical Records

I like to solve problems and one of the problem that I encountered is that if you are sick and seeing a doctor then you need to have few of your details handy like recent temperatures, medications taken, blood pressure and blood glucose readings.

The doctor would like to have a look that how each has been over a period of time.

Even sometime you would like to keep a track of these things for your own records.

I have Android phone and to fix this issue I have created and Android application which store this kind of information.

Now I can create profiles for all of my family member and keep their information different and if I am seeing a doctor then I just get their profile out and show it to doctor. It is much convenient and hoping that you like it as well.

Below is the Google Play store link from where you can download it:

https://play.google.com/store/apps/details?id=com.appowl247.fever

I will really like your feedback, please feel free to comment or provide feedback in Google Play.

Wednesday, January 16, 2019

Friday, January 11, 2019

FATAL: OCI listening thread exited. Failed to get IP address of host.

I am getting an error in the connection log when I am trying to connect to RAC database server


BLABLA(devel): sqlplus user/password@MY_TNS_NAME 

SQL*Plus: Release 11.2.0.2.0 Production on Fri Jan 11 14:11:21 2019

Copyright (c) 1982, 2010, Oracle.  All rights reserved.

FATAL: OCI listening thread exited. Failed to get IP address of host.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics, Real Application Testing and Unified Auditing options

SQL>


As the name suggests sqlplus is not able to determine the IP address of local machine and showing this error.

It looks like it is a warning and can be fixed very easily by adding the hostname and the IP address in the /etc/hosts file.

Once it is done then I tried connecting and it is fixed.

BLABLA(devel): sqlplus user/password@MY_TNS_NAME    

SQL*Plus: Release 11.2.0.2.0 Production on Fri Jan 11 14:13:19 2019

Copyright (c) 1982, 2010, Oracle.  All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics, Real Application Testing and Unified Auditing options

SQL>