Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

AIX Production Readiness Checklist (PRC) - check list 2

Status
Not open for further replies.

passion7aix

Technical User
May 1, 2013
11
US
Hello Everyone,

Can anyone please provide me the checklist for validating our newly built AIX LPARs. AIX is new in our environment. So I'm looking for a reference document or checklist to verify new LPARs.

I believe most of the companies does have some kind of check list to verify. please provide me any kind of general reference checklist. (*i know it depends on environment, but looking for a generalized doc)

I am trying put the things together, so that it will be useful for new builds.

Thanks,
MJ
 
Alright,
Here are the few or almost a complete set of Checklist for AIX, which I am using for almost half a decade.

check /etc/issue.net
verify ssh is running and can connect to other servers
Validate the following settings are set in /etc/ssh/sshd_config
Check sudo package installation
check sudoers permission (should be root:root 440)
check all services are disabled in /etc/inetd.conf
check account expiration policy for accounts
check /etc/security/user for default user settings
Lock after retries value should set to 0 in /etc/security/login.cfg
Add/validate additional generic user id's do not have password expiration
confirm UMASK=022 as default (set as default on a per user basis)
check sudo/ssh logging enabled (verify local7.debug ---/var/log/sshdlog and auth.debug---/var/adm/sudo.log
make sure sendmail.cf is configured properly. DScratesmtp.cb.crateandbarrel.com and O Privacy=authwarnings,novrfy,noexpn,restrictqrun
make sure ftpusers file is populated and includes root
rm any .rhosts, .netrc and comment out entries in /etc/hosts.equiv
Validate /etc/inet/ntp.conf:
NTP is running and synced with ntp server
Validate /etc/resolv.conf
Validate the given entries are in /etc/syslog.conf
Setup dump devices
TCPIP Setting also had been change to the following setting
File systems size in rootvg
setup root profile file setting.
Server time zone is been set up to match PST
Validate TCP/IP Services


SARFARAZ AHMED SYED,
Sr. Systems Engineer
 
Remember to customize the setting as per your environment and Local time and look for other stuff which might not exactly match with the above, As I have a few different customization for different clients.

SARFARAZ AHMED SYED,
Sr. Systems Engineer
 
@Sarfaraz Ahmed Syed,

Thanks for your prompt response. I really appreciate your help bhai! This really helps a lot.
I will add all these points to my checklist.

You may add any of other missing items later to this thread. So' that it will be helpful to other users like me.

 
Just want to add something

I believe "/etc/issue.net" is for Linux. (guess not in AIX). Please correct me if i'm wrong

thanks
 
Ok now,

If you are using ssh to connect to other systems, which I am sure you will, then you add under Banner ("/etc/issue.net") of /etc/ssh/sshd_config file.

Generally you add a warning note, same as we do on /etc/motd file.

Again it depends on your client/company.

It is common for Linux and AIX.
 
Also,
Try this, it has everything you need, infact more than you care about.

AIX OS has something called aixpert, which aids the SA with security setting configuration.

Running the below command will give you all the things you need to tightened the security of the BOX.
# aixpert -l high -n -o /tmp/securityfile

Note: The location "/tmp/securityfile" I gave, you can specify any location you want.

It will give you more than anyone can provide you.
 
Hi AIXLogician,

useful....information. Thank you

I see that "system dump needs to be setup" in the above check list.
Do we need to ?

i mean when we install AIX 7.1on new LPAR, it automatically came with it.
please see the below info and let me know if need to make any changes to make my system stable.

#sysdumpdev -l
primary /dev/lg_dumplv
secondary /dev/sysdumpnull
copy directory /var/adm/ras
forced copy flag TRUE
always allow dump FALSE
dump compression ON
type of dump fw-assisted
full memory dump disallow

Can you please let me know, the above characteristics are good ?

on one LPAR
============
LPAR1# sysdumpdev -L
0453-039

Device name: /dev/lg_dumplv
Major device number: 10
Minor device number: 11
Size: 0 bytes
Date/Time: Mon Apr 22 10:27:36 EDT 2013
Dump status: -3
Type of dump: fw-assisted
dump failed or did not start
A previous dump was not logged.
Scanning device /dev/lg_dumplv for existing dump.


Can you please tell me, how can i fix above error? why did the dump failed (* did not had enough space ?)


another LPAR
============
LPAR2# sysdumpdev -L
0453-019 No previous dumps recorded.

Scanning device /dev/lg_dumplv for existing dump.




 
Hello Mahijeeth,
It depends, by default AIX OS create a dump device in rootvg.

You define the dump size by estimating the size of dump in current running system.
like you run #sysdumpdev -e (this will give you the estimate of dump size in bytesfor the current running system, but this is a compressed dump).
For example say your estimated compressed dump is around 1.5 Gigs, you want to create a dump lv bigger in size than 1.5 Gigs. Say around 2 gigs, also if primary dump fails you can have a secondary dump of same size. This is how you select the size of dump lv.

Also you have to look at the error report, is it generating a error saying not sufficient dump device space? in this case also you have analyze and redefine the dump size.
 
@ AIXLogician,
Thanks for your reply.

earlier, i received below error. But i have increased the dump lv using extendlv command.
IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION
E87EF1BE 0810150013 P O dumpcheck The largest dump device is too small.

usually...dump status should be "0" right ? i mean if it is successful
Dump status: -3 --> failed
i was asking in what cases, dump fails ?
am just curious......whether my LPAR has correct settings or not.

#sysdumpdev -l
primary /dev/lg_dumplv
secondary /dev/sysdumpnull
copy directory /var/adm/ras
forced copy flag TRUE
always allow dump FALSE
dump compression ON
type of dump fw-assisted
full memory dump disallow

thank you
 
Hmmm,

Yes, that error is because your dump size was small.

If you want to add a secondary dump go head and do it.

I assume you have Power6 or above hardware. If not change the "type of dump" to "traditonal" using sysdumpdev -t command.
 
Yeah...your are right. we've POWER 6 h/w. So i believe it is ok go with
type of dump fw-assisted

And i'm planning to add the secondary dump device incase if i see same error again.
is it must to have secondary dump device on prod systems ?

Thanks...for response.
 
Thank you @ AIXLogician

one last question.....regarding dumps

lsvg -l rootvg ==> o/p
lg_dumplv sysdump 67 67 1 open/syncd N/A
livedump jfs2 16 16 1 open/syncd /var/adm/ras/livedump

I know that lg_dumplv ==> is a system dump device (used to store the dump)
livedump ==> am not sure about this LV.
/dev/livedump 0.25 0.25 1% 4 1% /var/adm/ras/livedump

Can you please let me know, the use of "/var/adm/ras/livedump" filesystem ?
how can i make use of it. In what cases it will be filled with core files...how to generate/control?

Thanks



 
Ok,

The term live dump itself indicate that it captures the system dump live. Live dumps are small dumps that does not require a system restart. Live dumps replace system dumps when your system is running.

So, Unlike system dumps, which are written to a dedicated dump device, live dumps are written to the file system. When you install the operating system, a file system is created to contain live dumps. By default, live dumps are placed in the /var/adm/ras/livedump directory. You can change the directory using the dumpctrl command (I rather recommend you do not try to change it).

To analyze a dump you can use kernel debugger (kdb), on how to debug refer the below link from IBM.
 
A couple of things to bear in mind for dumps...

If you only have SAN storage and you lose all access, then you may not get any errpt information or a dump when things go wrong because none of it can be written to storage that is no longer available.

If you sysdumpdev -e to get an idea of the memory in use and so the potential dump size and base your dump device size on this and then if something goes mad and it starts eating all the memory and paging space and the system starts paging its self into oblivion, you can only hope for a partial dump because the dump will be near the size of the total available memory plus the total available paging space before the system finally gets to the point of failing and by this time the required dump device may be way to small.

HTH
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top