Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

FTP Issues

Status
Not open for further replies.
Nov 6, 2001
77
US
We have an applications analyst who has a script that ftp's from server A to server B, extracting data from server B, daily starting at 6am. Sometimes the ftp takes 15 minutes, other times it takes 3 hours. Other than at month-end, there is only a minimal difference in file sizes. We checked to make sure that both the aix side and network side are set to 100_full_duplex, which they are. We've had the network people put a sniffer on Server A and have found no issues. Is there anything else I need to check?
 
netstat -v

netstat -v
-------------------------------------------------------------
ETHERNET STATISTICS (ent1) :
Device Type: IBM 10/100/1000 Base-T Ethernet PCI Adapter (14100401)
Hardware Address: 00:06:29:6b:7a:3a
Elapsed Time: 0 days 0 hours 0 minutes 2 seconds

Transmit Statistics: Receive Statistics:
-------------------- -------------------
Packets: 0 Packets: 0
Bytes: 0 Bytes: 0
Interrupts: 0 Interrupts: 0
Transmit Errors: 0 Receive Errors: 0
Packets Dropped: 0 Packets Dropped: 0
Bad Packets: 0
Max Packets on S/W Transmit Queue: 0
S/W Transmit Queue Overflow: 0
Current S/W+H/W Transmit Queue Length: 0

Broadcast Packets: 0 Broadcast Packets: 0
Multicast Packets: 0 Multicast Packets: 0
No Carrier Sense: 0 CRC Errors: 0
DMA Underrun: 0 DMA Overrun: 0
Lost CTS Errors: 0 Alignment Errors: 0
Max Collision Errors: 0 No Resource Errors: 0
Late Collision Errors: 0 Receive Collision Errors: 0
Deferred: 0 Packet Too Short Errors: 0
SQE Test: 0 Packet Too Long Errors: 0
Timeout Errors: 0 Packets Discarded by Adapter: 0
Single Collision Count: 0 Receiver Start Count: 0
Multiple Collision Count: 0
Current HW Transmit Queue Length: 0

General Statistics:
-------------------
No mbuf Errors: 0
Adapter Reset Count: 0
Adapter Data Rate: 2000
Driver Flags: Up Broadcast Running
Simplex 64BitSupport PrivateSegment
DataRateSet

Adapter Specific Statistics:
----------------------------
Additional Driver Flags: Autonegotiate
Entries to transmit timeout routine: 0
Firmware Level: 13.0.9
Transmit and Receive Flow Control Status: Disabled
Link Status: Up
Media Speed Selected: Autonegotiation
Media Speed Running: 1000 Mbps Full Duplex
Packets with Transmit collisions:
1 collisions: 0 6 collisions: 0 11 collisions: 0
2 collisions: 0 7 collisions: 0 12 collisions: 0
3 collisions: 0 8 collisions: 0 13 collisions: 0
4 collisions: 0 9 collisions: 0 14 collisions: 0
5 collisions: 0 10 collisions: 0 15 collisions: 0
-------------------------------------------------------------
ETHERNET STATISTICS (ent0) :
Device Type: IBM 10/100 Mbps Ethernet PCI Adapter (23100020)
Hardware Address: 00:04:ac:57:3b:e0
Elapsed Time: 13 days 1 hours 32 minutes 55 seconds

Transmit Statistics: Receive Statistics:
-------------------- -------------------
Packets: 38059800 Packets: 51221185
Bytes: 10090027714 Bytes: 6597016251
Interrupts: 87865 Interrupts: 50784366
Transmit Errors: 0 Receive Errors: 0
Packets Dropped: 0 Packets Dropped: 193
Bad Packets: 0
Max Packets on S/W Transmit Queue: 46
S/W Transmit Queue Overflow: 0
Current S/W+H/W Transmit Queue Length: 1

Broadcast Packets: 2069 Broadcast Packets: 4697348
Multicast Packets: 0 Multicast Packets: 24655
No Carrier Sense: 0 CRC Errors: 0
DMA Underrun: 0 DMA Overrun: 0
Lost CTS Errors: 0 Alignment Errors: 0
Max Collision Errors: 0 No Resource Errors: 0
Late Collision Errors: 0 Receive Collision Errors: 0
Deferred: 0 Packet Too Short Errors: 193
SQE Test: 0 Packet Too Long Errors: 0
Timeout Errors: 0 Packets Discarded by Adapter: 0
Single Collision Count: 0 Receiver Start Count: 0
Multiple Collision Count: 0
Current HW Transmit Queue Length: 1

General Statistics:
-------------------
No mbuf Errors: 0
Adapter Reset Count: 0
Adapter Data Rate: 200
Driver Flags: Up Broadcast Running
Simplex AlternateAddress 64BitSupport
PrivateSegment DataRateSet

IBM 10/100 Mbps Ethernet PCI Adapter Specific Statistics:
------------------------------------------------
Chip Version: 26
RJ45 Port Link Status : up
Media Speed Selected: 100 Mbps Full Duplex
Media Speed Running: 100 Mbps Full Duplex
Receive Pool Buffer Size: 384
Free Receive Pool Buffers: 125
No Receive Pool Buffer Errors: 43433
Inter Packet Gap: 96
Adapter Restarts due to IOCTL commands: 0
Packets with Transmit collisions:
1 collisions: 0 6 collisions: 0 11 collisions: 0
2 collisions: 0 7 collisions: 0 12 collisions: 0
3 collisions: 0 8 collisions: 0 13 collisions: 0
4 collisions: 0 9 collisions: 0 14 collisions: 0
5 collisions: 0 10 collisions: 0 15 collisions: 0
Excessive deferral errors: 0x0
 
I don't remember if you need to change the /etc/environment as well and i forgot the name of the variable to set! any way as long as you fixed the name resolution issues then it should be fine having this much broadcast as i think this was the result of the previous problem!
I don't see any problems with hueyii network except that ent1 is not recieving any thing! is that usual!?!

I still in doubt that it is a network problem as i mentioned previously!

I can see that you have problems with pbufs and fsbufs on hueyii as shown by the vmstat -v output above! this might be causing the problem!
 
You are having a low free page memory in hueyii as well!

can you run the vmstat 1 10 and and svmon -G and present the output for both machines?
 
Here's the output for hueyii
vmstat 1 10
kthr memory page faults cpu
----- ----------- ------------------------ ------------ -----------
r b avm fre re pi po fr sr cy in sy cs us sy id wa
1 1 407982 248 0 0 0 277 633 0 990 128601 2721 4 4 89 4
1 0 407989 125 0 0 0 89 113 0 898 508840 1334 10 8 82 1
3 1 407989 125 0 0 0 513 591 0 1088 534387 2881 11 9 63 17
2 0 406455 1398 0 0 0 151 161 0 953 552083 3859 13 10 72 5
1 0 406282 1364 0 0 0 0 0 0 862 529872 816 9 8 82 0
1 0 406268 1107 0 0 0 0 0 0 896 539367 1945 8 11 79 2
1 1 406311 757 0 0 0 0 0 0 928 541768 1934 10 10 73 6
2 1 406311 448 0 0 0 0 0 0 905 538024 1821 10 9 75 5
1 0 406314 247 0 0 0 0 0 0 845 528209 954 10 8 81 0
1 0 406314 124 0 0 0 78 85 0 842 528250 906 8 10 82 0
[klehr]hueyii:/usr/reprint# svmon -G
size inuse free pin virtual
memory 1441792 1441655 137 95268 406878
pg space 770048 107875

work pers clnt lpage
pin 95268 0 0 0
in use 337628 1104027 0 0

Here's the output for egate1
vmstat 1 10
System Configuration: lcpu=4 mem=10240MB
kthr memory page faults cpu
----- ----------- ------------------------ ------------ -----------
r b avm fre re pi po fr sr cy in sy cs us sy id wa
2 1 582209 1690 0 0 0 88 211 0 1656 24691 6955 11 11 71 7
6 0 582100 1798 0 0 0 0 0 0 1642 23298 7055 6 8 85 0
2 0 581975 1922 0 0 0 0 0 0 1732 35780 9228 7 12 82 0
1 1 582209 1686 0 0 0 0 0 0 1799 28169 6900 9 11 78 2
2 1 582130 1765 0 0 0 0 0 0 1720 21006 6730 3 11 65 21
4 1 581949 1946 0 0 0 0 0 0 1782 25727 6827 4 11 63 21
0 1 582177 1718 0 0 0 0 0 0 1641 22618 5951 3 10 66 20
5 0 582088 1802 0 0 0 0 0 0 1850 34058 8341 25 19 44 12
1 0 581975 1909 0 0 0 0 0 0 1960 38351 9138 32 20 42 7
3 0 582241 1646 0 0 0 0 0 0 2144 43500 10050 19 19 62 0
[klehr]egate1:/home/klehr# svmon -G
size inuse free pin virtual
memory 2621440 2617402 4038 206362 582050
pg space 524288 26360

work pers clnt lpage
pin 206101 261 0 0
in use 567050 2050352 0 0

For the ent1 on hueyii, I think there is a card there, but we don't have anything configured for it.
 
You know what happend just right now!!! the electricity went off while i was typing for you :p and i lost what i typed previously :p

any way, in short i was saying that i think you have problem with RAM from the little information you provided above! but it seems that you don't have any paging which tells that you are running under just enough of RAM but for the times that you use the ftp there might be something else running that your ftp gets slow as the RAM gets lower and lower! as indicated by:

6854 pending disk I/Os blocked with no pbuf
258952 paging space I/Os blocked with no psbuf
11179553 filesystem I/Os blocked with no fsbuf
 
I guess you are using jfs not jfs2 right?

What's the output of:

ioo -o numfsbufs

ioo -o hd_pbuf_cnt

I'm not sure which filesystem is having this problem but you can increase the values above (might cause more RAM shortage but could solve the issue!)

ioo -o numfsbufs=<#value>

ioo -o hd_pbuf_cnt<#value>

you have to unmout the filesystem you think creating the problem and mount it back for the change to take effect.

Regards,
Khalid
 
What type of applications do you run on each system? I can see big amount of file caching on hueyii machine! you might need to tune minperm and maxperm! and minfree and maxfree as well!

what's the output of vmo -a?

I guess you are away for now so i will leave and come back tomorrow to see your output!

Cheers,
Khalid
 
Here's the ioo info

ioo -o numfsbufs
numfsbufs = 186
[klehr]hueyii:/home/klehr# ioo -o hd_pbuf_cnt
hd_pbuf_cnt = 576

vmo -a
memory_frames = 1441792
pinnable_frames = 1346632
maxfree = 128
minfree = 120
minperm% = 20
minperm = 271830
maxperm% = 80
maxperm = 1087320
strict_maxperm = 0
maxpin% = 80
maxpin = 1153434
maxclient% = 80
lrubucket = 131072
defps = 1
nokilluid = 0
numpsblks = 770048
npskill = 6016
npswarn = 24064
v_pinshm = 0
pta_balance_threshold = 50
pagecoloring = 0
framesets = 2
mempools = 1
lgpg_size = 0
lgpg_regions = 0
num_spec_dataseg = n/a
spec_dataseg_int = n/a
memory_affinity = n/a
htabscale = -1
force_relalias_lite = 0
relalias_percentage = 0
data_stagger_interval = 161
large_page_heap_size = n/a
kernel_heap_psize = n/a
soft_min_lgpgs_vmpool = 0


Egate1 runs interface applications (eways) and hueyii runs ufos application
 
Can any errors be seen on the network switches where both boxes are attached to?
To eliminate any CPU/RAM/Disk issue, maybe have an vmstat and iostat running on both boxes while doing that ftp.
Maybe try an ftp from each involved Box to some different box, checking if it's still constant fast, to sort out which of the boxes might have a problem. You know best how many ftp's are needed to say if it should have showed up.

laters
zaxxon
 
We've had a sniffer on, and we haven't seen any issues in the last few days, but then the ftp has only been going through on the .62 address not the .65 Could be the last dns changes that were made have fixed it, but we won't know until it ftp's out on the 65 IP. Is there a way to specify what address to ftp out on?
 
I would first think of increasing the values of numfsbufs & hd_pbuf_cnt. Then think of tuning other values (minfree, maxfree, minperm and maxperm)

Code:
The last five lines of the vmstat -v report are useful when you're looking for I/O problems. The first line is for disk I/Os that were blocked because there were no pbufs. Pbufs are pinned memory buffers used to hold I/O requests at the logical volume manager layer. Prior to AIX v5.3, this was a systemwide parameter. It's now tuneable on a volume-group basis using the lvmo command. The ioo parameter that controls the default number of pbufs to add when a disk is added to a volume group is pv_min_pbuf, and it defaults to 512. This specifies the minimum number of pbufs per PV that the LVM uses, and it's a global value that applies to all VGs on the system. If you see the pbuf blocked I/Os field above increasing over time, you may want to use the lvmo -a command to find out which volume groups are having problems with pbufs and then slowly increase pbufs for that volume group using the lvmo command. I normally increase the global value to 1,024. 

Paging space I/Os blocked with no psbuf refers to the number of paging space I/O requests blocked because no psbuf was available. These are pinned memory buffers used to hold I/O requests at the virtual memory manager layer. If you see these increasing, then you need to either find out why the system is paging or increase the size of the page datasets. 

Filesystem I/Os blocked with no fsbufs refers to the number of filesystem I/O requests blocked because no fsbuf was available. Fsbufs are pinned memory buffers used to hold I/O requests in the filesystem layer. If this is constantly increasing, then it may be necessary to use ioo to increase numfsbufs so that more bufstructs are available. The default numfsbufs value is determined by the system and seems to normally default to 196. I regularly increase this to either 1,024 or 2,048.

you can read more on that here


regarding specifying addresses to ftp out on! i guess this could be controlled by the switches! I've never thought of doing it on the box coz it would creat more headache of maintaining the routing tables and stuff like that

Regards,
Khalid
 
Just thought I'd update this. We found the problem to be on the switch for the 2nd IP address. It was set at Auto negotiate instead of 100/Full. Once this was changed (which was on 12/27), we had to wait until the ftp used this IP. It went through on this IP the last 2 nights successfully with no delays.

Thanks for all your help and suggestions.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top